Preface
Well, today we are going to continue to learn ARMv8 machine.
One day we've started with «one-dimensional» product — one core executing straight, linear code. In previous post, we've added new, second «dimension» — we've made the product multi-core. Now it has four ALU cores running independent code simultaneously, in parallel. This time, we will add another one dimension (or at least a half of a dimension) to our product. We will review one important conception of all computing machines, and ARMv8 is not an exception — exceptions.
Theory
Let's start with the theory. The conception of exceptions consists of three parts: cause (in terms of ARMv8 — «syndrome»), handler (a code that processes arised exception) and a entity that binds them (in terms of ARMv8 — «Vector Table»).
The handlers, despite that they are divided in groups, from the technical point of view are absolutely identical for all exceptions. Vector table, the entity that associates exceptions with their corresponding handlers, is just a 2kB of aligned in a certain way code, divided in 16 equal-sized blocks. Each of that blocks consists of 32 ARMv8 instructions. As we can see, Vector Table is binding entity by its structure (strictly regulated by ARMv8 standard) and a set of handlers by its content.
The causes or, generally speaking — exceptions themselves, are of two types — asynchronous and synchronous. Asynchronous are those what take place when, let's say — some «external» event occurs. Interrupt is a good example of asynchronous exception — you never know when an interrupt will occur while you are writing code. Synchronous exceptions are those what arise immediately after some instruction is executed. In other words, synchronous exception is just a reaction to instruction. It can be arised by an instruction that caused some error — the situation when the machine can't continue to function normally without handling that error. An example of such an exception is a case of illegal instruction — situation where a fetched instruction could not be decoded (and executed) by ALU. «Handling» such errors implies analysing condition of computing machine and an attempt to fix that state prior to allowing the code to continue to run, or, in the worst cases — preventing the machine from running further code by (usually) leaving it in an infinite loop. And at last, there is a set of synchronous exceptions that are designed not to handle errors, but to serve regular, normal cases on a developers purpose. These will be the topic of our today's post — so called System Monitor Call (SMC instruction) in particular. We chose them as subject of this post.
SMC exception is synchronous and from the developer's point of view looks like a regular call (in terms of ARMv8 — branch) to a normal function, because its handler will be executed immediately after this instruction and before executing the next one. But it still is good as a case to learn exceptions on ARMv8 and to play with.
Thus, here is our plan for today:
1. Form out Vector Table
2. Configure ALU to use our Vector Table
3. Design handler for SMC exception
4. Generate exception with SMC instruction
Let's get it started and run through the plan shortly.
Form out Vector Table
The Vector Table is a block of regular ARMv8 code 2kB in size, split in 16 sections 128 bytes in size each. The placement of Vector Table also must be aligned to 2kB boundary. We will place our Vector Table in a separate file — vbar_64.s. It is aligned to 2kB and consists of 16 section aligned to 128 bytes. Nothing special to discuss here for now. We will get back to content of this file later.
Configure ALU to use our Vector Table
Vector Bar of ARMv8 is set up by writing its address to VBAR_EL3 register. VBAR stands for Vector Base Address Register. This is done in crt0_64.S in a newly appended function _set_vbar which is called from _crt0_main function right after storing the initial value of VBAR_EL3 register as fourth parameter of barium_main() for later review. Everything is simply and clear here:
_set_vbar:
ldr x7, =_vbar
msr VBAR_EL3, x7
dsb sy
isb
ret
Design handler for SMC exception
According to ARMv8 Vector Table structure (or map) our SMC-handler is the fifth in vbar_64.s. Usually, exception handler consist of entry and leave parts, which deal with registers — save/restore, code that obtain exception syndrome and branches to real handler payload function. We will review our handler a little later.
Generate exception by executing SMC instruction
After setting VBAR and implementing handler for SMC exception we can place SMC instruction in our code and it will generate exception and ALU will select corresponding handler and run it by just branching to its address. Let's review the SMC instruction itself. Its format is:
smc #imm16
#imm16 means that instruction takes so called 16-bit «immediate» value — a value that can be obtained during compilation — a number itself or a #define. In such case we cannot use register as a parameter to such instruction. ARMv8 instructions are 32-bit long and consist of opcode and its parameters. During translation of assembly code into machine code, assembler forms code of this instruction using specified immediate value. That's what we have about SMC instruction and its nature.
We can obtain all 16 bits of this value later in handler from exception syndrome — we'll review this later. But how can we specify this parameter when we need to pass different values? For example, we want to pass some information via this parameter to our handler. It looks like this parameter is designed exactly for that purposes but it is not usable because it is immediate value. Well, yes, it is probably not usable for that purposes. Of course, we can implement a big block switch/case of if/else which would look like:
if (a)
smc #1
else
if (b)
smc #2
...
else
smc #0xFFFF
But this would be enormous block of boring code. But we don't like big blocks of boring code and we used to do something exceptional in our posts. Okay, let's not make exceptions in this today. We will present a method to pass variable argument to SMC instruction at run-time and it will be a small piece of code.
How can we do that? We'll do this by forming out the SMC instruction itself with specified immediate value, write it to some memory address and execute it — we will implement a function that will do a part of assembler's work run-time. In case we need to perform SMC instruction we will branch to some function that forms SMC instruction and after that executes it. This is done in smc.s in _form_smc function. What is actually done here? First we form SMC 0 instruction — it will be the base for the one we need. Its opcode is D4000003h. Then we cut 16 bits off of parameter of _form_smc function (x0), shift it by 5 — this is the exact offset of #imm16 in SMC instruction, and add this value to base opcode we've formed above. That's all about forming opcode of SMC instruction with given #imm16 as a parameter. After that we store this opcode in address of _smc label. That could look like complete solution — just branch to (or fall to) _smc, but it would not work. And that is because of caches. We remember that we have turned them on already and that our application is so tiny that it fits in caches entirely. So, the code runs completely inside cache. Thus we need to force ALU to refetch our newly generated instruction. This is done by flushing caches. And now this is the last part. We flush caches and fall to our newly formed instruction (_smc) without branches because it is located right after _form_smc function. You can see the code
.globl _form_smc
_form_smc:
# Form instruction - smc with given immediate value.
# Form instruction SMC 0 - the base for desired one,
# its opcode is D4000003h:
mov x1, 0x0003
movk x1, 0xD400, lsl 16
# Ensure we have exact amount of bits we need - immediate is 16 bit long:
and x0, x0, 0xFFFF
# Shift the immediate value to position it takes in instruction:
lsl x0, x0, #5
# Put the immediate value into instruction code by orring:
orr x0, x0, x1
# Obtain the address we want to modify:
ldr x1, =_smc
# Write new instruction to destination address:
str w0, [x1]
# As we have caches enabled we have to mark memory region
# that contains our newly generated instruction as outdated:
adr x1, _smc
# Flush D-caches:
dc cvau, x1
dsb ish
# Flush I-caches:
ic ivau, x1
dsb ish
isb
Now let's review the template of _smc function. Here we have a SMC instruction with base immediate value, which we took as 0. What we have to keep in mind here is that we still are in function at the moment — we didn't branch to it, but we've fallen to section _smc which was generated by _form_smc function from the last one. _smc contains our smc instruction. After executing it, the ALU will run exception handler and, after returning from it, will run the instruction immediately following SMC. Thus we have to add ret instruction as part of _form_smc function.
Any function has some return value. What could we return from _form_smc? It could be interesting to return the opcode of generated instruction. That's exactly what we'll do. But we will not do any additional moving of data here because we already have our SMC in x0 — which is the register that contains return value of a function (by ABI). You can see this section below:
# Fall to newly formed instruction:
_smc:
# smc with default immediate value:
smc 0x0000
# We keep in mind that we are still in function (_form_smc), which
# is called from C-code. Thus have to put ret here. Also we use the
# return value for reviewing of instruction code we've generated.
# At this moment it is stored in w0, thus we do not move any values.
ret;
Now we have code that causes exception — SMC with immediate value as given parameter. So it's time to get back to our handler. It will be a simple routine that gets immediate value and so-called exception class — actual type of exception from exception syndrome and core number on which exception occurred. After that it branches to C-function exception_handler() which just outputs all gathered information: exception type (SMC only at the moment), its immediate value and core number. After exception_handler() the handler returns ALU to normal code. You can see the code of handler in the repository, in vbar_64.s file.
Now we have VBAR, function that generates dynamic SMC instruction with immediate value as specified parameter, returns its opcode and exception handler. So how can we put it all together to get a nice result? We'll use symbols we get from UART as an immediate value for SMC and output its opcode. Here's what we have as a result:
Barium No-Boot V0.4 (iMX8MP)
Build: 14:44:44, Sep 26 2025
Running at: 2000MHz
ALU Core №: 0
Vector BAR: 0000000000000000
ALU Core №: 1
Vector BAR: 0000000000000000
ALU Core №: 2
Vector BAR: 0000000000000000
ALU Core №: 3
Vector BAR: 0000000000000000
Awaiting commands from UART:
Exception: SMC 000D, Core: 3
Instruction opcode: D40001A3
Exception: SMC 0020, Core: 3
Instruction opcode: D4000403
What we see here? VBAR initially is set zeros for all cores. After that we see pairs of lines. First from exception handler — it outputs type of exception, its immediate value and core number on which exception occurred. Second line from barium_main() function — it outputs the return value of _form_smc() function which is the opcode of instruction we've generated.