Designing the ISA and the simulator is fine and dandy but it does not beat running a custom CPU on real hardware. Unfortunately, I don't have tens of thousands of euros to make a custom chip. The best I can do is to run it on an FPGA. An FPGA is a chip with a lot of logic gates that can be wired together to make any logic circuit. And our CPU is a logic circuit.
What we need to do is to use a hardware description language such as Verilog or VHDL to describe the CPU. I choose to use Verilog as I prefer its more concise syntax but both languages can do the same things in a very similar manner.
I designed my CPU in 4 main blocks:
The CPU is nice to have but it cannot do much on its own. It needs peripherals to interact with the outside world. The one we will use the most is the UART, a peripheral used to send and receive serial messages, such as our beloved "Hello, world!"
The peripherals are put on the memory bus of the CPU so that they can be interacted with. There is also the RAM and the ROM of the processor on that bus.
The peripherals are controlled by writing and reading at special memory addresses, in a way similar to the simulator. For example, the addresses used to interact with the UART are between 0xFF16 and 0xFF19. Here is the code used to do so, notice how similar it is to the code that ran on the simulator:
@align_word label string @string Hello, world! @rawbytes D A 0 label start set 15 cpy R3 ; Length of the message to print in R3 setlab string cpy R2 ; Pointer to the character to print in R2 label print-loop ; Loading a character from R2 and storing it in R3 tbm load R2 cpy R1 tbm ; Function call made thanks to the `callf` macro callf printc ; Incrementing the pointer to the desired character set 1 add R2 cpy R2 ; Decrementing the number of characters left to print set 1 sub R3 cc2 cpy R3 ; If the number of characters left to print is not 0, jump back to the start of the loop set 0 eq R3 cmpnot setlab print-loop jif quit label printc pushr R2 ;addrs pushr R4 ;waiting loop pointer set+ 0xFF16 ;UART tx_cmd addr tbm cpy R2 setlab printcLoop cpy R4 label printcLoop load R2 ;testing that R2 is not 0 to see if we are ready to print cpy R12 set 0 eq R12 read R4 ;until ready, go back jif set 1 ;computing the data addr add R2 cpy R12 read R1 ;writing the char str R12 set 0 ;sending command str R2 tbm popr R4 ;restoring registers popr R2 ret
To ensure that everything works well, it is important to test the CPU, its code, and its peripherals. For example, here is the result of the simulation that sends "Hello, world!" over a UART TX line.
Note how the TX line moves a bit each time a character is sent.
Once the Verilog code is ready, we must synthesize it to convert it into a bitstream that can be loaded into the FPGA. Here comes the not-fun part: "Slack not met".
If a signal goes through too many logic gates between two registers, the signal does not have the time to be fully updated before reaching the end register. This causes the design implemented in the FPGA to have a different behavior than what has been described in Verilog. Thankfully, the FPGA design tools (Vivado in my case) can alert us if such a problem arises and it is up to the designer to fix the issue.
The issue can be fixed by putting intermediate registers in the signal paths that cause issues. This increases latency but this is a small price to pay to have a working design. Once the issues are ironed out, the design can be downloaded to the FPGA.
We are now able to have our hardware design in the FPGA, but how can we give it software to run? The first solution is to bake the software as a ROM in the MCU design. This is very easy to do but this has a major drawback, to put a new software, we must restart the synthesis process, which is lengthy.
A better solution is to embed a bootloader in the MCU that can read firmware given over the UART link and put it into memory. That way, new software can effortlessly be downloaded.
The bootloader is a normal program that I write into a ROM in the MCU. It is executed on boot and uses the peripherals (UART and timers) to get the desired software from an outside source and write it into RAM from which it will be executed. Once the software is fully loaded, the bootloader resets the MCU's state and jumps into the new software.
Here is a small clip of me uploading the "Hello, world!" program to the processor and its reply.
Sending a "Hello, world!" over a UART link is a nice step, but it would be even better to write it onto a screen. To do so, we will need to design a basic GPU to pair with our CPU.
Back to homepage