Here is a pattern I have seen too many times:
- You need / are inspired to learn a new tool or technology.
- You get started using it and learn things as you go.
- You create simplified mental models that hold the information together.
- You don’t ask why things are the way they are and memorize what you don’t understand.
- You stuggle to fit the pieces together.
- Something comes up and shatters your assumptions.
- You have to rethink everything.
Sounds familiar? This type of learning can work, but I think it’s one of the fundamental reasons why people struggle with their first programming language, especially if that language is C.
I have seen so many people that memorize code without understanding it. They waste hours trying to guess what’s wrong with their code, but if they knew the basics of what’s under the hood they would figure out what the problem is in no time!
Our brain learns best one step at a time, by associating the new information with what’s already understood.
In math, no one asks you to find x in “x + 5 = 7” before you know what numbers are and how to add and subtract them.
In order to make sense of something, you have to make sense of the simpler things that it is built upon.
Sure, there are a lot of details you shouldn’t pay attention to unless you really need them, but also a lot of core ideas you should absolutely pay attention to.
The famous 80 / 20 rule also applies to what we learn.
20% of the things we learn we use 80% of the time, the other 80% we learn, we use just 20% of the time.
Starting with “hello world” is a terrible idea, because by the time you finish writing it, you might already have a million questions in your mind.
This is a famous book that explains how a CPU can be built one step at a time, starting from a basic circuit that we can all understand and building on top of it until a CPU is born.
Here is the thing though: Almost no one actually cares how the intel CPU actually works. We just need a mental model that’s good enough so that the CPU doesn’t look so alien to us.
This is what the book does. It presents the essence of a CPU.
But it only goes so far. It never tries to go further and teach something like a programming language the same way.
It’s a project made for students learning assembly language. It’s a simulation of a computer inside your browser.
You write a simplified version of assembly language and it shows the CPU throwing numbers around RAM, one at a time.
Nothing is hidden behind an operating system, compiler, or anything else. There is a table that maps from each instruction to a number. It is self contained.
You can see everything that is going on, all in one page.
You do something different, you immediately see what happens.
You don’t need to spend time configuring it anything, you just open a link and it runs in any browser.
Is so small that you can fit it all onto a screen.
It is not a the end of assembly language tutorials, however.
After you understand everything going on there, you still have to cover the “real world details” to call yourself an assembly language programmer.
It shows us the essence of assembly language.
If you want to get an idea of what the CPU is doing all day, it’s the perfect place to start.
If you are serious about assembly language, you can go further. But the journey is still easier if you start with 8 bit assembler simulator.
After seeing 8 bit simulator, I had one question stuck in my mind:
Why isn’t there something like 8 bit assembler simulator for people learning a higher level language like C?
I know one thing for sure: As I learned C, I didn’t like memorizing things I didn’t understand.
I just wanted to have the big picture of how it all works (and why it is this way) and then, as details come up, I could easily fit them into the big picture.
I didn’t find anything close to 8 bit simulator for C, so I started working on it myself. 3 iterations and 10 months later, miniC was born.
It’s much like 8 bit simulator, just taken one step further.
You write your C code, compile it, then you can see everything that’s going on under the hood, as it gets compiled and executed.
miniC is also an interactive tutorial series that teaches C from the ground up, starting from the very basic machine instructions.
(work in progress, part 1 is here).
miniC aims to answer questions like:
- What is going on under the hood when I write and execute “2+2”?
- What about the if statement?
- What about Pointers?
- What about Functions?
- What about Structs?
- What do they all look like in memory?
- What does the compiler too?
- What do the errors mean?
- What does the compiler not do? How is it thinking? How can you trick the compiler?
- What assumptions does it make?
- How does the CPU underneath work? How are the instructions executed?
- Why is it this way? How did people even come up with these things? Where are they coming from?
No configurations, no distractions, no headaches, just a web page and a tutorial series. All the details that make it hard to wrap your head around are gone. It is self-contained, and every piece fits together with the existing ones.
My number one goal: No more memorizing without understanding.
Don’t worry though, unlike 8 bit simulator, it will also cover the messy real world details.
That’s some motivation, now for the compiler…
What it feels like yesterday, I had no compiler and no idea how to write one, just my past frustrations and a vision.
8 months and 3 failed attempts later, I let miniC loose on reddit.
At the heart of miniC lies a
c–subset compiler, that targets a stack-based vm, that is written in C#, that runs on a mono dotnet virtual machine, that’s compiled to webassembly, that runs on the browserreddit/u/dangerbird2
If that sound crazy, well… it is, a bit. But I can explain!
In this post series I’ll be sharing compiler internals and how miniC became what it is today.