Oct 2, 2018
'Modules' in machine code
SubX now supports code in multiple files. Segments with the same name get merged.
One subtlety: later code is *prepended* to earlier code. Gives us executable libraries with overrideable entrypoints.
$ git clone https://github.com/akkartik/mu
$ cd mu/subx
$ ./subx translate *.subx -o out
$ ./out # run tests
$ ./subx translate *.subx apps/factorial.subx -o out
$ ./out; echo $?
120 # factorial(5)
$ ./out test # run tests
https://github.com/akkartik/mu/blob/29ab43973a/subx/Readme.md
We also now have the ability to allocate new segments in virtual memory using mmap()
. The first new segment I plan on is a 'trace' segment that tests can write to and make assertions on. Automated white-box testing.
Next up, once I have some tracing primitives: a dependency-injected interface for sockets so that we can write automated tests for a fake network.
permalink
* *
Sep 23, 2018
More adventures with machine code
SubX now has a test harness, and support for string literals.
Current (increasingly silly) factorial program to compare with the parent toot:
http://akkartik.name/images/20180923-subx-factorial.html
Two new features:
a) Autogenerated `run_tests` (line 26) which calls all functions starting with 'test_'.
b) String literals (line 31). They get transparently moved to the data segment and replaced with their address.
https://github.com/akkartik/mu/blob/37d53a709/subx/Readme.md
There's only one problem: I don't know how to build a compiler. Not really. And definitely not in machine code. So I'm going to be fumbling around for a bit. Lots more wrong turns in my future.
I've been trying to port the Crenshaw compiler to SubX. With tests. It's been slow going, because I have to think about how to make Crenshaw's primitives like Error()
and Abort()
testable.
I don't know if just learning to build a compiler will sustain my motivation, though. So some other ideas:
a) Some sort of idealized register allocator in Python or something. I've never built one, and my intuitions on how hard it is seem off.
b) Port Mu's fake screen/keyboard to SubX so that I can reimplement https://github.com/akkartik/mu/tree/master/edit#readme. It was just too sluggish since Mu was interpreted. Even my 12-year-old students quickly dropped it in favor of Vim.
permalink
* *
Aug 13, 2018
SubX now supports basic file operation syscalls:
https://github.com/akkartik/mu/blob/7328af20a/subx/ex8.subx
I've also made labels a little safer, so you can't call to inside a function, or jump to within a different function: https://github.com/akkartik/mu/blob/7328af20a/subx/037label_types.cc
Next stop: socket syscalls!
https://github.com/akkartik/mu/blob/7328af20a/subx/Readme.md
permalink
* *
Aug 11, 2018
Now that it can translate labels to offsets, SubX also warns on explicit use of error-prone raw offsets. Both when running and in Vim.
As I build up the ladder of abstractions I want to pull up the ladder behind me:
a) Unsafe programs will always work.
b) But unsafe programs will always emit warnings.
As long as SubX programs are always distributed in source form, it will be easy to check for unsafe code. Coming soon: type- and bounds-checking.
https://github.com/akkartik/mu/tree/76aace4625/subx
permalink
* *
Aug 4, 2018
It's been a slow week, but one idea I've been playing with is "comment tokens"[1]: effect-less words that you can sprinkle into your programs to make them more readable.
Concretely, SubX lines can get long, and the comment at the end is often far away and hard to line up visually with the instruction it's referring to. The solution: dot leaders[2].
https://github.com/akkartik/mu/tree/55b4627de1/subx
[1] Originally in the context of Lisp: https://github.com/akkartik/wart/commit/c2e6d0c6d3
[2] https://www.w3.org/Style/Examples/007/leaders.en.html
permalink
* *
Jul 30, 2018
Factorial on SubX
http://akkartik.name/images/20180730-subx-factorial.html
Ok, I think I understand calling conventions now.
Also coming face to face with the pain of debugging machine code 😀
https://github.com/akkartik/mu/commit/62c6d1638a
permalink
* *
Jul 26, 2018
Adventures in machine code #3
I just spent a couple of days separating out bitfields in my programs, and writing a translator to pack them correctly.
Then I realized that doing so makes it harder to count bytes when computing jump targets.
Luckily there's just two such bytes in the 32-bit x86 encoding, and most of the time the rule becomes, "add 1 byte if these three columns contain anything".
https://github.com/akkartik/mu/tree/6e51c60c699/subx/Readme.md
permalink
* *
Jul 9, 2018
After playing with the ELF format for a few days, it's starting to sink in that binaries need to specify the location of the data segment, but the locations of the stack and heap are the kernel's prerogative.
Obvious with hindsight.
permalink
* *
Jul 1, 2018
Lately I've been programming in raw (32-bit x86) machine code, evolving some minimal tooling for error checking rather than information hiding. A few different ways to write the same instruction ("mov ebx, 42"):
- <binary>
- `bb 2a 00 00 00`
- `bb 42/imm32` (todo: check that `bb` accepts an imm32)
- `mov_imm 42/imm32` (planned; like Forth, no overloading names)
It'll eventually start getting more high level.
- String literals.
- Function calls.
- ...
permalink
* *