hisham hm

🔗 That time I almost added Tetris to htop

Confession time: once I *almost* added a terminal version of Tetris as an Easter egg in htop.

I managed to implement a real crude but working version of it code golfing to make it as short as possible and got it pretty tiny, then added it to the help screen so it would activate by typing h, t, o, p (since h would take you to the help screen and the other keys would be nops in that screen).

Then there’s the question of how to hide an Easter egg in a FOSS codebase… The best I could think of was to make it into a long one-liner starting at column 200 so that most people looking at the code without word-wrapping editors would miss it. But after everything was coded, I decided that trying to “sneak code in”, even in my own codebase, was a bad practice and the good intention of innocent fun wasn’t worth it.

My fascination with Tetris goes way back. I first implemented it when I was in high school, and it getting it done really gave me pause: that was a real program, something that people paid real money for in Nintendo cartridges. It was the first time I thought I could really call myself a programmer for real. At the same time, it was my first contact with the ethics of software. I had never heard of FOSS then, and yet I asked myself: “what if my friends ask for the source code? what should I do?”

Years later, when we did the first CD version for our GoboLinux distro, I took an existing ncurses version of Tetris and hacked it into our installer, adding a progress bar that showed the status of files copying from CD to disk, while the user played the game (distro installers took forever back then!). Everyone loved it–except for the fact that it was supposed to auto-quit when the installation was finished but we changed the list of packages last minute so it got the count wrong.

A lot of people just kept playing for a long time without realizing the installation was done! (But it wasn’t too bad, they could just press Esc or something to quit and finish the install.)

Our early Gobo releases were full of little fun tweaks like that. In one release we included an emulator and legend has it that some hidden folder contains a ROM (not Tetris!), but not even I remember where that is, and that ISO probably isn’t even online anymore. (We really should have preserved our old stuff better!)

The memory of the Tetris installer in Gobo having a last-minute bug was another thing that dispelled me from the idea of the Tetris Easter egg in htop: while having bugs is just normal, I couldn’t bear the thought of htop having some serious bug caused by code added for silly reasons…

htop has its fair share of “unnecessary code”, such as the “big-digit LCD” meter and the themes, which are more artsy than utilitarian and I stand by them. If anything, I think software in general should be more artsy.

But “hidden Tetris in htop causes buffer overflow” would be terrible PR for the project (and my reputation by extension, I guess). That along with the bad taste in the mouth of the idea of hiding code in FOSS left made me drop the Easter egg idea.

I wish I still had that code, though! If only to keep it to myself as an autobiographical side-note.

Come to think of it, after writing all of this I realize I probably _should_ have included that code… as a comment!! Maybe that’s the way to do Easter eggs in FOSS? Add a fun/silly feature but leave it commented out, so that someone tinkering with the code finds it, enables it and has fun with it for a bit. I know that *I* would have enjoyed finding something like that in a codebase.

Oh well, maybe someday I’ll pull this off in some project.

🔗 Finally got rid of a/ and b/ in git diff outputs!

You know how there are these little annoyances that are just mild enough so that you do nothing about it?

In the world of open source there’s always this notion of “if you want something to be different, the code is there, you can change it”, but most often this is not practical: I would never go about carrying a patched version of Git with me to every machine I work on just because of the annoying `a/` and `b/` prefixes that show up on Git diffs.

But those tiny prefixes always made me unable to select and paste a filename with a double-click and a middle-click on the terminal.

Today, after who knows how many years, I decided to make a search about it — “I can’t be the only one annoyed by this, right?” — and lo and behold: someone did ask about this on StackOverflow, and there is a global configuration to disable those prefixes:

git config --global diff.noprefix true

And just like that, this annoyance is gone!

🔗 Turns out gcc has imperative argument handling

The Linux program with most contrived argument handling logic ever has got to be gcc.

Everything in it has a reason, of course, but the end result is that you get a weird mix where the order matters for some args and not for others PLUS there are imperative arguments:

Say you want to link a static library into your program (I’m going to use […] to skip other flags)

gcc -o myprogram [...] myprogram.c libmylibrary.a [...]

This works, but now you want to add plugins to your program. So you add some runtime dynamic linking logic and add -ldl.

Oops, you realize your plugins can’t find some symbols from the static library, only those already used by the main program. The compiler threw away everything from libmylibrary.a that was “unused”.

-Wl,–whole-archive to the rescue!

Wait, what’s that? Two flags joined by a comma?

Turns out gcc is a main driver command which launches other programs, and passes arguments along to them. -Wl,–something means that it will pass the flag –something to the linker. You can add after -Wl, anything that is understood by ld, the GNU Linker.)

But you have other libraries you’re linking as well, and now you start getting duplicated symbol errors when compiling, because it is linking too much stuff! The solution? Wait for it…

gcc [...stuff...] -Wl,--whole-archive libfoo.a -Wl,--no-whole-archive [...other libs...]

The arguments in gcc when dealing with linker options are not only positional, they are imperative!

And I mean that in a quite literal sense. They interpreted like a sequence with side-effects: you set a flag, the next libraries is affected by it, you unset the flag, the following libraries aren’t affected anymore.

I thought find was a strong contender for Unix command with the weirdest argument handling, but I guess gcc takes the cake. 🍰

🔗 Conway’s Law applied to the industry as a whole

Melvin Conway famously said that organizations design systems that mirror their own communication structure. But how about Conway’s Law applied to the entire industry rather than a single company?

The tech industry, and open source (OSS) in particular, are mostly shaped now around the dominating communication structure — GitHub. Nadia Eghbal’s book “Working in Public” does a great job at explaining how OSS’s centralization around a big platform mirrors what happened everywhere on the internet, with us going from personal websites to social networks.

Another huge shift in organizational and communication structure, especially in Open Source, has been the increasing coalescence of maintainership: we historically talk about “a loosely-knit group of contributors” but most OSS nowadays is written by employees of big companies.

The commit stats in big projects like the Linux kernel indicate this, as do GitHub stats and the like. There’s a long tail of small independent contributors, of course, but by quantity major projects are dominated by those hired full-time to work on it.

One thing I haven’t seen discussed a lot is how much this reality changes the way projects are run and developed. Sometimes we see it coming up in particular cases, such as the relationship between Amazon and Rust, but this is a general phenomenon.

When Canonical came into the scene back in 2004-2005, I remember distinctly noticing their impact on OSS; it wasn’t just “more getting done” (yay?) but also what and how—various projects shifted direction around that time (GNOME comes to mind); it didn’t feel like a coincidence.

I don’t mean to imply it’s all bad, just that we don’t discuss enough about how the influence of Big Co development styles affect, in a “Conway’s-law-way”, the development of OSS, and even tech in general, since both open and closed development are so linked nowadays.

OSS has a big impact on how tech in general works (though the reliance of every company on OSS dependencies), and Big Cos have an impact on how OSS works (through their huge presence on the OSS developer community), so in this way they affect everybody. People bring in the experiences they know and how they’re used to working, from coding styles to architecture and deployment patterns to decision processes.

One great example where this is more evident is the “monorepo” discussion, which happens to projects of many sizes nowadays, and where Google and FB experiences are often brought up.

“help our codebase is too big” no, your company is too big. try sharding into microservice entities operating as a cluster in the same management substrate rather than staying as a monolith

@myrrlyn on Twitter

The tweet above is such a great insight: we often see conversations about how to deal with huge codebases (using the likes of Google and FB as examples) AND we often see conversations about Big Tech monopolies — and how they’ve grown way beyond the status at which other monopolies were broken up in the past — but those two topics are hardly ever linked.

If we agree that some aspects of Big Tech as organizations are negative, how much of those do they bring into tech as technology practices via Conway’s Law? OSS seems to act as a filter that makes this relationship less evident, because contributions come from individuals, even though they work for these companies, and often replicate their practices, even if unknowingly.

These individuals will often, even if unknowingly, replicate practices from these companies. This is after all, a process of cultures spreading and influencing each other. It just seems to me that we as an industry are not aware enough of this phenomenon, and we probably should be more attuned to this.

🔗 Data Oriented Design, a.k.a. Lower Level Programming?

I’m not sure if this title is clickbaity, but it certainly summarizes some of the impressions I wanted to write about.

Yesterday I watched Andrew Kelley’s fun talk on Practical Data Oriented Design — do check it out! — and this post will contain some “spoilers” (as in, I will discuss his takeaways). I was drawn to the talk for two reasons: first, because I wanted to check if I was up-to-date on my programming TLAs, but also because he starts by talking about how he felt he had been stuck in a plateau as a programmer for the past decade — a feeling I’m sure many of us have felt at times! — and how this new knowledge got him out of it.

The bulk of the talk, and his takeways on refactoring his Zig compiler to use Data Oriented Design, is on how to get better runtime performance by making data structures smaller, so they are easier on the cache.

DOD techniques

Lots of the examples involved understanding struct alignment, to raise awareness of how much space gets wasted if you don’t take it into account. One way to deal with it includes replacing 64-bit pointers with 32-bit array indices (pointing out the assumption that we can only then have at most 4G items, which is often fair) and, most importantly, that type safety is lost once you no longer have a `MyStruct*` but just a `u32`. This comes along with moving from arrays of structures to structures of arrays, so you can pack data more tightly.

Another method is to apply “encodings” of data to avoid additional booleans in structs. Instead of an enum Creature { Elf, Orc } and a boolean isAlive, you do a enum Creature { AliveElf, DeadElf, AliveOrc, DeadOrc }, effectively moving that bit of data into the byte used by the enum. This is no different than packing structures using bitfields. Combining this with the switch to arrays, you can possibly even avoid using that bit altogether, by keeping two arrays dead_creatures and living_creatures.

As he went through the various examples of refactors to reach this goal, one by one I kept getting this sense of deja vu: “hey, this is how we used to program in the olden days!”

8-bit coding

If you look at how assembly for the 6502, the 8-bit processor used in the NES (my first game console) and the Apple II (my first computer!), you’ll see some of those tricks embedded in the processor design itself.

The 6502 is an 8-bit processor with a 16-bit address space: each instruction features a 1-byte opcode optionally followed by up to two bytes. Since the address space is 16-bits, addresses can go from 0 ($0000) to 65535 ($FFFF). So, to load a byte from memory position $1234 into the A register, you do a `LDA $1234`, which takes three bytes: `AD 34 12` (yes, the 6502 is little-endian!). However, to allow for more compact code, the first 256 bytes of memory have special processor support: addresses $0000 to $00FF, the “Zero Page”. So, just like in the enum trick for `AliveElf` and `DeadElf`, the “enum of opcodes” in the 6502 processor uses a separate number for loading from the Zero Page, so `LDA $0012` encodes into two bytes only: `A5 12`. This also reminds me of switching from pointers to integers, since that one-byte offset into the Zero Page is also a half-sized index that can be used given a set of assumptions.

Going from structs of arrays to arrays of structs is also a very old trick. In fact, I recall my earliest days of BASIC programming where we didn’t have structs and only had arrays, so storing each “attribute” in its own array was essentially the only way, so if I wanted to store x/y coordinates and a name for a bunch of characters, I’d have three arrays `XS`, `YS` and `NS$`. I also remember how, over time, using parallel arrays like this started to get frowned upon as “poor technique”, since arguably, code using arrays of structs is easier to read and maintain that that using structs of arrays, where you need to manually juggle more things in sync.

Refactoring for performance

And this is a common theme: all those old-school techniques being reframed in the talk as Data Oriented Design were in fact one day the norm, and they started to be phased out in the name of ease of development and maintenance. Yes, they do result in faster code — sometimes much faster code! — if you restructure your code to count each byte and optimize for cache usage. But a key word there is restructure. Writing code this way makes sense when you know how the data is be used, and how it will continue to be used. I was happy to see Andrew doing real-world measurements in his talk, and he correctly points out the assumptions involved, with comments such as “if we assume that most monsters are alive”, etc.

It’s very difficult to do this from the get-go, as you’re still iterating around your problem space. But once you know the typical behavior of the program, you can rework the data to match it. And yes, that will most likely give you a performance boost, but most often not without a cost in maintainability: how does that change in the structure changes the client code that uses it?

Further, how hard would it be to change it over again if the underlying assumptions change — for example, if the usage patterns change, if we port it over and the architecture changes, or if we need to add another bit of data into that structure. Sometimes those are important concerns, for example in a codebase of projects that change often and fast (think a startup evolving its product as market targets move), but sometimes projects reach a stage of maturity where you can step back, look at it and say: “Well, I think I have a good understanding of how this behaves now. What is the most memory-efficient representation for the data?”

Andrew’s case looks like a prime example for that. Once you get the tokenizer for a compiler done, you don’t really expect big seismical changes to its codebase (in fact, I think I could benefit from making some similar changes to my own Teal compiler!). In fact, a compiler is a perfect project for these kind of techniques: it’s fairly low-level and performance-critical code. If I recall correctly, Andrew used to work for a web company before Zig, so it makes sense that the style of code he gravitated towards before was higher-level than the one he’s excited about now.

What about maintenance

Optimizing code for performance always feels like a fun puzzle, but the maintenance cost is always in the back of my mind. Even in something like a compiler, making the code “as tight as possible” can backfire, if your implementation language does not allow for proper abstractions. The difficulties in adapting LuaJIT’s C codebase to the changes in newer versions of the Lua language come to mind. One such low-level trick in that codebase hinged on the fact that 32-bit address spaces were limited to 4GB, which allowed for some neat packing of data; that assumption, which was perfectly fair in the early 2000s, became central to the implementation. Of course, 64-bit systems arrived and assumptions changed. Getting rid of that limitation in a codebase full of smart data packing turned out to be a multi-year process.

Of course, if you can get a memory-efficient representation without hitting a maintenance cost, that’s the ideal situation. Some languages are better for this than others. I was impressed that Zig implements structs-of-arrays as MultiArrayList using apparently the same client interface as a regular ArrayList, such that changing from one to the other seems to be a “5-character change”. If you think of other languages that offer no such abstraction, that’s a much more impactful change throughout a codebase (think of all the places where you’d have to change a `monsters[i]->health` into `monster_healths[i]`, and how the memory management of those arrays and their contents change). I’ve also seen Edward Kmett pull some very cool tricks in Haskell combining super-efficient internal representations with very clean high-level abstractions.

In conclusion…

Still, I think it’s nice that some “old-school” techniques are getting a fresh coat of paint and are being revisited. We all benefit from being more performance conscious, and thinking about also means thinking about when to do it.

There’s something to be said about bringing back “old-school” techniques for programming, though, especially for those of us old enough to remember them: the trade-offs for modern architectures are definitely different. Andrew raises a good point about memoization vs. recomputation: the kinds of things you should choose to memoize when coding for the 6502 processor on an NES are very different than those for a modern x86-64. So it’s actually good that those things are being rethought over rather than just rehashed — there’s too much outdated advice out there, especially regarding performance.

The one piece of advice regarding performance that never goes old is: measure. And keep measuring, to see if the tricks you’re keen on using still make sense as the years go by! Another conclusion we get from this is that optimization and abstractions are not at odds with each other, but in fact, combining them, across language and application levels, is the right way to do it, so that we can keep the performance and the high-level code — but that’s probably a subject for another time!


Follow

🐘 MastodonRSS (English), RSS (português), RSS (todos / all)


Last 10 entries


Search


Admin