hisham hm

🔗 Parsing and preserving Teal comments

The Teal compiler currently discards comments when parsing, and it constructs an abstract syntax tree (AST) which it then traverses to generate output Lua code. This is fine for running the code, but it would be useful to have the comments around for other purposes. Two things come to mind:

  • JavaDoc/Doxygen/LDoc/(TealDoc?) style documentation generated from comments
  • a code formatter that preserves comments

Today I spent some time playing with the lexer and parser and AST, looking into preserving comments from the input, with these two goals in mind.

The compiler does separate lexer and parser steps, both handwritten, and the comments were discarded at the lexer stage. That bit was easy. As I consumed each comment, I stored it as a metadata item to the following token (what if there’s no following token? I’m currently dropping the final comment if it’s the last thing in the file, but that would be trivial to fix).

The hard part of dealing with comments is what to do with them when parsing: the parser stage reads tokens and builds the AST. The question then becomes, to which nodes attach each comment as metadata. This is way trickier than it looks, because comments can appear everywhere:

--[[ comment! ]] if --[[ comment! ]] x --[[ comment! ]] < --[[ comment! ]] 2 --[[ comment! ]]  then --[[ comment! ]]
   --[[ comment! ]]
   --[[ comment! ]] print --[[ comment! ]] ( --[[ comment! ]] x --[[ comment! ]]  ) --[[ comment! ]]
   --[[ comment! ]]
--[[ comment! ]]  end --[[ comment! ]]

After playing a bit with a bunch of heuristics to determine to which AST nodes I should be attaching comments in attempts to preserving them all, I came to the conclusion that this is not the way to go.

The point of the abstract syntax tree is to abstract away the syntax, meaning that all those purely syntactical items such as then and end are discarded. So storing information such as “there’s a –[[ comment! ]] right before “then” and then another one right after, and then another in the following line—no, in fact two of them!—before the next statement” are pretty much antithetical to the concept of an AST.

However, there are comments for which there is an easy match as to which AST node they should attach. And fortunately, those are sufficient for the “TealDoc” documentation generation: it’s easy to catch the comment that appears immediately before a statement node and attach to it. This way, we get pretty much the whole relevant ground covered: type definitions, function declarations, variable declarations. Then, to build a “TealDoc” tool, it would be a matter of traversing the AST and then writing a small separate parser to read the contents of the documentation comments (e.g. reading `@param` tags and the like). I chose to attach comments to statements and fields in data structures, and to stop there. Especially for a handwritten parser, adding comment handling code can add to the noise quickly.

As for the second use-case, a formatter/beautifier, I don’t think that going through the route of the AST is the right way to go. A formatter tool could share the compiler’s lexer, which does read and collect all tokens and comments, but then it should operate purely in the syntactic domain, without abstracting anything. I wrote a basic “token dumper” in the early days of Teal that did some of this work, but it hasn’t been maintained — as the language got more complex, it now needs some more context in order to do its work properly, so I’m guessing it needs a state machine and/or more lookahead. But with the comments preserved by the lexer, it should be possible to extend it into a proper formatter.

The code is in the preserve-comments branch. So, I think with these two additions (comments fully stored by the lexer, and comments partially attached to the AST by the parser), I think we have everything we need to serve as a good base for future documentation and formatter tooling. Any takers on these projects?

🔗 User power, not power users: htop and its design philosophy

What the principles that underlie the software you make?

This short story is not really about htop, or about the feature request that I’ll use as an illustration, but about what are the core principles that drive the development of a bit of software, down to every last detail. And by “core principles” I really mean it.

When we develop software, we have to make a million decisions. We’re often driven by some unspoken general principles, ranging from our personal aesthetics on the external visuals, to our sense of what makes a good UX in the product behavior, to things such as “where does bloat cross the line” in the engineering internals. In FOSS, we’re often wearing all these hats at the same time.

We don’t always have it clear in our mind what drives those principles, we often “just know”. There’s no better opportunity to assess those principles than when user feedback asks for a change in the behavior. And there’s no better way to explain to yourself why the change “feels wrong” than to put those principles in writing.

Today was one such opportunity.

I was peeking at the htop issue tracker as an end-user, which is a refreshing experience, having recently retired from this FOSS project I started. I spotted a feature request, asking for a change to make it hide threads by default.

The rationale was sensible:

People casually using htop usually have no idea what userland threads are for.
People who actually need to see them can easily enable them via SHIFT+H.

With them currently enabled by default, it is very inconvenient to go through the list and see what is running, taking up RAM, CPU usage and whatnot, therefore I think it’d be more user-friendly to not show them by default.

He proceeded to show two screenshots: one with the default behavior, full of threads (in green) mixed with the processes, and another with threads disabled.

When one of the current developers said that it’s easier for the user to figure out how to hide things than for them to discover that something hidden can be shown, the counter-argument was also sensible:

Htop can also show Disk IO, which can be arguably very useful, but is hidden by default.

At that point, I decided to put my “original author” hat on to explain what was the intent behind the existing behavior. Here’s what I wrote:


Hi @C0rn3j, I thought I’d drop by and give a bit of historical background as to what was my motivation for showing threads by default.

People casually using htop usually have no idea what userland threads are for.

Yes! I fully sympathize with this sentiment. And the choice for enabling threads and painting them green was deliberate.

You have no idea how many times I was asked “hey, why are some processes green?” over these 15+ years — and no, it wasn’t annoying: each of these times it was an opportunity to teach someone about threads!


(credit: xkcd)

htop was designed to provide a view to what’s going on in the system. If a process is spawning threads like crazy, or if all your cores are overwhelmed because multiple threads of a process are doing work, I think it’s fair to show it to the user, even if they don’t know a thing about threads — or I would say, especially if they don’t know a thing about threads, otherwise these things would be happening and they wouldn’t even know where to look.

Htop can also show Disk IO, which can be arguably very useful, but is hidden by default.

One of my last projects when I was still active in htop development was to add tabs to the interface, so that the user would have a more discoverable path for navigating through these additional columns:

This code is in the next branch of the old repo https://github.com/hishamhm/htop/ — I think the code for tabs is actually finished (though not super tested); it’s pretty nice, you can click on them or cycle with the Tab key, but the Perf Counters support which I was working on, never really got stable to a point of my liking, and that’s when I got busy with other things and drifted away from development — turns out I enjoy writing interface code a lot more than the systems monitoring part!

Anyway, I think that also illustrates the pattern that went into the design: I implemented the entire tabs feature because I wanted to make IO and Perf Counters more discoverable. I wanted to put that information in front of users faces, especially for users who don’t know about these things! I considered the fact that I had implemented the entire functionality of iotop inside htop but people didn’t know about it to be a personal failure, and a practical learning experience: adding systems functionality is useless if the UI design doesn’t get it to users’ hands. (And even I didn’t use the IO features because there was no convenient way of using them.)

I never wanted to implement a tool for “super advanced Linux power users who know what they doing”, otherwise I would have never spent a full line at the bottom showing F-keys, and I would have spent my time implementing Vim bindings (ugh ;) ) instead of mouse support. I’ve always made a point that every setting can be settable via the Setup screen, which you can access via F2-Setup (shown at the bottom line!) and which you can control with the keyboard or mouse, no “edit this config file to use this advanced feature for the initiated only” or even “read the man page” (in fact it only has a man page because Debian developers contributed it!).

I wanted to make a tool that has the power but doesn’t hide it from users, and instead invites users into becoming more powerful. A tool that reaches out its hand and guides them along the way, helping users to educate themselves about what’s happening on the internals of their system, and in this way have more control of it. This is how you hand power to users, not by erecting barriers of initiation or by giving them the bare minimum they need. I reject this dicothomy between “complicated tools for power users” and “stripped-down tools for mere mortals”, the latter being a design paradigm made popular by certain companies and unfortunately copied by so many OSS GUI projects, without realizing what the goals of that paradigm really were, but that’s another rant.

And that’s why threads are enabled by default, and colored in green!

(PS: and to put the point to the proof, I must say that the tabs feature was a much bigger code change than Perf Counters themselves; it included some quite big internal refactors (there’s no “toolkit”, everything is drawn “by hand” by htop) with unfortunately might make it difficult to ressurect that code (or not! who knows?), and of course tabs are user-definable and user-editable!)


The user who proposed the change in the defaults thanked me for the history tidbits and closed the feature request. But in some sense writing this down was probably more enlightening to me than it was for them.

When I started writing htop, I did not set out to create “an instrument for user empowerment” or any such lofty goal. (All I wanted was to have a top that scrolled and was more forgiving with mistypes than circa-2005 top!) But as I proceeded to write the software, every small decision had to come from somewhere, even if done without much deliberate thought, even if at the time it was just “it felt right”. That’s how your principles start to percolate into the project.

Over time, the picture I described in that reply above became clear to me. It helped me build practical guidelines, such as “every setting must be UI-accessible”, it helped me prioritize or even reject feature requests based on how much they aligned to those principles, and it helped me build a sense of purpose for the software I was working on.

If you don’t have it clear to yourself what are the principles that are foundational to the software you’re building, I recommend you to give this exercise a try — try to explain why the things in the software are the way they are. There are always principles, even if they are not conscious to you at the moment.

(PS: It would be awesome if some enterprising soul would dig down the tab support code and ressurrect it for htop 3! I don’t plan to do so myself any time soon, but all the necessary bits and pieces are there!)

🔗 Getting Amethyst up and running

I have played a bit with Rust in the past, but I’ve never done much with it.

To get myself more acquainted with the language, I’ve decided to try to write a small game with it, always a fun way to play with programming.

For that, I’ve started digging into the options and found Are we game yet?, a great resource with pointers for all things Rust and game development.

I’ve decided to try Amethyst for a game engine, as my first impressions with the documentation look quite good, and it will be nice to get updated with more modern game event management concepts.

I’ve followed the book’s instructions to get the latest stable Rust, and it also provides a 2D starter template repo. I managed to get it up and running, but I ran into a couple of hiccups, which I thought I’d document here.

First, I got this problem:

thread 'main' panicked at 'attempted to leave type `platform::platform::x11::util::input::PointerState` uninitialized, which is invalid', /rustc/d9a105fdd46c926ae606777a46dd90e5b838f92f/library/core/src/mem/mod.rs:659:9

Fortunately, 10 days ago someone at /r/rust_gamedev ran into the same thing, and the solution was to downgrade Rust from 1.48.0 to 1.46.0.

Doing that was simple:

rustup default 1.46.0

(Really impressed with how smooth the tooling is!)

I ran `cargo run` again, everything rebuilt, but then I ran into another problem:

thread 'main' panicked at 'called `Result::unwrap()` on an `Err` value: Unsupported', /Users/hisham/.cargo/registry/src/github.com-1ecc6299db9ec823/gfx-backend-vulkan-0.3.3/src/lib.rs:326:9

There’s a ton of people reporting different errors for panicked at ‘called `Result::unwrap()` on an `Err` on the internet, but a search for “err value unsupported gfx-backend-vulkan” put me on the right track. Another Amethyst beginner fixed the same issue by installing the necessary Vulkan library.

The recipe for this library was not yet available for GoboLinux, but thanks to our friends in Gentoo, it was easy to read their ebuilds for vulkan-loader and vulkan-headers and build the equivalents for Gobo.

With the Vulkan library built, I tried again, and then it progressed further, but still failed. But then, the output itself told me what was missing and gave me a useful hint:

vulkan: No DRI3 support detected - required for presentation
Note: you can probably enable DRI3 in your Xorg config

Fixing that was easy, thanks to a very on-point blog post by an Arch user, which told me to edit my /etc/X11/xorg.conf and add the following entries for my Intel Xorg driver:

Section "Device"
   Identifier  "Intel Graphics"
   Driver      "intel"
   Option      "AccelMethod"  "sna"
   Option      "TearFree"     "true"
   Option      "DRI"          "3"
EndSection

I restarted X, and voilà! The 2D starter repo for Amethyst showed me window with graphics and accepted keyboard input.

Now let’s continue reading the Amethyst book to learn how to make a simple game in Rust! :)

🔗 Talking htop at the Changelog podcast

I was interviewed at the Changelog podcast about the surprising story of the maintainership transition in htop:

The Changelog 413: How open source saved htop – Listen on Changelog.com

🔗 Dynamic type systems aren’t even simpler

Alexis King just published a great blog post titled “No, dynamic type systems are not inherently more open”.

That reminded me of the talk I gave last year at FOSDEM, titled “Minimalism versus types”, where I advocated for static types from a slightly different angle. I tried to convince people that types in dynamically-typed programs are often more complicated than people realize. And often more complicated than in typical statically-typed languages.

People often say the opposite, that static type systems are more complicated, and dynamically-typed languages are simpler. At the surface level, this seems true: in a dynamic world you just go merrily declaring variables, assigning values and doing things with them, without ever having to write down any types, no matter how trivial or complex they are. Things can’t get any simpler in the typing department than “doing nothing”, right?

Well, types are nothing more than the shapes and allowed behaviors of your data. So it’s not like you don’t have shapes and behaviors in any program, regardless of the language… so, you have types, whether you write them or not. They are there, even in assembly language, even if at a conceptual level, as the sets of “valid values” your program can manipulate. And you have to think about them, and they have to make sense, and they have to do the right thing. So, in short, in a correct dynamically-typed program the types need to be just as correct as they are in a statically-typed one (or else you’ll get runtime errors).

In other words, the types are there, but you have to run the type checker in your head. And you know what’s funny? When people don’t write down the types, they often end up with types that are often more complicated than the types from people who do write them. The complexity just goes under the radar, so it piles up.

One day you open that module which you haven’t touched in six months, and you see a function call where the third argument is null. You need to remember what kinds of variables you can pass to that third argument, or read the whole source code to figure it out. You follow through the code to see all places that third argument is used and realize the accepted type of the third argument depends on what you give to the second argument. Congratulations, you’re dealing with a dependent type, which means you’ve just surpassed Haskell in terms of type system complexity. Compilers that deal with this kind of type system are so complex they are effectively proof assistants (and are at the forefront of programming language research), and here you are dealing with those data types with your brain (and your faith in your ability to come up with sufficient tests) alone.

Given that there is no mechanical type checker to prescribe what is expressible, and that the dynamic runtime will accept anything as long as the program doesn’t crash, when doing typechecking in your head you essentially have the world’s most powerful and complicated type checker at your disposal. And once you start making use of this power, you end up dealing with the world’s most complicated type system.

And when you give people expressive power, they use it. In my experience, people go wild constructing complicated structures in dynamic languages that they normally wouldn’t in static languages. It’s not that static languages are less powerful (Turing equivalence, blah blah), but they make the things you’re doing more obvious to you (Alexis’s post has some great examples). In a dynamically-typed program people are all to keen to make variables and data structures perform double or triple duty (“this is a X but also a Y under circumstances Z”), but when they have to write down what they’re doing as types, it’s like a little conscience check, and they think twice before creating a more complex type for something that could be described in a simpler way (simple example: they’ll probably make two plain functions instead of making one function that takes a string argument that changes the behavior of other arguments). Static types nudge you towards simpler, less “clever” solutions (and we all know what kind of solution is more maintainable in the long run).

But okay, let’s assume we avoid “clever” and pick the same solutions in either. Writing the same program in a static or a dynamic language to process the same data in the same way, you will end up with values of roughly the same types in both. The fact that the variables have static types or not does not change that.

“But in a dynamic language I don’t have to write the types! It’s less work!”

“Not having to” write types but having to think about them anyway is like doing math “not having to” write anything down and doing all calculations in your head. How is it an advantage to not use pen and paper to track down your train of thought as you do a complex calculation, and instead be restricted to do it only in your head? And how is an advantage to not have a mechanical tool — like a calculator, which can actually compute the things you wrote down — to check whether what you wrote with pen and paper makes sense?

I’m lazy, so I hate doing any math in my head. I’ll take writing things down and have a machine check it for me any day of the week. Why wouldn’t I want the same when programming? That’s what computers are for, right? To save us from computing things in our head. So I’ll write my types, and have the compiler check whether they make sense, thank you very much. It’s less work.


Follow

🐦 Twitter🐘 MastodonRSS - posts in English, posts em Português, todos / all


Latest posts


Search


Admin area