hisham hm

🔗 What’s faster? Lexing Teal with Lua 5.4 or LuaJIT, by hand or with lpeg

I ran a small experiment where I ported my handwritten lexer written in Teal (which translates to Lua) to lpeg, then ran the compiler on the largest Teal source I have (itself).

lexer time / total time
LuaJIT+hw   -  36 ms / 291 ms
LuaJIT+lpeg -  40 ms / 325 ms
Lua5.4+hw   - 105 ms / 338 ms
Lua5.4+lpeg -  66 ms / 285 ms

These are average times from multiple runs of tl check tl.tl, done with hyperfine.

The “lexer time” was done by adding an os.exit(0) call right after the lexer pass. The “total time” is the full run, which includes additional lexer passes on some tiny code snippets that are generated on the fly, so the change between the lpeg lexer and the handwritten lexer affects it a bit as well.

Looks like on LuaJIT my handwritten parser beats lpeg, but the rest of the compiler (lots of AST manipulation and recursive walks) seems to run faster on Lua 5.4.2 than it does on LuaJIT 2.1-git.

Then I decided to move the os.exit(0) further, to after the parser step but before the type checking and code generation steps. These are the results:

lexer+parser time
LuaJIT+hw   -  107 ms ± 20 ms
LuaJIT+lpeg -  107 ms ± 21 ms
Lua5.4+hw    - 163 ms ± 3 ms
Lua5.4+lpeg -  120 ms ± 2 ms

The Lua 5.4 numbers I got seem consistent with the lexer-only tests: the same 40 ms gain was observed when switching to lpeg, which tells me the parsing step is taking about 55 ms with the handwritten parser. With LuaJIT, the handwritten parser seems to take about 65 ms — what was interesting though was the variance reported by hyperfine: Lua 5.4 tests gave be a variation in the order of 2-3 ms, and LuaJIT was around 20 ms.

I re-ran the “total time” tests running the full tl check tl.tl to observe the variance, and it was again consistent, with LuaJIT’s performance jitter (no pun intended!) being 6-8x that of Lua 5.4:

total time
LuaJIT+hw   -  299 ms ± 25 ms
LuaJIT+lpeg -  333 ms ± 31 ms
Lua5.4+hw    - 336 ms ± 4 ms
Lua5.4+lpeg -  285 ms ± 4 ms

My conclusion from this experiment is that converting the Teal parser from the handwritten one into lpeg would probably increase performance in Lua 5.4 and decrease the performance variance in LuaJIT (with the bulk of that processing happening in the more performance-reliable lpeg VM — in fact, the variance numbers of the “lexer tests” in lpeg mode are consistent between LuaJIT and Lua 5.4). It’s unclear to me whether an lpeg-based parser will actually run faster or slower than the handwritten one under LuaJIT — possibly the gain or loss will be below the 8-9% margin of variance we observe in LuaJIT benchmarks, so it’s hard to pay much attention to any variability below that range.

Right now, performance of the Teal compiler is not an immediate concern, but I wanted to get a feel of what’s within close reach. The Lua 5.4 + lpeg combo looks promising and the LuaJIT pure-Lua performance is great as usual. For now I’ll keep using the handwritten lexer and parser, if only to avoid a C-based dependency in the core compiler — it’s nice to be able to take the generated tl.lua from the Github repo, drop it into any Lua project, call tl.loader() to register the package loader and get instant Teal support in your require() calls!


Update: Turns out a much faster alternative is to use LuaJIT with the JIT disabled! Here are some rough numbers:

JIT + lpeg: 327ms
5.4: 310ms
JIT: 277ms
5.4 + lpeg: 274ms
JIT w/ http://jit.off: 173ms
JIT w/ http://jit.off + lpeg: 157ms

I have now implemented a function which disables the JIT on LuaJIT and temporarily disables garbage collection (which provides further speed ups) and merged it into Teal. The function was appropriately named:

turbo(true)

🔗 Parsing and preserving Teal comments

The Teal compiler currently discards comments when parsing, and it constructs an abstract syntax tree (AST) which it then traverses to generate output Lua code. This is fine for running the code, but it would be useful to have the comments around for other purposes. Two things come to mind:

  • JavaDoc/Doxygen/LDoc/(TealDoc?) style documentation generated from comments
  • a code formatter that preserves comments

Today I spent some time playing with the lexer and parser and AST, looking into preserving comments from the input, with these two goals in mind.

The compiler does separate lexer and parser steps, both handwritten, and the comments were discarded at the lexer stage. That bit was easy. As I consumed each comment, I stored it as a metadata item to the following token (what if there’s no following token? I’m currently dropping the final comment if it’s the last thing in the file, but that would be trivial to fix).

The hard part of dealing with comments is what to do with them when parsing: the parser stage reads tokens and builds the AST. The question then becomes, to which nodes attach each comment as metadata. This is way trickier than it looks, because comments can appear everywhere:

--[[ comment! ]] if --[[ comment! ]] x --[[ comment! ]] < --[[ comment! ]] 2 --[[ comment! ]]  then --[[ comment! ]]
   --[[ comment! ]]
   --[[ comment! ]] print --[[ comment! ]] ( --[[ comment! ]] x --[[ comment! ]]  ) --[[ comment! ]]
   --[[ comment! ]]
--[[ comment! ]]  end --[[ comment! ]]

After playing a bit with a bunch of heuristics to determine to which AST nodes I should be attaching comments in attempts to preserving them all, I came to the conclusion that this is not the way to go.

The point of the abstract syntax tree is to abstract away the syntax, meaning that all those purely syntactical items such as then and end are discarded. So storing information such as “there’s a –[[ comment! ]] right before “then” and then another one right after, and then another in the following line—no, in fact two of them!—before the next statement” are pretty much antithetical to the concept of an AST.

However, there are comments for which there is an easy match as to which AST node they should attach. And fortunately, those are sufficient for the “TealDoc” documentation generation: it’s easy to catch the comment that appears immediately before a statement node and attach to it. This way, we get pretty much the whole relevant ground covered: type definitions, function declarations, variable declarations. Then, to build a “TealDoc” tool, it would be a matter of traversing the AST and then writing a small separate parser to read the contents of the documentation comments (e.g. reading `@param` tags and the like). I chose to attach comments to statements and fields in data structures, and to stop there. Especially for a handwritten parser, adding comment handling code can add to the noise quickly.

As for the second use-case, a formatter/beautifier, I don’t think that going through the route of the AST is the right way to go. A formatter tool could share the compiler’s lexer, which does read and collect all tokens and comments, but then it should operate purely in the syntactic domain, without abstracting anything. I wrote a basic “token dumper” in the early days of Teal that did some of this work, but it hasn’t been maintained — as the language got more complex, it now needs some more context in order to do its work properly, so I’m guessing it needs a state machine and/or more lookahead. But with the comments preserved by the lexer, it should be possible to extend it into a proper formatter.

The code is in the preserve-comments branch. So, I think with these two additions (comments fully stored by the lexer, and comments partially attached to the AST by the parser), I think we have everything we need to serve as a good base for future documentation and formatter tooling. Any takers on these projects?

🔗 User power, not power users: htop and its design philosophy

What the principles that underlie the software you make?

This short story is not really about htop, or about the feature request that I’ll use as an illustration, but about what are the core principles that drive the development of a bit of software, down to every last detail. And by “core principles” I really mean it.

When we develop software, we have to make a million decisions. We’re often driven by some unspoken general principles, ranging from our personal aesthetics on the external visuals, to our sense of what makes a good UX in the product behavior, to things such as “where does bloat cross the line” in the engineering internals. In FOSS, we’re often wearing all these hats at the same time.

We don’t always have it clear in our mind what drives those principles, we often “just know”. There’s no better opportunity to assess those principles than when user feedback asks for a change in the behavior. And there’s no better way to explain to yourself why the change “feels wrong” than to put those principles in writing.

Today was one such opportunity.

I was peeking at the htop issue tracker as an end-user, which is a refreshing experience, having recently retired from this FOSS project I started. I spotted a feature request, asking for a change to make it hide threads by default.

The rationale was sensible:

People casually using htop usually have no idea what userland threads are for.
People who actually need to see them can easily enable them via SHIFT+H.

With them currently enabled by default, it is very inconvenient to go through the list and see what is running, taking up RAM, CPU usage and whatnot, therefore I think it’d be more user-friendly to not show them by default.

He proceeded to show two screenshots: one with the default behavior, full of threads (in green) mixed with the processes, and another with threads disabled.

When one of the current developers said that it’s easier for the user to figure out how to hide things than for them to discover that something hidden can be shown, the counter-argument was also sensible:

Htop can also show Disk IO, which can be arguably very useful, but is hidden by default.

At that point, I decided to put my “original author” hat on to explain what was the intent behind the existing behavior. Here’s what I wrote:


Hi @C0rn3j, I thought I’d drop by and give a bit of historical background as to what was my motivation for showing threads by default.

People casually using htop usually have no idea what userland threads are for.

Yes! I fully sympathize with this sentiment. And the choice for enabling threads and painting them green was deliberate.

You have no idea how many times I was asked “hey, why are some processes green?” over these 15+ years — and no, it wasn’t annoying: each of these times it was an opportunity to teach someone about threads!


(credit: xkcd)

htop was designed to provide a view to what’s going on in the system. If a process is spawning threads like crazy, or if all your cores are overwhelmed because multiple threads of a process are doing work, I think it’s fair to show it to the user, even if they don’t know a thing about threads — or I would say, especially if they don’t know a thing about threads, otherwise these things would be happening and they wouldn’t even know where to look.

Htop can also show Disk IO, which can be arguably very useful, but is hidden by default.

One of my last projects when I was still active in htop development was to add tabs to the interface, so that the user would have a more discoverable path for navigating through these additional columns:

This code is in the next branch of the old repo https://github.com/hishamhm/htop/ — I think the code for tabs is actually finished (though not super tested); it’s pretty nice, you can click on them or cycle with the Tab key, but the Perf Counters support which I was working on, never really got stable to a point of my liking, and that’s when I got busy with other things and drifted away from development — turns out I enjoy writing interface code a lot more than the systems monitoring part!

Anyway, I think that also illustrates the pattern that went into the design: I implemented the entire tabs feature because I wanted to make IO and Perf Counters more discoverable. I wanted to put that information in front of users faces, especially for users who don’t know about these things! I considered the fact that I had implemented the entire functionality of iotop inside htop but people didn’t know about it to be a personal failure, and a practical learning experience: adding systems functionality is useless if the UI design doesn’t get it to users’ hands. (And even I didn’t use the IO features because there was no convenient way of using them.)

I never wanted to implement a tool for “super advanced Linux power users who know what they doing”, otherwise I would have never spent a full line at the bottom showing F-keys, and I would have spent my time implementing Vim bindings (ugh ;) ) instead of mouse support. I’ve always made a point that every setting can be settable via the Setup screen, which you can access via F2-Setup (shown at the bottom line!) and which you can control with the keyboard or mouse, no “edit this config file to use this advanced feature for the initiated only” or even “read the man page” (in fact it only has a man page because Debian developers contributed it!).

I wanted to make a tool that has the power but doesn’t hide it from users, and instead invites users into becoming more powerful. A tool that reaches out its hand and guides them along the way, helping users to educate themselves about what’s happening on the internals of their system, and in this way have more control of it. This is how you hand power to users, not by erecting barriers of initiation or by giving them the bare minimum they need. I reject this dicothomy between “complicated tools for power users” and “stripped-down tools for mere mortals”, the latter being a design paradigm made popular by certain companies and unfortunately copied by so many OSS GUI projects, without realizing what the goals of that paradigm really were, but that’s another rant.

And that’s why threads are enabled by default, and colored in green!

(PS: and to put the point to the proof, I must say that the tabs feature was a much bigger code change than Perf Counters themselves; it included some quite big internal refactors (there’s no “toolkit”, everything is drawn “by hand” by htop) with unfortunately might make it difficult to ressurect that code (or not! who knows?), and of course tabs are user-definable and user-editable!)


The user who proposed the change in the defaults thanked me for the history tidbits and closed the feature request. But in some sense writing this down was probably more enlightening to me than it was for them.

When I started writing htop, I did not set out to create “an instrument for user empowerment” or any such lofty goal. (All I wanted was to have a top that scrolled and was more forgiving with mistypes than circa-2005 top!) But as I proceeded to write the software, every small decision had to come from somewhere, even if done without much deliberate thought, even if at the time it was just “it felt right”. That’s how your principles start to percolate into the project.

Over time, the picture I described in that reply above became clear to me. It helped me build practical guidelines, such as “every setting must be UI-accessible”, it helped me prioritize or even reject feature requests based on how much they aligned to those principles, and it helped me build a sense of purpose for the software I was working on.

If you don’t have it clear to yourself what are the principles that are foundational to the software you’re building, I recommend you to give this exercise a try — try to explain why the things in the software are the way they are. There are always principles, even if they are not conscious to you at the moment.

(PS: It would be awesome if some enterprising soul would dig down the tab support code and ressurrect it for htop 3! I don’t plan to do so myself any time soon, but all the necessary bits and pieces are there!)

🔗 Getting Amethyst up and running

I have played a bit with Rust in the past, but I’ve never done much with it.

To get myself more acquainted with the language, I’ve decided to try to write a small game with it, always a fun way to play with programming.

For that, I’ve started digging into the options and found Are we game yet?, a great resource with pointers for all things Rust and game development.

I’ve decided to try Amethyst for a game engine, as my first impressions with the documentation look quite good, and it will be nice to get updated with more modern game event management concepts.

I’ve followed the book’s instructions to get the latest stable Rust, and it also provides a 2D starter template repo. I managed to get it up and running, but I ran into a couple of hiccups, which I thought I’d document here.

First, I got this problem:

thread 'main' panicked at 'attempted to leave type `platform::platform::x11::util::input::PointerState` uninitialized, which is invalid', /rustc/d9a105fdd46c926ae606777a46dd90e5b838f92f/library/core/src/mem/mod.rs:659:9

Fortunately, 10 days ago someone at /r/rust_gamedev ran into the same thing, and the solution was to downgrade Rust from 1.48.0 to 1.46.0.

Doing that was simple:

rustup default 1.46.0

(Really impressed with how smooth the tooling is!)

I ran `cargo run` again, everything rebuilt, but then I ran into another problem:

thread 'main' panicked at 'called `Result::unwrap()` on an `Err` value: Unsupported', /Users/hisham/.cargo/registry/src/github.com-1ecc6299db9ec823/gfx-backend-vulkan-0.3.3/src/lib.rs:326:9

There’s a ton of people reporting different errors for panicked at ‘called `Result::unwrap()` on an `Err` on the internet, but a search for “err value unsupported gfx-backend-vulkan” put me on the right track. Another Amethyst beginner fixed the same issue by installing the necessary Vulkan library.

The recipe for this library was not yet available for GoboLinux, but thanks to our friends in Gentoo, it was easy to read their ebuilds for vulkan-loader and vulkan-headers and build the equivalents for Gobo.

With the Vulkan library built, I tried again, and then it progressed further, but still failed. But then, the output itself told me what was missing and gave me a useful hint:

vulkan: No DRI3 support detected - required for presentation
Note: you can probably enable DRI3 in your Xorg config

Fixing that was easy, thanks to a very on-point blog post by an Arch user, which told me to edit my /etc/X11/xorg.conf and add the following entries for my Intel Xorg driver:

Section "Device"
   Identifier  "Intel Graphics"
   Driver      "intel"
   Option      "AccelMethod"  "sna"
   Option      "TearFree"     "true"
   Option      "DRI"          "3"
EndSection

I restarted X, and voilà! The 2D starter repo for Amethyst showed me window with graphics and accepted keyboard input.

Now let’s continue reading the Amethyst book to learn how to make a simple game in Rust! :)

🔗 Talking htop at the Changelog podcast

I was interviewed at the Changelog podcast about the surprising story of the maintainership transition in htop:

The Changelog 413: How open source saved htop – Listen on Changelog.com


Follow

🐘 MastodonRSS (English), RSS (português), RSS (todos / all)


Last 10 entries


Search


Admin