hisham hm

Fun hack to redirect stdout and stderr in order

Prologue

This is anecdote about roundabout ways to get stuff done. Pierre mentioned in the comments below that a proper way to solve this is to use unbuffer. But if you want to read the improper way to do this, read on! :)

The story

Due to buffering, the terminal messes with the order of stdout and stderr of a program when redirecting to a file or another program. It prints the outputs of both descriptors in correct order relative to each other when printing straight to the terminal:

] ./my_program
stdout line 1
stdout line 2
stderr line 1
stdout line 3
stderr line 2
stderr line 3

This doesn’t change the order:

] ./my_program 2>&1
stdout line 1
stdout line 2
stderr line 1
stdout line 3
stderr line 2
stderr line 3

but it changes the order when saving to a file or redirecting to any program:

] ./my_program 2>&1 | cat
stderr line 1
stderr line 2
stderr line 3
stdout line 1
stdout line 2
stdout line 3

This behavior is the same in three shells I tested (bash, zsh, dash).

A weird “solution”

I wanted to save the log while preserving the order of events. So I ended up with this evil hack:

] strace -ewrite -o trace.txt -s 2048 ./my_program; sed 's,^[^"]*"(.*)"[^"]*$,1,g;s,n,,g;' trace.txt > mytrace.txt
] cat mytrace.txt
stdout line 1
stdout line 2
stderr line 1
stdout line 3
stderr line 2
stderr line 3
+++ exited with 0 +++

It turns out that strace does log each write in the correct order, so I’m catching the write syscall.

Note the limitations: it truncates lines to 2048 characters (good enough for my logs) and I was simply cutting off n and not cleaning up any other escape characters. But it worked well enough so I could read my ordered logs in a text editor!


You can’t automate SemVer, or: There is no way around Rice’s Theorem

Rice’s Theorem, proved in 1951, states that it is impossible to write a program that performs precisely any non-trivial analysis of the execution of other programs. More precisely, that’s impossible to code an analyzer for some non-trivial property that is able to decide whether an any given analyzed program has that property or not. And by “trivial property” we mean a property that either _all_ algorithms in the world have or _none_ has. So, yeah, “non-trivial property” is basically any property you can think of: “does it ever calculate 5 + 2”, “does it always use less than 10MB of memory”, “does it ever print something to the screen”, “does it ever access the network”?

At this point you might say “wait! I can write a program that checks if programs access the network or not! We can parse the code and if there are no calls whatsoever in it to any networking code such as connect(), then it doesn’t access the network!”. Sure, you can do that: but if the code has calls to connect(), you can’t decide for sure that it will access the network when it’s executed.

In 1936 Alan Turing proved that it is impossible to write a program that solves the Halting Problem, that is, to write an analyzer that checks programs and tells if it always terminates (”halts”) or it might enter an endless loop given some specific input. Okay, that’s a classic result, but that’s one property, how can Rice’s Theorem say we can’t make an analyzer for any property at all, even the silliest ones?

The proof for this amazingly powerful theorem is surprisingly simple. Turns out that if we had an analyzer for any silly property, we could use it to make a Halting Problem analyzer (which Turing proved to be impossible). Like this:

bool my_halting_problem_analyzer(Code analyzedProgram) {
   Code modifiedProgram = analyzedProgram + "; someCodeWithSillyProperty();"
   return my_silly_property_analyzer(modifiedProgram);
}

If the code in analyzedProgram always terminates, then the code in modifiedProgram will always reach the part that has the silly property, so my_silly_property_analyzer will return true, and my_halting_problem_analyzer returns true as well. If there is some input that makes the analyzedProgram hang in a loop, that means there’s some input that makes the silly property fail, resulting in false. Yay, we solved the Halting Problem using the silly property analyzer! Not.

Of course, this explanation is quite simplified1, so head to Wikipedia and your favorite formal languages book for the precise details. But the point stands that general semantic analysis of programs is impossible.

In particular, you can’t write a program that takes versions 1.0 and 1.1 of any program X and answer the question: “do they behave the same?”. In other words, it’s impossible to write an analyzer that looks at your master branch before you make a release and answers the question “should your new release tag be a major, minor or tiny release” according to the rules of SemVer (or any other API-compatibility-bound set of rules, for that matter).

This is because API compatibility is not only based on syntactically-expressible issues (that is, type signatures for functions and data structures). Any semantic changes to the code also break compatibility. A function may change its behavior but not its type signature (it still returns a string, but it used to be lower-case and it’s now upper-case), a struct can change they way it is used but the fields remain the same (field foo returned numbers from 0 to 10 and -1 when executed on Sundays, now it returns -1 on Saturdays as well). An automated tool won’t catch all this.

So, it is possible only to write a “pessimistic” tool, that may detect lots of situations syntactically and give the bad news: “hey, you must increment the major version here!”. But you can’t write a tool that is always able to look at code semantically and say the good news: “I assure you that no API behaviors have changed, you can safely name this a tiny version increase.”2

Yes, you can use test suites as an approximation for detecting semantic changes in API behaviors beyond type signatures and data structures. That would certainly improve your pessimistic analyzer — you’d be able to detect more situations where “you must increment major”. But even then it can only go so far, because in practice one can’t test for every possible input/output combination, so you still can’t be 100% sure. fuzz testing has uncovered bugs and unexpected behaviors even in programs with extensive test suites; as Dijkstra famously said, “Testing shows the presence, not the absence of bugs.” — likewise, test suites can show inconsistencies to the API specification, but not their adherence. So they can’t be taken to represent the semantics of a program entirely.

Anyway, in the end of the day, Rice’s Theorem shows us that general bullet-proof analysis of program behavior is not attainable, so no tool will ever be able to compare codebases and always tell us precisely that a new release is really “tiny-safe”. Semantic versioning just can’t be automated.


htop 2.0 released!

This week I finally released htop 2.0.0!

What’s new in htop 2.0

Since version 2.0, htop is now cross-platform!
Check out the video and slides of my presentation at FOSDEM 2016
about how this came to be. This release includes code supporting Linux, FreeBSD, OpenBSD and Mac OS X.

There are also, of course, some new features:

…And of course, lots of other tweaks and fixes!

Changelog

The changelog with the main new changes follows below. Special thanks
to everyone who contributed for this release, through bug reports, bug
fixes, new features and financial support for the platform abstraction
layer project!


String interpolation in Lua

Lua is known for having a very lean standard library, and for providing mechanisms to do things instead of a ton of features.

String interpolation isn’t available out of the box, but doing it in Lua isn’t a new trick. In fact, the manual includes it as an example of string.gsub:

local t = {name="lua", version="5.3"}
x = string.gsub("$name-$version.tar.gz", "%$(%w+)", t)
--> x="lua-5.3.tar.gz"

This applies to members of a table only, though. Python is introducing a general string-interpolation syntax:

a = "Hello"
b = "World"
f"{a} {b}"
f"{a + ' ' + b}"

Given that Lua supports the f"str" syntax for functions with a single string argument, I thought it would be nice to put its Lua-provides-the-mechanisms ethos to test by trying to write my own Python-like f-string formatter.

And here it is, in all its 28-line glory (and I went for readability, and not to write it as short as possible):

function f(str)
   local outer_env = _ENV
   return (str:gsub("%b{}", function(block)
      local code = block:match("{(.*)}")
      local exp_env = {}
      setmetatable(exp_env, { __index = function(_, k)
         local stack_level = 5
         while debug.getinfo(stack_level, "") ~= nil do
            local i = 1
            repeat
               local name, value = debug.getlocal(stack_level, i)
               if name == k then
                  return value
               end
               i = i + 1
            until name == nil
            stack_level = stack_level + 1
         end
         return rawget(outer_env, k)
      end })
      local fn, err = load("return "..code, "expression `"..code.."`", "t", exp_env)
      if fn then
         return tostring(fn())
      else
         error(err, 0)
      end
   end))
end

It works just like the Python example:

a = "Hello"
b = "World"
print(f"{a} {b}")

Unlike the one-liner from the Lua manual, it also works with local variables:

local c = "Hello"
local d = "World"
print(f"Also works with locals: {c} {d}")

do
   local h = "Hello"
   do
      local w = "World"
      print(f"Of any scope level: {h} {w}")
   end
end

Some more demos:

print(f"Allows arbitrary expressions: one plus one is {1 + 1}")

local t = { foo = "bar" }
print(f"And values: t.foo is {t.foo}; print function is {_G.print}")

local ok, err = pcall(function()
   print(f"This fails: { 1 + } ")
end)
print("Errors display nicely: ", err)

If there’s interest, I can make this a module in LuaRocks (probably calling it F rather than f).

Update! This is now available in LuaRocks as a module! Install it with:

luarocks install f-strings

More info at the f-strings GitHub page. Enjoy!


How to make a pull request on GitHub - a quick tutorial

So you made changes to a project — a bugfix or maybe a new feature — and you want to send it for inclusion in the official (“upstream”) sources. Perhaps you sent an email or opened an issue in the bugtracker, and the project maintainers asked you to send a Pull Request (PR) on GitHub. But how to do this? Here’s a quick how-to guide!

Step 0 - Have a GitHub account

Before anything, you need to have a GitHub account! If you don’t have one already, go to github.com and sign up. Just follow the instructions, it’s easy and free.

Step 1 - “Fork the repository”

“Forking a repository” on GitHub means creating your own Git repository, which is a copy of the original.

Let’s visit a repository and fork it. Start by visiting https://github.com/hishamhm/pull-request-tutorial

In the upper-right there’s a button named “Fork”. It also shows a number: how many times this repository was forked by other people).

Press it, and it will create your own copy of the pull-request-tutorial repository, at https://github.com/YOUR_USERNAME/pull-request-tutorial (the real URL will, of course, contain your own username).

Step 2 - Download your fork and create a branch

Now, it’s time for you to make your changes in the source code (your bugfix or new feature). Start by downloading your repository to your computer. Go to the terminal, make sure git is installed in your computer and type:

git clone https://github.com/YOUR_USERNAME/pull-request-tutorial.git

This will download the files and create a directory called pull-request-tutorial that is linked to your fork (i.e. the copy of the repository under your control).

To avoid trouble later, let’s create a new “branch” in our repository so that the work on our bugfix or feature is stored separately. Pick a meaningful name that represents the changes you plan to make in your code. In our example, I’ll call it “fix-typo”:

git checkout -B fix-typo

Step 3 - Make your changes in your fork

Now enter the directory of your local fork, and edit it at will, implementing your bugfix or feature.

If you create a new file, remember to add it with git add:

git add new_file.txt

Commit your changes, adding a description of what was added. If you’re not used to Git, the simplest way is to commit all modified files and add a description message of your changes in a single command like this:

git commit -a -m "Fix typo in README file"

(But there are lots of ways to choose which files (and even parts of files) do commit and edit the commit message. Look for the Git documentation for details.)

Once your changes are committed, “push” the changes: send them to your GitHub repository using git push

git push

(The first time you push from a branch, Git will complain that your local branch in your computer is not connected to a branch in the GitHub server. Just do what the command tells you to do:

git push --set-upstream origin fix-typo

Next time you push again to this repository, just “git push” will do fine.)

Now, when you visit https://github.com/YOUR_USERNAME/pull-request-tutorial again, you should see your changes there.

Step 4 - Make the Pull Request

This is the simplest step! In your repository page, the next time you open the page after pushing to a new branch, there’s a big green button saying “Compare & pull request”. Press it!

This will open a page in which you’ll be able to further edit the description for your proposed changes. Write down a nice report explaining why these changes should be included in the official sources of your project, and then confirm.

The project authors will receive an email notification that you sent them a PR. Then it’s their turn to read it and comment. You will get notifications when they comment. If they suggest any changes to your bugfix or feature, go back to Step 3, edit it and push again: your Pull Request will be automatically updated. If they are happy with the changes and want to integrate your contributions to the project, the maintainers will click “Merge” and your code will become part of the original repository!

If you want to give it a try, feel free to use the repository I created for this tutorial: https://github.com/hishamhm/pull-request-tutorial

Fork it, edit it, commit and push your changes and send me a PR!

If you liked this tutorial, leave a star on its repo. :)