hisham hm

The danger of simple examples

When discussing language syntax, people often resort to small examples using simple variables like foo or x, almost like “meta-syntactic variables”, i.e., to make clear these tokens are outside of the syntax under discussion.

One dangerous side-effect, though, is that these variables are always short and sweet. And syntax that works well with short variables doesn’t always work as well in real-world situations where they have to deal with the rest of the language.

Case the first

Recently we were discussing multiple assignment style in the Lua mailing list. Someone suggested this:

local a, b, c, d =
      e, f, g, h

…which makes the assignments “more parallel” than a single line and avoid writing lots of locals.
I think this a case where the over-simplified example is misleading.
With real-world looking variables, it would look more like

local cfg,              constraints, module_name,          initial_path =
      "default_config", {},          get_module_name(ctx), "/etc/myapp/default.config"

So yeah, It looks pretty with a, b, c but in the real world with significant names, this becomes a pain to maintain, and when we stuff too much in a single line, diffs are harder to read.

Case the second

Things always look good in tiny examples with single-letter variables. Which brings me to a gripe I have with an often-suggested Lua idiom: the famous t[#t+1] = v to append to arrays.

The reason why I think it’s so disengenious to defend t[#t+1] = v as the preferred idiom for appending to an array is because it looks good with a single-letter variable and five-line tutorial examples, but in the real world we use nested tables. In the end, table.insert(my.nested[data], v) is both more readable and avoids repetition:

Note how it’s not even necessarily shorter: in this realistic example the variable name dominates the size of the statement:

table.insert(my.nested[data], val)
my.nested[data][#my.nested[data] + 1] = val

Do I think table.insert is too long? Yes I do, I wouldn’t mind having a shorter idiom (many were proposed in the Lua list over the years, most of them were fine, but I’m not getting into them because we risk delving into syntactic bikeshedding again, so let’s avoid that).

Do I think it’s worth it to add local tinsert = table.insert to every program? No, I think this is worse than the t[#t+1] = v idiom, because I hate having to guess which abbreviation the module author used to write a shorter table.insert in their code (I’ve even seen local append = table.insert in the wild!). And then again, the abbreviation doesn’t gain us much: being comfortable to read is more important than being comfortable to write, but being easy to maintain is just as important if not more.

And yes, it is important to ponder what are the differences between being “easy to read”, “easy to write” and “easy to maintain”. And when pondering those, watch out for misleading short variables in the examples!

Of course, some idioms are advisable specifically for when you have short variables:

local r, g, b = 0, 255, 0

Everyone can easily read what’s going on there. But note that, almost without noticing, I also used a realistic example here! Realistic examples help getting the discussion grounded, and I find that they are often lacking when discussing syntax.


PS: And before someone mentions, the performance gains for localizing such variables as local tinsert = table.insert are overstated:


On the word “latino”

One of my least-favorite American English words is “latino”, for two reasons:

First, a linguistic reason: because it’s not inflected when used. When you’re used to the fact that in Spanish and Portuguese “latino” refers only to men and “latina” only to women, hearing “latino woman” sounds really weird (weirder than, say, “handsome woman”). Even weirder “latino women”, mixing a Spanish/Portuguese word and English grammar. “Bonito girls”? :)

Second, a sociological reason: because using a foreign loanword reinforces the otherness. Nobody calls the Italian community in America “italiano”, although that’s their name in Italian. The alternative “Hispanic” is not ideal because it doesn’t really make sense when including Brazil, which was never a Spanish colony (plus, the colonial past is something most countries want to leave behind).

I can’t change the language by myself, so I just avoid the term and use more specific ones whenever possible (Colombians, Argentines, Brazilians, South Americans, Latin Americans when referring to people from the area in general, etc.)

After writing the above, I checked Wikipedia and it seems the communites in the US agree with me:

« In a recent study, most Spanish-speakers of Spanish or Hispanic American descent do not prefer the term “Hispanic” or “Latino” when it comes to describing their identity. Instead, they prefer to be identified by their country of origin. When asked if they have a preference for either being identified as “Hispanic” or “Latino,” the Pew study finds that “half (51%) say they have no preference for either term.”[43] A majority (51%) say they most often identify themselves by their family’s country of origin, while 24% say they prefer a pan-ethnic label such as Hispanic or Latino. Among those 24% who have a preference for a pan-ethnic label, “‘Hispanic’ is preferred over ‘Latino’ by more than a two-to-one margin—33% versus 14%.” Twenty-one percent prefer to be referred to simply as “Americans.” »

I think the awkwardness in the grammar from point one actually reinforces point two, because it strikes me as something that no Spanish or Portuguese native speaker would come up with by themselves. So it sounds tacked upon.

Don’t get me wrong, I fully identify as a Brazilian, a South American and a Latin American — travellling abroad helps a lot to widen your cultural identity! — and I have no problem when people wear the term “latino” proudly, but I always pay close attention to the power of language and how it represents and propagates ideas.


String interpolation in Lua

Lua is known for having a very lean standard library, and for providing mechanisms to do things instead of a ton of features.

String interpolation isn’t available out of the box, but doing it in Lua isn’t a new trick. In fact, the manual includes it as an example of string.gsub:

local t = {name="lua", version="5.3"}
x = string.gsub("$name-$version.tar.gz", "%$(%w+)", t)
--> x="lua-5.3.tar.gz"

This applies to members of a table only, though. Python is introducing a general string-interpolation syntax:

a = "Hello"
b = "World"
f"{a} {b}"
f"{a + ' ' + b}"

Given that Lua supports the f"str" syntax for functions with a single string argument, I thought it would be nice to put its Lua-provides-the-mechanisms ethos to test by trying to write my own Python-like f-string formatter.

And here it is, in all its 28-line glory (and I went for readability, and not to write it as short as possible):

function f(str)
   local outer_env = _ENV
   return (str:gsub("%b{}", function(block)
      local code = block:match("{(.*)}")
      local exp_env = {}
      setmetatable(exp_env, { __index = function(_, k)
         local stack_level = 5
         while debug.getinfo(stack_level, "") ~= nil do
            local i = 1
            repeat
               local name, value = debug.getlocal(stack_level, i)
               if name == k then
                  return value
               end
               i = i + 1
            until name == nil
            stack_level = stack_level + 1
         end
         return rawget(outer_env, k)
      end })
      local fn, err = load("return "..code, "expression `"..code.."`", "t", exp_env)
      if fn then
         return tostring(fn())
      else
         error(err, 0)
      end
   end))
end

It works just like the Python example:

a = "Hello"
b = "World"
print(f"{a} {b}")

Unlike the one-liner from the Lua manual, it also works with local variables:

local c = "Hello"
local d = "World"
print(f"Also works with locals: {c} {d}")

do
   local h = "Hello"
   do
      local w = "World"
      print(f"Of any scope level: {h} {w}")
   end
end

Some more demos:

print(f"Allows arbitrary expressions: one plus one is {1 + 1}")

local t = { foo = "bar" }
print(f"And values: t.foo is {t.foo}; print function is {_G.print}")

local ok, err = pcall(function()
   print(f"This fails: { 1 + } ")
end)
print("Errors display nicely: ", err)

If there’s interest, I can make this a module in LuaRocks (probably calling it F rather than f).

Update! This is now available in LuaRocks as a module! Install it with:

luarocks install f-strings

More info at the f-strings GitHub page. Enjoy!


Generalized nullable operators

Today I was writing some Lua code and had to use something like this for the millionth time:

logger:fine("my message " .. (extra_data or ""))

Since operators in Lua fail when applied to null (and thankfully don’t do wat-eseque coercions), whenever I want to perform an operation on a value that may be null, I have to add the neutral element of the operation as a fall back:

print(a + (b or 0))
print(x * (y or 1))

This got me thinking of null-conditional operators such as ?. that some other languages such as C# have.

Then I wondered: wouldn’t it be nice if “?” could be a modifier to any operator?

Creating null-checking operators

Here’s the initial sketch of the idea: in case of a “nullable operator”, just cancel the operation when given a null operand: i.e., return the left-hand value in case the right-hand value is null.

Or, expressed in Haskell, here’s a function “rightCheckNullable” that takes a normal operator and converts it to a nullable version checking the right-hand value (”nullable types” are represented as “Maybe” types in Haskell):

rightCheckNullable :: (a -> b -> a) -> (a -> Maybe b -> a)
rightCheckNullable fn = a b ->
   case b of
   Just x  -> fn a x
   Nothing -> a

Let’s create some nullable operators:

(+?) = rightCheckNullable (+) -- nullable addition
(*?) = rightCheckNullable (*) -- nullable multiplication
(++?) = rightCheckNullable (++) -- nullable concatenation

And give them a spin:

main =
   let
      v1 :: Float
      v1 = 123

      v2 = Nothing

      v3 :: Maybe Float
      v3 = Just 456
   in
      do
         print $ show (v1 +? v2) -- prints 123.0
         print $ show (v1 +? v3) -- prints 579.0
         print $ show (v1 *? v2) -- prints 123.0
         print $ "hello" ++? Just "world" -- prints helloworld
         print $ "hello" ++? v2 -- prints hello

With something like the above, instead of a + (b or 0) and x * (y or 1), one could write simply:

print(a +? b)
print(x *? y)

This could give back some of the terseness we have when null auto-coerces to other types, without surprises with various operations. In JavaScript, null coerces to 0 when it is an integer, which gives us a proper neutral element for addition but not for multiplication.

Null-checking in C#

Note, however, that my choice of picking the right-hand value and checking the left-hand value only was arbitrary (though it works well for the examples above).

In C#, operations on nullable types are always lifted: the operators of the original types are extended with a check where, if either of the arguments is null, the result of the operation is null.

In Haskell, this transformation would be the following, taking a function that goes from a’s to b’s producing c’s, and producing an equivalent function that goes from Maybe a’s to Maybe b’s producing Maybe c’s:

bothCheckNullable :: (a -> b -> c) -> (Maybe a -> Maybe b -> Maybe c)
bothCheckNullable fn =
    ma mb ->
      case ma of
         Nothing -> Nothing
         Just a ->
            case mb of
               Nothing -> Nothing
               Just b -> fn a b

(In Haskell, you don’t have to actually write this function, since can use Control.Applicative.liftA2, a generalization of the above, to get the same result)

Checking the left-hand value

This makes me think that my “nullable operator modifier” could be aplied to either side (or both). Note that the syntax for null-conditional in C# is already ?., with the question-mark on the left-hand side, since the value being checked for nullity is the left-hand one. We don’t want x?.y to return y when x is null, though. A more sensible semantics for left-hand ? would be a “short-circuiting” one:

leftCheckNullable :: (a -> b -> c) -> (Maybe a -> b -> Maybe c)
leftCheckNullable fn = a b ->
   case a of
   Just x  -> fn x b
   Nothing -> Nothing

The flood gates are open!

There is still an asymmetry here, as rightCheckNullable is the only one that returns the “other value” when one of them is null.

In fact, we could have six versions of the conversion function: right-check, both-check, left-check, each of them returning the “other value” (as I did with +?) or null. If we called the C#-like version +??, this means addition could be modified into: +?, ?+, ?+?, +??, ??+, ??+??.

(And there could be two more variants of course, ?+?? and ??+?, but coming up with a realistic example using them would be a nice exercise in creative coding.)

But would it make sense to have so many modifiers?

Well, for one, instead of writing this

logger:severe("Error: connection failed" .. (details and (" - details: "..details) or ""))

we could write something like:

logger:severe("Error: connection failed" ..? (" - details: " ..?? details))

I know this is a step into the world of APL, and I’m not arguing this is a great or even good idea, but it was fun to think about, so I thought I’d share.


A small practical example where Lua’s % behavior is better than C’s

These days I saw on Twitter a comment on how the behavior of the % (remainder) operator for negative numbers is weird in C.

I’ve seen this discussion come up numerous times in the Lua mailing list over the years. The reason being because Lua does it different, and most languages simply copy the behavior of C.

Today I saw Etiene’s cool demo of a mini JavaScript Duck Hunt clone that she presented at a “intro to programming” workshop for the Women in Leadership event in Bremen, Germany.

It’s a really nice demo of game behavior in a short span of code, and with the environment of Mozilla Thimble, it instantly enticed me to play around with the code and see what happened.

The first thing that came to my attention was that the ducks spawn at position x=0, and this made them “pop” into the screen. I thought that changing the initial value to something like x=-50 would be a small change to try and would produce a smoother effect (just change 0 to -50 in lines 56 and 116).

When I first tried that, the result was that they would show up, but wouldn’t start flapping their wings until they were at x=0. The reason is because the logic to switch sprites is made testing x % 30 for values 0, 10 and 20… and JavaScript’s % operator, like C’s, returns negative remainders for negative divisors.

My quick hack solution was to calculate

   var absx = Math.abs(this.x);

(which required me a visit to DuckDuckGo to figure out how to properly say “abs(x)” in JavaScript). This made the birds enter the screen flapping their wings. Yay!

Of course, this is not something you’d want to have to explain in an “intro to programming” workshop. It would be better if the animation “just worked” with that change…

But wait! If you have really sharp eyes, you’ll notice that from -50 to 0, the birds are flapping their wings upwards and from 0 on, they do it downwards. The animation is inverted!

The reason is because operating on abs(x) causes this:

Lua 5.3.0  Copyright (C) 1994-2015 Lua.org, PUC-Rio
> for i = -50, 100 do io.write(math.abs(i)%30, " ") end
20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 0 1 2 3 4 5 6 7 8 9 10

If I write a one-liner to simulate the sprite logic in Duck Hunt, I get this:

> for i = -50, 100 do r=math.abs(i)%30; io.write(r==0 and "1" or (r==10 and "2" or (r==20 and "3" or ".") ) ) end
3.........2.........1.........3.........2.........1.........2.........3.........1.........2.........3.........1.........2.........3.........1.........2

Indeed, it’s going 3,2,1, 3,2,1 at the negative numbers and then 1,2,3, 1,2,3 at the positive ones. But let’s just drop the math.abs in Lua and see what happens:

> for i = -50, 100 do r=i%30; io.write(r==0 and "1" or (r==10 and "2" or (r==20 and "3" or ".") ) ) end
2.........3.........1.........2.........3.........1.........2.........3.........1.........2.........3.........1.........2.........3.........1.........2

We get 1,2,3,1,2,3 all the way!

In my experience, the vast majority of times I used %, it was to tell something to “do this every X steps”, like Etiene does in her Duck Hunt. For this kind of purposes, I’m pretty convinced that Lua’s behavior for % is a lot better. It’s unfortunate that most other languages just decided to follow the example of C.

Of course, there are a million other ways to make the ducks flap their wings, with and without %, that’s not the point. But it intrigued me that, if JavaScript had Lua’s behavior for %, my initial tiny change would have “just worked”!