hisham hm

🔗 The danger of simple examples

When discussing language syntax, people often resort to small examples using simple variables like foo or x, almost like “meta-syntactic variables”, i.e., to make clear these tokens are outside of the syntax under discussion.

One dangerous side-effect, though, is that these variables are always short and sweet. And syntax that works well with short variables doesn’t always work as well in real-world situations where they have to deal with the rest of the language.

Case the first

Recently we were discussing multiple assignment style in the Lua mailing list. Someone suggested this:

local a, b, c, d =
      e, f, g, h

…which makes the assignments “more parallel” than a single line and avoid writing lots of locals.
I think this a case where the over-simplified example is misleading.
With real-world looking variables, it would look more like

local cfg,              constraints, module_name,          initial_path =
      "default_config", {},          get_module_name(ctx), "/etc/myapp/default.config"

So yeah, It looks pretty with a, b, c but in the real world with significant names, this becomes a pain to maintain, and when we stuff too much in a single line, diffs are harder to read.

Case the second

Things always look good in tiny examples with single-letter variables. Which brings me to a gripe I have with an often-suggested Lua idiom: the famous t[#t+1] = v to append to arrays.

The reason why I think it’s so disengenious to defend t[#t+1] = v as the preferred idiom for appending to an array is because it looks good with a single-letter variable and five-line tutorial examples, but in the real world we use nested tables. In the end, table.insert(my.nested[data], v) is both more readable and avoids repetition:

  • it’s one thing less to have a typo on
  • one thing less to change
  • and one thing less to get out of sync.

Note how it’s not even necessarily shorter: in this realistic example the variable name dominates the size of the statement:

table.insert(my.nested[data], val)
my.nested[data][#my.nested[data] + 1] = val

Do I think table.insert is too long? Yes I do, I wouldn’t mind having a shorter idiom (many were proposed in the Lua list over the years, most of them were fine, but I’m not getting into them because we risk delving into syntactic bikeshedding again, so let’s avoid that).

Do I think it’s worth it to add local tinsert = table.insert to every program? No, I think this is worse than the t[#t+1] = v idiom, because I hate having to guess which abbreviation the module author used to write a shorter table.insert in their code (I’ve even seen local append = table.insert in the wild!). And then again, the abbreviation doesn’t gain us much: being comfortable to read is more important than being comfortable to write, but being easy to maintain is just as important if not more.

And yes, it is important to ponder what are the differences between being “easy to read”, “easy to write” and “easy to maintain”. And when pondering those, watch out for misleading short variables in the examples!

Of course, some idioms are advisable specifically for when you have short variables:

local r, g, b = 0, 255, 0

Everyone can easily read what’s going on there. But note that, almost without noticing, I also used a realistic example here! Realistic examples help getting the discussion grounded, and I find that they are often lacking when discussing syntax.


PS: And before someone mentions, the performance gains for localizing such variables as local tinsert = table.insert are overstated:

  • When done at the top of modules they become upvalues and not true locals;
  • Most of the modules I’ve seen doing this are far from being aimed at performance-intensive tasks that would warrant this kind of micro-optimization;
  • Optimization advice changes between Lua implementations; while the cached local helps for interpreted Lua, it may actually hurt for LuaJIT. So just aim for clearer code. If you have an optimization problem you can measure and the local variable does bring a benefit, do it in a small scope, close to your performance-sensitive tight loop so that whoever is reading your code can understand what is going on.

Follow

🐘 MastodonRSS (English), RSS (português), RSS (todos / all)


Last 10 entries


Search


Admin