🔗 String interpolation in Lua
Lua is known for having a very lean standard library, and for providing mechanisms to do things instead of a ton of features.
String interpolation isn’t available out of the box, but doing it in Lua isn’t a new trick. In fact, the manual includes it as an example of string.gsub:
local t = {name="lua", version="5.3"} x = string.gsub("$name-$version.tar.gz", "%$(%w+)", t) --> x="lua-5.3.tar.gz"
This applies to members of a table only, though. Python is introducing a general string-interpolation syntax:
a = "Hello" b = "World" f"{a} {b}" f"{a + ' ' + b}"
Given that Lua supports the f"str"
syntax for functions with a single string argument, I thought it would be nice to put its Lua-provides-the-mechanisms ethos to test by trying to write my own Python-like f
-string formatter.
And here it is, in all its 28-line glory (and I went for readability, and not to write it as short as possible):
function f(str) local outer_env = _ENV return (str:gsub("%b{}", function(block) local code = block:match("{(.*)}") local exp_env = {} setmetatable(exp_env, { __index = function(_, k) local stack_level = 5 while debug.getinfo(stack_level, "") ~= nil do local i = 1 repeat local name, value = debug.getlocal(stack_level, i) if name == k then return value end i = i + 1 until name == nil stack_level = stack_level + 1 end return rawget(outer_env, k) end }) local fn, err = load("return "..code, "expression `"..code.."`", "t", exp_env) if fn then return tostring(fn()) else error(err, 0) end end)) end
It works just like the Python example:
a = "Hello" b = "World" print(f"{a} {b}")
Unlike the one-liner from the Lua manual, it also works with local variables:
local c = "Hello" local d = "World" print(f"Also works with locals: {c} {d}") do local h = "Hello" do local w = "World" print(f"Of any scope level: {h} {w}") end end
Some more demos:
print(f"Allows arbitrary expressions: one plus one is {1 + 1}") local t = { foo = "bar" } print(f"And values: t.foo is {t.foo}; print function is {_G.print}") local ok, err = pcall(function() print(f"This fails: { 1 + } ") end) print("Errors display nicely: ", err)
If there’s interest, I can make this a module in LuaRocks (probably calling it .F
rather than f
)
Update! This is now available in LuaRocks as a module! Install it with:
luarocks install f-strings
More info at the f-strings GitHub page. Enjoy!
🔗 Generalized nullable operators
Today I was writing some Lua code and had to use something like this for the millionth time:
logger:fine("my message " .. (extra_data or ""))
Since operators in Lua fail when applied to null (and thankfully don’t do wat-eseque coercions), whenever I want to perform an operation on a value that may be null, I have to add the neutral element of the operation as a fall back:
print(a + (b or 0)) print(x * (y or 1))
This got me thinking of null-conditional operators such as ?.
that some other languages such as C# have.
Then I wondered: wouldn’t it be nice if “?
” could be a modifier to any operator?
Creating null-checking operators
Here’s the initial sketch of the idea: in case of a “nullable operator”, just cancel the operation when given a null operand: i.e., return the left-hand value in case the right-hand value is null.
Or, expressed in Haskell, here’s a function “rightCheckNullable” that takes a normal operator and converts it to a nullable version checking the right-hand value (”nullable types” are represented as “Maybe” types in Haskell):
rightCheckNullable :: (a -> b -> a) -> (a -> Maybe b -> a) rightCheckNullable fn = a b -> case b of Just x -> fn a x Nothing -> a
Let’s create some nullable operators:
(+?) = rightCheckNullable (+) -- nullable addition (*?) = rightCheckNullable (*) -- nullable multiplication (++?) = rightCheckNullable (++) -- nullable concatenation
And give them a spin:
main = let v1 :: Float v1 = 123 v2 = Nothing v3 :: Maybe Float v3 = Just 456 in do print $ show (v1 +? v2) -- prints 123.0 print $ show (v1 +? v3) -- prints 579.0 print $ show (v1 *? v2) -- prints 123.0 print $ "hello" ++? Just "world" -- prints helloworld print $ "hello" ++? v2 -- prints hello
With something like the above, instead of a + (b or 0) and x * (y or 1), one could write simply:
print(a +? b) print(x *? y)
This could give back some of the terseness we have when null auto-coerces to other types, without surprises with various operations. In JavaScript, null coerces to 0 when it is an integer, which gives us a proper neutral element for addition but not for multiplication.
Null-checking in C#
Note, however, that my choice of picking the right-hand value and checking the left-hand value only was arbitrary (though it works well for the examples above).
In C#, operations on nullable types are always lifted: the operators of the original types are extended with a check where, if either of the arguments is null, the result of the operation is null.
In Haskell, this transformation would be the following, taking a function that goes from a’s to b’s producing c’s, and producing an equivalent function that goes from Maybe a’s to Maybe b’s producing Maybe c’s:
bothCheckNullable :: (a -> b -> c) -> (Maybe a -> Maybe b -> Maybe c) bothCheckNullable fn = ma mb -> case ma of Nothing -> Nothing Just a -> case mb of Nothing -> Nothing Just b -> fn a b
(In Haskell, you don’t have to actually write this function, since can use Control.Applicative.liftA2, a generalization of the above, to get the same result)
Checking the left-hand value
This makes me think that my “nullable operator modifier” could be aplied to either side (or both). Note that the syntax for null-conditional in C# is already ?.
, with the question-mark on the left-hand side, since the value being checked for nullity is the left-hand one. We don’t want x?.y
to return y
when x
is null, though. A more sensible semantics for left-hand ? would be a “short-circuiting” one:
leftCheckNullable :: (a -> b -> c) -> (Maybe a -> b -> Maybe c) leftCheckNullable fn = a b -> case a of Just x -> fn x b Nothing -> Nothing
The flood gates are open!
There is still an asymmetry here, as rightCheckNullable is the only one that returns the “other value” when one of them is null.
In fact, we could have six versions of the conversion function: right-check, both-check, left-check, each of them returning the “other value” (as I did with +?) or null. If we called the C#-like version +??, this means addition could be modified into: +?, ?+, ?+?, +??, ??+, ??+??.
(And there could be two more variants of course, ?+?? and ??+?, but coming up with a realistic example using them would be a nice exercise in creative coding.)
But would it make sense to have so many modifiers?
Well, for one, instead of writing this
logger:severe("Error: connection failed" .. (details and (" - details: "..details) or ""))
we could write something like:
logger:severe("Error: connection failed" ..? (" - details: " ..?? details))
I know this is a step into the world of APL, and I’m not arguing this is a great or even good idea, but it was fun to think about, so I thought I’d share.
🔗 A small practical example where Lua’s % behavior is better than C’s
These days I saw on Twitter a comment on how the behavior of the % (remainder) operator for negative numbers is weird in C.
Under what circumstances does someone actually *want* the C behavior, anyway (-5 % 3 == -2; 5 % -3 == 2) ?
— mcc (@mcclure111) April 6, 2015
I’ve seen this discussion come up numerous times in the Lua mailing list over the years. The reason being because Lua does it different, and most languages simply copy the behavior of C.
Today I saw Etiene’s cool demo of a mini JavaScript Duck Hunt clone that she presented at a “intro to programming” workshop for the Women in Leadership event in Bremen, Germany.
It’s a really nice demo of game behavior in a short span of code, and with the environment of Mozilla Thimble, it instantly enticed me to play around with the code and see what happened.
The first thing that came to my attention was that the ducks spawn at position x=0, and this made them “pop” into the screen. I thought that changing the initial value to something like x=-50 would be a small change to try and would produce a smoother effect (just change 0 to -50 in lines 56 and 116).
When I first tried that, the result was that they would show up, but wouldn’t start flapping their wings until they were at x=0. The reason is because the logic to switch sprites is made testing x % 30 for values 0, 10 and 20… and JavaScript’s % operator, like C’s, returns negative remainders for negative divisors.
My quick hack solution was to calculate
var absx = Math.abs(this.x);
(which required me a visit to DuckDuckGo to figure out how to properly say “abs(x)” in JavaScript). This made the birds enter the screen flapping their wings. Yay!
Of course, this is not something you’d want to have to explain in an “intro to programming” workshop. It would be better if the animation “just worked” with that change…
But wait! If you have really sharp eyes, you’ll notice that from -50 to 0, the birds are flapping their wings upwards and from 0 on, they do it downwards. The animation is inverted!
The reason is because operating on abs(x) causes this:
Lua 5.3.0 Copyright (C) 1994-2015 Lua.org, PUC-Rio > for i = -50, 100 do io.write(math.abs(i)%30, " ") end 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 0 1 2 3 4 5 6 7 8 9 10
If I write a one-liner to simulate the sprite logic in Duck Hunt, I get this:
> for i = -50, 100 do r=math.abs(i)%30; io.write(r==0 and "1" or (r==10 and "2" or (r==20 and "3" or ".") ) ) end 3.........2.........1.........3.........2.........1.........2.........3.........1.........2.........3.........1.........2.........3.........1.........2
Indeed, it’s going 3,2,1, 3,2,1 at the negative numbers and then 1,2,3, 1,2,3 at the positive ones. But let’s just drop the math.abs in Lua and see what happens:
> for i = -50, 100 do r=i%30; io.write(r==0 and "1" or (r==10 and "2" or (r==20 and "3" or ".") ) ) end 2.........3.........1.........2.........3.........1.........2.........3.........1.........2.........3.........1.........2.........3.........1.........2
We get 1,2,3,1,2,3 all the way!
In my experience, the vast majority of times I used %, it was to tell something to “do this every X steps”, like Etiene does in her Duck Hunt. For this kind of purposes, I’m pretty convinced that Lua’s behavior for % is a lot better. It’s unfortunate that most other languages just decided to follow the example of C.
Of course, there are a million other ways to make the ducks flap their wings, with and without %, that’s not the point. But it intrigued me that, if JavaScript had Lua’s behavior for %, my initial tiny change would have “just worked”!
🔗 How to write Lua modules in a post-module() world
Our beloved function module()
is really going away. As of Lua 5.2 it’s only available with compatibility flags on, and the writing’s on the wall: it is going away for good in Lua 5.3. So, in a new Lua project I wrote this past semester, I decided to write it without using module()
, while making sure my code runs on both Lua 5.1 and 5.2 (as a side result, I started the compat52 project, which allows you to write code in a well-behaved Lua 5.2 style and make it run on both 5.1 and 5.2).
So, why I liked module()
in the first place? While vilified by some, I think the pros of module()
largely trumped its cons. It had indeed some nice properties:
- It provided much-needed policy for interoperability between modules - for the first time people were mostly writing Lua modules the same way and they worked with each other
- It encouraged documenting the module name in its argument, which is especially useful in a world without clearly defined path policies. One would often find out where to put the module by looking at its name. Nevermind the ill-advised suggestion of writing
module(...)
“so that you can name the module whatever you want” — users of a module must agree on a name so that all other modules that need it can call require() properly! - It pushed modules that return a table through require() — while Lua’s mechanisms for modules were too lax and resulted in spilling globals too often, consistent use of
module()
meant that you could rely on writinglocal foo = require("foo")
, which is “environmentally” clean idiom, albeit a bit repetitive. - You could nicely tell visibility through syntax: private functions declared with
local function
, public functions withfunction
. - Apart from the awkward
package.seeall
argument, use of module() was pretty-much boilerplate-free (I hate repetition in code, from the uglylocal print = print
idioms in Lua to the redundancy of .h and .c files in C).
So, how to try to retain some of these properties without module()? The solution I found was to adopt some bits of policy, which I list below.
Yes, bits of policy. I know many in the Lua world hate policies, but of course I’m not putting a gun against anyone’s head to follow them. I’m only sharing what works for me and hopefully you may find some use. And don’t worry it’s nothing too esoteric, and it’s mostly cherry-picking some established practices.
Starting from the outside in
Keeping in mind that the goal of a module is to be required by client code, this is how a module foo.bar will be used:
local bar = require("foo.bar") -- requiring the module bar.say("hello") -- using the module
An interesting observation here is that although we have a hierarchical structure of modules, the practice of loading them into locals means that in use they have to be accomodated in a flat namespace. So here’s Policy Bit #1:
Policy Bit #1: always require a module into a local named after the last component of the module’s full name.
Don’t do stuff such as local skt = require("socket")
— code is much harder to read if we have to keep going back to the top to check how you chose to call a module.
Naming modules
Now that you know that your module will end up in people’s locals, please take that into consideration when naming your module. (I wish we had a capitalization policy to separate that nicely, but naming things LikeThis in Lua tends to be used only for object-oriented code.)
The idea is to choose a name that strikes a balance between convenience and uniqueness, and that is usable. And what better way to achieve this other than using this name. So, here’s Policy Bit #2, let’s use the module name in its declaration!
Policy Bit #2: start a module by declaring its table using the same all-lowercase local name that will be used to require it.
So, in the beginning of module foo.bar (which will live in foo/bar.lua), we begin with:
local bar = {}
It’s not a nice self-documenting header as we used to have with module("foo.bar", package.seall)
, but it’s something. We can improve that with LDoc comments:
--- @module foo.bar local bar = {}
Don’t name your module something like “size”.
Declaring functions
When I’m scrolling through source code, I like to be able to tell what’s the sphere of influence of the piece of code I’m looking at. Is this a tiny helper function that’s only used below this line in this file? Is it an important function that’s used by other clients, so that an added or removed argument would mean API breakage? Ideally I like to be able to tell that without running back and forth in the code, so I really like visibility to be explicit in the syntax.
We must not declare global functions (or globals of any type, really!) in our modules, so using “globals vs. locals” to tell the difference won’t cut it. We have some alternatives, though. But first, let’s assert one thing:
Policy Bit #3: Use local function
to declare local functions only: that is, functions that won’t be accessible from outside the module.
That is, local function helper_foo()
means that helper_foo is really local.
This sounds obvious, but there are advocates of declaring all functions, public and private, as local functions and then writing an “export list” at the bottom of the module. Reading code written like this feels to me like a thriller with a twist ending: “haha, I was a public function all along!”
How to write public functions then? We must not declare global functions, but there are alternatives. Say we’re writing a function that will be used in client code as bar.say("hello")
. It’s nice that we can declare it just like that:
function bar.say(greeting) print(greeting) end
Policy Bit #4: public functions are declared in the module table, with dot syntax.
Visibility is made explicit through syntax. This is the same idea advocated by those who tell you to name your module tables “M”, except that you’re eating your own dogfood and using the name you expect your users to use. It’s also more consistent, since calls of say() are written bar.say()
everywhere, instead of say()
, M.say()
, etc. (Also, “M.” looks really really ugly and people can’t decide if they want to use “M” or “_M”.)
In case you have speed concerns about having your calls go through the module table: first, this is what your users will go through; second, this is no different than using colon-syntax and dispatching through self and nobody complains about that; third, if you really need it (and have a benchmarked case for it), sure go ahead and make locals for optimization clearly marked as such; fourth, if you’re really after speed you’re probably using LuaJIT and last I heard the value of caching functions into locals is put into question there.
Classes and objects
When talking about classes and objects, it’s then time to talk about things named LikeThis. (If you don’t do OOP, feel free to skip this section!)
As we did above, let’s start look from the outside in: how to instantiate an object. There are two common practices (oh why I am not surprised :( )… you either make a class table with a “new” method, or make the “class object” callable (as a function or a table with a __call metamethod — wait, that makes it three practices…)
local myset1 = Set.new() -- style 1 local myset2 = Set() -- style 2.1 (set is a function) local myset3 = Set() -- style 2.2 (set is a table)
If your module represents a class, I tend to like style 1 better because:
- it keeps the invariant that modules are tables
- it’s easy to store “static” methods as the other functions of the table
- it’s less magic — I often run modules through
for k,v in pairs(bar) do print(k,v) end
in the interactive prompt to get a quick look of what they export. - it just screams “I’m creating an object”
If all your module does is define a class, I guess it makes sense to name the module file MyClass.lua and have the class table be the module table. But I prefer not to do that, because often what we store as “static” class methods in purely OOP languages are really module functions. I still use the uppercase table when implementing the class, like this:
--- @module myproject.myclass local myclass = {} -- class table local MyClass = {} function MyClass:some_method() -- code end function MyClass:another_one() self:some_method() -- more code end function myclass.new() local self = {} setmetatable(self, { __index = MyClass }) return self end return myclass
It’s easy to see in the code above that the functions with MyClass
in their signature are methods. Sometimes it’s nice to declare the functions as fields inside the table declaration, but declaring methods separately as in the example above allows you to keep local helper functions closer to where they’re used.
If all the module does is declare the class, the class and module table may be one and the same. If you want to use style 2, we get something like this:
--- @module myproject.MyClass local MyClass = {} function MyClass:some_method() -- code end function MyClass:another_one() self:some_method() -- more code end local metatable = { __call = function() local self = {} setmetatable(self, { __index = MyClass }) return self end } setmetatable(MyClass, metatable) return MyClass
Both methods are acceptable, as long as it’s easy and obvious to tell you’re doing OOP:
Policy Bit #5: construct a table for your class and name it LikeThis so we know your table is a class.
Policy Bit #6: functions that are supposed to be used as object methods should be clearly marked as such, and the colon syntax is a great way to do it.
Don’t make people reading your function have to guess (or look up) if it is a method, a public module function or a local function.
Wrapping up
Return the module table. It’s a bit of boilerplate, but it’s what we have to deal with in a module()less world:
return bar
Policy Bit #7: do not set any globals in your module and always return a table in the end.
To sum it all up, a complete module foo.bar would look like this:
--- @module foo.bar local bar = {} local function happy_greet(greeting) print(greeting.."!!!! :-D") end function bar.say(greeting) happy_greet(greeting) end return bar
The result is that we type a bit more than we did with module(), and we risk polluting the global namespace if we’re not careful, but with this set of policies, we have:
- fairly self-documented code
- visibility rules readable through syntax
- modules that predictably return tables
- as much consistency and as little boilerplate as possible
…which mostly matches what I liked about module(), to the extent that can be done without _ENV tricks.
I’ve been using these policies successfully in a university project, and my plan is to follow them when I update the LuaRocks codebase to drop the use of module(). Consider your self encouraged to adopt some or hopefully all of them, but most importantly, whatever you do, be consistent! Good luck in this brave post-module() world!
🔗 Java: if you have trouble declaring a static hashmap…
Java (as of version 6, aka 1.6) does not allow you to declare a static HashMap as conveniently as an array. Still, you have the alternative of using a static block in your class to add fields. Take this example:
import java.util.HashMap; public class StaticHashMapTest { private final static HashMapconstants = new HashMap (); static { constants.put("A", "The Letter A"); constants.put("B", "The Letter B"); constants.put("C", "The Letter C"); } /* Rest of your class that needs to know the consts */ }
This works fine. But then you want to map something a little more complex than a string to another string. And I don't mean something very complex... just, say, a string to a string and an integer (yes, you'd like to use some kind of "pair object", but it looks like Java does not have it).
So you go and try to do things The Java Way (tm) and create a tiny class just to hold your two values:
import java.util.HashMap; public class StaticHashMapTest { private class Pair { final String name; final int number; public Pair(String name, int number) { this.name = name; this.number = number; } } private final static HashMap constants = new HashMap(); static { constants.put("A", new Pair("The Letter A", 123)); constants.put("B", new Pair("The Letter B", 456)); constants.put("C", new Pair("The Letter C", 789)); } /* Rest of your class that needs to know the consts */ }
This should suffice, right? I even made the Pair class private to my class, to ensure good information hiding (that's what Java is all about, right?). Turns out this fails to compile:
StaticHashMapTest.java:18: non-static variable this cannot be referenced from a static context constants.put("A", new Pair("The Letter A", 123)); ^ StaticHashMapTest.java:19: non-static variable this cannot be referenced from a static context constants.put("B", new Pair("The Letter B", 456)); ^ StaticHashMapTest.java:20: non-static variable this cannot be referenced from a static context constants.put("C", new Pair("The Letter C", 789)); ^ 3 errors
The error messages say that my "new" operators are failing due to the use of the "this" variable, which is not there at all! But hey, we can call "new" from a static context, can't we? We just did that when declaring the HashMap itself.
It turns out that the problem is that we're using an inner class. Objects from inner classes hold a "this" reference to their parent object (yes, as in myInnerObject.this.myParentAttribute... go figure), hence the trouble with the implicit "this" reference.
You have to make it a static inner class, which means it doesn't know anything about the enclosing class. Yes, that's yet another meaning for the word "static" in programming. Due to this peculiar meaning, inner classes are the only context where you can use the "static" qualifier to a class declaration in Java.
This, therefore, works:
import java.util.HashMap; public class StaticHashMapTest { private static class Pair { final String name; final int number; public Pair(String name, int number) { this.name = name; this.number = number; } } private final static HashMap constants = new HashMap(); static { constants.put("A", new Pair("The Letter A", 123)); constants.put("B", new Pair("The Letter B", 456)); constants.put("C", new Pair("The Letter C", 789)); } /* Rest of your class that needs to know the consts */ }
And that's Java for you.
Follow
🐘 Mastodon ▪ RSS (English), RSS (português), RSS (todos / all)
Last 10 entries
- What every programmer should know about what every programmer should know
- A degradação da web em tempos de IA não é acidental
- There are two very different things called "package managers"
- Last day at Kong
- A Special Hand
- How to change the nmtui background color
- Receita de Best Pancakes
- That time I almost added Tetris to htop
- Receita de Orange Chicken
- Receita de frango empanado no panko