hisham hm

🔗 Lua string concatenation considered not harmful

A user in the Lua mailing list recently asked the following question:

yield( splits[i-1][1]..word[i+1]..word[i]..splits[i+2][2] )

I tried table.concat and string.format, but both perform worst. This was
counter-intuitive to me, because Lua string concat generates copies of
intermediate strings. However, seems that for short strings and small number
of concatenated strings, string __concat performs better than string.format
or table.concat. Does anyone know if my observation is true?

The “folk wisdom” about copies of intermediate strings in Lua is often mis-stated, I think.

("aa"):upper() .. ("bb"):upper() .. ("cc"):upper() .. ("dd"):upper()

It translates to a single concatenation bytecode in both Lua and LuaJIT, so it produces the following strings in memory over the course of its execution:

"aa"
"bb"
"cc"
"dd"
"AA"
"BB"
"CC"
"DD"
"AABBCCDD"

This, on the other hand, does generate intermediate strings:

local s = ""
for _, w in ipairs({"aa", "bb", "cc", "dd"})
   s = s .. w:upper()
end

It produces

""
"aa"
"bb"
"cc"
"dd"
"AA"
"BB"
"CC"
"DD"
"AABB"
"AABBCC"
"AABBCCDD"

Notice the little pyramid at the end. This pattern is the one that people tell to avoid when they talk about “intermediate strings”. For a loop like that, one should do instead:

local t = {}
for _, w in ipairs({"aa", "bb", "cc", "dd"})
   table.insert(s, w:upper())
end
local s = table.concat(t)

That will produce:

"aa"
"bb"
"cc"
"dd"
"AA"
"BB"
"CC"
"DD"
"AABBCCDD"

plus an extra table. Of course this is an oversimplified example for illustration purposes, but often the loop is long and the naive approach above can produce a huge pyramid of intermediate strings.

Over the years, the sensible advice was somehow distorted into some “all string concatenation is evil” cargo-cult, but that is not true, especially for short sequences of concatenations in an expression. Using a..b..c will usually be cheaper and produce less garbage than either string.format(”%s%s%s”, a, b, c) or table.concat({a, b, c}).


Latest posts


Search


Admin area


Feeds