Memory "compression" explained
So the memory compression approach turns out to work even better than I'd hoped. It's a bit hard to measure what level of compression is achieved because the data is now in metatables, and I don't think the Simulator's memory profiler can read those. I'm estimating the gains to be about three quarters though.
It relies heavily on string packing
The idea here is that storing small numbers in tables is very inefficient because lua has only a single number type. Whether it's the number 1, 35435636, 3.133333333, 5.13E12 or 202432451351634, it all takes the same amount of memory, depending on the system that would be 32 or 64 bits.
The numbers I use for my tilemap are all positive and smaller than 256. So, 8 bits should be enough. Considering a tile map could have dimensions of 300x300 tiles, with 5 properties each, that does add up to several megabytes. Add to that the overhead of storing it in nested tables instead of a binary array, and I was looking at 7MB of tile data. That's laugably inefficient, that's why I wrote "compression" in quotes: de-bloating might have been a better term.
To the rescue comes string packing, which lets you define a data format and convert the data to raw binary. So my 5 numbers in the range of 0-255 could be stored as 5 bytes written to a file.
When provided with the same format string, the data can be read just as easily. But, you can't read it all in memory, or it'll take up just as much memory as the original table would (remember Lua's inefficient number format).
The solution, then, is to keep these packed strings in memory, and unpack them whenever a tile is needed. The memory requirements for this are the size of all packed data, plus the size of a single unpacked tile.
One problem remains: the existing game logic heavily uses the tile map, and expects it to be a 3 dimensions ([column][row][tile property]) table. Rewriting all that would be a gruesome task.
To solve this last problem, we can turn to metatables
. With metatables, you can define the indexing operation. And in that indexing operation, we can unpack a tile. The implementation of loading a file is:
local brickT = {}
local format = levelProps.packFormat
local packSize = string.packsize(format)
local packSizeOffset = packSize-1
local unpackMeta = {
__index = function(tbl, idx)
-- tbl["compressed"] contains the packed string.
-- idx*5-4: each tile entry is 5 bytes long,
-- for idx 1 we should start reading at pos 1. So, 1*packSize-packSizeOffset = 1
return {unpack(format, tbl["compressed"], idx*packSize-packSizeOffset)}
end
}
local brickFile = file.open(path..".bin")
for x = 1, levelProps.sizeX do
brickT[x]= setmetatable({compressed = brickFile:read(5*levelProps.sizeY)}, unpackMeta)
end
Suppose our code needs the second property of the tile at x=5,y=3. It will be accessed as brickT[5][3][2]
. What happens under the hood?
brickT is an ordinary table. brickT[5]
returns a table containing all data for a column of tiles. This ,however, is a metatable without numerical entries. The only entry it has is called compressed
and it is a string of 5*sizeY bytes representing a column of tiles. When we index it with [3]
, that entry cannot be found, and the metatable is queried. This has the __index
method which calculates the byte-offset of the 3rd tile in the column-string, unpack a tile there and returns it. The returned value is a regular indexed table of size 5*, and the second entry [2]
can be accessed in the regular manner. We're done.
The great thing about the approach above is that the game logic is not aware that the table data is compressed. It requests brickT[5][3][2]
and gets what it expected. Note that if the code needs more of the tile than just entry [2], the tile should be stored in a local variable to prevent unpacking multiple times.
My last remaining fear was about performance: how much cpu time would the unpacking consume?
Luckily this turned out to be very minimal, perhaps 6% of the total cpu time according to the (Simulator!) profiler. There will be a compounding effect of earlier optimisation, because I already optimised the amount of redraw per frame for the tile-map. Not a lot of tile data is needed every frame.
@fosterdouglas and others have confirmed solid fps even with this technique applied; so I couldn't be happier
!
*the actual size is 6, as the unpacking operation inserts and extra number at the end, representing the index of the next byte in the packed data. We don't mind it being there; removing it costs cpu time while it does not bring us any advantage.