Collision checking library (a tale of reinventing the wheel and fighting the garbage collector)

This is the tale of how I implemented separating axis theorem collision checking for the Playdate, learned a bit about garbage collection and reinvented the wheel about 3 times in the process.
This post will be way too long, but hopefully it will be interesting or helpful to you, if you are stuck on collision checking or are having weird lags in your game.

To get your interest here is the end result (running on the Playdate):

I also released the code and the project you see above on Github:

Skip to the bottom for a TL;DR; of lessons learned.

Intro

Anyway, I needed some more complex collision checking for a game I was building. The SDK supports AABB collisions, but I needed full polygons (for reasons!).
I have been developing games on-and-off as a hobby for some years now and often rolled my own "physics", but usually in a crappy makeshift way. This was the time to do it right for once...

Lucky enough many people have done this before me and so I found a very helpful guide, which I was able to reproduce: Separating Axis Theorem (SAT) - Let's Make a Physics Engine [05] - YouTube

The first write

My first implementation was straight in Lua using the playgate.geometry.vector2D and playdate.geometry.polygon classes. After some initial hickups getting the vector directions and scalings right...
playdate-20240218-153134
...it basically worked great
playdate-20240126-223324

The first rewrite

I was however fighting with the SDK a bit:
The polygon class has helpful functions to access the individual vertices (needed for SAT). However these return a Point, not a Vector2D! Annoying! (since a Point does not have magnitude, dotProduct, etc.).
First I built the helper functions I needed for the Point class (leftNormal and dotProduct). Eventually I ditched the Polygon in favour of an array of Vector2Ds and build my own rendering function for it:

function drawV2DListAsPoly(list)
    for i=1, #list do
        local x1,y1 = list[i]:unpack()
        local x2,y2 = list[(i % #list)+1]:unpack()
        gfx.drawLine(x1,y1,x2,y2)
    end
end

-- same as drawV2DListAsPoly, but will draw each edge with increasing line width
-- this makes it easier to spot if poly is described in CW or CCW order
function drawV2DListAsPoly2(list)
    local w = gfx.getLineWidth()
    for i=1, #list do
        gfx.setLineWidth(i)
        local x1,y1 = list[i]:unpack()
        local x2,y2 = list[(i % #list)+1]:unpack()
        gfx.drawLine(x1,y1,x2,y2)
    end
    gfx.setLineWidth(w)
end

Fighting the garbage collector

This is the point when I started to develop the game more and hit a bottleneck pretty quickly: I was getting horrible lags.
You can kinda see it in the following gif. The moving ball stops from time to time and the fps drops from ~38 to ~33. (sorry for the choppy gif, Mirror keeps crashing on me whenever I try to record a video, so I had to do some workarounds).
2024-02-18 16-01-01
You see it even better in the Sampler of the Simulator:


Everybody who played Flipper Lifter will have also noticed this, since the game suffers from the same problem (at least that is my guess, since I am getting similar lags).

The code was generating too many temporary objects and the garbage collector did not get enough time to clean them up. So eventually it has to pause the main loop to clean-up or the Playdate would run out of memory.

We all know the following will create garbage, which needs to be cleaned up:

local objectA = playdate.geometry.vector2D.new(0,0)
-- the following will overwrite objectA, leaving the old data behind as garbage
objectA = playdate.geometry.vector2D.new(100,100)

But the following will also create two instances of garbage:

local objectA = playdate.geometry.vector2D.new(0,0)
local objectB = playdate.geometry.vector2D.new(10,10)
objectA = objectA + objectB * 2
-- objectB * 2 creates a temporary vector, since the multiply operator cannot modify either operand
-- objectA + [result of objectB * 2] creates another vector, which is then assigned to objectA, leaving the old data behind as garbage

This sort of calculation is very common in collision checking algorithms, so there was no way around it. Luckily the vector2D class implements the addVector and scale operators. So the better code would be:

objectB:scale(2)
objectA:addVector(objectB)

Still this is not ideal, since it modifies objectB and two function calls into C are involved. Also there are no in-memory variants of leftNormal and rightNormal for example.

The second rewrite

I ended up writing in-memory variants of the functions I needed for collision checking.

local function leftNormalIM(target, v)
    target.x, target.y = v.y, -v.x
end

local function dirNormalized(target, pa, pb)
    local len = math.sqrt((pa.x - pb.x)^2 + (pa.y - pb.y)^2)
    target.x, target.y = (pa.x - pb.x) / len, (pa.y - pb.y) / len
end

Running the new code did indeed improve the situation. Setting playdate.setCollectsGarbage(false) I used the sampler to inspect the number of frames between emergency garbage collection (basically spikes in the CPU usage like the one above).
Before I had this happen about every ~195th frame, now it was up to only every ~280th (a 40% improvement).

Unfortunately CPU usage was up by about 20% (55% -> 65% CPU time just for the collision checking of my simple example scene).
This was going nowhere, so there was only one solution:

The third rewrite (in C)

I rewrote the whole collision checking code in C. Leading to the result from the beginning of the post.

The example scene with 20 moving balls and one static polygon, all colliding with each other, runs at a pretty stable 50fps on the Playdate hardware (using about 50% CPU time for collision checking). The collision resolution is done in Lua, as is rendering, etc.

You can download a PDX to try the scene from the gif for yourself from the Github releases. It is quite mesmerizing to look at:

I am happy with the results so far, but there are still more improvements to be done. I will probably update the code as I go along. Feel free to contribute.

But for now, back to developing the actual game... :smiley:

Lessons Learned

  • Separating axis theorem is actually pretty straightforward to implement. I probably should have looked at the theory way sooner.
  • Be aware which parts of your code create garbage. Especially anything on the hot path should be avoided (i.e. running every update call or even multiple times per frame).
  • For this reason: avoid using the mathematical operators (+,-,*,/,negation) with playdate.geometry.vector2D and consider using scale, addVector, normalize instead
  • Implementing anything in C will speed it up by a lot (probably 10x or more). On the flipside development becomes harder (unclear error messages, two IDEs, two debuggers, cannot step from Lua into C code and vice-versa, more tools involved, etc.)
11 Likes

Interesting post! Admit I did start reading and wondered when the C coding would appear, since that's the classic Lua performance pattern (escape to C). You mention having to use two IDEs at the end though? Just wondering why that is reqd (time to try the C API myself I guess). I'm using Vscode and was assuming it could just handle Lua and C...

I probably should have reworded that, because it depends on some choices I made:

  • Develop and build on Windows
  • Using VisualStudio instead of NMake

If you are on Linux or go the NMake route, you can probably do everything in VSCode or Vim.
I just went with the VisualStudio route, because it seemed like the easier route and worked out of the box (including debugging). I never went back and tried NMake.
Honestly the two IDE thing is not the biggest deal. Being able to step from Lua to C and vice versa would be a bigger plus (not sure if that is possible when you do everything in one IDE)

Did you know the SDK has playdate.geometry.polygon:containsPoint(x, y) and polygon:intersec(p)? It is very fast.

Anyway, I made an attempt of a poly:poly SAT function using pure Lua and managed about 75% of the speed of containsPoint but far slower than just calling polygon:intersect(p) . You can also remove all the sqrts from the SAT algo if you're just looking for boolean collision not depth of intersection, as with containsPoint.