Lua Stack Overflow without recursion

I'm getting the following, pretty sparse crash message on device:

rendering step	77601	406801
num Notes	0
num Notes	3
num Notes	2
num Notes	2
num Notes	2
num Notes	2
num Notes	1
num Notes	4
num Notes	0
num Notes	0
rendering step	78001	406801
num Notes	0
num Notes	1
num Notes	2
num Notes	2
num Notes	2
num Notes	2
num Notes	1
num Notes	2
num Notes	0
num Notes	0
stack overflow at Core/minilua/lapi.c:510
stack traceback:
	[C]: in ?

On simulator, the code works well.

Here is the function where this stack overflow happens. The general purpose is to draw midi notes as pixels to images called strips. steps is a time-unit in midi, so the larger stepWindow is the more notes will be processed during each coroutine invocation. It is a very demanding job that would take many seconds to complete. That's why it is run in a coroutine. There is no recursion, but there are some nested for-loops in here.

function View:buildStripsYielding()
    local stepWindow = lume.clamp(viewModel:getTempo(), 1, 400)
    local numTracks = viewModel.numTracks
    local numSteps = viewModel:getNumSteps()
    local stepsPerPixel = lume.clamp(
        numSteps / screenW,
        1, 4
    )
    if numSteps / stepsPerPixel > maxStripWidth then
        -- this would be too wide for comfort, bitmap size is limiting factor
        stepsPerPixel = numSteps / maxStripWidth
    end
    stripWidth = numSteps / stepsPerPixel
    print("steps per pixel", stepsPerPixel, "stripWidth", stripWidth)
    listView:setNumberOfRows(numTracks)
    for i = 1, numTracks do
        coroutine.yield()
        local curStrip = gfx.image.new(stripWidth, rowHeight)
        trackStrips[i] = curStrip
    end
    for step = 1, numSteps, stepWindow do
        coroutine.yield()
        print("rendering step", step, numSteps)
        for i = 1, numTracks do
            local curStrip = trackStrips[i]
            gfx.pushContext(curStrip)

            local notes = viewModel:getNotes(i, step, step + stepWindow)
            for _, note in ipairs(notes) do
                for curStep = note.step, note.step + note.length do
                    gfx.drawPixel(floor(curStep/stepsPerPixel), ((127-note.note) / 127) * rowHeight)
                end
            end
            gfx.popContext()
        end
    end
end

More specifically, when I remove the for step = 1, numSteps, stepWindow do loop and al its nested contents, the crash does not occur.
Lowering the stepWindow size (400) to a lower value might pospone the inevitable crash.
The crash does not happen immediately, it manages to run for a few seconds

Can someone explain what causes the stack overflow and how to resolve it?

The max for-loop depth seems to be 4, which I do not consider to be extreme.
For the playing song, the tempo and thus stepWindow is clamped to the maximum of 400. The program runs at 20 fps. We can then deduce that it crashed after 10 seconds (step 78000 / 400 / 20 = 9.75). That might raise suspicion as 10 seconds is the limit at which the sdk kills a game for being unresponsive. But I would say that's a red herring in this case, as the game loop is not blocked and the screen is indeed updated 20x per second while it is running.

Full project code: https://github.com/ninovanhooff/MIDI-Master/blob/da865ffb03c730fe675605331dd1e5ae9b2efb75/Source/editor/EditorView.lua#L231

The rather obvious solution to removing one level of for-loop nesting was to use drawLine rather than drawPixel, since all pixels were drawn in a horizontal line. This fixed the crash.

Still would be interested to know what the for-loop / stack size max depth is

A for loop shouldn’t be overflowing the stack. It uses constant space. My guess is that it’s not your code’s stack, but the stack of lua values the C API gets for scratch space. There’s probably a mismatched push/pop inside one of the functions that’s overflowing it.

When I used to write lua functions in C, I made a special macro that would make sure the stack didn’t unintentionally grow. It’s super easy to lose track of what you’re doing and leak stack space.

1 Like

Allright, trying again on hardware and this crash still occurs regularly.

As mentioned before, this code works well on simulator

@dave can this be promoted to a bug?

Is this still an issue? From the other thread it sounds like the only issue left is re-setting the track instruments while the sequence is playing.

And I remember now what held me up with looking into this, getting toybox.py set up wasn't the easiest thing. If you want me to take a look, can you post a pdx build where you were seeing the problem, and steps I should take to trigger the crash? Thanks!

Thanks for checking in.
I don't think this is still a problem. I didn't track what change or firmware update fixed it.

I'll re-open this when it turns out it is still an issue.