Image Implementation Questions

Couple of questions regarding the Image type's implementation

  • for image:draw(), is the data immediately sent to the display?

  • for pushContext() + draw() + popContext(), can someone provide intuition on why its much slower than direct draw() calls? (note: this is in the case of many draw() calls, only calling push/pop once, not wrapping each draw call)

  • is passing lua image types into C supported?

  • would it be possible to have an API which allows for image:draw( list of positions) for drawing the same image in multiple locations more efficiently?

cc'ing @matt @Nic who may know / or be interested (apologies if not!)

1 Like

for image:draw(), is the data immediately sent to the display?

No. By default it is drawn inside the frame buffer. You can get a copy of the frame buffer with the function playdate.graphics.getWorkingImage()
Before the next playdate.update(), the working frame buffer becomes the display frame buffer which will be send to the screen.

for pushContext() + draw() + popContext(), can someone provide intuition on why its much slower than direct draw() calls? (note: this is in the case of many draw() calls, only calling push/pop once, not wrapping each draw call)

I actually never realised it was significantly slower. It would be interesting to benchmark.
It might be slower for several reason. Maybe the system frame buffers are stores in a faster part of the RAM. pushContext() also has to set a bunch of states, did you try to call playdate.graphics.lockFocus() instead?

is passing lua image types into C supported?

yes. You can use playdate->lua->getBitmap(); and playdate->lua->pushBitmap();

Would it be possible to have an API which allows for image:draw( list of positions) for drawing the same image in multiple locations more efficiently?

Technically possible but I don't know if there will be much gain to it. I could imagine if some of the position have the same bit alignment some optimisation would be possible for example.

2 Likes

Yeah, after reading Benchmarks & Optimisations (Matt's benchmarks) I tried both approaches (just specified one for brevity), and they had roughly the same performance dip. Its not super relevant since its already doing a single push to the screen, I was interested in it as a form of batching in the case it wasn't happening.

I agree there wouldnt be much gain for the final question.

A bit more context, the game I'm working on is sitting pretty steadily at the 48fps full-screen-redraw capacity of the system (on device), but its getting close (tetering around 80% cpu). This is with ~30-50 draw calls per frame (which comprise ~50% of the total frame), with a fair bit of other computation going on (collision detection, animation stepping, enemy spawning, level tile paging, sound tracks, sound fx). Does this seem like a reasonable ballpark for the cap on how much drawing can be done per frame?

It depends. :upside_down_face:

Are you using sprites? They have their own overhead. But they offer drawing optimisations by tracking dirty areas and allow partial refresh of the display.

If you're not using sprites how exactly are you drawing? A little more detail may help.

Also the frame rate for full screen (48 at the moment but could be 50 I believe if Dave makes some optimisations) is only if all pixels are redrawn (for example clear screen before your draw). Without a clear screen the performance is based only on the number of changed rows that need to be sent to the display.

So a few recommendations. Try the sprite system to do partial redraws. Measure with an unlocked framerate setRefreshRate(0) to see how high you can go. When I did this for Daily Driver I got to between 70 and 90 FPS, which is why I decided to write a simple frame limiter to target 60fps.

Display can be driven at 200fps with only 75 rows changing, and 60 with 175 rows changing. I think I'm remembering the numbers correctly.

2 Likes

For sure, I'll try to answer each.

For how drawing works, its just

--lua  
-- info type for each entity set is just a packed set of [x, y, anim_index]
for i=1,#e.info do
   local d = e.info[i]
   e_img_set[d.anim_index]:draw(d.x, d.y)
end

I'm not using sprites, early on I measured sprites vs images for my use case, and they were significantly slower, due to a few factors: the overhead of moveTo/moveBy was pretty significant (things are moving all over the screen every frame), but it also precludes one from organizing data in a more efficient way (SOA, hot/cold splitting, for example). There were other factors such as the performance of the associated collision detection also being far too slow for my needs (this isn't a knock on the SDK, they just solved a far more general problem than I needed).

Edit: an additional note; things are added and removed often, this also factored in to the performance when measuring image vs sprites early on, in all the associated operations needed, a much simpler, faster thing could be done. Again this is due to control of data layout.

For partial screen redraws, this isn't really possible in this case (possibly why sprites performed so poorly), as every row is actually changing necessarily by design (beyond the action on screen, a vertically scrolling bg is drawn), so my question is more how far things can go within the 48 (possibly 50)fps 20ms frame.

Forgot to note on the last bit, I currently do run with an uncapped framerate

A typical frame looks like (on device)

These draw_ calls (just a local alias for image:draw), correspond to the various entity types, being drawn in the fashion described above, split by group (this layout provides a few advantages, but thats probably a chat for another day :))

My questions were answered, (thank you again v much @Nic )

I just wanted to add this for posterity, in case someone comes poking through curious about image drawing vs sprite drawing (or maybe @matt will find interesting in their benchmarking work), in the kinds of scenarios I described above (lots of removals, or lots of movement on many sprites). I dug up a small little test which exercises the latter (movement). Raw numbers aren't particularly important (since nothing else is going on, e.g. its not in a real game), rather the scaling properties was the focus here.

Tested on device, on newest firmware.

  • at 50 images ( sprites = 47fps, images=48fps),
  • at 100 images ( sprites = 37fps, images=48fps), ~30% faster
  • at 150 images ( sprites = 27fps, images=36fps), ~33% faster
  • at 200 images ( sprites = 20fps, images=29fps), ~45% faster
  • at 250 images ( sprites = 18fps, images=27fps), ~50% faster

image used:
image-1

import "CoreLibs/graphics"
import "CoreLibs/animation"
import "CoreLibs/sprites"
import "CoreLibs/timer"
import "CoreLibs/utilities/sampler"

gfx = playdate.graphics
snd = playdate.sound
dsp = playdate.display

gfx.setColor(gfx.kColorWhite)
dsp.setRefreshRate(0)

num_sprites = 250

images = {}
x_min = 10
x_max = 300
y_ = 30
x_ = 10
for i=1,num_sprites do 
    table.insert(images, {gfx.image.new("image-1.png"), x_, y_, 1})
    x_ = x_+3
    y_ = y_+1
end

sprites = {}

local image = gfx.image.new("image-1.png")
y_ = 30
x_ = 10
for i=1,num_sprites do 
    local sp = gfx.sprite.new(image)
    sp:add()
    sp:moveTo(x_, y_)
    table.insert(sprites, sp)
    x_ = x_+3
    y_ = y_+1
    sp.dir = 1
end

for i=1,num_sprites do
    assert(images[i][4] == sprites[i].dir)
    assert(images[i][3] == sprites[i].y)
    assert(images[i][2] == sprites[i].x)
end

local use_sprites = true
function playdate:update()
    gfx.clear()
    if playdate.buttonJustPressed(playdate.kButtonA) then
        use_sprites = not use_sprites 
    end

    if use_sprites == true then
        for i=1,#sprites do
            local spr = sprites[i]
            spr:moveTo(spr.x+10*spr.dir, spr.y)
            if spr.x >= x_max then
                spr.dir = -1
            elseif spr.x <= x_min then
                spr.dir = 1
            end
        end

	    gfx.setColor(gfx.kColorBlack)
        --sample("drawing", function() 
        gfx.sprite.update()
        --end)

	    gfx.setLineWidth(0.5)
        gfx.drawLine(0, 220, 400, 220, 2.0)
        gfx.drawText("sprite mode", 0, 0)
        playdate.drawFPS(340, 225)
    else
        for i=1,#images do
            local img = images[i]
            img[2] = img[2]+10*img[4]
            if img[2] >= x_max then
                img[4] = -1
            elseif img[2] <= x_min then
                img[4] = 1
            end
        end
 
        --sample("drawing", function() 
        for i=1,#images do
            local img = images[i]
            img[1]:draw(img[2], img[3])
        end
        --end)

        gfx.setLineWidth(0.5)
        gfx.drawLine(0, 220, 400, 220, 2.0)
        gfx.drawText("image mode", 0, 0)
        playdate.drawFPS(340, 225)
    end
end
5 Likes