Benchmarks & Optimisations

During making my game and my endless quest for 60fps performance, I have found some interesting performance hacks.

I have a small benchmarking "game" that I developed to inspect performance of various commands between simulator and device, but have also used it to profile different versions of the system software.

The raw numbers (number of times that command can be called in a single frame/update) should not be focussed on, but comparing them can lead to some less-than-obvious choices when it comes to code.

Example

Comparing approaches to drawing a single-pixel horizontal line compared to drawLine, using Lua, on Device:

0.9.1 (I have older versions if needed)

  • 5% faster to use fillRect
  • 45% slower to use drawRect

0.10.0 (bizarrely slow compared to previous and following system software!?)

  • 15% faster to use fillRect
  • 29% slower to use drawRect

0.10.2 (much faster than previous system software)

  • 27% faster to use fillRect
  • 43% slower to use drawRect

0.11.1 (slightly slower than previous system software)

  • 31% faster to use fillRect
  • 41% slower to use drawRect

0.12.0 (generally faster than previous system software, almost as fast as it's ever been)

  • 46% faster to use fillRect
  • 42% slower to use drawRect

Here's a table for a bunch of functions:

OUTDATED, see below

0.12.0 vs 0.10.2 and 0.12.0 vs 0.9.1


edited: 2020-12-07

Others

Has anybody else found any other unexpected performance tweaks?

17 Likes

Here's the source for my benchmarking app: bench.zip (8.3 KB)

runs at 50fps by default

edit: 2020-12-19, added

  • fSpriteSetVisible,

edit: 2020-10-27, added

  • fSpriteSetImage,
  • fSpriteSetCenterStatic,
  • fSpriteSetCenterToggle,
  • fSpriteSetCenterRandom,

Code improvements accepted!

import "CoreLibs/graphics"
import "CoreLibs/sprites"
import "CoreLibs/timer"

local gfx <const> = playdate.graphics
local format <const> = string.format
local wait <const> = playdate.wait
local update <const> = playdate.graphics.sprite.update
local updateTimers <const> = playdate.timer.updateTimers
local cos <const> = math.cos
local sin <const> = math.sin
local floor <const> = math.floor
local random <const> = math.random
local drawLine <const> = playdate.graphics.drawLine
local wait <const> = playdate.wait
local drawText <const> = playdate.graphics.drawText
local drawTextInRect <const> = playdate.graphics.drawTextInRect
local drawRect <const> = playdate.graphics.drawRect
local fillRect <const> = playdate.graphics.fillRect
local drawCircleAtPoint <const> = playdate.graphics.drawCircleAtPoint
local fillCircleAtPoint <const> = playdate.graphics.fillCircleAtPoint
local drawCircleInRect <const> = playdate.graphics.drawCircleInRect
local fillCircleInRect <const> = playdate.graphics.fillCircleInRect
local lockFocus <const> = playdate.graphics.lockFocus
local unlockFocus <const> = playdate.graphics.unlockFocus
local pushContext <const> = playdate.graphics.pushContext
local popContext <const> = playdate.graphics.popContext

local now = playdate.getCurrentTimeMilliseconds

math.randomseed(0)    -- same random numbers every time
local function rnd(x)
    return random(0,x)
end

local playerSprite = nil
local FPS = 50
local frameMS = 1000/FPS
playdate.display.setRefreshRate(FPS)

if playdate.isSimulator then
    location = "SIMULATOR"
else
    location = "DEVICE"
end

local W <const> = playdate.display.getWidth()
local H <const> = playdate.display.getHeight()
local CW <const> = W/2
local CH <const> = H/2

local start = 0
local count = 0
local cmd = 1
local runs = 5
local cmdName = ""
local done = false
local testImage <const> = gfx.image.new( "Images/background" )
local testBack = gfx.image.new( "Images/background" )
local testSprite <const> = gfx.image.new("Images/playerImage")

local function fNil() end
local function fDrawLineDiagonal() drawLine(0, 0, W, H) return "drawLine - Diagonal" end
local function fDrawLineHorizontal() drawLine(0, CH, W, CH) return "drawLine - Horizontal" end
local function fDrawLineVertical() drawLine(CW, 0, CW, H) return "drawLine - Vertical" end
local function fDrawLineRandomDiagonal() drawLine(0, rnd(H), W, rnd(H)) return "drawLine - Random Diagonal" end
local function fDrawLineFillRect() fillRect(0, CH, W, 1) return "drawLine - fillRect" end
local function fDrawLineDrawRect() drawRect(0, CH, W, 1) return "drawLine - drawRect" end
local function fMathRandom() math.random(0,999) return "math.random" end
local function fMathRandomLocal() random(0,999) return "math.random - local" end
local function fMathSin() sin(90) return "math.sin" end
local function fMathSinRandom() sin(rnd(359)) return "math.sin - random" end
local function fMathCos() cos(90) return "math.cos" end
local function fMathCosRandom() cos(rnd(359)) return "math.cos - random" end
local function fMathFloor() floor(1.23) return "math.floor - local" end
local function fImageSample() testImage:sample(0, 0) return "image:sample" end
local function fDrawText() drawText("TEST!", CW-22, CH+50) return "drawText - local" end
local function fDrawTextInRect() drawTextInRect("TEST!", CW-25, CH+50, 50, 50, nil, nil, kTextAlignment.center) return "drawTextInRect" end
local function fDrawRect() drawRect(CW-50, CH-50, 100, 100) return "drawRect" end
local function fFillRect() fillRect(CW-50, CH-50, 100, 100) return "fillRect" end
local function fDrawCircleAtPoint() drawCircleAtPoint(CW, CH, 100) return "drawCircleAtPoint" end
local function fFillCircleAtPoint() fillCircleAtPoint(CW, CH, 100) return "fillCircleAtPoint" end
local function fDrawCircleInRect() drawCircleInRect(CW-50, CH-50, 100, 100) return "drawCircleInRect" end
local function fFillCircleInRect() fillCircleInRect(CW-50, CH-50, 100, 100) return "fillCircleInRect" end
local function fSpriteMoveToStatic() playerSprite:setVisible(true) playerSprite:moveTo(CW, CH) return "sprite:moveTo - static" end
local function fSpriteMoveToRandom() playerSprite:setVisible(true) playerSprite:moveTo(rnd(W), rnd(H)) return "sprite:moveTo - random" end
local function fSpriteSetImage() playerSprite:setImage( testSprite ) return "sprite:setImage" end
local function fSpriteSetCenterStatic() playerSprite:setCenter(0, 0) return "sprite:setCenter - static" end
local function fSpriteSetCenterToggle() playerSprite:setCenter(playerSprite.x and 400 or 0, 0) return "sprite:setCenter - toggle" end
local function fSpriteSetCenterRandom() playerSprite:setCenter(rnd(399), 0) return "sprite:setCenter - random" end
local function fSpriteSetZIndex() playerSprite:setZIndex(1) return "sprite:setZIndex" end
local function fDraw() testSprite:draw(0,0) return "image:draw" end
local function fDrawLocked() gfx.lockFocus(testBack) testSprite:draw(0,0) gfx.unlockFocus() return "image:draw - locked" end
local function fDrawLockedLocal() lockFocus(testBack) testSprite:draw(0,0) unlockFocus() return "image:draw - locked local" end
local function fDrawPushContext() pushContext(testBack) testSprite:draw(0,0) popContext() return "image:draw - pushcontext local" end
-- local function () return "" end

funcs = {
    fNil,
    fDrawLineDiagonal,
    fDrawLineHorizontal,
    fDrawLineVertical,
    fDrawLineRandomDiagonal,
    fDrawLineFillRect,
    fDrawLineDrawRect,
    fMathRandom,
    fMathRandomLocal,
    fMathSin,
    fMathSinRandom,
    fMathCos,
    fMathCosRandom,
    fMathFloor,
    fImageSample,
    fDrawText,
    fDrawTextInRect,
    fDrawRect,
    fFillRect,
    fDrawCircleAtPoint,
    fFillCircleAtPoint,
    fDrawCircleInRect,
    fFillCircleInRect,
    fSpriteMoveToStatic,
    fSpriteMoveToRandom,
    fSpriteSetImage,
    fSpriteSetCenterStatic,
    fSpriteSetCenterToggle,
    fSpriteSetCenterRandom,
    fSpriteSetZIndex,
    fDraw,
    fDrawLocked,
    fDrawLockedLocal,
    fDrawPushContext,
}
local max = #funcs

function myGameSetUp()

    local playerImage = gfx.image.new("Images/playerImage")
    assert( playerImage ) -- make sure the image was where we thought

    playerSprite = gfx.sprite.new()
    playerSprite:setImage( playerImage )
    playerSprite:setCenter( 0.5, 0.5 )
    playerSprite:moveTo( CW, CH )
    playerSprite:setVisible(false)
    playerSprite:add() -- This is critical!

    local backgroundImage = gfx.image.new( "Images/background" )
    assert( backgroundImage )
    lockFocus(backgroundImage)
    playdate.graphics.drawTextInRect("*BENCH*", CW-50, CH-7, 100, 50, nil, nil, kTextAlignment.center)
    unlockFocus(backgroundImage)

    gfx.sprite.setBackgroundDrawingCallback(
        function( x, y, width, height )
            gfx.setClipRect( x, y, width, height ) -- just draw what we need
            backgroundImage:draw( 0, 0 )
            gfx.clearClipRect()
        end
    )
end

myGameSetUp()

function playdate.gameWillResume()
    cmd = 1
    start = 0
    wait(300)
end

function playdate.update()
    -- playdate.graphics.sprite.update()
    -- playdate.timer.updateTimers()
    update()
    updateTimers()

    if start == 0 then
        start = now()
    end

    if cmd == 1 then
        print(location.." ("..runs.." RUN AVE)\n")
        print("#,"," BENCH,","CALL")
    end

    -- run command for a frame worth of milliseconds
    if start > 0 and cmd > 0 and cmd <= max then -- approx calls: 104000 sim, 3000 device
        for i = 1,runs do
            while now() < start+frameMS do
                cmdName = funcs[cmd]()

                count = count + 1
            end
            start = now()
        end
        done = true
    end

    if done == true then
        print(format("%02d,\t%6d,", cmd, count//runs), cmdName)

        done = false
        count = 0
        start = 0
        cmd = cmd + 1
    end

    if cmd == max+1 then
        print("\nEND")
        cmd = cmd + 1
        playerSprite:setVisible(false)
        playdate.display.setRefreshRate(30)
    end
end
5 Likes

To talk a bit about optimisations.

local

As well as variables benefiting from being local, every function you call benefits from being local too! After all, everything is just a table in Lua it seems.

So you can see in my code above that every function I use is redeclared as a shorter-named, local version. This results in a considerable performance gain. That includes doing it to things SDK functions like playdate.graphics.sprite.update()!

(I've also updated the table in the OP)

4 Likes

Matt, this is really cool! We are now talking about doing this sort of analysis internally for all of our functions.

2 Likes

Wow! Great to hear that. This will be a benefit to all games and the ecosystem as a whole. That's really cool too :sunglasses:

I've never written a benchmark before, so I'm happy it's been useful. I would think there are better ways to benchmark, but this was the first way I came up with and it seemed to work :smiling_face:

I still haven't had a chance to step back from development and bug fixes to dig into this, but when I was looking at optimizing the image rotation code I'd see variation of +/- 10% between builds that didn't have any obvious cause. The best I could figure was changes in memory and code layout causing different caching behavior.

2 Likes

You're absolutely right that there are factors at play outside of the code.

I run the benchmarks a few times and take an average, but even then - if I press menu a couple of times to re-run the benchmark - I have seen slightly varying results. Moreso if I quit the app and rerun the app, things can change again, and if I restart the device they can change yet again. But, in my testing, the general pattern of the results stays much the same. I made sure to test all firmware versions in sequence, exactly the same way every time, so each is tested immediately after a fresh install.

So, it's a complicated situation. You have my utmost respect working on this :raised_hands:

1 Like

Updated to allow comparisons between:

  • sprite:moveTo
  • sprite:setImage
  • sprite:setCenter

Latest code above.

Table not updated.

1 Like

Latest figures show that 0.12.0 is looking good!

Generally as fast as it's ever been with some small outliers (red and yellow) but mostly the same (grey) compared with 0.10.2 and 0.9.1

4 Likes

Matt, i'm finally getting around to setting up some automation around your benchmark suite. Have you updated your suite since you attached the zip above?

Sorry, I see now you've been editing previous posts with updates.

1 Like

I doubt my local version is much different, I think this is up-to-date.

Matt, would you mind sending me your historical data?

We really appreciate this work you've done!

1 Like

Sure, I'll send it when I next get to my Mac.

Have just emailed you 2x CSV files @james

Also slightly updated code above, one new benchmark (sprite:setVisible) had been added locally by me since October.

1 Like

2 years later...

Comparing 1.13.0 vs 1.12.3 vs 1.11.1 (recovery firmware)

Wondering if there have been compilation settings changes?

On 1.13.0 I'm seeing a drop in performance almost right across the board. Ave ~6% less.

Also 1.12.3 also had some changes, maths was slower but sprites were faster, and was on average as fast as 1.11.1. Ave ~1% slower.

Since 0.12.3

  • "empty" update loop is ~12% slower
  • math functions are 10–30% slower
  • image draw ~20% slower
  • draw text is ~10% slower

Since 0.13.0

  • image sample is ~35% slower
  • sprite set visible is ~30% slower
  • sprite set image is ~25% slower
  • sprite set properties is ~10–15% slower
  • sprite set zindex ~20% slower
  • draw text is ~10% slower

but

  • drawing circles ~10% faster (since 1.12.3)
  • sprite move ~10% faster (since 1.13.0)
  • math random ~20% faster (since 1.13.0)

2 Likes

Thanks for doing this Matt!

Very curious that maths would be so much slower. You would think there is not much code to implement or change for those.

1 Like

fwiw, I see a regression on the C side as well (running the same binary, not recompiled, on 1.12.3 vs 1.13)

notably, audio seems to have gotten ever more expensive ~8% up to ~11% on average

1.12.3

1.13

On a more personal note, this stuff can be somewhat demoralizing when I work so hard on nailing down performance, I know its hard to track everything but I feel perf testing should gate releases.

To not end on a downer, the frame rate seems to have ticked up a half frame, and become a bit more stable, thank you!

4 Likes

I've added results for 0.12.3 in my table, which in hindsight is to blame for some of the stuff I'm seeing.

I also hadn't bothered to run the benchmark for some time.

Fingers cross it's easy to spot the problem(s).

I know how hard 0.13.0 release has been for the team — you have my utmost respect!

Agree, you can't have it all, and priorities are a fact of life.

I'm glad the Catalog sdk release is out the door, that should hopefully bring release frequency back to a bi-monthly schedule.

It would be great if performance would be the focus for one of the upcoming releases.

1 Like