Is there a C version of this benchmark ? I seem to remember there was some discussion on porting it to C api but not sure it has ever been done
has anyone done a REV A vs REV B comparison using this benchmark ?
Comparing to Benchmarks & Optimisations - #27 by matt
I'm getting much better results on rev A hardware, firmware 2.1.1.
Any explanation for this?
Has the testing time per function been increased maybe?
These are the timing values I see in the script I used, which come down to 100 ms per test.
local FPS = 50
local frameMS = 5000/FPS
playdate.display.setRefreshRate(FPS)
#, BENCH, CALL
nil 14854
drawLine - Diagonal 387
drawLine - Horizontal 1846
drawLine - Vertical 454
drawLine - Random Diagonal 687
drawLine - fillRect 4537
drawLine - drawRect 935
math.random 6736
math.random - local 6398
math.sin 5814
math.sin - random 2848
math.cos 5818
math.cos - random 2598
math.floor - local 6259
image:sample 1680
drawText - local 444
drawTextInRect 55
drawRect 454
fillRect 849
drawCircleAtPoint 221
fillCircleAtPoint 280
drawCircleInRect 379
fillCircleInRect 479
sprite:moveTo - static 1206
sprite:moveTo - random 795
sprite:setImage 1750
sprite:setCenter - static 1477
sprite:setCenter - toggle 1353
sprite:setCenter - random 1125
sprite:setZIndex 3054
image:draw 1451
image:draw - locked 782
image:draw - locked local 746
image:draw - pushcontext local 816
not sure its related to these tests, but dave said the new hardware (rev b units) are more sensitive to cache misses, it has caused performance issues before with drawblurred and rotating bitmaps 90 degrees Performance of graphics.image:drawBlurred method on Rev B hardware - #3 by dave
edit: i just ran the benchmark on my rev a unit by compiling the source code using latest pdc.exe from latest sdk version from the 1st post (lua) and these where my results on my REV A unit:
They seem to be more in line with the values posted from other firmware's although some seem lower now, while it seems with you it actually ran more calls ?
DEVICE (5 RUN AVE)
#, BENCH, CALL
01, 1590, nil
02, 73, drawLine - Diagonal
03, 301, drawLine - Horizontal
04, 85, drawLine - Vertical
05, 124, drawLine - Random Diagonal
06, 584, drawLine - fillRect
07, 161, drawLine - drawRect
08, 791, math.random
09, 749, math.random - local
10, 655, math.sin
11, 428, math.sin - random
12, 822, math.cos
13, 547, math.cos - random
14, 1345, math.floor - local
15, 258, image:sample
16, 84, drawText - local
17, 11, drawTextInRect
18, 85, drawRect
19, 152, fillRect
20, 41, drawCircleAtPoint
21, 52, fillCircleAtPoint
22, 69, drawCircleInRect
23, 85, fillCircleInRect
24, 181, sprite:moveTo - static
25, 130, sprite:moveTo - random
26, 260, sprite:setImage
27, 222, sprite:setCenter - static
28, 221, sprite:setCenter - toggle
29, 175, sprite:setCenter - random
30, 399, sprite:setZIndex
31, 220, image:draw
32, 137, image:draw - locked
33, 131, image:draw - locked local
34, 128, image:draw - pushcontext local
END
edit 2: i see a difference in timings the original sources i just downloaded showed this :
while you do 5000 / FPS
So the benchmark runs longer with you (5 times longer) and can do more calls per benchmark because the timeframe is different, so your results posted above in theory need to be divided by 5
edit 3: it would seem you downloaded dave's version while this is the Original where the numbers were previously based on
for reference here are the numbers of my REV A as well (i think nino's are REV A Also)
echo off
time and date set
#, BENCH, CALL
nil 14831
drawLine - Diagonal 390
drawLine - Horizontal 1833
drawLine - Vertical 449
drawLine - Random Diagonal 683
drawLine - fillRect 4611
drawLine - drawRect 954
math.random 6028
math.random - local 7295
math.sin 7293
math.sin - random 3169
math.cos 7167
math.cos - random 3049
math.floor - local 8664
image:sample 1948
drawText - local 455
drawTextInRect 55
drawRect 456
fillRect 848
drawCircleAtPoint 219
fillCircleAtPoint 262
drawCircleInRect 361
fillCircleInRect 467
sprite:moveTo - static 1259
sprite:moveTo - random 782
sprite:setImage 1802
sprite:setCenter - static 1750
sprite:setCenter - toggle 1444
sprite:setCenter - random 1212
sprite:setZIndex 2879
image:draw 1381
image:draw - locked 745
image:draw - locked local 752
image:draw - pushcontext local 759
END
@joyrider3774 asked if I could run these benchmarks on my Rev B Playdate, and I'm happy to, especially as I'm also trying to figure out Rev A vs Rev B performance differences in my music player app. I'll post about that in a separate thread soon.
Results from original version and Dave's version
Original (with 5 second delay, crank enabled)
DEVICE (5 RUN AVE)
#, BENCH, CALL
01, 1299, nil
02, 70, drawLine - Diagonal
03, 337, drawLine - Horizontal
04, 83, drawLine - Vertical
05, 118, drawLine - Random Diagonal
06, 783, drawLine - fillRect
07, 177, drawLine - drawRect
08, 849, math.random
09, 798, math.random - local
10, 906, math.sin
11, 470, math.sin - random
12, 892, math.cos
13, 461, math.cos - random
14, 860, math.floor - local
15, 449, image:sample
16, 110, drawText - local
17, 14, drawTextInRect
18, 89, drawRect
19, 159, fillRect
20, 46, drawCircleAtPoint
21, 56, fillCircleAtPoint
22, 73, drawCircleInRect
23, 101, fillCircleInRect
24, 375, sprite:moveTo - static
25, 215, sprite:moveTo - random
26, 489, sprite:setImage
27, 457, sprite:setCenter - static
28, 470, sprite:setCenter - toggle
29, 309, sprite:setCenter - random
30, 686, sprite:setZIndex
31, 354, image:draw
32, 184, image:draw - locked
33, 179, image:draw - locked local
34, 185, image:draw - pushcontext local
END
Dave version (with 5 second delay, crank enabled)
#, BENCH, CALL
nil 7730
drawLine - Diagonal 368
drawLine - Horizontal 1791
drawLine - Vertical 437
drawLine - Random Diagonal 619
drawLine - fillRect 4492
drawLine - drawRect 945
math.random 4838
math.random - local 5080
math.sin 5478
math.sin - random 2594
math.cos 5606
math.cos - random 2604
math.floor - local 5659
image:sample 2525
drawText - local 577
drawTextInRect 74
drawRect 451
fillRect 773
drawCircleAtPoint 238
fillCircleAtPoint 297
drawCircleInRect 418
fillCircleInRect 550
sprite:moveTo - static 1938
sprite:moveTo - random 1149
sprite:setImage 2177
sprite:setCenter - static 2456
sprite:setCenter - toggle 2421
sprite:setCenter - random 1688
sprite:setZIndex 3774
image:draw 1952
image:draw - locked 962
image:draw - locked local 957
image:draw - pushcontext local 963
END
While testing my app, I noticed that connecting my Playdate to the macOS simulator and clicking 'Control Device with Simulator' improved performance by a significant amount (iirc around 15%, enough to make Opus audio playback real time). The responsible serial command is disablecrank
. I added a 5 second delay before starting the benchmark, so I could disable the crank via the Simulator option.
Results from both versions with crank disabled
Original (with 5 second delay, crank disabled)
DEVICE (5 RUN AVE)
#, BENCH, CALL
01, 1298, nil
02, 78, drawLine - Diagonal
03, 357, drawLine - Horizontal
04, 92, drawLine - Vertical
05, 129, drawLine - Random Diagonal
06, 835, drawLine - fillRect
07, 180, drawLine - drawRect
08, 893, math.random
09, 880, math.random - local
10, 969, math.sin
11, 496, math.sin - random
12, 973, math.cos
13, 513, math.cos - random
14, 946, math.floor - local
15, 462, image:sample
16, 127, drawText - local
17, 15, drawTextInRect
18, 96, drawRect
19, 172, fillRect
20, 51, drawCircleAtPoint
21, 62, fillCircleAtPoint
22, 87, drawCircleInRect
23, 112, fillCircleInRect
24, 384, sprite:moveTo - static
25, 238, sprite:moveTo - random
26, 500, sprite:setImage
27, 468, sprite:setCenter - static
28, 494, sprite:setCenter - toggle
29, 353, sprite:setCenter - random
30, 684, sprite:setZIndex
31, 384, image:draw
32, 190, image:draw - locked
33, 197, image:draw - locked local
34, 201, image:draw - pushcontext local
END
Dave version (with 5 second delay, crank disabled)
#, BENCH, CALL
nil 7836
drawLine - Diagonal 401
drawLine - Horizontal 1826
drawLine - Vertical 477
drawLine - Random Diagonal 666
drawLine - fillRect 4958
drawLine - drawRect 1017
math.random 5109
math.random - local 5574
math.sin 6054
math.sin - random 2645
math.cos 6038
math.cos - random 2639
math.floor - local 6284
image:sample 2757
drawText - local 664
drawTextInRect 79
drawRect 493
fillRect 895
drawCircleAtPoint 261
fillCircleAtPoint 324
drawCircleInRect 451
fillCircleInRect 591
sprite:moveTo - static 2090
sprite:moveTo - random 1260
sprite:setImage 2691
sprite:setCenter - static 2654
sprite:setCenter - toggle 2634
sprite:setCenter - random 1845
sprite:setZIndex 3824
image:draw 2112
image:draw - locked 1038
image:draw - locked local 1058
image:draw - pushcontext local 1055
END
@dave, do you happen to know why the option is making such a difference in performance?
I've put the REV B with cranck and no cranck in an excel sheet and compared the values to my Rev A results, its indeed weird that disabling the cranck on a REV B seems to have an impact
original version. Overal REV B also seems faster than REV A except on some math stuff
Original version (https://1drv.ms/x/s!AmOKrDXp7rpjlOZdcHmWJTxAuSoBYw?e=vOa1Zk)
Dave's version ( https://1drv.ms/x/s!AmOKrDXp7rpjlOZbB1yiegFswN3yXQ?e=4OtrfA )
Holy crap, you're right. For some reason on the rev B board the crank sampling is taking way longer than it does on rev A. I'm looking into this now, not sure whether we can get a fix in for 2.2, but if not it'll be soon after.
Here's an interesting Lua thing I discovered whilst looking into constant folding and propagation during compilation, which is when a compiler replaces constants with their values if it sees that they won't change at runtime (there are exceptions). I'm don't pretend to know how this applies to the Lua bytecode compilation of Playdate SDK, or running that bytecode in the interpreter. Regardless...
So, let's say in your game you do some maths and stuff across a bunch of variables, some of which are based on constants defined earlier in the program flow. I do this all the time.
I also have values written out as maths, like
gfx.drawText("hello", 20+10+2-6, 120+10)
, arrived as as I adjust the position of things on screen (so I can remove the most recent addition/subtraction to get back to the previous value) and which I never get around to shortening. Good to know they get optimised out during compilation!
Anyway, consider these two functionally identical (almost exactly identical) snippets:
-- math
local a = 30
local b = 9 - (a / 5)
local c
local d
c = b * 4
if (c > 10) then
c = c - 10
end
d = 60 / a
-- a = 30, c = 2, res = 4
local function fMathOrderA()
res = c * (60 / a)
return "math orderA"
end
local function fMathOrderB()
res = (60 / a) * c
return "math orderB"
end
local function fMathOrderC()
res = d * c
return "math orderC"
end
- Device (rev A): fMathOrderB is ~10% slower
- Device (rev A): fMathOrderC is ~2.5% slower
Lesson: the order of execution of your maths seems to matter more than expected
Could it be summarised as "put your constants first (towards the left)"? Not sure.
Or could all this just be caching symptoms?
I tested the above code with
which = 0
function playdate.update()
local start = playdate.getCurrentTimeMilliseconds()
if which == 0 then
for i=1,100000 do fMathOrderA() end
elseif which == 1 then
for i=1,100000 do fMathOrderB() end
else
for i=1,100000 do fMathOrderC() end
end
print("test "..which.." elapsed: "..(playdate.getCurrentTimeMilliseconds()-start).." ms")
which = (which+1)%3
end
and I get C 17% faster than A (make sense) and B 5% faster (). On an H7 unit that's 10% and 4%, respectively.
Here's what I get from echo "res = c * (60 / a)" | luac -l -p -
:
main <stdin:0,0> (10 instructions at 0x600000d0c080)
0+ params, 3 slots, 1 upvalue, 0 locals, 3 constants, 0 functions
1 [1] VARARGPREP 0
2 [1] GETTABUP 0 0 1 ; _ENV "c"
3 [1] GETTABUP 1 0 2 ; _ENV "a"
4 [1] LOADI 2 60
5 [1] DIV 1 2 1
6 [1] MMBIN 2 1 11 ; __div
7 [1] MUL 0 0 1
8 [1] MMBIN 0 1 8 ; __mul
9 [1] SETTABUP 0 0 0 ; _ENV "res"
10 [1] RETURN 0 1 1 ; 0 out
and here's echo "res = (60 / a) * c" | luac -l -p -
:
main <stdin:0,0> (10 instructions at 0x600003194080)
0+ params, 2 slots, 1 upvalue, 0 locals, 3 constants, 0 functions
1 [1] VARARGPREP 0
2 [1] GETTABUP 0 0 1 ; _ENV "a"
3 [1] LOADI 1 60
4 [1] DIV 0 1 0
5 [1] MMBIN 1 0 11 ; __div
6 [1] GETTABUP 1 0 2 ; _ENV "c"
7 [1] MUL 0 0 1
8 [1] MMBIN 0 1 8 ; __mul
9 [1] SETTABUP 0 0 0 ; _ENV "res"
10 [1] RETURN 0 1 1 ; 0 out
The code is pretty much the same, but the first one does use one more stack slot. Maybe that's the difference? Not sure what's going on in your test, though..
Thanks Dave, particularly for the decompilation dumps.
At this point, me neither
Nim Bindings vs Lua
I've been trying Nim Bindings lately. Was very interested in the performance comparison of Nim vs Lua. If we assume Nim to be very close to C performance, you can also see this as a rough indication of Lua vs. native performance.
For a fair comparison, I implemented Matt's benchmark as posted in this thread to Nim.
As noted before, there is some variability between runs. Also for the Nim results. For Nim, I would expect timing accuary to account for the variability. I'd say the variability is lower for Nim that for Lua.
I'm using the latest published Nim bindings, and like that @samdze showed me some improved results using experimental compiler optimizations which widen the performance gap between Lua and NIm. His results indicate a performance increase of 7.5x. This is a combination of (unexplained) worse lua performance on his device compared to mine, plus an 1.6x performance increase from the Nim bindings I used to his experimental LTO branch. The results by @samdze can be viewed here
Source (on a branch for another project, sorry for the messy organisation): use C init event for bench · ninovanhooff/Nim-Snake-Playdate@0247678 · GitHub
Here are the results for the current Nim bindings version.
revA_nim_211_name | revA_nim_211_# | revA_lua_# | revA_lua_211 | |
---|---|---|---|---|
nil | 11187 | 3.755287009 | 2979 | nil |
drawDiagonal | 70 | 0.9210526316 | 76 | drawLine - Diagonal |
drawHorizontal | 380 | 1.035422343 | 367 | drawLine - Horizontal |
drawVertical | 82 | 0.9111111111 | 90 | drawLine - Vertical |
drawRandomDiagonal | 156 | 1.130434783 | 138 | drawLine - Random Diagonal |
drawLineFillRect | 1567 | 1.748883929 | 896 | drawLine - fillRect |
drawLineDrawRect | 193 | 1.09039548 | 177 | drawLine - drawRect |
mathRandomSugar | 1879 | 1.378576669 | 1363 | math.random |
mathRandomProc | 1668 | 1.319620253 | 1264 | math.random - local |
mathSin | 10951 | 9.734222222 | 1125 | math.sin |
mathSinRandom | 1816 | 3.472275335 | 523 | math.sin - random |
mathCos | 11307 | 9.988515901 | 1132 | math.cos |
mathCosRandom | 1587 | 2.994339623 | 530 | math.cos - random |
mathFloor | 11476 | 9.453047776 | 1214 | math.floor - local |
imageSample - Fast | 10697 | 34.39549839 | 311 | image:sample |
drawText | 101 | 1.188235294 | 85 | drawText - local |
drawTextInRect not in C API | 11 | drawTextInRect | ||
drawRect | 101 | 1.16091954 | 87 | drawRect |
fillRect | 167 | 1.024539877 | 163 | fillRect |
drawEllipse | 81 | 1.88372093 | 43 | drawCircleAtPoint |
fillEllipse | 112 | 2 | 56 | fillCircleAtPoint |
drawEllipse | 82 | 1.138888889 | 72 | drawCircleInRect |
fillEllipse | 112 | 1.191489362 | 94 | fillCircleInRect |
spriteMoveToStatic | 332 | 1.509090909 | 220 | sprite:moveTo - static |
spriteMoveToRandom | 199 | 1.309210526 | 152 | sprite:moveTo - random |
spriteSetImage | 5958 | 16.87818697 | 353 | sprite:setImage |
spriteSetCenterStatic - center not implemented in Nim | 272 | sprite:setCenter - static | ||
spriteSetCenterToggle - center not implemented in Nim | 254 | sprite:setCenter - toggle | ||
spriteSetCenterRandom - center not implemented in Nim | 212 | sprite:setCenter - random | ||
spriteSetZIndex | 9578 | 16.3447099 | 586 | sprite:setZIndex |
draw | 1178 | 4.252707581 | 277 | image:draw |
drawLockedLocal - lockFocus not implemented in C | 156 | image:draw - locked | ||
drawLockedLocal - local is a lua-concept | 151 | image:draw - locked local | ||
drawPushContext | 1156 | 7.09202454 | 163 | image:draw - pushcontext local |
imageSample - Slow | 163 | 0.5241157556 | 311 | image:sample |
average perf increase | ||||
4.856087018 |
Observations:
- Are we comparing apples to apples? Looking at the original lua benchmark; Joyrider, Samdze and my results differ significantly. My results, which are used in this comparison, are most favourable to Lua
- All functions except drawLine-vertical are faster for Nim than for Lua. This function is roughly equally as slow on both systems, the fdiiferences are not consistent between runs.
ImageSample - Slow
can be ignored because... - It seems that the C implementation for imageSample is so slow that there are already some optimisations done in Lua. This will have something to do with the generation of BitmapData for every invocation. Still, when working directly with BitmapData, a dramatic increase compared to Lua can still be achieved (34x). If you are making an image editing app, (Playmaker etc.) I would strongly consider C or Nim
- Over all functions, the performance increase averages out to Nim being 5x faster than Lua when ignoring the unoptimized image:sample function
- The performance increase differs greatly per function. For a performance benefit for your project, look at the functions that are used most heavily in your project for every frame. Math is about 10x faster in Nim. When you are using the sprite system for drawing, the performance increase might not be so great (not directly measured by benchmark) when compared to drawing directly to screen (4.3x). If you do a lot of drawing to off-screen images, the performance increase is even more significant, at 7.3x
- Where the Lua sdk is mature and the performance is stabilized, the performance gap will widen because Nim can be tweaked further to be more performant. (5x -> 7.5x as shown by preliminary results)
Here are my results on my rev A device, using the experimental optimized version of the Nim bindings.
I'd also want to point out that the comparison done here is almost purely in terms of how fast a language can interact with the SDK (and it is often the SDK itself that bottlenecks the tests).
So it is not a general speed comparison and standard algorithms/pure logic implemented in Nim vs. Lua would show much more difference.
That said, here you go:
function | revA 2.1.1 Nim | revA 2.1.1 Lua | perf. increase |
---|---|---|---|
nil | 12778 | 2077 | x6.152142513 |
drawDiagonal | 70 | 71 | x0.985915493 |
drawHorizontal | 411 | 289 | x1.422145329 |
drawVertical | 84 | 84 | x1 |
drawRandomDiagonal | 168 | 122 | x1.37704918 |
drawLineFillRect | 1744 | 601 | x2.901830283 |
drawLineDrawRect | 200 | 165 | x1.212121212 |
mathRandomSugar | 7800 | 984 | x7.926829268 |
mathRandomProc | 9663 | 1104 | x8.752717391 |
mathSin | 13458 | 939 | x14.33226837 |
mathSinRandom | 5117 | 428 | x11.95560748 |
mathCos | 13156 | 1071 | x12.28384687 |
mathCosRandom | 11077 | 568 | x19.50176056 |
mathFloor | 13018 | 1298 | x10.02927581 |
imageSample - fast | 13411 | 290 | x46.24482759 |
drawText | 102 | 83 | x1.228915663 |
drawTextInRect | 29 | 11 | x2.636363636 |
drawRect | 100 | 87 | x1.149425287 |
fillRect | 180 | 156 | x1.153846154 |
drawEllipse | 87 | 43 | x2.023255814 |
fillEllipse | 126 | 52 | x2.423076923 |
drawEllipse | 86 | 70 | x1.228571429 |
fillEllipse | 120 | 87 | x1.379310345 |
spriteMoveToStatic | 3242 | 171 | x18.95906433 |
spriteMoveToRandom | 658 | 128 | x5.140625 |
spriteSetImage | 2956 | 274 | x10.78832117 |
spriteSetCenterStatic - still to test in Nim | 235 | ||
spriteSetCenterToggle - still to test in Nim | 247 | ||
spriteSetCenterRandom - still to test in Nim | 183 | ||
spriteSetZIndex | 7927 | 413 | x19.1937046 |
draw | 1125 | 224 | x5.022321429 |
drawLockedLocal - not implemented in C | 135 | ||
drawLockedLocal - local is a lua-concept | 135 | ||
drawPushContext | 1091 | 128 | x8.5234375 |
imageSample - slow, can be ignored | 199 | 290 | x0.6862068966 |
average perf increase | |||
x7.587159451 | |||
ignoring imageSample slow | |||
x7.825123332 |
i checked the branch as i'm trying to port the NIM Code to C but i noticed this:
is that normal ? i mean is the playdate case sensitive when it comes to directory names or not ? because if it is it might have failed to load the background image in certain cases
edit: come to think of it it's probably not case sensitive as it runs from a fat file system if i'm not mistaken
here's the C Port https://github.com/joyrider3774/benchmark_c_playdate i'm not entirely sure of certain functions like the random ones (randomsugor / proc) and the values used for sin and cos random, it would be greate if someone verified these. i also added a setcenter(0,0) on the background image otherwise i would not see "bench" text on the screen. Also not sure if i'm supposed to see the benchmark actually running i think with the lua version i did, but almost seems as here it's being run in the background somehow
So here's the results from running on my pladate rev A:
#, BENCH, CALL
0,nil,12732
1,drawDiagonal,71
2,drawHorizontal,415
3,drawVertical,84
4,drawRandomDiagonal,209
5,drawLineFillRect,2005
6,drawLineDrawRect,206
7,mathRandomSugar,13000
8,mathRandomProc,12422
9,mathSin,11731
10,mathSinRandom,13213
11,mathCos,12809
12,mathCosRandom,12727
13,mathFloor,11630
14,imageSample - Fast,12011
15,drawText,103
16,drawTextInRect not in C API,0
17,drawRect,104
18,fillRect,176
19,drawEllipse,87
20,fillEllipse,119
21,drawEllipse,87
22,fillEllipse,119
23,spriteMoveToStatic,2771
24,spriteMoveToRandom,2415
25,spriteSetImage,6130
26,spriteSetCenterStatic - center not implemented in C,0
27,spriteSetCenterToggle - center not implemented in C,0
28,spriteSetCenterRandom - center not implemented in C,0
29,spriteSetZIndex,8921
30,draw,1279
31,drawLockedLocal - lockFocus not implemented in C,0
32,drawLockedLocal - local is a lua-concept,0
33,drawPushContext,1221
34,imageSample - Slow,4806
Edit on github (source code) i also added the SpriteSetCenter* functions now as C does have these
I wrote a benchmark for Opus audio decoding, as I found that Rev A was too slow for real time playback, while Rev B was just barely fast enough, and I wanted precise performance numbers. I posted more details along with source code and download links for the benchmark in a new thread, but here are the results:
(Rev A: 0.84x, Rev B w/ crank sampling: 1.13x, Rev B w/o crank sampling: 1.24x)
Rev B without crank sampling is nearly 50% faster than Rev A. Rev A and Rev B are supposed to have roughly equivalent performance, but that's clearly not the case with all workloads.
From the thread (rev A was 84% real time and runs at 168 MHz):
Any update on this? Disabling crank sampling still results in a speed increase on 2.3.0.
Sorry, this got buried in the dozens of open dev forum tabs. Yeah, seems pretty important. I've targeted it for 2.5. I looked into it earlier and didn't see any obvious reason it was taking longer, but hopefully I'll have better luck this time.
The problem with the C benchmark is that the effort to getCurrentTimeMilliseconds
and call a function can start to cost significant time. Also, the compiler is very good at knowing when something isn't used like a loop of sin
that does nothing.
We can run this 300k iteration loop in 0.02 sec.
pd->system->resetElapsedTime();
for (i=0;i<300000;i++) { sinf(i); }
pd->system->logToConsole("elapsed %f\n", pd->system->getElapsedTime());
But only 1800 times in 0.02 sec like this
int i;
float j;
pd->system->resetElapsedTime();
for (i=0;i<1800;i++) {
j+=sinf(i);
}
pd->system->logToConsole("elapsed %f %f \n", pd->system->getElapsedTime(), j);
Didn't make it into 2.5.0, sorry. I'll try and get it next time around.