I wanted to have a few basic particle effects in my game. I couldn't find any existing libraries for Playdate so I decided to make my own and well, things got away from me a little. I'm basing it off the Unity particle system, so you start by specifying a sprite, then can set things like:
Emission angle, spread, and initial width
Particle size and opacity over time
Whether the particle inherits the emitter's velocity
Like I said, I couldn't find anything like it (though definitely let me know if there already is) so if people are interested,I'd be happy to release it as a library. I'd just need to spend some time polishing it up a little, as well as making it more performant (any help in those regards would definitely be appreciated as well).
Yeah so I tried it on device last night and the results… weren’t great. On my computer I can do hundreds of particles at a time, on device it’s more like 20-30. Which is actually enough for some things like a smoke effect, but not enough for sparks etc.
To be honest, I’m not sure what’s the bottleneck at this point. One thought I had is the particles are all sprites, and I could replace them with images. I could also stop adjusting opacity and scale, or bake them into an image table and play an animation instead. I’d love to hear any thoughts people have as to how to improve it, or suggestions for how I could best determine the bottleneck. I’d also be happy to post the source code if it helps!
That's too bad. But some of the fun of Playdate development is finding creative alternatives when the "real" way is too slow! If you need 300 sparks and can only get away 30, maybe each spark image is a cluster of 10 dots, and you never let them travel far enough to look off? Etc.
Thank you all for the tips! I switched from sprites to images, and the frame rate may have improved slightly. I also tried rounding positions to integers, though just to double check: it's faster to calculate floating point positions then round before drawing vs. drawing at the floating point position?
And thanks for the tip about using the sampler. Here are the top % items:
Line 256 calls the particle update function, which is:
self.position.x = math.floor(self.position.x+.5)
self.position.y = math.floor(self.position.y+.5)
Do you know what the issue might be here? I guess it gets called on every particle every frame so it could add up. Also, do you know what the metamethod _mul, metamethod _index, and metamethod _add reference?
seems to be a lot of vector maths (C add, mul) and table thrashing (C index)?
depends, but my understanding is that a sprite you tell it to draw at a float and it's different than the last float but still the same int, it will mark dirty. but if you told it to draw at an int last time and next time it's the same int, then it won't be marked dirty.
try this faster version
local floor <const> = math.floor
self.position.x = floor(self.position.x+.5)
self.position.y = floor(self.position.y+.5)
and then this one
self.position.x = (self.position.x+.5)//1
self.position.y = (self.position.y+.5)//1
not sure about this
not sure about this
Nice work! Would love to see something like this make its way to a reusable library.
For performance issues, I found success in pre-rendering my particle interactions. I wrote up a little "rendering engine" that would save the contents of the canvas to a pdi animation. Then I would just load those pdi animations into my game. You could even pre-render several variations and randomly playback if you wanted some variety.
Matt, thank you for the benchmarking tool, it's incredibly helpful. I didn't get a chance to try it with your rounding improvements, but I was curious about the vector math you pointed out and indeed there were some interesting findings:
Adding two vectors is significantly slower than adding their individual components. That is,
vectorA.x += vectorB.x
vectorA.y += vectorB.y
is nearly twice as fast as
vectorA = vectorA + vectorB
Unfortunately, after updating all the vector math, the limit still seems to be 20-30 particles. Checking the profiler now does give some different results, namely:
which is odd because the x position and y velocity lines aren't mentioned. Also, it's now mostly under metamethod _index, which is interesting.
And professir, that's actually really interesting. I've disabled scaling and opacity temporarily because I know they're rather slow processes, but if I could pre-render it into a series of images it could be a lot faster. Have you released that rendering engine?
Strange, I can't reproduce that. In a simple test case I'm seeing the opposite, the vector addition is a bit under 3x as fast as adding the components separately, both in the simulator and on the device. I wonder what the difference is?
Interesting... Here are the results I got from Matt's benchmark tool:
add vectors: plus equals
add vectors piecewise
add vectors piecewise: plus equals
add vectors new
add nonvectors piecewise
add nonvectors piecewise, indexed
scale vectors piecewise
I'm assuming the second number is the time to run the function? If so, 35-36 involve adding vectors directly ("plus equals" refers to me testing A+=B vs. A = A+B which seems negligeble), 37-39 involve adding the x and y components separately (which I refer to as "piecewise"), 40-41 involve using a generic Lua table instead of vectors, 42 involves storing the x and y components as individual variables, and 43-44 recreates the initial tests but with scaling instead of addition.
Unless maybe I have the numbers backwards?
I just looked at the original thread and yup, I got it backwards. Higher number = faster, so adding vectors is indeed around twice as fast as adding the components. Technically using generic tables or individual variables is ~50% faster than vectors, but the added convenience may be worth it.
Also, I made a change so that the particles' velocity only updates every other frame, and got it up to ~50 particles which is a fairly decent number. I've got a few more things I want to try but if anyone has any suggestions I'd love to hear it!
Sorry just to clarify I added the time label since I didn’t see that it was labeled. I think right now it says benchmark or something. And just a quick update: I’ve gotten it up to 100 particles which I think is decent enough to release. I’ll clean it up a little and post it here soon.