Working on a little particle system for a game. Would there be interest if I built it out into a library?

Yes, it's integer/floor division. Since Lua 5.3

https://www.lua.org/manual/5.3/manual.html#3.4.1

I use //1 instead of (an alias to) math.floor() as it's quicker and easier, though I can't remember the performance difference off-hand.

2 Likes

Ah ok, much appreciated. And thanks for the link, couldn't find it before.

1 Like

Strange, I can't reproduce that. In a simple test case I'm seeing the opposite, the vector addition is a bit under 3x as fast as adding the components separately, both in the simulator and on the device. I wonder what the difference is?

main.lua.zip (727 Bytes)

Interesting... Here are the results I got from Matt's benchmark tool:

# time name
35, 11242, add vectors: plus equals
36, 11165, add vectors
37, 4634, add vectors piecewise
38, 4656, add vectors piecewise: plus equals
39, 4789, add vectors new
40, 16838, add nonvectors piecewise
41, 14018, add nonvectors piecewise, indexed
42, 17058, add numbers
43, 12457, scale vectors
44, 6780, scale vectors piecewise

I'm assuming the second number is the time to run the function? If so, 35-36 involve adding vectors directly ("plus equals" refers to me testing A+=B vs. A = A+B which seems negligeble), 37-39 involve adding the x and y components separately (which I refer to as "piecewise"), 40-41 involve using a generic Lua table instead of vectors, 42 involves storing the x and y components as individual variables, and 43-44 recreates the initial tests but with scaling instead of addition.

Unless maybe I have the numbers backwards?

...Welp

I just looked at the original thread and yup, I got it backwards. Higher number = faster, so adding vectors is indeed around twice as fast as adding the components. Technically using generic tables or individual variables is ~50% faster than vectors, but the added convenience may be worth it.

Also, I made a change so that the particles' velocity only updates every other frame, and got it up to ~50 particles which is a fairly decent number. I've got a few more things I want to try but if anyone has any suggestions I'd love to hear it!

1 Like

Nope! It's how many iterations were done in the time available. Bigger is better.

Not sure why I labelled that column time, sorry
(I didn't)

1 Like

SsirRender-public 3.zip (113.0 KB)

@manalive Here you go. I sanitized the engine a bit and threw in some comments / printed output into the simulator console to help newcomers. Let me know if you have any questions

EDIT: minor changes to the file

Ooh, better correct that, this will be very confusing and possibly a massive time-waster. Since optimization is already a huge time sink.

Suggested: add a bit of math to calculate the time that is spent, or change labels to i"terations" or something like that.

Sorry just to clarify I added the time label since I didn’t see that it was labeled. I think right now it says benchmark or something. And just a quick update: I’ve gotten it up to 100 particles which I think is decent enough to release. I’ll clean it up a little and post it here soon.

1 Like

Yes! I did check but forgot to mention here. I describe it as:

number of times that command can be called in a single frame/update

Hey all, thanks so much for all your help and feedback, I finally got it working more or less the way I want it. It's still a little rough but I wanted to share it with you all to hear your thoughts or feedback.

Meet glitterbomb:
sparkloop

The main class is called ParticleEmitter, which you initialize by passing it an image. There's also a class called AnimatedParticleEmitter that does the same thing but with an image table. You can manually position the emitter, or give it a velocity so it moves each frame.

Right now, the default behavior is to emit particles randomly given certain parameters (discussed below) but could easily be changed to whatever pattern you want.

Right now, the settings are:

  • emissionRate - How many particles are created per second
  • emissionForce - Their initial velocity
  • emitterWidth - If greater than 0, will randomize the start position of particles along a line
  • emissionAngle - The direction they are emitted towards (0 degrees is right, 90 is down, 180 is left, etc.)
  • emissionSpread - If greater than 0, will randomize the angle at which particles are emitted within +/- the spread amount
  • particleLifetime - How long particles will exist (in seconds) before being destroyed (note, this along with emissionRate combine to give you the maximum number of particles that will exist at one time. So emitting 100 particles per second with a 1 second lifetime will have the same number of particles as 25 particles per second with a 4 second lifetime)
  • particleUpdateDelay - Rather than update particles' velocity every frame, you can insert a delay to improve performance (at the cost of physical accuracy)
  • inheritVelocity - Boolean. If set to true, particles will have the emitter's velocity added to their own when spawned.
  • gravity - The amount of gravity particles experience (can set to 0 to turn off)
  • worldScale - A helpful variable that lets you convert from your game's scale to the real world (for example, lets you think of particle velocity as m/s, 9.8 as physically accurate gravity, etc.)

You can either set these variables on initialization by passing them as a table, or can set them manually with functions like setEmissionRate, setEmissionForce, etc. I've included a demo app so you can see some of the functionality and examples of how to use it.

Right now, I've found that I can get over 100 particles on device at 30FPS, and closer to 50 particles if they're animated (note: if anyone has thoughts on how to improve it I would love to hear it!)

I've found that animating scale and opacity was too expensive to do, so instead I recommend baking any size or opacity changes into your animation. To that end, I also created a second mini "app" called glitterglue. With glitterglue, you can create a ParticleEmitter, but rather than draw onscreen it draws a particle frame by frame to a sprite sheet so you can import it as an ImageTable to an AnimatedParticleEmitter later (I'm sure there's a more elegant way of going about this but that's what I have so far). Right now, it just animates size and opacity, but I'm sure could be extended to rotation or other animations.

Anyways, would love for you all to take a look, try it out, and let me know what you think!

7 Likes

Hey all, so I've made a few more improvements and feel like it's ready for people to use. There are a few more features I'm trying to get in, but in the meantime I wanted to share some findings I made during the process of optimizing it to run on device (big thanks to Matt for his benchmarking tool):

  • When working with coordinates, storing them as numbers (self.xPosition, self.yPosition) is faster than storing them in a table (self.position.x, self.position.y) is faster than storing them as Playdate vectors. For most use cases I’d say the convenience of vectors is worth the tradeoff, but if you’re running hundreds of calculations every frame, storing them as tables or individual values makes a significant difference (almost 3x faster).
  • It’s slightly faster (20%) to access class values directly than it is to have a getter function.
  • It is faster (25%) to create a local variable every frame to store a value than it is to directly access that value multiple times a frame. For example, if you need to use access a particle’s position multiple times per frame, just create a local variable to store it once rather than accessing it directly from the particle object every time.
  • It is faster (7%) to calculate a particle’s position every frame than it is to calculate and store all positions ahead of time and access them from a table every frame. I figured “pre-baking” the animation would be faster, but apparently it’s faster to do basic addition and multiplication than it is to access a value from a table.
  • It is very slightly faster (2%) for the particle spawner object to store the particle image than it is for each particle to store their own image. This was a little surprising to me as I figured each particle storing the image would be significantly slower, but since it isn’t it might give more flexibility in the future for each particle to store its own image.
  • Some other tricks that I didn’t benchmark but improved performance when testing on device:
    • Moving and drawing images was faster than moving and drawing sprites. I tested this early in the process and it’s something I’d like to revisit it to see if it’s true in all circumstances since sprites are easier to work with.
    • Many people will tell you this, but it’s best practice to save references to commonly used functions (playdate.graphics, math.random, etc.) as a const at the start of your file
    • DrawCentered() is actually very expensive, and it’s better to save the image’s dimensions and use it to offset a regular Draw()
    • Object pooling is your friend: rather than create a new particle object every time an old one is destroyed, I simply stored the old one and reinitialized it.
    • I added the option to only update a particle’s velocity every other frame, every third frame, etc. which improved performance (at the expense of accuracy). I also tried batching particles that were created at the same time to update them together, however this actually made things slower. I want to revisit this as well since it doesn’t entirely make sense to me, but my guess is the benefit of only accessing the particle’s properties once is outweighed by the overhead for managing batches.

Curious to hear if you all have any thoughts!

4 Likes

Great! Thanks for the detailed summary.

Does this increase the number of particles you're pushing around?

edit: from 30 particles up to 100! Nice.

1 Like

Oh sorry yes! Forgot to mention it here, but originally it could support around 30 particles at a time and now it’s over 100. I also added the ability to load imagetables for animated particles (as well as a utility to create imagetables that animate a particle’s size/opacity) and even there it can support 50+ animated particles.

The one remaining issue is there is a framerate spike every 5-10 seconds which seems to be due to the garbage collector? But other than that I’m very pleased with where it’s at.

1 Like

You can change the frequency of the garbage collector, or trigger it yourself. If you're reusing particles then the question is "what garbage is building up?"

Triggering the GC yourself allows you to see how much has been freed. Let's track it down!

Did you see pdParticles? pdParticles - Playdate particle system

2 Likes

Interesting! Is there a way to see what garbage the GC is collecting?

And I did see that! It’s funny that I built this because there was no particle system solution, then the next day someone else made one :joy:

I really like the stylized look of their particles, and I may try adding something like that here.

1 Like

Hey all, so I managed to figure out the GC issue and I'm up to about 150 particles which is awesome. I'm working on adding some additional features, and I had a question I wanted to run by you all:

I want to be able to apply other forces besides constant downwards gravity -- for example, a variable wind force, or a relative orbital gravity based on the particle's position. I wanted to make this part as extensible and generalizable as possible to make it easy for other people to select which force they want to use, or implement a new one of their own. However in doing so, I've found that it negatively impacted performance and I wanted to get your input on what you'd prefer if you were using this in your own game:

Option 1
Gravity logic is implemented directly in the particle update loop:

Update Loop
    Check if particle should be destroyed
    Apply force
    Update position

This is how it's currently implemented.

Option 2
Gravity logic is moved to a separate function to make it easier to customize

Update Loop
    Check if particle should be destroyed
    Call ForceFunction()
    Update position
ForceFunction()
    Apply force

That way, you can easily write multiple force functions (gravity, wind, etc.) and let users choose which they want to use or write their own. I even created a wrapper function so all a user would need to do is set the wrapper to a different function to apply different a force.

The downside is because you're calling a function 100+ times a frame, it impacts performance, with a max of 140 particles instead of 150.

Option 3
Instead of calling the force function within the update loop, we just mark particles that need to be updated and then call the force function once at the end. The force function then loops through all the particles that need to be updated and applies force to them:

Update Loop
    Check if particle should be destroyed
    Add particle to update list

Call ForceFunction()
ForceFunction()
    Loop through particles that need updating
        Apply force
        Update position

Since you're only calling the update function once, it's slightly faster than Option 2 (145 max particles instead of 140)

However it makes writing a new force function more complex since you need to add logic to loop through all the particles, as well as to update the particle's position.

I'm curious if you all think the ease of updating is worth the performance tradeoff for option 2, or if you'd prefer option 3 which is a tradeoff between the two.

150, nice!

Interested to hear about your garbage collection changes.

Just checking, is ForceFunction() defined as local?

  • local is faster
  • local <const> perhaps even faster
1 Like

The issue with the GC turned out to be that I had several functions that returned values as tables which I wrote in the form of return {x=0,y=0}. As it turns out, this creates a new table every time it was called and, like the force function here, it was being called 100+ times a frame which is what caused the GC build up. It wasn’t the prettiest, but to solve it I created a “holder” table at init that gets reused instead of creating a new one every time (e.g. holderTable.x=0 holderTable.y=0 return holderTable). Not only did it solve the GC issue but it also made things run ~50% faster.

And right now the force function is a class method, but maybe I’ll try declaring it as a const and see what happens!

3 Likes

For your force function, have you considered having a callback one could specify which defaults to nil? That way your update function could check for the existence of the callback, call it if it exists, and otherwise run the internal force logic.