Hello! Recently it was brought to my attention that the intro segment to my game Reel-istic Fishing was lagging pretty heavily.
This was odd to me, as I've tested it before and it ran perfect on my Rev A device (30FPS). I tested it out again, and it still ran really well, even in Mirror.
Plenty of people who own the game were kind enough to help me test this out, and we eventually noticed a pattern!
8 people total tested out the intro.
All 4 people with Rev B devices experienced heavy lag.
All 4 people with Rev A devices had a smooth experience.
(thank you aperson, citrus, gamma, tavi, nicksr, kiwi, and rae for taking the time to test this)
There should not be any meaningful difference between our intros. This segment happens right after the title screen, and the only difference our games would have is the difficulty option we picked (which has no effect on the intro).
There is also no randomization of the intro, so all 8 of us would be seeing the same thing.
I'm not sure what could be causing such a big performance difference D:
Here are two graphs! (Apologies for the cropping of the Rev A one, I took it yesterday and don't have a way to get a fresh screenshot of everything)
Rev B you can see a shaky 13-17FPS (25% of CPU is idle)
Rev A is 30FPS
Trying to represent this via gifs. Hopefully they display accurately:
A rev B laggy video
How it looks on a Rev A unit
It's probably not needed, but here's the whole intro from start of the lag to the end, recorded with obs and mirror (to show inputs) on my Rev B device:
Oof. That doesn't look good. Could you DM me a build I can profile, try and figure out what's going on? I can't think of any hardware differences that would cause that.
could the downclock of the cpu affect the fpu performance ? if that intro uses floating point math and rev A cpu's were doing fpu calculation faster than rev b (currently) can something like that affect it ?
Turns out this is a surprise performance regression in the "draw image rotated 90 degrees" function. Rotating an image is a notoriously hard thing to optimize because it accesses data in strides across the image, which causes lots of cache misses if you can't store the entire image in cache. I had tested this with dcache on vs off for every reasonable width/height combination and came up with a heuristic for when we should disable the cache--sometimes cache hurts here, sometimes it helps--but either this is a weird aberrant case or something changed and my heuristic is wrong now. I'll rerun those tests over the weekend and see where that needs updating or if I should just drop it on the H7.
What's going on at a higher level is the boss sprite is off-screen during that sequence but it's still being animated, with its image getting set every three or four frames. Normally that wouldn't be a problem but since the sprite is rotated the system has to redraw the rotated image every time into its own image buffer. One easy fix is to rotate those images ahead of time, either in code or in your image editor. We'd considered adding a cache mechanism to help with this sort of case, but decided it only adds unnecessary complexity since it's easy enough to implement (and customize) at the game level.
Came across the same issue (horrible performance on a Group 5 device when rotating sprites 90 degrees, while Group 1 device had no issue at all) and wanted to post my solution to hopefully help anyone stumbling across similar issues.
It seems like this issue only occurs when big sprites get rotated (sprites which are 240 x 240 or bigger, from what I've experienced in my game).
Rotating smaller sprites (such as 240x80 for example) tends to not cause issues (unless done en masse).
I've encountered the performance issues when drawing the text for a menu.
The issues could be resolved by simply limiting the size of the image/sprite the text was drawn in. Maybe similar approaches will help other devs stumbling across this issue for now.
Previously the text was drawn on a sprite the size of the red rectangle. Now it's drawn on a sprite the size of the green rectangle and the difference is night & day.
Hope this helps others!