Device crash for C/Nim based game

I'm working on a game using Nim, which compiles down to C. I'm getting experiencing a crash on device only that I'm struggling to debug. SDK version 2.3.1.

When I run crashlog.txt through the symbolizer, it just gives me question marks:

--- crash at 2024/02/26 17:45:58---
   r0:20010e3c    r1:0000003f     r2:00000411    r3: 20010e44
  r12:20009a98    lr:2405d17d     pc:2405cca8   psr: 91010000
 cfsr:00000082  hfsr:00000000  mmfar:00000047  bfar: 00000047
heap allocated: 3638912
Lua totalbytes=0 GCdebt=0 GCestimate=0 stacksize=0 

?? ??:0
?? ??:0

A few notes from my investigation so far:

  • This thread indicates the 24* memory space is for firmware
  • I've tried just running the simulator with valgrind, but it isn't reporting any problems in my code
  • This thread mentions using an Address Sanitizer, but this thread on Stack Overflow indicates it isn't supported on ARM
  • The last device crash I debugged I was able to to repro in the simulator. The error message was unaligned fastbin chunk detected. I resolved that error by deleting code, not by solving the underlying issue, so I'm suspicious this is the same problem. But like I said, I'm unable to repro in the simulator this time.

I'm about to switch to more tedious debugging options (printf, custom malloc), so I'm looking for ideas on anything else I should be doing before I go down that path.

It may still be of value to run asan on your simulator code. If you're trashing memory it will show up with asan in the sim.

As alluded to in File open crashes on device C API - #11 by dave, we actually do have symbols for (some) the Playdate firmware, located in the symbols.db file that's included with the SDK, but I never could find the memory addresses from my crash logs in the file. It just now occurred to me to convert the hexadecimal addresses from the logs into decimal (tool) to match the ones in the database, and sure enough, I can see the file name corresponding to the pc and lr values in your crash log.
Your pc and lr values are 2405cca8 and 2405d17d, which are 604359848 and 604361085 in decimal. If I search for those numbers in the lines table, it tells me that the pc value was found in file_id 166 (at line 3496). The lr value isn't found, but the surrounding addresses are in the same file. If I search for that file id in the files table, it tells me that file is /Core/dlmalloc.c. It's worth noting that while I wasn't able to find those addresses in the functions table, I could find addresses within a couple hundred of yours.
Hopefully this is somewhat helpful.

1 Like

Are there any instructions on how to do this on Linux? I spent a bit of time working on it this evening, but eventually hit this: Workarounds for #837 (Shadow memory range interleaves with an existing memory mapping. ASan cannot proceed correctly. ABORTING.) · Issue #856 · google/sanitizers · GitHub

It just now occurred to me to convert the hexadecimal addresses from the logs into decimal (tool) to match the ones in the database, and sure enough, I can see the file name corresponding to the pc and lr values in your crash log

Great find! It does a bit to confirm that this is likely a memory issue.

do we ship a simulator with asan enabled? As I recall, the simulator has to be built with asan enabled if it's going to run an asan-enabled game.

Or is it the other way around, a non-asan sim can run an asan game but an asan sim can't run a non-asam game?

Noting for tracking this issue; there were discussions and an example cmake file on the dev discord on how to enable asan for a project in linux.

I always enable asan when building for the simulator. It works great if I run the simulator + game from Xcode.

However it seems to crash if I double click on a PDX built with asan in the Finder (asan_die something something…)

1 Like

To close this loop:

I never was able to get ASan working for me on Ubuntu because of the issue I linked above. Valgrind seemed to work well enough, but it didn't help much because I was still unable to repro the crash on the simulator.

Instead, I edited the setup.c file that ships with the playdate SDK to have it log everything it was doing. With this, I discovered that the Nim integration layer wasn't fully using the playdate allocator. Under the covers, it was directly calling malloc in the c std lib, which apparently still works.

I was able to get rid the device crashes by ensuring that Nim itself was using playdate.system.realloc. (Caveat: I reserve the right to find more debilitating crashes that wind up being difficult to debug)

If you're interested, the pull request is here: Use playdate realloc for memory management by Nycto · Pull Request #60 · samdze/playdate-nim · GitHub

1 Like