Today was a tremendous day. I received my Playdate device!
Unfortunately, my C-based game is crashing the device. I get a message to press A to restart. I've confirmed that I can build and run the C API examples on device. This same C code runs without issue in the simulator (with the limited malloc pool), as well as on the Android platform (with a few platform-specific differences, of course). This occurs when building with SDK 1.10.0 on OSX and Windows.
What would be the best approach to debug this? I've been commenting out large pieces of the engine, but iteration time is very slow without any other clues. Logging to console seems very unreliable before a crash.
I suppose that's possible, but I don't have any recursion, and I don't think I have any large allocations on the stack. I'll double check. Thanks for the suggestion.
The first thing you want to look at is pc and lr there--pc should be the address where the crash was (or maybe the address after) and lr points to the calling function, unless the current function is using it for something else. If you load your pdex.elf file into gdb and do info line *0x<address> it'll tell you what the functions are, if they're in the scope of the elf file.
Sorry for the dumb question, but how do I build a pdex.elf? I tried running gdb on my pdex.bin, but it complained "pdex.bin: not in executable format: File format not recognized."
The Makefiles in the example projects leave a pdex.elf behind in the build folder and then use objcopy to generate the pdex.bin files. If you're using plain make, look for it there. I haven't found a way to get an elf file from our cmake setup, unfortunately. (I'm sure it's possible, I just don't know the first thing about cmake..)
Reading symbols from pdex.elf...done.
(gdb) info line *0x0802f302
No line number information available for address 0x802f302
(gdb) info line *0x0802f343
No line number information available for address 0x802f343
I added -g to my UDEFS, just in case. Do I need to make any other configuration changes to access this info?
The values of pc and lr seem reasonable, compared to the crashlog you shared, but all the negative values for heap and lua are disconcerting. I would expect my lua numbers to be zero, as I have no lua code. Any clues here?
I should have mentioned before: if the address is in the 0x08000000-0x08100000 that's firmware code. Your game will be running in the 0x60000000-0x61000000 range. There's a symbols.db sqlite database that has symbols for the public side of the api, which we use for the profiler. I don't think we have a tool for looking up addresses in there but if you're familiar with sqlite you could look around in there.. But you still wouldn't find these numbers because they're in the low-level driver code. This crash log says it's crashing in the eMMC (flash storage) driver, and that address isn't doing anything unusual. It's dereferencing $r3, but the value there is the device instance, should be fine. And the $cfsr value looks like random bits to me, doesn't look like it logged the crash correctly. (The Lua stuff is also uninitialized data, don't worry about those )
One last thing to try: If you send up the crash report I can try and find it in Memfault and see what information they've got. In Settings, go to the bottom of the first menu to System, then go to the bottom of that menu to "Send Crash Report". If you give me your serial number (DM if you don't want to share in public) I'll look it up and let you know what it says.
Just to confirm that I could take an address from crashlog and look it up successfully, I put a null pointer dereference in the Hello World example. Here is the resulting crashlog:
@dave This sounds like great information and super useful. Is there an progress on providing more information in the event of a crash? All of my crashes appear in the firmware. Further, my logToConsole() calls must be getting buffered as I'm unable to even see the message immediately before the crashing code. Any additional information would be helpful. Being able to lookup symbols in a public map file, or get a stack from my own code would be helpful. It would be ideal if I could get a remote debug session running. Thanks.
One thing that's coming in 2.1 is those values in crashlog.txt are going to be accurate again. Right now they're getting overwritten before we have a chance to write them out to disk.
Once that's in place you should be able to use the firmware_symbolizer.py script in the SDK to find out where the crash was, if it was in your game code. I'll file a feature request to have that query symbols.db instead if the crash location is in the firmware.