Tips and tricks for processing audio in C

dustin · August 8, 2022, 4:27pm

I thought I would start a thread to bring together some tips on audio processing with the Playdate's C SDK. Feel free to share things you've learned here be it general tips on digital audio processing or Playdate specific.

Some helpful terminology:

Sample: a moment of a sound wave digitally sampled and stored as value (8 bit or 16 bit on Playdate).
Sample Rate: the number of samples per second of audio
16-bit Audio: audio sampled to shorts (value range of -32767...32767)
8-bit Audio: audio sampled to chars (value range of -127...127)
Channels: on the Playdate, samples can be stored with 1 (mono) or 2 (stereo) channels.
Frame: all of the data that defines a full mono or stereo sample—which with stereo includes a sample per each channel. Data length for a frame is: 1 byte for 8-bit mono, 2 bytes for 8-bit stereo, 2 bytes for 16-bit mono, 4 bytes for 16-bit stereo.

Let me know if these definitions seem off or if I'm missing anything useful here.

dustin · August 8, 2022, 4:57pm

Here is what is meant to be a general example of processing the audio data stored in a Playdate AudioSample. And is ideally a good place to start if you want to mess around with Playdate audio data in C.

I will try to circle back and make this example even more general by showing how to handle stereo sound and ideally a little more compact.

Please let me know if you see any issues here and I'll fix it up.

// A few helpful macros and values
#define MIN(A, B) ((A) < (B) ? (A) : (B))
#define MAX(A, B) ((A) > (B) ? (A) : (B))
#define MAX_INT16_F 32767.0f
#define MIN_INT16_F -32767.0f
#define MAX_INT8_F 127.0f
#define MIN_INT8_F -127.0f

// In this case, I'm reading an AudioSample passed from the lua runtime.
AudioSample* sample = pd->lua->getArgObject(1, "playdate.sound.sample", NULL);

// Fetch a ptr to the sample's data, data length, format, and sample rate.
uint8_t* sound_data = NULL;
SoundFormat sound_format;
uint32_t sound_sample_rate;
uint32_t sound_data_length;

pd->sound->sample->getData(sample, &sound_data, &sound_format, &sound_sample_rate, &sound_data_length);

// Determine number of channels (stereo or mono).
const int sound_channels = SoundFormatIsStereo(sound_format) ? 2 : 1;

// For now this example only processes mono (8/16-bit) frames.
if(sound_channels > 1) {
	return;
}

// Get length of each frame (if stereo, a frame includes a sample per channel).
const int bytes_per_frame = SoundFormat_bytesPerFrame(sound_format);

// Determine the number of samples in the AudioSample's data.
const int sample_count = sound_data_length / bytes_per_frame;

// Allocate an output buffer we'll use to store our processed audio.
uint8_t* output = malloc(sound_data_length);

// Now, you don't need to handle all sound formats if you know what data you're working with.
// However, if you want to create something more general, you could handle both 16 and 8 bit audio samples.
// In each of these cases we cast our output and input buffers to the correct.
// For now I am only handling mono here, but stereo samples sit side by side so you could have a subloop per sample to iterate over the number of channels.
if(!SoundFormatIs16bit(sound_format)) {
	int16_t* output16 = (int16_t*)output;
	int16_t* samples16 = (int16_t*)sound_data;
	const int16_t clip_value = 5000;

	for(int i = 0; i < sample_count; i++) {
		const int16_t sample = samples16[i]; // Grab the current 16-bit sample.
		const float sample_f = (float)sample / 32767.0f; // Not used but an example if you want to work with a normalized float (-1.0 – 1.0).

		// Super simple processing of a hard clip 16 bit audio data.
		output16[i] = MIN(clip_value, MAX(-clip_value, sample));
	}
}
else {
	int8_t* output8 = (int8_t*)output;
	int8_t* samples8 = (int8_t*)sound_data;
	const int8_t clip_value = 50;

	for(int i = 0; i < sample_count; i++) {
		const int8_t sample = samples8[i]; // Grab the current 8-bit sample.
		const float sample_f = (float)sample / 127.0f; // Not used but as above just an example if you want to work with a normalized float (-1.0–1.0)

		// Simple processing of a hard clip on 8 bit audio data.
		output8[i] = MIN(clip_value, MAX(-clip_value, sample));
	}
}

// To wrap our processed in an AudioSample again.
// The assumption here is that the sound format, sample rate, and data length haven't changed.
// If they have, you'll want to pass in those values.
AudioSample* new_sample = pd->sound->sample->newSampleFromData(output, sound_format, sound_sample_rate, sound_data_length);

// You do not need to free the output buffer as the AudioSample has now taken owership and will free it when it is freed.

// And if you wanted to push this new AudioSample back to the lua runtime.
pd->lua->pushObject(new_sample, "playdate.sound.sample", 0);

dustin · August 8, 2022, 5:06pm

Using the example above to grab samples from an AudioSample, here is an example of reversing those samples for audio that plays backwards from the original.

const int bytes_per_frame = SoundFormat_bytesPerFrame(sound_format);
const int sample_count = sound_data_length / bytes_per_frame;

// Note that because this loop is simply reversing samples we do not need to know the
// length of samples we're working with. We just copy frames to their inverse position in 
// the output buffer. This loop should handle 8-bit and 16-bit plus mono and stereo...
// I say should though I have not tested. ;)
int8_t* output = malloc(sound_data_length);
for(uint32_t i = 0; i <= (sound_data_length-bytes_per_frame); i += bytes_per_frame) {
	memcpy(output + ((sound_data_length - bytes_per_frame) - i), sound_data + i, bytes_per_frame);
}

AudioSample* new_sample = pd->sound->sample->newSampleFromData((uint8_t*)output, sound_format, sound_sample_rate, sound_data_length);

dustin · August 8, 2022, 5:15pm

Adjusting gain of 16-bit audio samples:

#define MIN(A, B) ((A) < (B) ? (A) : (B))
#define MAX(A, B) ((A) > (B) ? (A) : (B))
#define MAX_INT16_F 32767.0f
#define MIN_INT16_F -32767.0f

float gain = 2.0 // Increase gain on audio by 2x

int16_t* output = (int16_t*)malloc(sound_data_length);
int16_t* samples = (int16_t*)sound_data;
for(int i = 0; i < sample_count; i++) {
	// Because we're casting our sample to float, adjusting it, then back to a short,
	// I like to make sure the clipping that occurs is what we expect so we clip to 
	// max/min short before writing back to output. I'm not 100% this is necessary however.
	output[i] = (int16_t)MIN(MAX_INT16_F, MAX(MIN_INT16_F, (float)samples[i] * gain));
}

GuyPerfect · August 9, 2022, 2:05pm

Audio can be used without hooking into the Lua runtime via playdate->sound->addSource(). There is also a version of this function that operates on channel objects, but I won't be covering it at this time. Refer to the Channels section of Inside Playdate with C for more information if you're curious. When not used in the context of a channel, the new source is associated with the "default channel", which is a resource managed by the audio runtime.

addSource() has the following arguments:

SoundSource* playdate->sound->addSource(
    AudioSourceFunction *callback,
    void *context,
    int stereo
);

callback is a pointer to a handler function that will be described in just a moment.
context is any pointer defined by the programmer. This same pointer will be used in calls to the handler callback.
stereo is a boolean value for the number of channels in the source: 0=mono, 1=stereo.
The return value is a pointer to the new SoundSource object.

AudioSourceFunction has the following arguments:

int AudioSourceFunction(
    void *context,
    int16_t *left,
    int16_t *right,
    int len
);

context is the same pointer that was used in the call to addSource(). Every time the handler is called, context will contain this pointer.
left and right are both pointers to a sample buffers to receive the output audio. They can be accessed like arrays, such as left[0], left[1] and so-on.
len is the number of audio frames to render. This many samples should be stored to left and (if stereo) right every time the handler is called.
The return value indicates whether there is any meaningful audio data: 0=all audio is silent, 1=data in left and right is meaningful.

Question: Is it true that mono sources only need to store to left? I feel like that's the case but I can't say with absolute certainty.

Once addSource() has been called, the runtime will automatically invoke the callback whenever it needs more samples, until playdate->sound->removeSource() has been called for that source object. This happens in a manner similar to the program's update callback, but in a distinct context. The pointer to the PlaydateAPI object is not passed to the audio handler, so if any runtime features (such as access to removeSource()) is needed inside the handler, the program will need to make use of the context argument.

Question: Is it safe to use removeSource() from inside the audio callback? I've been doing this without issue so far, but I don't know if it's guaranteed to continue working in future SDK versions.

The way I recommend to associate state with the context argument is to define a simple structure type. Consider the following:

// Context state for audio callback
typedef struct {
    PlaydateAPI *pd;
    SoundSource *source;
} AudioState;

// Create a new audio source with a state context
AudioState *state = pd->system.realloc(NULL, sizeof (AudioContext));
state->pd     = pd;
state->source = pd->sound->addSource(&AudioHandler, state, 1);

// Audio callback routine
int AudioHandler(void *context, int16_t *left, int16_t *right, int len) {

    // Resolve state fields as needed
    AudioState *state = (AudioState *) context;
    // Do not use the audio callback for system tasks:
    //   spend as little time here as possible

    // Process all requested samples
    for (int x = 0; x < len; x++) {
        left [x] = rand(); // White noise
        right[x] = rand(); // Narrowing conversion is automatic, so don't @ me
    }

    // Audio data is meaningful, so return 1
    return 1;
}

Here, a structure type AudioState is defined that points to both the PlaydateAPI object and a SoundSource object. This can be used to store any number of fields applicable to the program. For instance, if a sound effect is to be played and its length in samples is known, the state structure can store the number of remaining samples, allowing removeSource() to be called after the sound effect has finished playing.

It's important to remember that the audio callback is not intended for anything except delivering audio samples to the runtime. Don't use it for any other purpose: use the program's update callback for everything else.