GetDunne Wiki

Notes from the desk of Shane Dunne, software development consultant

User Tools

Site Tools


sarah_oscillator_details

This is an old revision of the document!


SARAH Oscillator Details

About SARAH's oscillators

The oscillators are the only interesting aspect of SARAH's design. All of the other elements shown in the signal-flow diagram above (Envelope Generators, LFOs, summing and scaling) are entirely conventional.

SARAH uses two oscillator instances per voice. The oscillators themselves are identically structured, but their settings are independent, i.e., they can be set to produce different waveforms, detuned, and mixed so as to provide at least a basic set of options to create composite timbres, and each oscillator's inherent “harmonic shaping” can also be set up differently, to provide further control. The following explanation (which is just an overview) applies equally to OSC1 and OSC2.

SARAH's oscillators are essentially wave-table based. They simply play out samples from a 1024-element digitized representation of one cycle of the selected waveform—sine, triangle, square, or sawtooth. What is interesting and new is how the 1024-element wave tables are populated.

Sine waves are a special, simple case

There is a common sine wave table which is generated once and shared by all oscillator instances (including the LFOs in sine-wave mode). This is adequate, because the sine wave has no higher-order harmonics which need to be suppressed to avoid aliasing.

Reconstructing band-limited triangle, square and sawtooth waves

The triangle, square, and sawtooth wave tables are generated dynamically, as follows:

  1. 1024 samples of each waveform (one cycle) are generated once, using exactly the same mathematical expressions used in the LFOs, resulting in mathematically “exact” waveforms having 512 harmonics.
  2. Each mathematically-exact waveform is transformed using juce::dsp::FFT to produce a frequency-domain representation, which is a new 1024-element array, where each element (“coefficient”) represents the relative amplitude and phase of all 512 harmonics. (The mathematics of the FFT are such that each element is a complex number having real and imaginary components, and each harmonic other than the 0th and 512th is represented twice, for positive and negative frequencies.)
  3. After the initial forward FFT operations, one copy of each of the resulting three complex, frequency-domain arrays (one each for triangle, square, and sawtooth waveforms) is kept in memory, shared by all oscillator instances.
  4. Each oscillator instance has its own 1024-element complex array. In preparation to sound a note, it copies the coefficient data out from the appropriate common frequency-domain table to its own array, then performs an in-place inverse FFT to re-create the appropriate time-domain wave table.
  5. To sound a note, the oscillator resamples its own 1024-element array (wave table).

Avoiding aliasing by reconstructing band-limited wave tables

The interesting stuff happens at step 4, but to understand it we must first discuss step 5: When the oscillator is assigned a note frequency, the note's frequency in cycles per second (Hertz) is divided by the sampling rate in use (typically 44100 Hz) to yield a float-valued “phase increment” in samples per cycle. The oscillator also has a float-valued “phase” variable, restricted to the range 0.0 to 1.0, where 0.0 represents the beginning of the cycle and 1.0 represents the end. Each time the oscillator generates a new sample, it multiplies the phase by 1024 and rounds the result, to obtain a wave-table index in the range 0-1023, plucks that sample out of its wave-table and outputs it, then adds the phase increment to the phase, ensuring the result “wraps around” if necessary, so it remains in the range 0.0 to 1.0.

When the phase-increment is exactly 1.0, the oscillator replays its wave-table exactly. At a sampling rate of 44100 Hz, this would happen at an oscillator frequency of 44100/1024 = 43.066 Hz. This is a bit below F1, way down in the bottom octave of the piano range. Even lower notes result in a phase-increment a bit lower than 1.0, in which case, wave-table samples are occasionally repeated, resulting in slight quantization artifacts which are not very noticeable at such low notes.

For all notes above F1—i.e., just about every note you'll ever play—the phase increment will be greater than 1.0, meaning that some wave-table samples will be skipped. Basically, you are trying to replay the basic 43 Hz note at the higher pitch, and so all harmonics of the original tone will be multiplied by the phase increment value. The 512th harmonic of 43.066 Hz is 43.066 x 512 = 22049.79 Hz, just below half the 44100 Hz sampling rate—the so-called Nyquist frequency. Playing F2 means a phase-increment of around 2.0, so all harmonics above the 256th would be above the Nyquist frequency and would be “aliased” to lower frequencies. Up near the top of the piano range, the aliasing of even very low-numbered harmonics (which have substantial energy) will result in very noticeable aliasing artifacts.

In SARAH, this is avoided at step 4 above, by figuring out the highest-numbered harmonic which will still be less than the Nyquist frequency, and setting all higher-numbered harmonic coefficients to zero. When the resulting table is inverse-FFT-transformed, we obtain a version of the original waveform which is almost perfectly band-limited, and plays back at the new rate with no aliasing whatsoever.

sarah_oscillator_details.1504649600.txt.gz · Last modified: 2017/09/05 22:13 by shane