This is an old revision of the document!
The oscillators are the only interesting aspect of SARAH's design. All of the other elements shown in the signal-flow diagram above (Envelope Generators, LFOs, summing and scaling) are entirely conventional.
SARAH uses two oscillator instances per voice. The oscillators themselves are identically structured, but their settings are independent, i.e., they can be set to produce different waveforms, detuned, and mixed so as to provide at least a basic set of options to create composite timbres, and each oscillator's inherent “harmonic shaping” can also be set up differently, to provide further control. The following explanation (which is just an overview) applies equally to OSC1 and OSC2.
SARAH's oscillators are essentially wave-table based. They simply play out samples from a 1024-element digitized representation of one cycle of the selected waveform—sine, triangle, square, or sawtooth. What is interesting and new is how the 1024-element wave tables are populated.
There is a common sine wave table which is generated once and shared by all oscillator instances (including the LFOs in sine-wave mode). This is adequate, because the sine wave has no higher-order harmonics which need to be suppressed to avoid aliasing.
The triangle, square, and sawtooth wave tables are generated dynamically, as follows:
The interesting stuff happens at step 4, but to understand it we must first discuss step 5: When the oscillator is assigned a note frequency, the note's frequency in cycles per second (Hertz) is divided by the sampling rate in use (typically 44100 Hz) to yield a float
-valued “phase increment” in samples per cycle. The oscillator also has a float
-valued “phase” variable, restricted to the range 0.0 to 1.0, where 0.0 represents the beginning of the cycle and 1.0 represents the end. Each time the oscillator generates a new sample, it multiplies the phase by 1024 and rounds the result, to obtain a wave-table index in the range 0-1023, plucks that sample out of its wave-table and outputs it, then adds the phase increment to the phase, ensuring the result “wraps around” if necessary, so it remains in the range 0.0 to 1.0.
When the phase-increment is exactly 1.0, the oscillator replays its wave-table exactly. At a sampling rate of 44100 Hz, this would happen at an oscillator frequency of 44100/1024 = 43.066 Hz. This is a bit below F1, way down in the bottom octave of the piano range. Even lower notes result in a phase-increment a bit lower than 1.0, in which case, wave-table samples are occasionally repeated, resulting in slight quantization artifacts which are not very noticeable at such low notes.
For all notes above F1—i.e., just about every note you'll ever play—the phase increment will be greater than 1.0, meaning that some wave-table samples will be skipped. Basically, you are trying to replay the basic 43 Hz note at the higher pitch, and so all harmonics of the original tone will be multiplied by the phase increment value. The 512th harmonic of 43.066 Hz is 43.066 x 512 = 22049.79 Hz, just below half the 44100 Hz sampling rate—the so-called Nyquist frequency. Playing F2 means a phase-increment of around 2.0, so all harmonics above the 256th would be above the Nyquist frequency and would be “aliased” to lower frequencies. Up near the top of the piano range, the aliasing of even very low-numbered harmonics (which have substantial energy) will result in very noticeable aliasing artifacts.
In SARAH, this is avoided at step 4 above, by figuring out the highest-numbered harmonic which will still be less than the Nyquist frequency, and setting all higher-numbered harmonic coefficients to zero. When the resulting table is inverse-FFT-transformed, we obtain a version of the original waveform which is almost perfectly band-limited, and plays back at the new rate with no aliasing whatsoever.