====== SARAH Oscillator Details ====== The oscillators are the only interesting aspect of SARAH's design. All of the other elements shown in the [[sarah|signal-flow diagram]] (Envelope Generators, LFOs, summing and scaling) are entirely conventional. SARAH uses two oscillator instances per voice. The oscillators themselves are identically structured, but their settings are independent, i.e., they can be set to produce different waveforms, detuned, and mixed so as to provide at least a basic set of options to create composite timbres, and each oscillator's inherent "harmonic shaping" can also be set up differently, to provide further control. The following explanation (which is just an overview) applies equally to OSC1 and OSC2. ===== About SARAH's oscillators ===== SARAH's oscillators are essentially wave-table based. They simply play out samples from a 1024-element digitized representation of one cycle of the selected waveform---sine, triangle, square, or sawtooth. What is interesting and new is how the 1024-element wave tables are populated. ===== Sine waves are a special, simple case ===== There is a common sine wave table which is generated once and shared by all oscillator instances (including the LFOs in sine-wave mode). This is adequate, because the sine wave has no higher-order harmonics which need to be suppressed to avoid aliasing. ===== Reconstructing band-limited triangle, square and sawtooth waves ===== The triangle, square, and sawtooth wave tables are generated dynamically, as follows: - 1024 samples of each waveform (one cycle) are generated once, using exactly the same mathematical expressions used in the LFOs, resulting in mathematically "exact" waveforms having 512 harmonics. - Each mathematically-exact waveform is transformed using //juce::dsp::FFT// to produce a frequency-domain representation, which is a new 1024-element array, where each element ("coefficient") represents the relative amplitude and phase of all 512 harmonics. (The mathematics of the FFT are such that each element is a [[wp>Complex_number|complex number]] having real and imaginary components, and each harmonic other than the 0th and 512th is represented twice, for positive and negative frequencies.) - After the initial //forward// FFT operations, one copy of each of the resulting three complex, frequency-domain arrays (one each for triangle, square, and sawtooth waveforms) is kept in memory, shared by all oscillator instances. - Each oscillator instance has its own 1024-element complex array. In preparation to sound a note, it copies the coefficient data out from the appropriate common frequency-domain table to its own array, then performs an //in-place **inverse** FFT// to re-create the appropriate time-domain wave table. - To sound a note, the oscillator resamples its own 1024-element array (wave table). ===== Avoiding aliasing by reconstructing band-limited wave tables ===== The interesting stuff happens at step 4, but to understand it we must first discuss step 5: When the oscillator is assigned a note frequency, the note's frequency in cycles per second (Hertz) is divided by the sampling rate in use (typically 44100 Hz) to yield a ''float''-valued "phase increment" in //samples per cycle//. The oscillator also has a ''float''-valued "phase" variable, restricted to the range 0.0 to 1.0, where 0.0 represents the beginning of the cycle and 1.0 represents the end. Each time the oscillator generates a new sample, it multiplies the phase by 1024 and rounds the result, to obtain a wave-table index in the range 0-1023, plucks that sample out of its wave-table and outputs it, then adds the phase increment to the phase, ensuring the result "wraps around" if necessary, so it remains in the range 0.0 to 1.0. When the phase-increment is exactly 1.0, the oscillator replays its wave-table exactly. At a sampling rate of 44100 Hz, this would happen at an oscillator frequency of 44100/1024 = 43.066 Hz. This is a bit below F1, way down in the bottom octave of the piano range. Even lower notes result in a phase-increment a bit lower than 1.0, in which case, wave-table samples are occasionally repeated, resulting in slight //quantization artifacts// which are not very noticeable at such low notes. For all notes above F1---i.e., just about every note you'll ever play---the phase increment will be greater than 1.0, meaning that some wave-table samples will be skipped. Basically, you are trying to replay the basic 43 Hz note at the higher pitch, and so //all harmonics// of the original tone will be multiplied by the phase increment value. The 512th harmonic of 43.066 Hz is 43.066 x 512 = 22049.79 Hz, just below half the 44100 Hz sampling rate---the so-called //Nyquist frequency//. Playing F2 means a phase-increment of around 2.0, so all harmonics above the 256th would be above the Nyquist frequency and would be "aliased" to lower frequencies. Up near the top of the piano range, the aliasing of even very low-numbered harmonics (which have substantial energy) will result in very noticeable aliasing artifacts. In SARAH, this is avoided at step 4 above, by figuring out the highest-numbered harmonic which will still be less than the Nyquist frequency, and setting all higher-numbered harmonic coefficients to zero. When the resulting table is inverse-FFT-transformed, we obtain a version of the original waveform which is almost perfectly //band-limited//, and plays back at the new rate with no aliasing whatsoever. ===== Format of frequency-domain coefficient tables ===== In order to write the code for step 4, it's critical to understand the format of data produced by //juce::dsp::FFT// after a forward FFT operation. This is not published, and it's possible that some FFT implementations may differ (the JUCE code is structured so as to support multiple implementations in future), but I wrote my initial code based on my experience working with other FFT code, and my guess turned out to be correct. **I'm not going to give a full explanation of discrete Fourier transforms and the FFT here.** Refer to [[wp>Discrete_Fourier_transform|Wikipedia]] or (preferably) any good Digital Signal Processing textbook to learn more. The following description concentrates entirely on pragmatic details relevant to C++ coding. //juce::dsp::FFT// requires arrays that are exactly a power of 2 in size. The specific power is referred to as the //order// of the FFT, and must be specified when declaring a //juce::dsp::FFT// object. In SARAH, we use order 10, implying arrays with 2^10 = 1024 data points. The Fourier Transform is defined only for Complex Numbers, which are ordered pairs (re, im) consisting of a //real// and an //imaginary// component. These are represented in C++ as something like struct Complex { float re; float im; }; such that ''sizeof(Complex)'' is two times ''sizeof(float)''. However, //juce::dsp::FFT// itself is declared so that all array arguments are treated as ''float[]'', and the documentation states that all arrays must be twice the size given by the //order//, e.g., ''float[2048]'' for order 10. This is because, internally, each ''float[2048]'' array is interpreted as a ''Complex[1024]'' array. After a forward FFT operation, the frequency-domain data are again in a ''Complex[1024]'' array, arranged as follows: * Element 0 represents (gives the coefficient of) the 0th harmonic, aka DC component * Element 1 represents the 1st harmonic, aka fundamental * Element 2 represents the 2nd harmonic (1 octave above fundamental) * Element 3 represents the 3rd harmonic (1 octave plus a fifth above fundamental) * ... * Element 511 represents the 511th harmonic * Element 512 represents the 512th harmonic * Element 513 represents the 511th harmonic * ... * Element 1022 represents the 2nd harmonic * Element 1023 represents the 1st harmonic (fundamental) //Why are the 1st through 511th harmonics represented twice?// This is because the Fourier Transform works in terms of both //positive// and //negative frequencies//, where a negative frequency is interpreted as the Nyquist frequency minus the absolute frequency value. (This is related to the phenomenon of aliasing. When a positive frequency is beyond the Nyqyist frequency, it is "aliased" to the corresponding negative frequency. The human ear hears this negative frequency just the same as the corresponding positive one.) The 0th harmonic, whose frequency is zero Hz, is its own negative. Owing to the arcane mathematics of the FFT, so is the 512th harmonic. ===== The resulting code ===== The result of the above analysis---which mercifully seems to be correct for the default //juce::dsp::FFT// implementations on both Windows and Macintosh---is that the code for reconstructing an anti-aliased and harmonic-shaped time-domain wave table is as follows: void SynthOscillator::setFilterCutoffDelta(float fcDelta) { if (waveform.isSine()) return; zeromem(waveTable, sizeof(waveTable)); int maxHarmonicToRetain = int(1.0 / (2.0 * phaseDelta)); if (maxHarmonicToRetain >= fftSize / 2) maxHarmonicToRetain = fftSize / 2; float cutoffHarmonic = (filterCutoff + fcDelta) * fftSize / 2; for (int ip = 1; ip < maxHarmonicToRetain; ip++) { int in = fftSize - ip; float* fftBuffer = SynthOscillatorBase::getFourierTable(waveform); memcpy(waveTable + ip, fftBuffer + ip, 2 * sizeof(float)); // positive frequency coefficient memcpy(waveTable + in, fftBuffer + in, 2 * sizeof(float)); // negative frequency coefficient float fv = 1.0f; if (ip > int(cutoffHarmonic)) fv = std::pow(ip / cutoffHarmonic, -filterSlope / 6.0f); waveTable[ip + 0] *= fv; waveTable[ip + 1] *= fv; waveTable[in + 0] *= fv; waveTable[in + 1] *= fv; } inverseFFT->performRealOnlyInverseTransform(waveTable); } This code works as follows: * If the selected waveform is sine, nothing need be done; the common sine wave table will be used instead of this oscillator's ''waveTable[]'' array. * ''waveTable[]'' is cleared to all zeroes, and we compute the index of the maximum harmonic to retain, as the reciprocal of twice the //phase delta// (oscillator frequency in cycles per sample) * We then compute ''cutoffHarmonic'' which represents the cutoff frequency of our simulated low-pass filter. This is based on the fixed ''filterCutoff'' parameter and a dynamic ''fcDelta'' which is the sum of the shape-LFO and scaled shape-EG outputs. (Both these quantities have the range 0.0 to 1.0.) * The ''for'' loop runs through the harmonic indices 1 through ''maxHarmonicToRetain'', copies both the positive and negative-frequency coefficients out of the master table (a pointer to which is returned by //SynthOscillatorBase::getFourierTable(waveform)//, and then scales the resulting copies by a factor ''fv'' computed by an expression which models the response curve of a simple low-pass filter with a given cutoff frequency (expressed in cycles per sample) and slope (expressed in dB per octave). * Finally, we request an inverse FFT on the ''waveTable[]'' array. The astute reader will have noted that the ''for'' loop skips the 0th harmonic. The 0th harmonic of an digitized AC signal represents the //DC component// (net offset from zero) which is neither audible nor desirable in a digital signal processing system. Ensuring that the 0th-harmonic coefficient is always zero results in output waveforms which are guaranteed to be symmetric about zero---yet another free gift from the FFT. ===== "Real-Only" FFT ===== As I said earlier, the Fourier Transform is defined over the Complex domain. An ordinary sequence of time-domain samples is "real-only"---corresponding to Complex numbers having zero imaginary components. After a forward FFT operation, the Complex frequency components will be such that for each //positive-harmonic// coefficient //(re, im)//, the corresponding //negative-harmonic// coefficient will be //(re, -im)//; they are so-called //Complex conjugates//. These special properties can be utilized to define "real-only" variants of the forward and inverse FFT algorithms which use less CPU than the "full Complex" version, and the //juce::dsp::FFT// module includes these. Any CPU efficiency is a bonus, but for practical purposes, what you need to know is that the "real-only" FFTs interpret the time-domain data not as an array of 1024 ''Complex'' values (a real ''float'' followed by the corresponding imaginary ''float''), but rather as an array of 1024 real ''float''s followed by an array of 1024 imaginary ''float''s (which are all zeros). Hence, even though the array is declared as size 2048, we simply ignore the second half, and use only the first 1024 ''float''s. //SynthOscillator::getSample()// is simply float SynthOscillator::getSample() { int sampleIndex = int(phase * fftSize + 0.5f); if (sampleIndex >= fftSize) sampleIndex = 0; phase += phaseDelta; while (phase > 1.0) phase -= 1.0; if (waveform.isSine()) return sineTable[sampleIndex]; else return waveTable[sampleIndex]; }