GetDunne Wiki

Notes from the desk of Shane Dunne, software development consultant

User Tools

Site Tools


sarah_oscillator_details

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
sarah_oscillator_details [2017/09/05 23:33]
shane [The resulting code]
sarah_oscillator_details [2017/09/05 23:42] (current)
shane [The resulting code]
Line 1: Line 1:
 ====== SARAH Oscillator Details ====== ====== SARAH Oscillator Details ======
  
-===== About SARAH's oscillators ===== +The oscillators are the only interesting aspect of SARAH's design. All of the other elements shown in the [[sarah|signal-flow diagram]] (Envelope Generators, LFOs, summing and scaling) are entirely conventional.
-The oscillators are the only interesting aspect of SARAH's design. All of the other elements shown in the signal-flow diagram above (Envelope Generators, LFOs, summing and scaling) are entirely conventional.+
  
 SARAH uses two oscillator instances per voice. The oscillators themselves are identically structured, but their settings are independent, i.e., they can be set to produce different waveforms, detuned, and mixed so as to provide at least a basic set of options to create composite timbres, and each oscillator's inherent "harmonic shaping" can also be set up differently, to provide further control. The following explanation (which is just an overview) applies equally to OSC1 and OSC2. SARAH uses two oscillator instances per voice. The oscillators themselves are identically structured, but their settings are independent, i.e., they can be set to produce different waveforms, detuned, and mixed so as to provide at least a basic set of options to create composite timbres, and each oscillator's inherent "harmonic shaping" can also be set up differently, to provide further control. The following explanation (which is just an overview) applies equally to OSC1 and OSC2.
  
 +===== About SARAH's oscillators =====
 SARAH's oscillators are essentially wave-table based. They simply play out samples from a 1024-element digitized representation of one cycle of the selected waveform---sine, triangle, square, or sawtooth. What is interesting and new is how the 1024-element wave tables are populated. SARAH's oscillators are essentially wave-table based. They simply play out samples from a 1024-element digitized representation of one cycle of the selected waveform---sine, triangle, square, or sawtooth. What is interesting and new is how the 1024-element wave tables are populated.
  
-==== Sine waves are a special, simple case ====+===== Sine waves are a special, simple case =====
  
 There is a common sine wave table which is generated once and shared by all oscillator instances (including the LFOs in sine-wave mode). This is adequate, because the sine wave has no higher-order harmonics which need to be suppressed to avoid aliasing. There is a common sine wave table which is generated once and shared by all oscillator instances (including the LFOs in sine-wave mode). This is adequate, because the sine wave has no higher-order harmonics which need to be suppressed to avoid aliasing.
  
-==== Reconstructing band-limited triangle, square and sawtooth waves ====+===== Reconstructing band-limited triangle, square and sawtooth waves =====
  
 The triangle, square, and sawtooth wave tables are generated dynamically, as follows: The triangle, square, and sawtooth wave tables are generated dynamically, as follows:
Line 21: Line 21:
   - To sound a note, the oscillator resamples its own 1024-element array (wave table).   - To sound a note, the oscillator resamples its own 1024-element array (wave table).
  
-==== Avoiding aliasing by reconstructing band-limited wave tables ====+===== Avoiding aliasing by reconstructing band-limited wave tables =====
  
 The interesting stuff happens at step 4, but to understand it we must first discuss step 5: When the oscillator is assigned a note frequency, the note's frequency in cycles per second (Hertz) is divided by the sampling rate in use (typically 44100 Hz) to yield a ''float''-valued "phase increment" in //samples per cycle//. The oscillator also has a ''float''-valued "phase" variable, restricted to the range 0.0 to 1.0, where 0.0 represents the beginning of the cycle and 1.0 represents the end. Each time the oscillator generates a new sample, it multiplies the phase by 1024 and rounds the result, to obtain a wave-table index in the range 0-1023, plucks that sample out of its wave-table and outputs it, then adds the phase increment to the phase, ensuring the result "wraps around" if necessary, so it remains in the range 0.0 to 1.0. The interesting stuff happens at step 4, but to understand it we must first discuss step 5: When the oscillator is assigned a note frequency, the note's frequency in cycles per second (Hertz) is divided by the sampling rate in use (typically 44100 Hz) to yield a ''float''-valued "phase increment" in //samples per cycle//. The oscillator also has a ''float''-valued "phase" variable, restricted to the range 0.0 to 1.0, where 0.0 represents the beginning of the cycle and 1.0 represents the end. Each time the oscillator generates a new sample, it multiplies the phase by 1024 and rounds the result, to obtain a wave-table index in the range 0-1023, plucks that sample out of its wave-table and outputs it, then adds the phase increment to the phase, ensuring the result "wraps around" if necessary, so it remains in the range 0.0 to 1.0.
Line 31: Line 31:
 In SARAH, this is avoided at step 4 above, by figuring out the highest-numbered harmonic which will still be less than the Nyquist frequency, and setting all higher-numbered harmonic coefficients to zero. When the resulting table is inverse-FFT-transformed, we obtain a version of the original waveform which is almost perfectly //band-limited//, and plays back at the new rate with no aliasing whatsoever. In SARAH, this is avoided at step 4 above, by figuring out the highest-numbered harmonic which will still be less than the Nyquist frequency, and setting all higher-numbered harmonic coefficients to zero. When the resulting table is inverse-FFT-transformed, we obtain a version of the original waveform which is almost perfectly //band-limited//, and plays back at the new rate with no aliasing whatsoever.
  
-==== Format of frequency-domain coefficient tables ====+===== Format of frequency-domain coefficient tables =====
 In order to write the code for step 4, it's critical to understand the format of data produced by //juce::dsp::FFT// after a forward FFT operation. This is not published, and it's possible that some FFT implementations may differ (the JUCE code is structured so as to support multiple implementations in future), but I wrote my initial code based on my experience working with other FFT code, and my guess turned out to be correct. In order to write the code for step 4, it's critical to understand the format of data produced by //juce::dsp::FFT// after a forward FFT operation. This is not published, and it's possible that some FFT implementations may differ (the JUCE code is structured so as to support multiple implementations in future), but I wrote my initial code based on my experience working with other FFT code, and my guess turned out to be correct.
  
Line 55: Line 55:
   * Element 1023 represents the 1st harmonic (fundamental)   * Element 1023 represents the 1st harmonic (fundamental)
  
-//Why are the 1st through 511th harmonics represented twice?// This is because the Fourier Transform works in terms of both //positive// and //negative frequencies//, where a negative frequency is interpreted as the Nyquist frequency minus the absolute frequency value. (This is related to the phenomenon of aliasing. When a positive frequency is beyond the Nyqyist frequency, it is "aliased" to the corresponding negative frequency. The human ear hears this negative frequency just the same as the corresponding positive one.) The 0th harmonic, whose frequency is zero Hz, is its own negative. So is the 512th harmonic, owing to the arcane mathematics of the FFT.+//Why are the 1st through 511th harmonics represented twice?// This is because the Fourier Transform works in terms of both //positive// and //negative frequencies//, where a negative frequency is interpreted as the Nyquist frequency minus the absolute frequency value. (This is related to the phenomenon of aliasing. When a positive frequency is beyond the Nyqyist frequency, it is "aliased" to the corresponding negative frequency. The human ear hears this negative frequency just the same as the corresponding positive one.) The 0th harmonic, whose frequency is zero Hz, is its own negative. Owing to the arcane mathematics of the FFT, so is the 512th harmonic.
  
-==== The resulting code ====+===== The resulting code =====
 The result of the above analysis---which mercifully seems to be correct for the default //juce::dsp::FFT// implementations on both Windows and Macintosh---is that the code for reconstructing an anti-aliased and harmonic-shaped time-domain wave table is as follows: The result of the above analysis---which mercifully seems to be correct for the default //juce::dsp::FFT// implementations on both Windows and Macintosh---is that the code for reconstructing an anti-aliased and harmonic-shaped time-domain wave table is as follows:
 <code cpp> <code cpp>
Line 94: Line 94:
   * Finally, we request an inverse FFT on the ''waveTable[]'' array.   * Finally, we request an inverse FFT on the ''waveTable[]'' array.
  
-==== "Real-Only" FFT ====+The astute reader will have noted that the ''for'' loop skips the 0th harmonic. The 0th harmonic of an digitized AC signal represents the //DC component// (net offset from zero) which is neither audible nor desirable in a digital signal processing system. Ensuring that the 0th-harmonic coefficient is always zero results in output waveforms which are guaranteed to be symmetric about zero---yet another free gift from the FFT. 
 + 
 +===== "Real-Only" FFT =====
 As I said earlier, the Fourier Transform is defined over the Complex domain. An ordinary sequence of time-domain samples is "real-only"---corresponding to Complex numbers having zero imaginary components. After a forward FFT operation, the Complex frequency components will be such that for each //positive-harmonic// coefficient //(re, im)//, the corresponding //negative-harmonic// coefficient will be //(re, -im)//; they are so-called //Complex conjugates//. These special properties can be utilized to define "real-only" variants of the forward and inverse FFT algorithms which use less CPU than the "full Complex" version, and the //juce::dsp::FFT// module includes these. As I said earlier, the Fourier Transform is defined over the Complex domain. An ordinary sequence of time-domain samples is "real-only"---corresponding to Complex numbers having zero imaginary components. After a forward FFT operation, the Complex frequency components will be such that for each //positive-harmonic// coefficient //(re, im)//, the corresponding //negative-harmonic// coefficient will be //(re, -im)//; they are so-called //Complex conjugates//. These special properties can be utilized to define "real-only" variants of the forward and inverse FFT algorithms which use less CPU than the "full Complex" version, and the //juce::dsp::FFT// module includes these.
  
-Any CPU efficiency is a bonus, but for practical purposes, what you need to know is that the "real-only" FFTs interpret the time-domain data not as an array of 1024 ''Complex'' values (a real ''float'' followed by the corresponding imaginary ''float''), but rather as an array of 1024 real ''float''s followed by an array of 1024 imaginary ''float''s (which are all zeros). Hence, even though the array is declared as size 2048, we simply ignore the second half, and use only the first 1024 ''float''s. //SynthOscillator::getSample()// is then simply+Any CPU efficiency is a bonus, but for practical purposes, what you need to know is that the "real-only" FFTs interpret the time-domain data not as an array of 1024 ''Complex'' values (a real ''float'' followed by the corresponding imaginary ''float''), but rather as an array of 1024 real ''float''s followed by an array of 1024 imaginary ''float''s (which are all zeros). Hence, even though the array is declared as size 2048, we simply ignore the second half, and use only the first 1024 ''float''s. //SynthOscillator::getSample()// is simply
 <code cpp> <code cpp>
 float SynthOscillator::getSample() float SynthOscillator::getSample()
sarah_oscillator_details.1504654399.txt.gz · Last modified: 2017/09/05 23:33 by shane