Fundamentals of Signal Processing for Musicians hero

Creative Coding

Fundamentals of Signal Processing for Musicians

A free, text-based primer on the signal processing behind music production — waveforms, amplitude, frequency and phase, wave superposition, spectrum and harmonicity, sound pressure and dB, and the analog-to-digital path — with diagrams and audio examples throughout.

Level

Beginner

Duration

Self-paced

Format

Self-paced video

Added

14/03/2023

Course overview

A self-paced introduction to the signal processing concepts every music producer benefits from understanding. Starting from what an audio signal is, it works through waves and waveforms, amplitude and DC offset, period, frequency and phase, and how waves combine into complex tones. Later lessons cover the spectrum, harmonicity and pitch, sound pressure level and decibels, and the difference between analog and digital signals, before an overview of common signal-processing techniques. Illustrated with diagrams and sine, saw and square-wave audio examples.

Learning outcomes

Understand and explain the fundamental principles of sound, its physical properties, and how it interacts with the human auditory system.

Identify and operate various audio tools and equipment, such as microphones, preamps, and amplifiers, and understand their role in the audio signal processing chain.

Apply various audio signal processing techniques, including sound synthesis, dynamic effects, filters, distortions, and advanced techniques like pitch-shifting and looping, to create, manipulate, and enhance audio signals.

Understand the principles of digital audio, including analog-to-digital and digital-to-analog conversion, and effectively use digital audio processing software for audio production and manipulation.

Who is this course for?

  • Music Producers and Sound Engineers: Those already working in the field can deepen their understanding of the science behind their craft and learn new techniques.
  • Musicians and Composers: Those seeking to better understand the science of sound for composing and creating music will benefit greatly from this course.
  • Students and Academics: Students studying physics, music, acoustics, or related fields can enhance their knowledge with practical applications of theory.
  • Audio Enthusiasts and Hobbyists: Those with a passion for sound, music, and audio technology, who want to understand more about how it works and how to manipulate it for their own projects.
  • Aspiring Audio Professionals: Individuals who aim to start a career in the music or sound industry, such as sound design for games, film, television, and music production.
  • Professionals in Related Fields: Professionals in areas such as game design, film production, or multimedia arts, who want to improve their understanding of sound to enhance their primary skills.

Requirements

  • Basic familiarity with physics: An understanding of basic concepts of physics, especially waves and vibrations, will be beneficial for fully grasping the principles of sound and audio signal processing.
  • Interest in audio production: While not a strict requirement, an interest in audio production, sound design, or music will make the course more engaging and enjoyable.
  • Basic computer skills: As this course involves digital processing and software use, basic computer skills are necessary. Knowledge of audio processing software can be beneficial but is not a prerequisite as the course will provide introductions to these tools.

Course content

Fundamentals of Signal Processing for Musicians

10 lessons

+
  • Introduction to signal processing


    Signal processing is a fundamental aspect of the modern music production process, encompassing a wide range of techniques and tools that manipulate, analyze, and transform audio signals. These techniques can be used to create new sounds, enhance the quality of recordings, and facilitate the seamless integration of various audio elements in a mix.

    But, first of all: what is an audio signal? An audio signal is an electronic representation of sound. In the context of music production, this representation can be broadly categorized into two groups: time-domain representation, which shows the shape of the oscillation in time — called waveform, and frequency-domain representation which illustrates all the frequency contained in the signal — called spectrum.

    Waveform and spectrum of a vocal signal — source: Wikipedia

    Spectrogram of the spoken words "nineteen century" — source: Wikipedia (a spectrogram represents the spectrum evolution over time).

    Similarly, signal processing techniques can occur in the time domain or the frequency one, and depending on their application they can affect several characteristics of the signal that are also noticeable in its waveform and spectrum. They can be summarized as follows: sound synthesis (additive, subtractive, FM, granular), dynamic effects (gain, preamp, compressor), filters (low-pass, equalizer), distortions (fuzz, overdrive, clipping, bit crushing), modulations (flanger, chorus, phaser, tremolo), delay and reverb (tape, echo, spring, plate), Imaging and spatialization (panning, balance, stereo width, mono sum), and some more complex techniques which require buffers and spectral processing (pitch-shifting, time-stretching, looping).

    Signal processing techniques can occur in two complementary fields, called analog and digital. Analog signal processing uses continuous electrical signals to represent and manipulate audio, while digital signal processing relies on discrete, numerical representations of audio signals to perform manipulations and transformations.

    The development and popularization of digital signal processing have revolutionized the music production landscape, allowing for more precise control and manipulation of audio signals, as well as the creation of entirely new sounds and effects. Today, signal processing is an integral component of music production, enabling musicians and producers to shape, sculpt, and refine their audio creations in ways that were previously unimaginable.

    In the context of this course, the focus will be on the principles of real-time digital audio processing, covering various synthesis techniques and digital audio effects as well. By understanding the underlying concepts and applications of these techniques, musicians can operate on them independently, to achieve the desired outcome in fundamental professional contexts like sound design, production, mixing, and mastering.

  • Waves, Sound Waves, and Waveforms



    A wave is a disturbance that travels through space and time, carrying energy and momentum without transporting matter. A disturbance in physics is a change from the current state of a measurable quantity at some location. We can distinguish from a huge variety of waves, basically depending on their type — electromagnetic waves, mechanical waves — and their motion — longitudinal waves, transverse waves, etc. Wave motion is defined by the direction in which the wave propagates compared to the way in which its particles move: If these two movements occur in a parallel form the motion is longitudinal, if they are in a perpendicular relationship the motion is transverse. Acoustic waves are pressure variations. They are mechanical — that means they require a medium to propagate, such as air, and their motion is longitudinal. These pressure variations are caused by a physical source that vibrates, such as a guitar string or a vocal cord. Ultimately, this motion can be detected by a receiver, such as the human ear. Sound is the term we use to describe an acoustic wave that is audible to humans.


    Longitudinal Plane Wave
    — source: Wikipedia
    Transverse Plane Wave
    — Source: Wikipedia

    From the first image above describing a 2D longitudinal wave, it is possible to understand what happens to the particles of air affected by acoustic wave propagation. Each vertical line can be considered a column of particles that moves back and forth and transfers this motion to the following line. The spacing between lines varies when they are affected by the motion: when a line is moving towards the following the spacing between the two is reduced; this is a moment of maximum pressure. Then, when the line bounces backward and distances itself from the following one, there is a minimum pressure value. These alternated movements are called compression and decompression, respectively.

    We can study waves and their motion by measuring their corresponding signal. But what is a signal then? A signal is a mathematical function, representing a physical quantity that varies in relation to another physical quantity. For example, the pressure that varies over time describes what we call an audio signal.

    Waveforms as said in the previous paragraph are used to plot a representation of sound — i.e. an audio signal — on a 2D Cartesian coordinate system. Conventionally, they depict the particle displacement on the vertical axis and the time progress on the horizontal one. The following image (source: Wikipedia) draws a continuous horizontal line resulting from all the positions at which the blue particle locates itself by going up and down over time. That is how a waveform is built in simple terms.



    By dissecting the shape and the size of a waveform we are able to retrieve the basic information about the sound that originated it, such as amplitude, period, frequency, and phase

  • Sound Parameters pt. 1: DC Offset and Amplitude

    Waves are characterized by motion that varies over time (and also across space). In acoustic waves, this variation is represented by the motion of the air particles, which continuously change direction. This results in a sequence of high and low peaks, corresponding to the points of maximum and minimum pressure on the physical wave. In the following plot, we can easily identify these peaks: they are represented by the highest and lowest points of the waveform.

    Waveform of a Violin Sound — Source: Wikipedia


    A waveform can provide us with a wealth of information. Typically, it oscillates around a central value known as the DC offset, which represents the equilibrium point around which the particle moves, and is set to zero. Points above the DC offset (upper part of the plot) have positive values, while those below it (lower part of the plot) have negative values. The maximum displacement of the wave from its equilibrium point defines its amplitude, which is denoted by the letter A. This is the first crucial parameter of our signal. To calculate the amplitude, we flip the negative part of the waveform (i.e., the oscillation below the DC offset) to create a sequence of positive values. This process is called rectification, and the amplitude of the signal corresponds to its highest peak after rectification.


    The black dotted curve below the x-axis is flipped over it, the red line represents a rectified signal whose amplitude corresponds to the blue dotted line that touches its peaks. The DC offset is here represented by the horizontal x-axis.


    When a signal DC offset is different than zero, things get more complicated. Generally speaking, if we are considering audio signals that reproduce sound through our loudspeakers, having their DC offset set to 0 is the correct way to deal with them. Other values than that will not only compromise our measurements and the resulting sound but also potentially damage our PA. The amplitude of an audio signal increases when there is more energy involved in the wave motion. For perceptual and technical reasons, amplitude variations might be very relevant even when they occur within reduced ranges of values. This forces us to deal with a wide range of numbers that encompasses decimal numbers occurring with different amounts of digits after the point — also known as different orders of magnitude. In order to deal with these values in a more efficient way, we will later introduce a fundamental tool that math gave to the sound engineering world: deciBels.

  • Sound Parameters pt. 2: Period, Frequency, and Phase

    A fundamental characteristic of some audio signals is called periodicity. A periodic signal exhibits a repeating pattern of oscillation at regular intervals of time, with each repetition maintaining the same shape. The time it takes for one complete cycle of the pattern to occur is called the period of the signal and is denoted by the letter T.

    A sinusoidal signal with its amplitude and period quantities denoted respectively by the letters A and T (Source: Wikipedia) 

    The period in physics tells us about the time it takes something to happen. In this case, the time it takes for a signal to complete a full cycle before starting a new one. Another important parameter related to period is frequency — denoted by the letter f — which tells us how often something happens. The frequency of a signal tells us how many cycles per second are drawn, and it is measured in Hertz (Hz). 1 Hz is one oscillation per second. By knowing the period, we can retrieve the frequency of a signal by finding its reciprocal, that is 1 divided by the period itself, 1/T. When the wave is a sound wave, a higher frequency corresponds to a higher pitch.

    The simplest periodic signal is one describing a point moving around a unity circle at a constant speed.  When a full circle is completed, the motion repeats itself over the same amount of time. It is called simple harmonic motion, and when we plot its y-coordinate variation we draw the most basic periodic waveform: a sinusoid, or sine wave.

    Source: Wikipedia


    The image above shows two waves, a blue one and a red one, respectively related to the variation on the x-axis and y-axis of the point moving on the circle plot. The growing angle drawn by this anti-clockwise motion is denoted by θ (theta) and computed starting from a 0° value placed on the point (1, 0) on the unit circle plot. Let's focus on the red wave: The upper half of the circle corresponds to the positive semi-oscillation of the sine wave (first half of the period), while the lower half corresponds to its negative semi-oscillation (second half of the period). It takes a full cycle on the circle to complete a full sine wave period, including both positive and negative oscillations. If we want to describe a specific point within one cycle, we use the term phase, usually denoted by the Greek letters φ or θ. Since phase is equal to the position of the point on the circle, it can also be seen as a growing angle. That is why we use angle units of measurement to describe phases, like degrees and radians.

    Let's now find the matches between the oscillation points and their phase values: 

    • The very 1st point of the oscillation is placed at phase .
    • The positive peak is placed at phase 90°, or π/2.
    • The half-period point equal to zero is placed at phase 180°, or π.
    • The negative peak is placed at phase 270°, or 3π/2.
    • The end of the period, equal to 0 and coincident with the start of the following period, is 360° or , and at the same time 0°. 
    Hence, each start of a new period on a periodic signal corresponds to a multiple of 2π.
    Here is an example of a 1-period long analog sine wave, together with all its quantities and its phase values. (Source: Wikipedia)
  • Wave superposition: Interference and Complex Waveforms

    In the last lesson, we discovered the simple harmonic motion as the most fundamental oscillation possible — i.e. the sinusoidal one. This means that any other waveform is more complex, and a good way of looking at them is to consider them as a composition of multiple simple oscillations. But how are simple waves summed together? The principle of superposition answers this question:


    When two or more waves cross at a point, the displacement at that point is equal to the sum of the displacements of the individual waves.


    In the following image, we see waves resulting from the superposition of several waves from different sources, producing a complex pattern. Their high peaks and low peaks occur at specific points depending on the sum of the waves' displacement values at that location — at that very moment of course.


    Waves interference pattern on a water surface, source: Wikimedia Commons

    To better understand the result, let's see how the sum between two sine waves looks with waveforms. Here are four basic cases:


    1. The first two images below show two identical sine waves drawn in black and blue: same period, same amplitude, and the same phase — when this happens we say that they are in phase. The sum between them is shown in the third picture below, drawn in red: it is still a sine wave, but its amplitude doubled. We call this case constructive interference.


                                       
      Like in algebra, an object summed with itself gives us a result that equals that same object multiplied by 2. If the two waves had different amplitudes, let's say 1 and .05, the resulting sum would be a sine wave of 1.5.

    2. The first image below on the left shows a sine wave drawn in black. The second image on the right shows a sine wave drawn in blue. They have the same amplitude and period, but the second sine wave has its phase shifted by 180° — 180° out of phase, a half oscillation: it basically starts with the semi-oscillation made of only negative values  The resulting sum of the two waves is shown in the third image below in red, and it equals 0. Opposite oscillations cancel each other, resulting in the absence of motion, a straight horizontal line with no displacement whatsoever. We call this destructive interference.

                                             


      If we just pick the high peak of the black wave, which equals 1, we easily notice how it occurs at the same time as the low peak of the blue wave, which equals -1. Knowledge of basic algebra helps us also in this case, as we are basically summing perfectly opposite numbers, and any number summed with its opposite gives 0 as a result.

      From the two cases above we can retrieve also what happens when two identical waves with a phase offset different than 180° are added together. Depending on the size of the offset, we will obtain a third sine wave with an amplitude value between 0 and 2 (image below).

      The blue and green waves are summed together, resulting in the red line. The animation shows an increasing phase offset between the two waves, that each time produces a red wave of different amplitude. (Source: Wikipedia)
    3. The following image shows a wave of a given period (black dashed line) and a second wave whose period equals half the first wave (blue dashed line); therefore, the second wave's frequency is a multiple of the first wave frequency, equal to its double. Their sum is a composite waveform (red continuous line), thus not a sine wave anymore.


      In this specific case, since the period of one wave is the multiple of the other one, we have a consistency in correspondence with the beginning of the long wave, equal to 0, that coincides with the half period of the short wave, also equal to 0.

  • Spectrum, Harmonicity, and Pitch

    The frequency of a periodic signal directly affects what we call the pitch of a note. Let's listen to an example with two sawtooth waves:

    C4 sawtooth (261.6Hz)

    A4 sawtooth (440Hz)

    A4 is a note placed at a higher position than C4, and we clearly hear this difference. If we also consider that the frequency of A4 is higher than C4, we understand that they both grow together. However, this is not linear growth, but it follows a more sophisticated rule that we call logarithmic function.

    The plot of the function y = ln(x)
     

    On the logarithmic plot above, the growth of the quantity on the vertical axis slows down as the quantity on the horizontal axis approaches higher values. This is exactly what happens to the pitch related to frequency. To better understand it, we can make a step back and examine a specific musical interval: the octave. An octave is the distance between two notes with the same name on a musical scale. The Western musical scale C D E F G A B repeats itself after seven notes, which is why we call this interval octave: we need 8 steps to get from C4 to C5. This is not just an arbitrary definition, but it's grounded in Western music history, Physics, and Perception. Let's listen to the interval A3-A4:

    A3 sawtooth (220Hz)

    A4 sawtooth (440Hz)

    We can clearly hear how despite being placed at different heights on the pitch scale, these notes share the same name, as they are in tune. This is the reason why, for instance, we are able to sing the same melody together with a person that has a lower or higher vocal range: if we are singing an octave interval, it works well. About their frequencies, they have a ratio of 2:1, as A4 is 440 Hz and A3 is 220 Hz.
    If I want to find A5, the jump becomes larger, as its frequency is 880 Hz, so 4:1 for A3, and 2:1 for A4. A6 is 1760 Hz, so 8:1 for A3, 4:1 for A4, and 2:1 for A5. we see how to keep the octave interval, we are forced to double the frequency per each reiteration of it. The plot of this function is logarithmic, and it is shown below: 


    The frequencies corresponding to musical pitches/notes and pitch/note intervals (interval: pitch distance) increase logarithmically. Per the current standard, pitch A4 corresponds to 440Hz. (Source: Acousticslab.org) 

    Waveforms can be further analyzed through sophisticated methods that give us information about their frequencies. We call the frequency content of a waveform "spectrum", and it consists in a diagram that maps intensity (amplitude or dB) on the vertical axis and frequencies on the horizontal one. For example, the range of frequencies to which we are sensitive is called the audible spectrum, and it goes from 20 Hz to 20000 Hz; everything below (infrasound) and above (ultrasound) is inaudible to us. The spectrum of a sine wave is made of a single line placed at its frequency value, which height equals its amplitude. That's also why we call the sound of a sine wave pure tone: because it represents a single pure oscillation and so just a single and precise frequency value. Non-sinusoidal signals have a higher spectral complexity, as they can be seen as the sum of several different sine waves — the superposition principle studied in the previous lessons shows how that works with an example of two sines summed together. We are particularly sensitive to the spectrum of a sound rather than its waveform. Here is an example of three different waveforms having the same frequency, but different spectra:

    A4 sinewave (440Hz)

    A4 sawtooth (440Hz)

    A4 squarewave (440Hz)

    The sawtooth and the square are in tune with the sine, but they sound like they have a higher sonic complexity than that, because of their rich frequency content. Below, there is a comparison of the three waves' respective spectra:


    (Source: Perfect Circuit) 

    The square and the sawtooth have more than one line in their spectrum. Non-pure tones that have a periodic waveform can have a richer spectrum, with a large number of different sinusoidal components named partials. Until our signal is periodic, the frequency of each partial will be always following a fundamental rule that places them at an integer ratio with the lower partial: 2:1, 3:1, 4:1,5:1, 6:1, they are essentially multiples. We call this configuration harmonic spectrum, its first partial fundamental, and the following ones overtones or harmonics. A harmonic spectrum enforces the perception of a note placed at the fundamental frequency. In the image above we can see how the sawtooth has a spectrum that decreases in amplitude as the order of its partial increases, as much as the square wave one. However, this profile appears different in the square wave spectrum where the spacing between each partial is larger, as it contains only odd overtones. Spectral features together with other causes contribute to defining a different timbre for each instrument.


  • Sound Pressure Level and dB

    We encounter logarithmic scales very often when dealing with sound. Another important case is related to intensity. Things can get confusing here because there are a lot of related quantities to deal with — sound intensity level, acoustic power, etc. For the sake of simplicity, let's stick to a single scale that we will call sound pressure level, that is all we need to relate to the amplitude of our signal.

    Imagine we are measuring the amplitude of a sound that it's changing in time. We will need to take a step further from the simple rectification of a wave (as we have seen in the 3rd lesson) and take a better average amplitude value over a certain interval of time. This is done through a signal averaging technique called root mean square, or RMS. The result will be a single pressure value, measured in Pascal (Pa).

    The scale of intensities to which we are sensitive it's impressively wide, encompassing several orders of magnitude. It goes from the threshold of hearing, which is the quietest possible sound and is equal to 0.00002 Pa, to the limit of pain, which is the minimum intensity at which a person begins to perceive, or sense, a stimulus as being painful, and is above 20 Pa. To deal with a more convenient range of numbers, we compress this scale by calculating a logarithm of a ratio between our measured quantity in RMS and the threshold of hearing.

    Source: Wikipedia

    This is the formula for the deciBel SPL or dB SPL. Why it is convenient? Let's look at another image, comparing values in micro-Pascal and the corresponding amount in dB SPL.

    Source: The Pro Audio Files

    Everything falls within a range that goes from 0 dB to 130 dB. When we use dB SPL we can consider each doubling or halving of our signal as a simple sum or subtraction of approximately 6 dB. This is super convenient and also matches the way we perceive intensity: we are sensitive to intensity variations when they are multiplied by a factor (x2 or x 1/2 for instance) rather than changing linearly and turning multiplications with big numbers and many digits into sums of small numbers is way more intuitive.

  • Analog Signal

    Electronic systems for sound can operate on an electric "copy" of the real acoustic wave thanks to a basic operation called transduction. A transducer is a device that converts energy from one form to another. The human ear is the most familiar transducer to us. Transducers that convert acoustic energy into electricity mostly belong to the wide and diverse set of devices that we know as microphones.


    AKG C451B small-diaphragm condenser microphone (Source: Wikipedia) 


    After the transduction stage, we can deal with an oscillation in electric tension that can be conditioned by a sequence of elements that apply signal processing techniques to better render our final sound. Together they are parts of what we define as a sound system, also called electroacoustic chain. Many configurations are possible, but we can say that a simple, generic live sound system has the following structure:

    Transducer
    Microphones, pick-ups, operating in mono, stereo, or more channels. The analog signal journey starts here. Let's assume we are working on the sound for a live act of a voice and guitar dup. We will definitely have two distinct signals to deal with: one coming from the singer's microphone, and the other coming from the transduction of the guitar sound — let's consider a very simple case with an acoustic guitar sound transduced with another microphone. These two signals will travel on a parallel route, being treated on different lanes of our electroacoustic chain before the point when we will finally merge them. These lanes are usually represented by the channel strips of a mixing console. The way to get there is through cables and input connections.

    Preamp
    Here the analog signals of vocals and guitar are amplified to a proper reference level through a gain stage, a fundamental operation that allows us to deal with stronger signals at a better signal-to-noise ratio, suitable for further processing and at a comparable level with the other signals that will be mixed together with them. Often embedded in professional mixing consoles, placed right after the input connections. The gain amount is regulated through a knob called trim.

    Mark Levinson Preamp (Source: Wikipedia) 

    Channel Strip Processors
    Here come the adjustments, usually handled by a series of different knobs. Depending on the complexity of the mixer, we can have several processors here that let us modify our two signals independently to enhance their respective features in a way that will make the subsequent merging constructive. For instance, uniforming the guitar dynamics with a  compressor and emphasizing the vocal range frequencies with an equalizer. Then, twisting the pan knob to move the position of both sources (guitar and voice) over a stereo image, made of left and right channels. An additional knob might send our sound to an auxiliary output for additional processings that we could do with outboard gear — reverb for instance, from which we will send the processed sound back inside the mixer, for instance as an additional channel. Finally, the fader regulates the final amount of a signal before merging it with other signals. An additional fader dedicated to the sum that goes to the output helps us to keep the amplitude of the sound within a controlled range.

    A Yamaha MG20XU Live Analog Mixer with a Digital FX Bus. We can see the parallel vertical channel strips per each input channel presenting several knobs and switches. From the top: combo connector (xlr+jack cable), pad attenuation and hp filter (grey buttons), preamp gain (white knob), compressor (yellow knob), 3-band semi-parametric EQ (green knobs), aux sends (blue knobs), fx send (white knob at the bottom), pan (red knob), on/off switch, fader with output bus assignment and pre-fader listening switches on its right. (Source: Wikipedia) 

    Amplifier
    Two channels containing a different balance of the sum between voice and guitar are sent to an amplifier, which brings them to a range of values that is adequate for driving enough energy to the speakers.

    A McIntosh stereo audio amplifier with output power of 50 watts per channel used in home component audio systems in the 1970s. (Source: Wikipedia)

    Loudspeakers
    Finally, our electrical signal is sent to another transducer, which turns it again into acoustic pressure. We can now listen to a balanced mix of voice and guitar, and panned across the horizontal line coherently with the position of the two musicians. There are several kind of loudspeaker systems, depending on the size of the room/concert venue, and on other features.

    A line-array cluster of speakers used in large concert venues (Source: Wikipedia)


  • Digital Signal

    Nowadays, signal processing is mostly done with digital devices, whether it's for editing purposes, producing, mixing, or broadcasting. Technology gives us the possibility to store and preserve accurate copies of our signals in the form of digital data, which we can process through non-destructive editing. To understand how digital audio works, we can draw an analogy with moving images in films. Human vision has a certain rate at which records visual information, corresponding to approximately 24/25  images per second. This implies that the usage of video equipment that can record and playback a sequence of images at a similar rate — 24 frames per second or 24 fps, for instance, will allow the illusion of reproducing a smooth-moving image with a finite number of stills.

    Digital audio works in a similar way. It takes the preamplified signal and filters it to limit the spectral range for avoiding representation errors (called aliasing or foldover). Then, it starts "taking the snapshots" of the oscillation with a processing block called an analog-to-digital converter or ADC. This is done according to the Nyquist-Shannon Sampling Theorem: in order to represent the full audible spectrum, we need to record the audio signal at a rate that is at least double the highest frequency of the range mentioned above. As the audible range goes from 20 Hz to 20 kHz, we need at least 40,000 snapshots per second to be able to record all these oscillations in a convenient way.


    The grey curve represents a continuous oscillation, and the red arrows show the sampling points. (Source: Wikipedia)

    Different sampling rate options are chosen depending on the context. Still, the most used in the professional context are 44.1 kHz and 48 kHz. Now the oscillation is turned into a sequence of numbers.

    In order to preserve detailed information,  it is also fundamental to use enough points on the vertical axis. This depends on the quantization factor, which determines the bit depth. Two common numbers for this quantity are 16 and 24-bit, respectively denoting that we have 2^16 and 2^24 points to approximate the original oscillation. It's a lot!

    Now that the signal has become a stored sequence of numerical values, we can process them without being afraid of compromising the data, as operations in digital devices can be reversible. The sequence of numbers is rendered again as an oscillation by the digital-to-analog converter or DAC, which also smoothens the stepped sequence of numbers through an interpolation filter. We finally have an analog signal again.

    Linear interpolation (blue line) of discrete data (red dots). (Source: Wikipedia)

  • Overview of Signal Processing Techniques

    There are numerous signal processing techniques used in music production. Some of the most common include:

    • Sound synthesis: Generating sounds from scratch or manipulating existing ones using various techniques such as additive, subtractive, frequency modulation, granular, and wavetable.

      Basic waveforms. They can be generated through synthesis techniques like wavetable, polynomial functions, and additive synthesis — source: Wikipedia.

    • Dynamic effects. They mainly affect the amplitude of the signal and related aspects such as intensity, volume, etc. Some examples are gain stages, preamps, gates, and compressors.

      Upward compression — source: Wikipedia
    • Filters. They can change the energy amount of some frequency regions, which can be enhanced, diminished, and even suppressed. Some examples are low-pass, band-pass, high-pass, notch, resonant, low and high shelf, graphic and parametric equalizers, and crossovers.

      Parametric equalizer — source: Wikipedia
    • Distortions. As dynamic processors, they also operate on the amplitude of the signal, but in a way that heavily affects its spectrum as well. Some examples are electric guitar distortions (fuzz, overdrive, tube, boost, scream, crunch), saturation effects, wave-folding, and clipping.


      Clipping examples — source Wikipedia
    • Modulations. These effects aren't fixed but change themselves over time: they apply oscillations that lend signal parameters a specific motion, affecting the signal character (pitch, frequency content, amplitude) and its overall shape in time as well. Some examples are tremolo, wah, flanger, chorus, vibrato, and phaser.

      Spectrogram of a phaser effect applied to white noise — source: Wikipedia.  

    • Delay and Reverb effects. These techniques layer time-shifted copies of the original signal together, either to create rhythmical patterns or ambiance effects — often concerning simulations of natural reverberation phenomena. Some examples are tape delay, ping-pong, slapback, echo, spring or plate reverb, and convolution reverb.


      Diagram representing the reverberation time — source: Wikipedia
    • Imaging and spatialization techniques. They affect the localization of the signal in the listening space, the stereo mixing of multiple sounds, and the nature of the listening space itself (they are often combined with other kinds of techniques for this kind of purpose). Some examples are panning, balance, stereo width, mono sum, binaural, and multi-channel spatialization.

      Audio phase-meter or goniometer, to measure the phase balance between the left and right channels of a stereo track (Source: Wikipedia)
    • Buffer-based and spectral processing techniques. They are often more sophisticated than the previous one, and sometimes hardly fit into a single category. Their basic requirement might involve the presence of an audio buffer in which we can store portions of the signal, and some sophisticated operations that give us direct access to its spectrum. Some examples are pitch-shifting, time-stretching, and looping.

      A spooky image of a creature with a diabolical grin appears when the sonogram of Aphex Twin's song Windowlicker is visualized. This evolving spectrum has been designed with the software Metasynth, which can generate spectral content from scratch. (Source: Wired)

    The list continues, as techniques in the signal processing world are many and increasing in number. Depending on which branch of the sound professions we specialize in, we might need to become very skilled with the application of some of these: sound designers, composers, and other creatives can make good use of all of these techniques, whereas mixing and mastering engineers might not need to deal with synthesis, for example. We hope that this first chapter sounds intriguing enough to you for getting further with the study of signal processing.

Instructors

Massimiliano Cerioni

Massimiliano Cerioni

Instructor

Massimiliano Cerioni is a Berlin & Rome based AV composer, performer, sound designer and engineer. He is the founder of the independent audio software project Culto, which has released its first M4L device Simbiosi in 2021. He uses coding, augmented instruments and electronics to create compositions, performances and installations.