Building a Synthesizer in Swift

SwiftMoji
Better Programming
Published in
11 min readJul 25, 2019

--

Making audio waveforms with AVAudioEngine

A random synthesizer…

During WWDC ’19, Apple quietly announced some updates to the AVAudioEngine with a short video.

Included in these updates were two brand new AVAudioNodes called AVAudioSinkNode and AVAudioSourceNode. In this piece, we’ll be focusing on how the AVAudioSourceNode can be used to build a musical synthesizer for iOS.

In Apple’s slideshow they collectively defined the two new nodes as a wrapper for “a user-defined block that allows apps to send or receive audio from AVAudioEngine.” In our case, we’ll be sending audio data to the output of the audio signal processing network. The AVAudioSourceNode provides a trailing closure expression which takes in four parameters. We will only need the last two, which are of type AVAudioFrameCount and UnsafeMutablePointer (which points to a list of audio buffers). The closure’s return type is OSStatus, which will be used to indicate whether or not the DSP (Digital Signal Processing) code for our synthesizer’s oscillator is running smoothly.

The AVAudioSourceNode can be used in realtime and manual rendering mode, which means that it can be used to write audio straight to an audio file or, in our case, create sounds in a live context.

Lastly, Apple gives us an extremely important warning: the code you write in the AVAudioSourceNode’s block must be realtime compliant. This means that while on the audio thread, no objects should be initialized and no memory should be allocated. The realtime aspect of audio programming is especially important because lag in audio code can result in buffer under or overflows, which create clicks and pops. These clicks are devastating in a live performance as not only could they damage a user’s hearing, they could also damage the user’s speakers.

Now, with the introduction out of the way, let’s get building our synthesizer!

A quick video of the app we will be building.

To follow along with this tutorial you’ll need Xcode 11 or later. If you want to run your finished application on a device instead of a simulator, that device needs to have iOS 13 or later installed.

Once your Xcode is up to date, open it and navigate to File -> New -> Project. Select iOS and Single View Application. Set the Product Name to “Swift Synth” or whatever name you like. Click Next, navigate to the directory you want to save your application in, and click Create.

Let’s take care of some maintenance. For this tutorial, I’ll be laying out the UI programmatically. Feel free to use storyboards if you like — it should be easy to follow along either way.

The project should have opened up to the project settings. If you don’t want to use Storyboards, scroll down to the Deployment Info tab in General and clear the text field to the right of Main Interface. Next, locate the Main.storyboard file in the Project Navigator and delete it. Finally, we need to enter the Info.plist file and delete the field which pertains to the Storyboard Name.

Follow these steps to clear the Main Interface.
Click the minus button where the arrow is pointing to remove the indicated field. You will have to drill down through the structure to reach the specified field.

One of the updates in iOS 13 and Xcode 11 was the introduction of the SceneDelegate.swift. This manages various UIScenes in your app and specifically interacts with the top-level UIWindowScene to manage multiple windows.

To set the root view controller at configuration time, we now use the scene willConnectTo session: UISceneSession function. First, we should attempt to cast the scene to a UIWindowScene. If that succeeds, we can continue by initializing the window property of the SceneDelegate.

We will pass the bounds of windowScene’s coordinateSpace into the initializer which includes one frame. Next, set the window’s windowScene property to the windowScene constant we created earlier. Set the rootViewController of the window to ViewController.

Lastly, make sure to call the makeKeyAndVisible method on window to present the UI. To ensure everything is working, build and run the application. You should see a blank screen and no errors in the console.

Now, open ViewController.swift. For the sake of clarity, command click the class name and choose the rename option from the dropdown. Type SynthViewController in the text box to rename the file and class. The first thing I like to do when programming a new view controller is to stub it out with mark comments:

It’ll be easier to create the audio part of the app first. Create a new Swift file (File -> New -> File) and call it Synth. Start by importing AVFoundation and Foundation:

import AVFoundationimport Foundation

Then create a class called Synth and stub it out with the same mark comments. You’ll also need to create an initializer, although we’ll be modifying its parameters later:

We’ll be making Synth a singleton by adding a static shared instance property to its definition. This will allow us to access it from any view controller with ease:

Now we’ll add a few properties pertaining to the actual audio of our synthesizer. The first will be a volume property which will allow us to simulate turning off and on the synthesizer. The most important variable in our synthesizer is the engine, which is an AVAudioEngine. The AVAudioEngine will host the sound making AVAudioNodes that we add to our signal chain. The last three are timing variables, which we will discuss in more detail later.

To initialize these properties, head to the initializer and start by initializing the audioEngine. Next, create two constants that reference the mainMixerNode and outputNode of the audioEngine.

The mainMixerNode is a singleton that is connected to the outputNode on the first mention. It acts as an intermediary between the source nodes and the outputNode.

Then create a format constant by calling the inputFormat function for bus 0 on the outputNode. The format will provide us with the default audio settings for the device we are working with. For instance, we can set our sampleRate property by accessing the format’s sampleRate property. If the concept of sample rate is new to you, I suggest you take a quick detour and check out The Audio Programmer’s video about the fundamentals of audio software on youtube.

Next, set the deltaTime float to one over the sampleRate. Delta time is duration each sample is held for. For instance, if the sampleRate was 44,100 Hz, you would take one second and divide it into 44,100 to represent each of the samples.

We’ve now reached the point where we can start diving into AVAudioSourceNode.

Start by defining a typealias called Signal outside the Synth class definition. Signal will be a closure type which takes in one float to represent time and returns one float for the audio sample:

typealias Signal = (Float) -> (Float)

Now, inside Synth add a variable called signal of type Signal.

Next, add a sourceNode variable of type AVAudioSourceNode. Make sure to lazily initialize sourceNode as we will be referencing self within its trailing closure. You can hit enter to auto-complete a basic closure structure. For the two parameters, type an underscore as they are not necessary.

The last two should be named frameCount and audioBufferList. Within the block, start by defining a pointer called ablPointer as a UnsafeMutableAudioBufferListPointer with audioBufferList in the initializer. The audioBufferList holds an array of audio buffer structures that we will fill with our custom waveforms. Buffers are used in audio to give applications more than 1/44,100 of a second to generate samples within the render block. Audio buffers generally contain between 128 and 1024 samples.

Next, we create a for-loop to iterate through index values between 0 and our frameCount variable.

In audio, frames are sets of samples that occurred at the same time. In stereo audio, each frame contains two samples–one for the left ear and another for the right ear. In our case, we’ll be setting both samples to the same value because our synth is monophonic.

Inside of the for-loop, we will obtain the sampleVal by calling our signal closure with Synth’s time property, then advance time with deltaTime.

ABLPointer points to an array, which means we can treat it as such and iterate through its contents within a nested for-loop.

For each buffer element, we must cast it to a float pointer. We can then index the buffer at the current frame and set it the sampleVal found earlier.

Lastly, don’t forget to return noErr if everything succeeds.

We can now return to the initializer and finish setting up Synth.

Start by adding a parameter to the initializer called signal of type escaping Signal. Then, inside the initializer, set self.signal to the signal argument.

Create an inputFormat of type AVAudioFormat — filling in the arguments with format’s properties and limiting the channels to one. Then, attach sourceNode to the audioEngine to introduce it the audio graph.

We can now connect the sourceNode to the mainMixer using inputFormat.

The last step is to connect the mainMixer to the outputNode. You can set mainMixer’s outputVolume to a starting value of 0 as it shouldn’t make sound without a user first requesting to do so.

Finally, we need to call the start method on the audioEngine to initialize the hardware I/O nodes. This function can throw an error so you need to proceed the line with the try keyword and wrap the entire statement in a do-catch block.

Since we have encapsulated all the properties of Synth fairly well, we will need one public accessor method to set the signal of Synth.

// MARK: Public Functionspublic func setWaveformTo(_ signal: @escaping Signal) {    self.signal = signal}

To start producing audio, we need to make a set of closure expressions which conform to the Signal type. Let’s start by creating a swift file called Oscillator. Feel free to move the typealias for Signal over to this file as it should fit in nicely.

Start by creating a struct called Oscillator. Add two static Float variables to Oscillator called amplitude and frequency, giving them initial values of 1 and 440 (Concert A) respectively.

import Foundationstruct Oscillator {    static var amplitude: Float = 1    static var frequency: Float = 440}

Let’s start by building the sine oscillator, as it is the simplest.

Inside Oscillator, create a constant static closure expression called sine with a Float parameter for time and a return type of Float for the samples it will output. Within the closure block, calculate the sine of 2 * pi * Oscillator.frequency * time. Then multiply the output by Oscillator.amplitude and return the result.

If you took trigonometry you’ll know that sine is a periodic function of time with a period equal to (2 * pi) over b, where b is the factor that time or x is being multiplied by before being passed into the function. In our case, b is equal to (2 * pi * Oscillator.frequency). That means the period of our sine wave is (1 / Oscillator.frequency). This makes perfect sense because our frequency is in Hz or cycles per second. If there are 440 cycles per second in a sine wave, each cycle is allotted 1 / 440th of a second.

The next oscillator we’ll build is the triangle wave. This will also be a constant static closure expression, with the same parameters.

For a triangle wave, we separate the wave into three parts: the initial incline, the turning point, and the latter incline. We first calculate the period of the triangle wave by dividing one by Oscillator.frequency. However, because there is no triangle function built into the standard library, we have to calculate where the current sample is located relative to the current cycle.

We can find the currentTime constant by taking the floating point remainder of the total time elapsed divided by the period. For example, if time currently equals 17.5 and the period is 5, the remainder of 17.5 / 5 will be 2.5.

With this information, we can create a value constant that holds the progress percentage of the current cycle. This percentage is found by dividing the currentTime (2.5) by the period (5). This shows us that we are exactly 50% of the way through the current waveform.

We can use the value constant to calculate the result sample value. If value is less than 0.25, we are in the first fourth of the triangle and are inclining from 0 to 1. For that reason, we set result equal to 4 times the current value. If value is greater than or equal to 0.25 and less than 0.75, the waveform dips from 1 to -1. Result will equal value multiplied by 4 and subtracted from two. We know from the last if-statement that (value * 4) ended at 1, therefore we will be starting at (2–1=1) which is the peak. The last else-statement works in a similar way to the previous two. The last step is to convert result back into a Float and multiply Oscillator.amplitude.

The next two oscillators we will build are the sawtooth and square. Following the same convention as before, these will both be static closure expressions.

We will also reuse the same mathematics applied in the triangle wave to find the location of each sample in the sawtooth and square waves. In the sawtooth oscillator, the percentage value will simply be multiplied by 2 (0–2). The final result will be that value minus 1 (-1–1) and multiplied by Oscillator.amplitude.

In the square wave, if value is less than 0.5 the closure returns the negative of Oscillator.amplitude. Otherwise, it simply returns Oscillator.amplitude. This limitation of only 2 states creates abrupt, high energy transitions which adds a series of upper odd harmonics.

Last but not least, we will create a whiteNoise oscillator. This is by far the easiest to program and wrap one’s mind around, as it is simply random Float sample values.

With the release of Swift 4.2, primitive numerical types such as Float got a new static method called random(in:). With this new feature we can simply pass in a closed range (…) from -1 to 1 and multiply it by Oscillator.amplitude to obtain the sample.

Awesome, we did it!

We now have a 5-waveform oscillator built entirely with Swift. Let’s head back to SynthViewController so that we can give it a proper user interface.

For starters, we’ll need a UISegmentedControl to switch between our waveforms. The square icons I used can be found here. I did not create these icons, so please don’t use them commercially without first purchasing them from this artists page on TheNounProject.

The other component we will add is UILabel to display the frequency and amplitude of the current waveform being produced. We’ll initialize these private variables lazily so that we can access self within the closure expression as seen below:

Then we can create two new private functions called setUpView and setUpSubviews. They’ll both be called at the start of our app’s lifecycle in the viewDidLoad function.

Let’s implement the updateOscillatorWaveform function that we added as a selector to our waveformSelectorSegmentedControl.

First, to interface between the UISegmentedControl’s index values and the oscillator type, create an enum of type Int called Waveform somewhere within your Oscillator.swift file.

Then add the following cases: sine, triangle, sawtooth, square, and whiteNoise. They should be in exactly that order, as long as you copied the images array exactly the way I wrote it within the declaration of waveformSelectorSegmentedControl.

enum Waveform: Int {    case sine, triangle, sawtooth, square, whiteNoise}struct Oscillator {
...

The actual implementation of updateOscillatorWaveform will involve calling the rawValue initializer on Waveform with waveformSelectorSegmentedControl’s selectedSegmentIndex property.

Then, a switch statement can be defined with the resulting waveform. For each of the five cases, call Synth.shared.setWaveformTo with the respective Oscillator waveform.

We should also quickly implement a setPlaybackStateTo(state:) function to turn our synth on and off. This function will simply take in a boolean called state and set Synth.shared.volume to 0.5 or 0 using a ternary operator.

Finally, we use the touch related methods already included in UIViewController to allow the user to manipulate the oscillator’s pitch.

That’s it! Build it and run the app on an iOS 13 device or simulator to test it.

I hope you had a great time learning how to build synths in Swift. If you have any bugs or if I made any mistakes, please feel free to leave a comment below.

If you want to download the final Xcode project, you can find it on Github here. Also, a lot of the code I used in this tutorial was found in Apple’s sample project here. If you want to learn more about what’s new in AVAudioEngine, you can find the WWDC video here.

Until next time!

--

--

A blog that utilizes fun emojis and playful code examples to introduce key features of the Swift programming language in a memorable manner.