One-Bit Symphony

Photo by Denisse Leon / Unsplash

Who doesn't know one-bit music! Well, actually, um, all the Atari and Commodore players whose computers had multi-channel sound generators. It was us on the Spectrum who had to make do with a single wire, where it was either 0 or 1, and depending on how fast you fiddled with it (programmatically), that's the sound it made…

Of course, it wasn't particularly glorious, but it worked, and when you gave it a little bit, the results were really interesting. How could you give it?

A little theory to start with… Yeah, it's necessary!

Sound, as it is known, is a wave transmitted through the air or other medium. The basic type of wave is a sine wave.

Tone is the basic building block of music. Its most important parameter is frequency, given in hertz, or the number of cycles per second.

Chamber A is a tone that usually has a frequency of 440 Hz. (Why usually? Because there are tunings that have chamber A elsewhere, e.g. at 442 Hz, Handel used 409 Hz, La Scala in the 18th century had chamber A at 451 Hz…)

According to classical music theorists, Octave is an interval of eight notes. Or 12 semitones. In fact, an octave higher tone has twice the frequency in Hz. If a chamber A has a frequency of 440 Hz, an octave higher A will have 880 Hz. Since the frequency of vibration is inversely proportional to the length of the vibrating object, it could be that a string of length L produces a tone and a string of length L/2 produces a tone an octave higher. If you have a guitar, measure the distance from the fret to the fretboard and see if a fret half the distance produces an octave higher tone.

Sampling is the process of making a continuous waveform, such as a sine wave, discontinuous. This is done by taking a sample (sample) of the current value as a number at regular intervals and storing it. If we then set the output to that value at the same intervals, we get a waveform that matches the original with some degree of tolerance.

The Sample Frequency is the frequency at which we sample. For us, with single-bit music, this will be the frequency with which we are able to change the value at the output. Which implies that if we change the ones and zeros with this rate, we get a rectangular waveform of half the frequency – and this will also be the maximum achievable frequency of playing. (Which, among other things, is a nice derivation of Shannon's, also Nyquist's, theorem).

The sinusoid is the basic sound waveform. Fortunately, no instrument has a sound that is exactly sinusoidal, which is good, because a sinusoid is a very flat, dull, and colorless sound.

To us, amplitude is the same as loudness; the greater the deviations of the waves from neutral level, the louder the sound we perceive. The ear perceives only the absolute deviation, it does not distinguish which direction it (the deviation) is in, whether positive or negative (in the case of sound transmission through air: it does not distinguish whether there is dilution or thickening, it only perceives the difference from normal).

The colour of the sound is determined in the real world by the construction of the instrument, the material, the soundboard – each of these factors adds some other component to the basic tone, a different vibration with a different intensity, giving the sound a characteristic colour. In complex musical instruments, different components resonate at different frequencies, and the combination of these resonant effects is unique and characteristic of the instrument. Simple instruments have a simple sound – for example, a variety of pipes. Complex instruments, such as those where tone is produced by plucking, for example, have a sound composed of a variety of frequencies that change in amplitude over time…

A sawtooth is a progression of sound that looks – well, like a sawtooth. Oddly enough, a sawtooth can easily be constructed from sine waves – we take the basic one, say 440, add to it a twice as fast (880) with half the amplitude, add to that a three times as fast (1320) with one-third the amplitude… And if we do this indefinitely, we get a sawtooth waveform.

A harmonic frequency is one whose magnitude is some integer multiple of the fundamental frequency. The first harmonic is equal to the fundamental frequency, the second harmonic double, the third triple… We could simplify the previous definition of sawtooth by saying that it is the sum of all harmonics (first, second, third, …) with linearly decreasing amplitude. See also tuning.

The triangle is another popular waveform. Like a sawtooth, a triangle can be constructed by summing harmonics – this time odd harmonics (1st, 3rd, 5th, …) with exponentially decreasing volume.

The square is the last of the holy quartet of fundamental waveforms. Instead of a smooth descent or ascent, it has only two values – a maximum and a minimum. It is constructed in the same way as a triangle, by summing odd harmonics, but this time the loudness does not decrease exponentially, i.e. with the square, but linearly.

The noise is a random signal.

CC-BY-SA, source: Wikimedia https://commons.wikimedia.org/wiki/File:Waveforms.svg

The decomposition of basic periodic functions into sine sums is itself a very interesting part of mathematics, and if you're more interested, the keywords are Fourier transform and Fourier series.

Zdroj: https://cs.wikibooks.org/wiki/Soubor:Synthesis_square.gif

Square rules them all

Single-bit music is generated, as you either know from experience or can guess because you're smart, by changing the value of a single bit – it's either 0 or 1. The result, when you think about it, is a square (or better say: rectangular) waveform. It won't end up being exactly rectangular, because there are various circuits on the way from the processor to the amplifier that distort the rectangle a little bit, but that's not important for us right now. The important thing is that we can forget all about sinusoids and triangles, because we will always and only generate a dull rectangle.

If we want to play, for example, a tone with a frequency of 1000 Hz, this will mean that thousands of times per second (=every millisecond) both cycles, i.e. 1 and 0, must take place. So the easiest way to do this is to let half a millisecond be a logical 1, another half a millisecond a logical 0, for a total of one millisecond, and if we repeat this, we get a thousand-hertz rectangle like wine!

When half the time is a 1 and half the time is a 0, we call that a 1:1 duty cycle (also 50%) signal. It's on the same amount of time as it is off. The Spectrum 48 will generate such a signal for us on demand – yes, this is what BEEP plays.

But what if we halve the amount of time we have that logic 1? So the result will have a 1:3 duty cycle (25%, i.e. one quarter of the time of 1, three quarters of 0) and the sound will sound different, sharper – although we will still perceive it as „just as high“. For example, Jonathan Smith's playing routine, used in Ping Pong, works with the duty cycle ratio.

For example, Heartland or Special FX's games (Firefly) had a strange buzzing music. BTW, the routine for Special FX was written by Jonathan Smith again (he died a few years ago) and was also used in the famous Orpheus editor. It has a 1:N alternation – i.e., we only turn on logic 1 for a small moment at the beginning of the loop, and then logic 0 until the end of the desired time. (The small moment is due to the sampling rate, i.e., the rate at which we are able to switch between 1 and 0 in our routine.)

Frequency division

So how do we play each note? If I want to play 440 Hz, what exactly does that mean?

Let's say we have a Spectrum 48 whose fundamental clock frequency is 3.5 MHz. That's three and a half million hertz. The familiar T – time of one clock cycle – is therefore 0.285714 microseconds.

The frequency of the tone f is 440. Thus, 440 cycles take place in one second, which means that there will be… 3500000/f = 7954.545454 cycles of T per cycle.

Since you're assembly programmers, we won't explain here how to calculate how many T's any instruction takes, just know that if we change the output from a logic 0 to a logic 1 (or vice versa) every 3977 T or so, the Spectrum will play a beautiful rectangular chamber A.

But we can't change the output with the frequency of a clock signal – the fastest instruction has 4 beats, plus we need an OUT, and it has at least 11 beats, plus some counting… Well, let's suppose that one run through the loop of our playing program, the part where it decides whether it's now 0 or 1, takes, say, 120 T. By simply dividing 3500000/120 we find that this loop runs roughly 29166 times per second, giving us a sampling rate of about 29kHz. If we periodically alternate 1s and 0s, we generate a tone at 14.5kHz – that's a pretty high squeal. It's not ultrasonic, but it's very high and one should hear it (the upper limit of audible frequency is about 20kHz, more in youth, getting worse towards old age).

The playing loop works basically – and roughly simplified – by counting the runs, i.e. how many times it has been run. When the value of D (divisor) is reached, it goes again from zero, because that means a new run. So during that time the output has to be changed according to the desired waveform. Example: for „heartland“ waveform 1:N, the output will be log. 1 only when the counter is equal to 0, then the output will be log. 0. Thus, logic 1 will be output for one waveform, 120T.

What is the divisor then? It is basically the inverse of the frequency (1/f) and the formula is: D = M / f / Tcyc. M is the clock frequency (3500000), f is the desired tone frequency, Tcyc is the loop duration in clock cycles (T), and the result is a number that indicates how many passes through the loop per cycle of the resulting tone. For chamber A, this gives us 66.287878…, and since we're on the Spectrum and have integer registers, 66. A tone an octave higher will have a divisor of 33 (the smaller the divisor, the higher the tone), and an octave lower will have a divisor of 133 (rounded).

What are the maximum and minimum frequencies of our hypothetical gate? The maximum frequency is the one that would be heard with a divisor of 2 (if it were 1, the value at the output would not change) – when plugged into the formula, it comes out to 14.6 kHz. The minimum frequency will be, with an 8-bit counter, 114 Hz.

Now all that remains is to calculate the values for the individual semitones. The fraction of the frequencies of two adjacent tones is the twelfth root of two – fortunately, you don't have to do the math and can use the table. (From it we also learn that our lowest frequency is about tone B2, and the highest is about A9.) We get a table of divisors – something like this](https://docs.google.com/spreadsheets/d/1fSbAVHtOm1pXRo90QyolqM66l7d_hT51omESPoz3Ack/pubhtml).

Since we're using integer operations, the frequencies won't be exact, and there will already be very audible detuning in places where we have a lot of tones at low values (i.e., at the high notes). After all, there are already several notes per divisor in the seventh octave, and the sixth will also be audibly out of tune.

Multiple voices

We discussed the one-note theory. But single-beat music can play multiple notes at once, even three, or five by madmen like Tim Follin. How do you do it?

Well, you can either change note frequencies quickly and play a C for a while and an E for a while and alternate like that, like the very oldest BASIC games do, for example. Or you can stack the two tones together – i.e. you have two counters in the loop, each for one tone, so you get two runs, and the result is then the OR or AND of those runs. (For the 1:N alternation, the AND makes no sense.)

Is it that simple? Well, yes, it is. Theoretically. In practice, you run into a lot of snags. For example, if you use a generator with a 1:N alternation and play two notes that are exactly an octave apart, the pulses of the higher one will merge with the pulses of the lower one and you'll only hear the higher one (this is what the huby routine suffers from, for example).

You can solve it – either you tune one tone subtly, i.e. shift its frequency a bit elsewhere, e.g. add 1 to the divisor (which you do with the lower tone), or you shift its phase (i.e. you don't send log 1 when the counter is 0, but maybe 4). While this will make them both sound the same, you'll also add parasitic frequencies to the result. On the other hand – it gets lost in the rectangular noise…

Or that in other alternations (1:1 for example) two tones sound like they are half the volume of one tone.

One more detail – when we change the first value in a 1:N alternation (i.e. 2:N, 3:N, 4:N), we can simulate a change in volume up to a certain limit. But if we choose the first number too large, then at higher frequencies the cord changes to something close to 1:1 – and again we have a different problem.

If two tones are played, just slightly out of tune, the resulting waveform will produce what is called beats. This is because when two frequencies are played at once, we also perceive a third one that is equal to their difference – this can also be used in generating sounds, but we'll save that for another time.

To be continued...

What haven't we discussed yet? Like how to make percussion, i.e. drums (using noise and slides). How to make sound effects (glissando, tremolo, ornaments – arpeggios). What is a detune and what is a phase. What is the envelope for. What playing routines, editors… are available and how which ones work. And maybe a lot of other things – and since I know there are experienced composers of one-beat symphonies among the readers, I'll ask them this time: contribute a topic, a piece of advice, a note… Thank you!

For reading and inspiration

Comments powered by Talkyard.

Martin Maly

Martin Maly

Programmer, journalist, writer and electronic hobbyist. Vintage CPU lover. Creating new computers with the spirit of 80's.
Czechia