Modeling rhythms using numbers – part 2

This is a continuation of my previous post on modeling rhythms using numbers.

Euclidean rhythms

The Euclidean Rhythm in music was discovered by Godfried Toussaint in 2004 and is described in a 2005 paper “The Euclidean Algorithm Generates Traditional Musical Rhythms“. The greatest common divisor of two numbers is used rhythmically giving the number of beats and silences, generating the majority of important World Music rhythms.

Do it yourself

You can play with a slightly generalized version of euclidean rhythms in your browser  using a p5js based sketch I made to test my understanding of the algorithms involved. If it doesn't work in your preferred browser, retry with google chrome.  

The code

The code may still evolve in the future. There are some possibilities not explored yet (e.g. using ternary number systems instead of binary to drive 3 sounds per circle). You can download the full code for the p5js sketch on github
screenshot of the p5js sketch running. click the image to enlarge

The theory

So what does it do and how does it work? Each wheel contains a number of smaller circles. Each small circle represents a beat. With the length slider you decide how many beats are present on a wheel.  
Some beats are colored dark gray (these can be seen as strong beats), whereas other beats are colored white (weak beats). To strong and weak beats one can assign a different instrument. The target pattern length decides how many weak beats exist between the strong beats. Of course it's not always possible to honor this request: in a cycle with a length of 5 beats and a target pattern length of 3 beats (left wheel in the screenshot) we will have a phrase of 3 beats that conforms to the target pattern length, and a phrase consisting of the 2 remaining beats that make a “best effort” to comply to the target pattern length. 
Technically this is accomplished by running Euclid's algorithm. This algorithm is normally used to calculate the greatest common divisor between two numbers, but here we are mostly interesting in the intermediate results of the algorithm. In Euclid's algorithm, to calculate the greatest common divisor between an integer m and a smaller integer n, the smaller number n is repeatedly subtracted from the greater until the greater is zero or becomes smaller than the smaller, in which case it is called the remainder. This remainder is then repeatedly subtracted from the smaller number to obtain a new remainder. This process is continued until the remainder is zero. When that happens, the corresponding smaller number is the greatest common divisor between the original two numbers n and m.
Let's try it out on the situation of the left wheel in the screenshot. The greater number m is 5 (length) and the smaller number n is 3 (target pattern length). Now the recipe says to repeatedly subtract 3 from 5 until you get something smaller than 3. We can do this exactly once:
5 – (1).3 = 2
We can rewrite this as:
5 = (1).3 + 2
This we can interpret as: the cycle of 5 beats is to be decomposed as 1 phrase with 3 beats, followed by a phrase with 2 beats (the remainder). Each phrase consists of a single strong beat followed by all weak beats. In a symbolic representation easier read by musicians one might write: x..x. (In the notation of the previous part of this article one could also write 10010).
Euclid's algorithm doesn't stop here. Now we have to repeatedly subtract the remainder 2 from the smaller number 3:
3 = (1).2 + 1
This in turn can be read as: the phrase of 3 beats can be further decomposed as 1 phrase of 2 beats followed by a phrase consisting of 1 beat. In a symbolic representation: x.x Euclid continues:
2 = (2).1 + 0
The phrase of two beats can be represented symbolically as: xx. We've reached remainder 0 and Euclid stops: apparently the greatest common divisor between 5 and 3 is 1.
Now it's time to realize what we really did: 
  • We decomposed a phrase of 5 beats in a phrase of 3 beats and a phrase of 2 beats making a rhythm x..x. 
  • Then we further decomposed the phrase of 3 beats into a phrase of 2 beats followed by a phrase of 1 beat. 
  • We can substitute this refined 3 beat phrase in our original rhythm of 5 = 3+2 beats to get a rhythm consisting of 5 = (2 + 1) + 2 beats: x.xx. 
  • I hope it's clear by now that by choosing how long to continue using Euclid's algorithm, we can decide how fine-grained we want our rhythms to become. 
  • This is where the max pattern length slider comes into play. 

The length slider and the target pattern slider will determine a rough division between strong and weak beats by running Euclid's algorithm just once, whereas the max pattern length slider helps you decide how long to carry on Euclid's algorithm to further refine the generated rhythm.

Posted in Uncategorized | Comments Off on Modeling rhythms using numbers – part 2

Modeling rhythms using numbers

Context

I like to dabble a bit in generative music from time to time. While thinking about how to generate percussion patterns I was wondering about compact representations of rhythm. This blog entry documents my current approach (which, as usual, may or may not exist already, and may or may not be useful to you.)
(Note to self: this blog entry lacks some pictures for clarity.) 

Encoding rhythm in a number

Consider the following simple rock beats:
This beat has 3 voices: 
  • the upper voice represents the hi-hat
  • the middle voice represents the snare drum.
  • the lower voice represents the bass drum (kick)

Simple case: a voice has notes with equal duration (e.g. upper staff)

In the upper staff, there's an easy conversion between notes being present/absent and the bits in a binary number.

For the hi-hat, consider each measure as consisting of 8 8th notes. To each 8th note we can associate a bit in a binary number. Since on each beat an 8th note is played all bits are set to one. Therefore the hi-hat in the first measure in this case could be represented as a binary number (1111 1111), which can be written as the decimal number 255, and a resolution 2 (The resolution, 2, represents the number of bits per beat. Here it's 2 because there are 2 8th notes per beat).
The bass drum can be seen as consisting out of 4th notes. There's a kick on the first and third beat, but not on the second and fourth beat of the measure. The bass drum voice in the first measure therefore can be represented as the decimal number (1010) (in decimal this is number 10) with a resolution of 1. 
The snare drum also consists of 4th notes. There's a snare drum on beats 2 and 4, but not on beats 1 and 3 of the measure. For this reason, the snare drum can be represented as (0101) (in decimal this is number 5) with resolution of 1. 
The complete first measure therefore can be summarized as:
  • hi-hat: 255 (resolution 2)
  • snare drum: 10 (resolution: 1)
  • bass drum (kick): 5 (resolution 1)

Second case: a voice has notes with unequal duration (e.g. lower staff)

The simple notation we used before no longer suffices. The bass drum voice has a kick on the first beat, and one on the second half of the second beat. As a first idea, we can pretend that the notes are written as 8th notes that are tied together. In that case the bass drum could be almost modeled as (1101) with a resolution of 2, except that this number doesn't model at all that the first two 8th notes are tied together to form one longer kick.
To overcome this limitation, introduce a new digit 2. 2 indicates that the current note is present, and tied to the next note (whereas digit 1 indicates that the current note is present but not tied to the next one). An accurate representation for the bass drum in the lower staff therefore is (2101) with a resolution of 2. Because of the number 2, this is no longer a binary number, but it can be interpreted as a ternary number (a number in number base 3).  (2101) in number base 3 corresponds to decimal number 64.
Since one cannot tie a note to a rest, the number combination 20 doesn't make any sense and for all practical purposes can be replaced with 10. 
Without loss of generality, we can also interpret the numbers of the upper staff as numbers in number base 3. The upper staff therefore is modeled as:
  • hi-hat: (1111 1111) in number base 3, or 3280 in number base 10, (resolution: 2)
  • snare drum (0101) in number base 3, or 10 in number base 10, (resolution: 1)
  • bass drum (kick): (1010) in number base 3, or 30 in number base 10, (resolution: 1)
The lower staff is modeled as:
  • hi-hat: (1111 1111) in number base 3, or 3280 in number base 10, (resolution: 2)
  • snare drum (0101) in number base 3, or 10 in number base 10, (resolution: 1)
  • bass drum (2101) in number base 3, or 64 in number base 10, (resolution: 1)

What's the point?

Any decimal number can be rewritten in number base 3 and vice versa, so any integer represents a drum pattern voice, and every drum pattern voice can be written as a single integer. So drum pattern voices can be enumerated and constructed systematically.

Hah! I bet you can't do triplets can you?

Why not? Of course I can. Suppose you have a drum pattern voice that mixes 8th notes with 8th-based triplets. You can again consider the 8th notes as consisting of 3 tied 16th-triplet notes, and the 8th triplet notes as consisting of 2 tied16th triplet notes. The resolution is 6 (since there are 6 triplet 16ths per beat), and the pattern for 8th note triplets is (212121) (decimal: 616), whereas the pattern for 8th notes is (221221) (decimal: 700). 
How did I know I had to use a resolution of 6? A single beat has 3 triplet 8th notes, or 2 8th notes. The least common multiple of 2 and 3 is 6. Therefore I had to subdivide the beat into 6 equal parts (which corresponds to using a triplet 16th as reference length).

Is this system general enough to encode mixtures of different tuplets?

It is, but if you got to very exotic rhythms, you may end up with large resolutions and many digits. Rest assured: in popular practice most rhythms don't need very complex encodings.

Can you convert representations between different resolutions?

To some extent, yes, but not every pattern can be expressed in any resolution without loss of information. In number base 3, if you understand what we're doing here, it's rather trivial. E.g.
  • (1010) with resolution 2 corresponds to (21002100) in resolution 4. What we did here is replace every 8th note with tied 16th notes. This boils down to applying rewriting rules 1 -> 21 and 0 -> 00. This is “up”sampling the rhythmic representation, and it may be a preparation step for other transformations later on.
  • Similar we can upsample (21002100) in resolution 4 to (2221000022210000) in resolution 8. Here we replaced every 16th note with tied 32th notes.This boils down to applying rewriting rules 2 -> 22, 1-> 21, 0 ->00.
  • If you want to halve the resolution, you process the base 3 numbers by two: 
    • Take drum pattern represented by 2221000022210000 in number base 3, and group by 2: (22,21,00,00,22,21,00,00)
    • Then substitute: 22 -> 2, 21 -> 1, 00 -> 0 (This is a form of “down”-sampling without loss of information)
    • If you encounter other patterns than 22, 21 or 00 you cannot reduce the resolution without mutilating the rhythm.In that case you down-sample while losing some information (a kind of low-pass filtering).
    • If you downsample the rhythm to a lower resolution, and while doing so are forced to mutilate the rhythm, you can upsample it again, then subtract the resulting number from the original number to get an error rhythm (a kind of high-pass filtered rhythm).

Are digits 0,1,2 enough to notate any rhythm?

Yes and no. Yes: you can form any rhythm using this system. No: you cannot accurately annotate certain expressive marks (e.g. staccato, marcato, ghost notes) using this system. To add such information should be possible by introducing new digits (which themselves can encode different dimensions of information, e.g. by forming the digits by multiplying prime factors, where presence of a given prime factor indicates presence of a certain expressive mark). In that case not every conceivable number is a valid rhythm anymore and things may get hairy. Instead of absorbing the expressive marks directly in the rhythm model, they can also be added as meta-information, e.g. in the form of a second (binary) number where each bit represents presence or absence of a given expressive mark.

So how do I use this in my generative music?

It's up to you how you use the representation to create music. Here are some possibilities.
  • You can generate random integers and interpret them as drum pattern voices.
  • You can start from an integer, and use rewriting rules like the ones shown above to upsample to a higher resolution. By using rewriting rules other than the ones present in the previous section you can systematically calculate variations on the starting pattern. E.g. try 21->11, 22->11 or 22->21 to break ties, or 21 ->  10 to replace a longer duration with a shorter one.
  • Instead of using rewriting rules, you can also use systematic calculations on the decimal representations (or representation in any other number base really), and interpret the results as rhythms again. In that case the variations are stilll systematic, but most likely more unpredictable to an observer.
Posted in Uncategorized | Comments Off on Modeling rhythms using numbers

Fear of change and its influence on the practice of music composition

Introduction

In this blog entry I will formulate some thoughts about how fear of change can explain a number of principles in music composition. It's very well possible that all this has been written before, and much better explained than I will ever be able to do, but I'm in philosophical mood today, and perhaps you'll start to think differently about some things after reading this text. If you experience a feeling of skepticism while reading this article, and feel like it's written by an internet crackpot theorist, rest assured that this is in complete correspondence with what the article predicts will happen 🙂

Fear of change as an organizing principle in the universe

Fear of change, while sounding specific to human psychology, really permeates the universe. In Newtonian physics, any action will cause a counter-action that resists the original action. If you push your table top down, the table top pushes back and cancels out your intention to change it (until you hit it so hard that it breaks or deforms). Dynamic processes strive for equilibrium, that is, a state in which all changes canceled each other out perfectly and nothing happens anymore. Exactly why all things strive for minimal energy to the best of my knowledge is not known to anyone but it's an empirical observation that has held together science for a few centuries already and has been observed over and again in experiments and observations.

In psychology, “fear of change” is a well-known topic. When Copernicus found that the earth rotates around the sun, it caused massive resistance from the world population. When confronted with the implications of quantum theory (that he helped to establish), Einstein resisted the change in world view it would bring about and declared that “God doesn't throw dice”. Announcements for big changes in an organization, e.g., are typically met with skepticism, and quickly resistance and conservatism will pop up to cancel out the announced change. Just google for “change management” to find a myriad of books explaining how to reorganize a corporation. As I will argue in this blog entry, this same mechanism of fear of change (or better “resistance to change”) also permeates music theory. 

At the same time the universe doesn't appear to like complete rest. Quantum physics (a revolutionary theory that of course was met with a lot of skepticism at first!) predicts that there's no such thing as complete “rest”. The Heisenberg uncertainty relation necessitates that even at a temperature of 0 Kelvin (the lowest possible temperature in the universe) there must still be a small rest energy. In nature and technology we also observe constant evolution. Change is inevitable it seems. Similarly, in music, listening to a piece that consists of a single note without volume or rhythmic variation that lasts forever is not a pleasant experience. Ask anyone who's suffering from tinnitus what it's like…

Finally, I want to stress that this fear of change is not a bad thing per se. After all, it has helped us survive since the stone age. It was probably safer to eat the berries that your parents ate than to try new berries every day. And it continues until today, since not all big changes or revolutionary “new insights” really have the merit they claim they have (and that may well apply to this blog entry too!)

Fear of change in music

In this section I will list some places where I see fear of change in action in music composition. If you know about more examples, by all means, comment!


3.1 Music style

In modern classical music, certain experiments have been branded “interesting”, whereas other experiments have proved to be wildly successful with wide audiences. 

The “12 tone” music style, that resolutely throws away the organizing principle of “sounding good” and replaces it with a different organizing principle of “using all available notes and treating them without differences”, as introduced by Schoenberg, results in music that has many leaps and bounds and, let's face it, has failed to attract a significant audience. On the other hand, “minimal music” with composers like Philip Glass, Steve Reich, Brian Eno, Michael Nyman, Terry Riley and a myriad of others, makes slowly evolving music and continues to be wildly successful with wide audiences. Compared to the 12-tone music, minimal music minimizes change. It also offers just enough changes to keep it from being boring. As such it avoids complete rest.


3.2 Writing melody

When writing melody, e.g. in the context of counterpoint, or in the context of a song, it is advised to avoid big leaps. The reason given by ancient theorists is that smaller leaps are easier to sing. What makes a bigger leap harder to sing accurately than a smaller leap? Is the larger change of pitch a cause for distress in our brains? At the same time, I also took an introduction to counterpoint class, in which I was warned to avoid “turbulence”, i.e. writing a flurry of notes that doesn't seem to go anywhere. Minimize the change, while avoiding complete lack of direction (lack of direction would be a form of equilibrium or rest).


3.3 Voice leading

When moving from chord to chord, voice leading is the principle that makes you do these movements while minimizing the changes in notes. Voice leading is an important topic in many courses on harmony and jazz theory. It is perhaps the most common principle that governs modern music styles (apart from those styles that avoid it on purpose, like the 12-tone music mentioned earlier). Voice leading is a direct application of minimizing change between chords. The fact that you move between chords and don't just stay on the same chord all the time, is a direct application of avoiding complete rest. 

Minimizing changes between chords historically probably also has a second reason: when playing chords on a keyboard it is easiest to play chords that are close together, i.e. where you minimize the changes in required hand and finger movements. Minimizing unneeded movements is absolutely required when learning to play an instrument at the level of a virtuoso. This synergy between physical minimization of change and pyschological minimization of change probably has led voice leading towards the huge role it plays in music composition.


3.4 Fugue construction

While constructing a fugue according to the classical rules, the composer first states the theme, then restates the theme a fifth away from the original theme (but without introducing new accidentals, a so-called modal transposition), then returns to the original theme. This is the so-called “exposition” part of the fugue. During the exposition, the listener is taught the theme that will return in all kinds of variations later on. The theme is taught three times (minimize change), but the second time a fifth away compared to the first and third time (no complete rest). Why a fifth away? At first sight, a fifth seems like a large jump. Why didn't the composer just write the theme a second higher?

There's again an application of the principle here and it requires some explanation. If you transpose all notes from the C major key a perfect fifth up, you get the notes from G major. If you compare the notes of C major and G major, you will notice that they share all the same notes, except for the note f (in C major) compared to a note f# (in G major). G major therefore represents a key that is as close as possible to C major (since it differs in only one accidental) while not being completely the same (since it differs in at least one accidental). A theme written in C major that is modally transposed from c to g will therefore sound maximally the same as the original theme (minimize change). Next time you wonder why moving along the circle of fifths is so popular, fear of change may be the answer you look for.

3.5 Modulation

Modulation is the art of moving from musical key to musical key. When you move from one key to another, you want to gently guide the listener towards this change. When you read about modulation, you will often be advised to modulate to “near” keys, that is, musical keys that do not differ in number of accidentals too much. This is a direct application of minimization of change. 

One can also modulate to more distant keys. In those cases a sudden change, known as direct modulation, is frown upon by composers and theorists. To modulate between keys, especially to distant keys, several advanced techniques have been invented and they involve clever voice leading, sometimes going as far as substituting chromatic notes for enharmonic equivalents, towards a cadence to confirm the new key. These techniques incrementally introduce small changes to the audience so they are guided from the old key into the new key without sudden changes.


3.6 Writing hit songs

Commercial pop music often reuses the same chord progressions. During the 1980-ies, these chords where typically I, IV, V (think: C F G). Nowadays, the new chord progression used in virtually all hit songs is (I, V, vi, IV) (think: C G Am F). Why is that? Why exactly those progressions? Why did I,V,iv,IV come after I,IV,V?

Look at I, IV, V. Remember from the section about fugue construction that transposing a theme a fifth up will maximally retain the existing melody notes. The same is true when transposing a theme a fifth down (note “c” transposed a fifth down gives an “f”. This can just as well be thought of as transposing it a fourth up). When transposing something a fifth up, you need an extra sharp (or one less flat) to completely preserve the same melody. Similarly, when transposing a theme a fifth down, or equivalently a fourth up, you need an extra flat (or one less sharp) to completely preserve the melody. This means that by playing with the chords I, IV, V you have minimized the changes in the set of notes that need to be recognized by an audience, and maximally preserved the possible melodies that can be written on top of these chords.

Complete rest is still not desirable, and so after 20 years of I, IV, V the time was ready for a new chord progression that finds a way to minimize change while avoiding complete rest. And this new chord progression appears to be I, V, vi, IV. It's an evolution from I,IV,V in that it introduces an extra chord. Because the audience is already very used to I, IV, V, this extra chord can inject a bit of much needed change into the music again. The new chord of course is not chosen arbitrarily. It's chosen in such a way that it minimizes change with respect to the old chord progression. 

As before, when going from I to V, you need only one extra accidental. When going from V to vi, you need one less accidental (since vi is the minor equivalent of major I). Then when going from vi to IV, you need one less accidental again, and finally when going back from the final IV to I to sing the next verse, you need a single extra accidental again, making the circle round. Changes have been minimized between every two consecutive chords, and total rest is avoided by moving between different chords.


I'm afraid we're still stuck with I, V, vi, IV for a while, but if you want to define the future, grab your chance, and design a new chord progression that minimizes change while avoiding complete rest 🙂 Unfortunately, you may have trouble selling it to the music publishing companies, since they will probably resist these sudden changes you try to introduce 😉 “Never change a winning team/theme!”


Posted in Uncategorized | Comments Off on Fear of change and its influence on the practice of music composition

More notes to self on treating vocals

Steps to follow:

Remove background noise using gate

  • e.g.: threshold: -17dB, reduction: -100dB, attack: 5ms, hold: 30ms, release: 60ms, hysteresis: -3dB, lookahead: 0, high cut: 20kHz, low cut: 20Hz

Corrective Equalization

see “Remove Rumble” and “Sweep Sound” part of previous post

Normalize gain to -3dB

De-esser (some say it should come after compression)

  • e.g.: detection frequency: 9800Hz, sensitivity: 26%
  • e.g.: suppressor: 9300Hz, strength: -9dB

Compression

  • e.g.: attack: 2ms, knee: 1, threshold: -22dB RMS, gain: 8dB, limiter threshold: -0.5dB

    Equalization and Enhancing

    see “Give Glitter” part of previous post. 
    Note: to make sure that all similar vocals are treated similarly, route them through a common bus and apply the effects on the bus.

    Add reverb and delay

    use send/return configuration for all time based effects

    Last minute fix-ups

    Autotune + see “Special fx” part of previous post
    Posted in Uncategorized | Comments Off on More notes to self on treating vocals

    Note to self about equalizing vocals

    Since I'm sure I will forget this information, I'm putting it online where I know I will find it back. After each step, also check the effect in the mix (i.e. together with other instruments). A subtle effect is usually better than an over-the-top effect. The information here is summarized from https://www.youtube.com/watch?v=qdDDVortvRU . Be sure to check out their video for sound examples.

    Step 1: remove rumble

    Use a high pass filter (aka low cut filter). Increase the cut-off frequency until you just start to hear the difference, then reduce it a bit. That's right, aim for not hearing the effect. This ensures that you only remove rubbish, and don't remove valuable data. A typical cut-off frequency will be around 80Hz-120Hz.

    When done, check the effect in the mix.

    Step 2: give glitter

    For this purpose use a high shelving filter. Try to boost frequencies above 8kHz with anything from 1dB to about 6dB. If you want a more subtle effect, try to boost above 12kHz-16kHz instead.
    When done, check the effect in the mix.

    Step 3: sweep sound

    Use a small bandpass filter, vary its center frequency and search for frequency bands that obviously stand out compared to other frequency bands. You can attenuate these a bit. A typical action is to attenuate around 800Hz-1kHz.
    When done, check the effect in the mix.

    Step 4: special fx

    This step is optional.

    • To make sound brighter, try to boost 2kHz-5kHz.
    • To make vocal sit better in the mix in quieter passages, try cutting between 100Hz-250Hz
    When done, check the effect in the mix.
    Posted in Uncategorized | Comments Off on Note to self about equalizing vocals