Beatles Genome Project Part II

Go to Github Repo

Cluster Analysis


We can’t call this a Beatles Genome Project without invoking some bioinformatics, can we? A common problem in the genomic field is to identify groups of genetic markers and associate them with certain expressions. What if we could do the same with chords in The Beatles corpus?

Above, I show cluster analysis using as a genetic marker the percentage of each song playing on a particular chord. Each row is normalized to account for the fact that I, IV, and V are the most common chords by far, as a way to identify the role of “surprising” chords. Unsurprisingly, these three chords are the least relevant to forming clusters since they are so common.

When we examine chords outside the major key, however, a remarkable phenomenon emerges: by far, when a Beatles song uses an unusual chord like bIII (for instance, “I am the Walrus”), the chord exists on its own or with one other unusual chord (for instance, see “Strawberry Fields Forever”, which invokes both the minor v and major VI). Moreover, songs emphasizing unusual chords form surprisingly definite clusters which span multiple albums — it’s extremely unusual to find a chord-cluster with multiple members in each album. What this tells us is that Lennon and McCartney were quite deliberate in spacing out their use of unusual harmonic structures. Roughly three-quarters of their songs use a chord outside of the major key, and of those, the vast majority emphasize only one or two.


Surprisingly, a similar approach can be made to the beat number of each melody note. A common practice taught in songwriting is to vary the onset of musical phrases between the verse and the chorus. For instance, while the first note of musical phrases in the verse may fall on beat two, they may actually anticipate beat one for the chorus to give it an extra nudge. While our analysis doesn’t currently distinguish between sections, it does profile each song to identify the most common beat for phrase onset.

To do this, I measure the percentage of melody notes that fall on each beat and each half-beat. To emphasize notes that begin musical phrases, I weight each melody note by the time to previous note in the song. I then divide by the total corpus average to identify surprising beats for phrase onsets.

Beautifully, on-beat and off-beat melody notes definitely cluster together. This suggests that syncopation (think most George Harrison songs like “If I Needed Someone”) is a defining feature of a song — if part of the song syncopates, it’s more likely other parts of the song will also syncopate.

Just like the chord analysis, though, we discover that most songs emphasize a particular beat onset as the signature beat onset for the song. For instance, in the song “I’ve Just Seen A Face”, the verse melody is essentially non-stop, but almost all phrases in the chorus “falling, yes I am falling, and she keeps calling” start on beat two, as reflected in the cluster analysis. It’s fairly shocking how each song tends to choose one of these beat onsets as its unique signature, and like the chord clusters, these clusters tend to span multiple albums, rather than grouping in time.

Scale Degree

Identifying clusters by scale degree could at first seem to reveal little more than identifying songs by their musical mode.  The vast majority of Beatles songs are in major mode, with a few in mixolydian (identified by the flat-seven) and minor (the flat-three), which is evidenced by the strong clusters for those two scale degrees in the above cluster analysis. However, it turns out that within in each mode, the emphasis on particular scale degrees also forms a unique fingerprint like the chord or beat onset.

In the above analysis, I identify the percentage of notes in each song that fall on each scale degree relative to the key of each song. Unlike the beat-onset analysis, I perform not weighting, except to account for the length each note is played. Also like our other analyses, I normalize by the corpus averages to identify the element of surprise. ¬†Interestingly, the seventh scale degree — the most dissonant and tension-producing note in the scale — forms a huge cluster with subclusters on the second and fourth scale degrees.

At first, I did not expect the stable tones — 1, 3, and 5 (think do, mi, sol) — to produce identifiable clusters since I assumed that they would be equally distributed among all the songs. However, this turned out to be wrong: songs clearly emphasize one of these stables, as shown by the strong clusters in the upper-right-corner of our chart.

Comments are closed.