Member-only story
Unnatural Keys
Nature doesn’t come with identifiers.

At time of writing, I am working in the music industry. And as part of that work, we want a database of all of the songs in the world so that we can properly identify unknown songs and provide attribution so that folks can get paid appropriately. It is a noble goal with some interesting engineering challenges.
There’s also some… less interesting engineering challenges.
One is a bit self-inflicted. The first instinct for every DB person when faced with the “database of all the songs in the world” problem is to go with a natural key. They think: “there’s a bunch of IDs we have to store anyways that the business cares about. That’s the definition of a natural key! Let’s just use them”. After all, there are a lot of songs in the world — slightly more than 100 million, depending on who you ask and what they consider to be a song. Adding our own surrogate means a few hundred megabytes of overhead, excluding indexes on the other IDs that the business cares about.
There’s even industry standards that should take care of this for us. ISRC is literally the ISO standard (ISO 3901) for “uniquely identifying sound recordings”. And if you’ve worked in software for any length of time, you know that it does not.
For example:
- Not all sound recordings are songs. Is that recording of rain hitting off of a window a song? No. Does it have an ISRC? Oh yeah. Millions of them.
- Not all songs have recordings. People have been making songs for millennia. People have only been recording things for about 150 years. ISRC has existed for about 30 years. There are gaps there. There are race conditions between writing a song and playing a song and recording that song and getting an ISRC allocated for that song. There are a whole lot of people making unrecorded music every day and a bunch of people recording music that they aren’t bothering to register with an ISRC.
- ISRC only cares about the recording. I talk about the music industry like one homogenous thing, when in reality it is a conglomeration of sub-industries all fighting with one another to extract the most money possible from people listening to music. For songwriters, there is a separate ID for a song’s composition (ISWC). There is…