Spotify Has A (meta)Data Issue

Publication Date: December 3, 2025

Spotify logo turned into the 'This is Fine' meme.

I’m relatively new to Spotify, or more-so, to the world of music streaming services. It’s not that I’m against digital music; in fact I love it. I’m just one of those people who has long maintained their own digital music collection. For years I’ve been buying CDs and ripping them to FLAC, hoarding bootleg recordings of artists I appreciate, etc. I have my own home-server, so I’ve been able to enjoy the convenience of my music wherever I go through things like Plexamp, and more recently, Navidrome.

But since early last year, I’ve been pretty much exclusively using Spotify for my music listening needs.

I’ve honestly been tempted by Spotify over the years. It’s nice to have a wide-variety of artists instantly available, and the recent release of lossless streaming has removed one my last perceived “barriers”; I have some good speakers, and I want my music to sound as best as it can.

Spotify Wrapped is also a great piece of marketing, and I’ve admittedly gotten a little FOMO from it. It’s a great and accessible way to share what you’ve been listening to with friends, and you do feel like you’re “missing out” when all your friend’s social feeds have their Spotify Wrapped results every December.

The problem is…my Spotify Wrapped is always wrong. It’s wrong for my friends as well.

Last.fm - A Starting Point

For years now I’ve been using Last.fm to ‘scrobble’ my music. Basically the service allows you to track each song you play, and it maintains an ongoing list on your profile. Eventually, you have a pattern of your music listening habits over time.

The LastFM weekly report

I originally started doing this way back in 2007, where I was scrobbling music from my iPod Mini using Rockbox. There’s been massive gaps in my scrobbling over the years (I basically didn’t track anything from mid-2011 through 2018), so my aggregate listening data is massively skewed towards the music collection I possessed when I was a teenager. But I’ve been dedicated again to scrobbling since mid-2020.

And this is where my issue with Spotify has started.

Last.fm aims to track every piece of music you listen to, and to facilitate this, it allows you to integrate itself into various services. By configuring Navidrome on my home server, I can have the Last.fm API see what I’m listening to there. I can also do the same with Spotify.

And the things is, my Spotify Wrapped is always wrong. Even when factoring in other music sources, my “most listened to” song and artists are never correct on my end of year Spotify Wrapped. I have the data, it’s all in my Last.fm profile.

Part of me thinks this is either because Spotify is overly-reliant on their algorithms, or they have an incentive to promote specific artists.

But after using Spotify for close to 18 months now and getting two Spotify Wrapped reports… I’m beginning to think Spotify has a massive metadata issue. If anything, it’s got a consistency issue.

There’s No Metadata Standard

As someone who has spent too much time hoarding bootlegs, you typically want to assign correct metadata for each song. This of course extends outside of bootlegs and to all songs in my collection, and you’d see a similar approach in the various media services you use. Netflix for instance has the Netflix Metadata Template they make available for partners. This template allows you to label items such as actors, movie genre, etc. For example, every movie on the service with Tom Hanks would have had his name included under the “actors” metadata field. This then allows the user to type “Tom Hanks” into the search and get a list of all those movies. You can see a sample from the template below:

The Netflix Metadata template

For music, you do something similar. For example, Navidrome has a tagging guideline which include the following key fields:

  • Title: The name of the song. (Example: “Imagine”)
  • Artist: The performing artist(s) for the song. (Example: “John Lennon”) If a track has multiple artists, include all of them here.
  • Album: The name of the album the song belongs to. All tracks in the same album should have exactly the same Album tag.
  • Album Artist: The primary artist for the album. This is usually the album’s main artist or group, or “Various Artists” for a compilation. Every track in an album should share the same Album Artist so Navidrome knows they belong to one album. For example, on a soundtrack or compilation album, set Album Artist to “Various Artists”. If a track has multiple album artists (like collaboration albums), include all of them here (see Handling Multiple Artists below).
  • Track Number: The song’s track number on the album. This can be just the track number (like “5”) or a fraction like “5/12” to indicate track 5 of 12. Use leading zeros if your tag editor requires (e.g., “05”). Proper track numbers help Navidrome sort songs in the album’s order.
  • Disc Number: If an album spans multiple discs, use this to differentiate disc 1, disc 2, etc. For example, “1/2” for Disc 1 of 2. Ensure all tracks that are on the same disc have the same disc number, and all tracks share the Album name. Navidrome will group multi-disc albums together and may show disc divisions.
  • Year/Date: The year (or full date) of the album’s recording. While not strictly required, the year is useful information and some views or clients might use it. Formats accepted are: YYYY (for YEAR and DATE) and YYYY-MM-DD or YYYY-MM (for DATE). For a more precise date information, you can leverage other Date fields:
    • DATE/YEAR: The date of the track recording.
    • ORIGINALDATE/ORIGINALYEAR: The original release date of the album.
    • RELEASEDATE/RELEASEYEAR: The release date of the album.
  • Genre: The genre of the music (e.g., Rock, Jazz). This is a multi-valued field and can help when browsing or creating genre-based playlists.
  • Compilation (Part of a Compilation): A special flag for various-artists albums. For a “Various Artists” compilation album, set this tag on all its tracks so Navidrome treats them as one album. In MP3/ID3 tagging, this is often labeled “Part of a Compilation” (technically the TCMP frame) which should be set to “1” (true). In FLAC/Vorbis tags, use a tag named COMPILATION with value “1”. Not all editors show this field explicitly, but many (like iTunes or Picard) will mark an album as a compilation for you if you specify it. If you can’t find this tag, simply ensuring Album Artist is “Various Artists” usually works, but using the compilation tag is a best practice.

And these fields are a way to both tag music correctly, and to establish commonality across the industry. Similar tags would already be added for music you buy on iTunes, Amazon Music, etc.

Now, Spotify does have metadata guidelines, but they don't seem to be enforced. Moreso, they're lax-enough that they allow for incredible variability.

Song Titles

From what I’ve seen, song titles seem to be a massive issue in Spotify. Let’s use Killing Joke’s 1985 album ‘Night Time’ as an example, as I was listening to it when this entire post came into conception.

If I go into Spotify, the album looks about what I’d expect from the preview. It’s got the album art, the album name, and it’s been assigned the release date.

A display of serveral Killing Joke albums with their original release dates underneath

But If I open it up, the song titles are wrong. Multiple tracks track have had ’- 2007 Digital Remaster’ added to it. But I’m not here to listen to the song ‘Eighties - 2017 Digital Remaster’, because a song by that title has never existed. The song is titled ‘Eighties’.

A display of the songs from Killing Joke's 'Night Time', each with '- 2007 Digital Remaster' appened to the song titles

This would be like me saying I read ‘To Kill a Mockingbird - 2021 Harper Perennial Modern Classics’ or watched ‘Casablanca - 2022/80th Anniversary 4K Edition’. Were these items re-released? Absolutely. But Harper Lee wrote ‘To Kill a Mockingbird’ and not ‘To Kill a Mockingbird - 2021 Harper Perennial Modern Classics’.

For Spotify’s listing of Killing Joke’s 1985 album ‘Night Time’, all the tracks seem to be from a 2007 CD release. Except for track 3, which is simply given it’s actual title (‘Love Like Blood’). Is it from the 2007 re-release, and has it been labelled incorrectly? Or is it from the original release/mastering process? Who knows.

Why bother with consistency?

Album Tagging

If you go to listen to the White Stripes though, they seem to be doing it somewhat correctly. There’s multiple releases, and the individual tracks don’t have the “remaster/release date/etc.” added to the song names.

A display of some White Stripes albumns

I can listen to the White Stripes album Elephant (2003) or Elephant (Deluxe) (2023). We’re differentiating the different versions here through the album-title itself, and once I go in the tracks are simply labelled as their correct song titles. After all, the song’s not called ‘Seven Nation Army - 2023 Deluxe Edition’. It’s ‘Seven Nation Army’.

A display of Elephant (Deluxe) (2023) showcasing the song titles are the original song titles.

But there’s obviously a large lack of consistency across the service.

If I want to listen to Soundgarden, it’s just a mess.

A display of all Soundgarden albumns

Three editions of Superunknown are labelled as 1994 (the original release date of the album), but there’s no original version of the album available.

But for some reason Badmotorfinger (1991), Ultramega OK (1988), and Scream Life/Fopp (1990) all are labelled with their re-release date, and not the original release date.

The easiest way to handle this would be embrace what’s outlined in the Navidrome guide. That is, utilizing both ORIGINALDATE / ORIGINALYEAR (the original release data of the album) and RELEASEDATE / RELEASEYEAR (the release date of the album). For Superunknown (20th Anniversary) that would be:

  • ORIGINALDATE / ORIGINALYEAR: 1994
  • RELEASEDATE / RELEASEYEAR: 2014

For Soundgarden, they may very well be using that ORIGINALDATE / ORIGINALYEAR for all these versions of Superunknown. They're all showing as 1994. That is after all the date that the album originally released.

But the White Stripes seem to be using RELEASEDATE / RELEASEYEAR for their albums. Elephant was released in 2003 and Elephant (Deluxe) was released in 2023. And to further confuse the issue, Soundgarden seems to be using ORIGNALDATE / ORIGINALYEAR or RELEASEDATE / RELEASEYEAR, with seemingly no thought given to how or why.

The easiest fix to this would be to enforce both fields, and expose them to the user. Have Elephant (Deluxe) show that it was originally released in 2003 and that this deluxe version was released in 2023. But again, why bother with consistency?

Artist Tagging

This issue with tagging also extends to the artists themselves.

If you look at the Run the Jewels album RTJ4 (2020) you can see some songs list ‘Run the Jewels’ as the first cited artist. Others list El-P as the first artist.

A track listing for RTJ4

There’s no explicit reason for this, and it’s not tied to writing credits. For example, the song ‘a few words for the firing squad (radiation)’ was written by ‘Meline, Render, T Schwartz, W. Schwartz, and Matt Sweeney’.

At first glance, Spotify is listing El-P, Killer Mike, and Run the Jewels here in a manner similar to the Tom Hanks / Netflix example I used earlier. If I type in ‘El-P’ I’d then get a list of both his solo music and his work with Run the Jewels (for those unfamiliar, Run The Jewels consists of El-P and Killer Mike). Again, there’s no consistency to the tagging here, but I understand the intent.

The issue is that lack of consistency though, and I have a feeling Spotify only logs the first artist internally for my end of year Spotify Wrapped.

Why do I think that? Because they only look at that first listed artist for their ‘Liked Songs’ feature.

Liked Songs

I’ll keep using Run the Jewels - ‘a few words for the firing squad (radiation)’ as they’re the first artist I noticed this with.

For those that don’t use Spotify, you can add songs to your ‘Liked Songs’ at any time and this adds them to a dedicated playlist called ‘Like Songs’. Fairly self-explanatory. But it also allows you to go to an artist and just listen to the songs you’ve “liked” by that artist, instead of having to manually select tracks or listen through an entire album. In the screenshot below, you can see a green checkmark next to ‘a few words for the firing squad (radiation)’, which indicates it’s one of my liked tracks.

'a few words for the firing squad (radiation)' with a checkmark next to it

However, if I go to my “Like Songs by Run the Jewels”…it’s not there.

List of Run the Jewels Favourites

It is under my “Liked Song by El-P”…because he’s the first artist listed. Likewise, no songs credited first to Run the Jewels are listed under El-P (despite him being listed on the tracks and vice-versa)

List of El-P favourites

And this applies to multiple other artists. Jack White has at some point gone back and added himself as an artist for all of his White Stripes output. Which makes sense, because now if I search for Jack White, the White Stripes discography also comes up. But again, if I add a White Stripes song to my favourites, it won’t be listed under my ‘Liked Song by Jack White’. That said, he’s done so at the album level, and not at the song-level (Run the Jewels does both).

Elephant listing The White Stripes and Jack White as the artists.

So as a user, it’s incredibly frustrating, as there’s no consistency across the service. Why don’t my searches for Jack White (solo output + White Stripes) match how the artists “Liked Songs” (separate lists for Jack White and White Stripes) present data? It’s annoying, and I’m fairly sure it’s skewing my end of year Spotify Wrapped report.

Hidden Data

Here’s the thing as well though, there’s actually (likely) hidden metadata fields that allow uploaders to list artists. By hidden, I of course mean not exposed or directly viewable to the user.

If I type in Meg White (drummer for the White Stripes), the White Stripes are shown as the first results. Meg’s not listed as an artist on any of the songs, but she comes up anyway. So, somewhere in the backend, someone has tagged her there. It’s similar if I search for Keith Richards; his solo stuff is presented, and then an entry for ‘The Rolling Stones’ appears. He’s not explicitly credited on any of The Rolling Stones’ tracks, but somewhere in the backend they’ve connected it together with tags.

Spotify search for Meg White that show the White Stripes as a suggested artist for said search.

So some people are using the songs themselves to tag and connect artists, some are using the backend, and it’s generally a mess and lack of consistency.

Play Counts + Hidden Albums

I’ve already covered that song titles are a mess, and that will lead to this last section on Spotify. As part of Spotify Wrapped, you’ll sometimes be presented with your most played songs. And, if you go to an artist page, you can see the most played tracks for that artist. For example, if I go to Kate Bush’s page, I can see that ‘Running Up That Hill (A Deal With God)’ has 1,512,766,747 listens as of my writing this.

List of top-played Kate Bush songs on Spotify.

But if I go to her discography, I’m presented with ‘Hounds of Love (2018 Remaster)’ which lists the song as ‘Running Up That Hill (A Deal With God) - 2018 Remaster’. What?

Kate Bush's 'Hounds of Love' album with '2018 Remaster' added to the name and song titles

If I then go back to the most played songs and click the context menu next to the song, select ‘go to album’…I’m then brought to ‘Hounds of Love’. The ‘2018 Remaster’ moniker is gone, and all the tracks are labelled as they should be. This original version of the album isn’t listed under the discography section though, and seems to only exists somewhere hidden in the backend. Weird.

Kate Bush's 'Hounds of Love' album.

Looking at the play counts though, they are the same for both tracks. So one of two things are happening:

  • In terms of the actual digital file sitting on Spotify’s servers, we’re being fed the same file across multiple versions of an album. This is what most likely is happening, and the user is just presented a different label depending on where they listen to it.
  • Someone’s linked the separate files somewhere, and a play in either album counts as the same song. Which would make sense to me, but in this case there’s a single entry for ‘Running Up That Hill’, and that should be the song title presented to the user regardless of which version of the album they listen to.

So at the end of the year, is Spotify tracking that digital file as a single song? Or do they track the label? Have I listened to 'Running Up That Hill (A Deal With God)' 20 times? Or have I listened to 'Running Up That Hill (A Deal With God)' six times, and 'Running Up That Hill (A Deal With God) - 2018 Remaster' fourteen times?

I do know I can add these "different" tracks to my "Liked Songs", effectively giving me repeats.

Kate Bush 'Liked Songs' showcasing songs added multiple times (Under Ice + Under Ice - 2018 Remaster).

But who knows how they're actually tracking it in the back end. I just know my end of year Spotify Wrapped is always wrong.

Wrapping it Up

At the end of the day, it’s not all on Spotify. Music labels are adding these tracks, and they’re obviously not doing a great job of it. Naming conventions don’t exist across the board, and the labelling of items isn’t even consistent across a single album.

But as a user, I find it increasingly frustrating. Part of this stems from it’s messing up my Last.fm data, but it’s mostly annoying because it’s just so inconsistent across the service. If I’m able to look up a song using a specific artists name, at the very least that song should then be listed under ‘Liked Songs’ section of that artist.

And I think this is all why my Spotify Wrapped is wrong every year. I may have listened to a lot of of a specific band, but the songs are all split across whatever individual artist was assigned first under the song listing. I listened to ‘Superunknown’ a lot this year, but did I do it across three versions of the album and are they all tracked separately? I know they’re tracking ‘Running Up That Hill’ as a single song in terms of “top listens”, but is my Spotify Wrapped counting the “different” versions as separate entries?

Maybe I’m asking a lot, but for a company that had a revenue of €15.67 billion and a net income of €1.138 billion in 2024, you think they’d be able to hire some people to expand and enforce their metadata standards.