Get all artists and their tags

As you understand it now, the dataset is mostly built on the song level. However, some information is on the artist level, for example tags (The Echo Nest tags are called 'terms' and the musicbrainz tags are called 'mbtags' in the dataset).

If we consider Radiohead for instance, the tags for Radiohead are stored in every song by Radiohead. Note that is the case for all artist features: similar artists, artist location, etc.

One solution is to go through files until you find one song by Radiohead. That is obviously slow, and that is why we did it for you. We created a text file unique_artists.txt containing one line per artist, including a random track of is. The format is:
artist id<SEP>artist mbid<SEP>track id<SEP>artist name

A solution is now to go through the file (approx. 42K lines), find the one where the 4th field is 'Radiohead', get its track from the 3rd field, and read it. We remind you that if a track is named TRAXLZU12903D05F94.h5, its path in the dataset is A/X/L/TRAXLZU12903D05F94.h5.

The following python code summarizes what we just said.

f = open('unique_artists.txt','r')
for line in f.xreadlines():
    parts = line.strip().split('<SEP>')
    if len(parts)>3 and parts[3] == 'Radiohead':
        print 'Radiohead track:',parts[2]

It outputs:

Radiohead track: TRMMNCI128F9310D00

Problems with the previous method
Going through a text file is not very fast if we must do it often. Also, there is a problem with the artist names. Many 'names' can be associated with one Echo Nest artist ID, for instance 'A' would have the same id as 'A feat. B' and 'A / B'.

Let's assume you are now looking for an artist with its Echo Nest ID, which is the right way to work. Radiohead's ID is: 'ARH6W4X1187B99274F'. We could also work with his musicbrainz ID: 'a74b1b7f-71a5-4011-9441-d0b5e4122711'.

The previous tutorial on how to find a song with a specific name or feature should have taught you how to use the summary file or the SQL database to quickly spot a song from an artist ID. Here is a short reminder.