We explain how to efficiently get all the songs from a list of artists using SQLite temporary tables. It is particularly useful for tasks such as tag prediction where train and test lists have been established.
We assume we have the file 'artists.txt' whose content is:
AR00DDV1187B98B2BF AR00J9R1187B98D920 AR00P3M1187FB5428C AR00RBU1187B9B3F1F AR00Y9I1187B999412
We load it with the following code using iPython:
In [2]: artists = [] In [3]: f = open('artists.txt','r') In [4]: for line in f.xreadlines(): ...: if line == '' or line.strip() == '': ...: continue ...: artists.append( line.strip() ) ...: In [5]: f.close()
We then connect to the SQLite database track_metadata.db:
In [7]: import sqlite3 In [8]: conn = sqlite3.connect('/track_metadata.db')
We create a temporary table with our list of artists. Temporary tables are kept in memory, and disappear when the connection is closed. This table will be called 'myartists' and have one column: 'artist_id'.
In [12]: q = "CREATE TEMP TABLE myartists (artist_id TEXT)" In [13]: res = conn.execute(q) In [14]: conn.commit() In [15]: for aid in artists: ....: q = "INSERT INTO myartists VALUES ('"+aid+"')" ....: res = conn.execute(q) ....: In [16]: conn.commit()
Now, we select from the main tables all the track_id whose artist_id is also in 'myartists'. We use an SQL JOIN for this.
In [17]: q = "SELECT track_id FROM songs JOIN myartists" In [18]: q += " ON myartists.artist_id=songs.artist_id" In [19]: res = conn.execute(q) In [20]: data = res.fetchall()
We know have the information in data. We found 131 tracks:
In [21]: len(data) Out[21]: 131 In [22]: data[:3] Out[22]: [(u'TRNPEOL12903CBC2B8',), (u'TRPVEJE12903CBC2BC',), (u'TRXICUN12903CEE131',)]
Don't forget to close the connection!
conn.close()
And that's it! Of course, this could be modified to search for other things, for instance all the tracks from specific years.
- Login to post comments