millionsong's blog

Loudness in the MSD

Submitted by millionsong on Mon, 07/25/2011 - 10:57

Following a few questions we received (most recently from Sam Ferguson, thanks!) here is a somewhat detailed account on how the loudness is computed in the Million Song Dataset. What follows is a (slightly modified) answer from Tristan Jehan:

Login to post comments
Read more

MSD papers are coming out!

Submitted by millionsong on Fri, 07/15/2011 - 14:45

Today happens to be the acceptance date for both ISMIR (nope, delayed!) and WASPAA, and we are very excited to see publications using the MSD finally being released.

Login to post comments
Read more

10,000 page views for the MSD subset on the UCI ML repo

Submitted by millionsong on Tue, 07/05/2011 - 10:00

As of today, the subset of the MSD on the UC Irvine Machine Learning repository has been viewed more than 10,000 times!

Login to post comments
Read more

MSD crash course for Hack/Reduce

Submitted by millionsong on Sat, 06/25/2011 - 06:39

3 hours before hack/reduce, I've decided to write down a few ideas for the participants who would want to play with the MSD. Half of this is a list of resources, half of this is a crash course course on the MSD. I reserve the right to update this info during the day.

Login to post comments
Read more

MSD at Hack/Reduce Boston & s3 bucket

Submitted by millionsong on Wed, 06/22/2011 - 10:08

This weekend I'll head down to the NERD Center for the 3rd edition of Hack/Reduce along with colleagues from the Echo Nest.

Login to post comments
Read more

musiXmatch data visualized

Submitted by millionsong on Wed, 06/22/2011 - 08:19

Simple post to mention the great visualization of lyrics per genre developed by Andrew Clegg (Last.fm), available on the official Last.fm blog.

Read more

The MSD in a Relational Database format

Submitted by millionsong on Sat, 06/11/2011 - 15:29

The intern team at Infobright ported the Million Song Dataset to a relational database, and you can easily get it for yourself!

Login to post comments
Read more

ISMIR 2011 - MSD tutorial

Submitted by millionsong on Sat, 05/14/2011 - 08:46

A few words on the upcoming ISMIR conference in Miami, we just received the news that our tutorial on the Million Song Dataset was accepted!

Login to post comments
Read more

The musiXmatch dataset: connecting lyrics

Submitted by millionsong on Mon, 04/11/2011 - 17:32

Quick reminder: 237,662 bag-of-words with the top 5,000 words given out of ~779K MSD tracks matched with the musiXmatch API.
http://millionsongdataset.com/musixmatch

Login to post comments
Read more

Cover Songs in the SQLite database

Submitted by millionsong on Sun, 03/27/2011 - 12:42

Following advice from our Quality Assessment office (i.e. Dan), we included the information from the SecondHandSongs dataset into the track_metadata.db SQLite database. You can download the new version from this site (not from infochimps!).

Login to post comments
Read more