Million Song Dataset Benchmarks - Collection Characteristics
We successfully downloaded 994,960 audio samples from the content provider.
The list of samples from the original MSD that we were unable to retrieve is provided in missing_samples.txt.
A full list of all sample properties (track_id, title, artist_name, duration, 7digital_Id, sample_bitrate, sample_length, sample_rate, sample_mode, sample_version, filesize) is provided in sample_properties.csv.gz.
Song-length statistics as CSV file
Please note that the scale is logarithmic. It can be observed that there are two peaks at sample lengths of 30 and 60 seconds with 366,130 and 596,630 samples, respectively, for a total of 96,76% of all the samples.
|Sampling rate ||# samples ||% samples |
|22 ||768,710 ||77.26% |
|44 ||226,169 ||22.73% |
|other ||81 ||0.01% |
Sample rate statistics as CSV file.
|Bitrate ||# samples ||% samples |
|64 ||343,344 ||34.51% |
|128 ||646,120 ||64.94% |
|other (VBR) ||5494 ||0.55% |
Bitrate statistics as CSV file.
| ||# samples ||% samples |
|Mono ||6,342 ||0.6% |
|Stereo ||150,779 ||15.2% |
|Joint stereo / dual channel ||5494 ||0.55% |
Channel statistics as CSV file.