#nowplaying

#nowplaying

public datasets for music information retrieval and recommendation tasks

#nowplaying dataset

#nowplaying is a data set which leverages social media for the creation of a diverse data set describing the music listening behavior of users. For the creation of the data set we rely on Twitter which is frequently facilitated to post which music the respective user is currently listening to. From such tweets, we extract track and artist information and further metadata. Currently, the dataset contains tweets from 2012 to 2018 and was resolved against the Musicbrainz database dumped on 2019/01/12, where we extract record ids from. It contains a total of 126 mio. listening events.

Download a CSV representation of the dataset here (37GB).

If you make use of our dataset, get further details or want to refer to it, please cite the following paper:
[tplist include=”2″ template=”tp_template_2016″ link_style=”images” show_tags_as=”none” headline=”0″]

playlists dataset

The playlist dataset is based on the subset of users in the #nowplaying dataset who publish their #nowplaying tweets via Spotify. However, this dataset is based on the user playlists of these users. A description of the generation of the dataset and the dataset itself can be found in the following paper.

Download a CSV representation of the playlist dataset here (last updated: 2015-12-31 00:00:00 (CET)).

If you If you make use of our dataset, get further details or want to refer to it, please cite the following paper:
[tplist include=”1″ template=”tp_template_2016″ link_style=”images” show_tags_as=”none” headline=”0″]

#nowplaying-RS dataset

The #nowplaying-RS dataset features context- and content features of listening events. It contains 11.6 million music listening events of 139K users and 346K tracks collected from Twitter. The dataset comes with a rich set of item content features and user context features, as well as timestamps of the listening events. Moreover, some of the user context features imply the cultural origin of the users, and some others—like hashtags—give clues to the emotional state of a user underlying a listening event.

Feel free to download the dataset and also training and test splits. Also, you can find reference implementations of the conducted experiments on Asmita’s GitHub repository.

If you make use of our dataset, get further details or want to refer to it, please cite the following paper:
[tplist include=”3″ template=”tp_template_2016″ link_style=”images” show_tags_as=”none” headline=”0″]