- Talks
- Datasets
- EMOPIA+: extended version
of EMOPIA that comes with a functional representation-based tokenization
- EMOPIA: a multimodal dataset
comprising audio+MIDI of emotion-annotated pop piano solo pieces
- EGDB+PG: the EGDB dataset rendered
with the Positive Grid BIAS FX2 Plugin, published at DAFx’24
a dataset that contains transcriptions of the electric guitar performance
of 240 tablatures rendered with different tones, published at ICASSP’22
- AILabs.tw Pop1K7: a
dataset comprising 1747 transcribed piano performances of Western,
Japanese and Korean pop songs, compiled in the Compound Word Transformer
paper (AAAI’21)
- DadaGP: a dataset of ~26k
GuitarPro songs in ~800 genres, converted to a token sequence format for
generative language models like GPT2, TransformerXL, etc
corpora of Western classical music excerpts (WCMED) and Chinese classical
music excerpts (CCMED) annotated with emotional valence and arousal
values (ICASSP’20 paper-a)
- #nowplaying-RS:
a new benchmark dataset for building context-aware
music recommender systems
(SMC’18 paper)
- Symbolic-Musical-Datasets: list of symbolic musical datasets, including lead
sheets and MIDIs
- Lakh Pianoroll Dataset (LPD): a collection
of 174,154 unique multi-track piano-rolls derived from the MIDI files in
Lakh MIDI Dataset (LMD), used in our MuseGAN paper (AAAI’18 paper)
- iKala: 252 30-second
excerpts sampled from 206 iKala songs (plus 100 hidden excerpts reserved
SVS 2014-2016) (ICASSP’15 paper)
- Su
Dataset for automatic music transcription in piano solo, piano
quintet, string quartet, violin sonata, choir, and symphony
(ISMIR’16 and ISMIR’15 papers)
- MACLab Dataset for violin offset
detection (ISMIR’15 paper)
- MACLab Dataset for guitar playing
techniques (ISMIR’15 and ISMIR’14 papers)
Dataset for expression analysis in violin (ISMIR’15 paper)
- Octave dual-tone dataset (SMC’14 paper)
- The
AMG1608 dataset
for personalized
music emotion recognition (ICASSP’15
- The
CH818 dataset for music emotion recognition in Chinese Pop songs
- The
DEAM and MediaEval dataset for dynamic and static music
emotion recognition
(used in the ‘Emotion in Music’ Task in MediaEval 2013-2015)
- CAL500exp
for time-varying music auto-tagging (ICME’14 paper)
- CAL10k:
10k songs with 140 genre tags (TMM’13 paper)
- LiveJournal:
40k blog articles with user mood labels and music tags (TMM’13