Home | Lab | Publication | Teaching | Resources


Music and AI Lab, NTU GICE (臺灣大學 音樂與人工智慧實驗室)

 

We use knowledge in music to build deep learning models that analyze and/or create music.

 

Exemplar topics for music analysis include automatic music transcription, melody extraction, structural analysis, and source separation.

 

Exemplar topics for music creation include symbolic-domain MIDI generation, audio-domain singing voice synthesis, text-to-music, loop generation, tone creation, and automatic music mixing (see here for a curated list of models for music creation).

 

We place a bit more focus on music creation in the past few years, building models such as the Pop Music Transformer and Compound Word Transformer for MIDI generation, and KaraSinger for musical audio generation.

 

In partnership with the Taiwan AI Labs and the famous singer Sandee Chan, we also built the first deep learning-based singing voice synthesis model (AI Sandee; AI陳珊妮) with human-level performance in Taiwan (ref).

 

The lab used to be affiliated with the Research Center for IT Innovation at Academia Sinica, from Sept. 2011 to Jan. 2023.  See here for the good old memories.

 


 

Location: BL-505 (博理館505; map)  

 


 

Information for prospective students joining the lab

 

Lab intro: link

 

Prerequisites

  • Deep interest in music
  • Good background in machine learning and mathematics (e.g., have taken courses such as Machine Learning, Deep Learning, Signals and Systems, Digital Signal Processing, Linear Algebra, Probability and Statistics)
  • Good coding experience in python and a deep learning framework such as PyTorch

 

Suggested readings

 

Lab style

  • Work as a team rather than lots of individual projects
  • Work on meaningful projects that are related to the lab and that can have some impact rather than random ideas
  • Aim high yet set milestones along the way
  • Open source code
  • Open to collaboration with the industry
  • Open to collaboration with labs around the world (via internships or remote collaboration; we have many papers with international collaborators)

 


 

Current Members

 

PhD students

  • Yu-Hua Chen / 陳宥華 (class of 2020) / guitar effect modeling
  • Chih-Pin Tan / 譚至斌 (class of 2023) / music cover generation
  • Wei-Han Hsu / 許巍瀚 (class of 2024) / source separation

 

Master students

  • Fischer Yeh / 葉軒瑜 (class of 2023) / singing voice synthesis
  • Wei-Jaw Lee / 李維釗 (class of 2023) / text-to-music
  • Albert Hsu / 許庭肇 (class of 2023) / singing voice conversion
  • Hsin Ai / 艾芯 (class of 2023) / MIDI generation
  • Ting-Kang Wang / 王庭康 (class of 2023; co-advised) / MIDI generation
  • Fang-Duo Tsai / 蔡芳鐸 (class of 2023; co-advised) / text-to-music
  • Yen-Tung Yeh / 葉彥東 (class of 2024) / guitar effect modeling & automatic mixing
  • Thomas Lin / 林宗易 (class of 2024) /
  • Chi-En Dai / 戴麒恩 (class of 2024) /
  • Fang-Chih Hsieh / 謝方智 (class of 2024) /

 

Full-time Admin Assistant

  • Soo Hiok-bûn / 蘇郁雯

 

Collaborators

  • Bo-Yu Chen / 陳柏昱 /
  • Fei-Yueh Chen / 陳飛岳 / MIDI generation
  • Rinni Fang / 方品涵 /
  • Jingyue Huang / 黃婧越 (from China) / MIDI generation
  • Haven Kim (from Korea) / text-to-music

 


Former PhD students

  • Ching-Yu Chiu / 邱晴瑜 (2023). Human-Inspired Methods for Music Information Retrieval (MIR)— Taking Music Source Separation and Beat Tracking as Examples. PhD Thesis, Joint PhD Program between National Cheng-Kung University and Academia Sinica, January 2023.
  • Chih-Ming Chen / 陳志明 (2022). Exploring High-Order Relations for Recommender Systems. PhD Thesis, Joint PhD Program between National Chengchi University and Academia Sinica, September 2022.
  • Fernando Henrique Calderon Alvarado (2022). Human Expressions: A Study on the Valuable Insights Embedded in Human Generated Content. PhD Thesis, Joint PhD Program between National Tsing-Hua University and Academia Sinica, May 2022.
  • Zhe-Cheng Fan / 范哲誠 (2019). Vector-neuron-based Learning with Arbitrary Bilinear Products. PhD Thesis, National Taiwan University, July 2019.
  • Szu-Yu Chou / 周思瑜(2019). Attention-based Sound Event Recognition using Weakly Labeled Data. PhD Thesis, National Taiwan University, January 2019.
  • Jen-Yu Liu / 劉任瑜 (2018). Weakly-supervised Event Detection for Music Audios and Videos Using Fully-convolutional Networks. PhD Thesis, National Taiwan University, June 2018.

 

Former master students

  • Cyan Lan / 藍雲瀚 (2022-2024) / text-to-music

 

Former research assistants

  • Josh Chen / 陳仲桓 (2023-2024) / text-to-music

 

Alumni before 2022