Music
and AI Lab, NTU
GICE (臺灣大學
音樂與人工智慧實驗室)
We
use knowledge in music to build deep learning models that analyze and/or
create music.
Exemplar
topics for music analysis include
automatic music transcription, melody extraction, structural analysis, and
source separation.
Exemplar
topics for music creation include
symbolic-domain MIDI generation, audio-domain singing voice synthesis,
text-to-music, loop generation, tone creation, and automatic music mixing
(see here for a
curated list of models for music creation).
We
place a bit more focus on music creation in the past few years, building
models such as the Pop
Music Transformer
and Compound Word
Transformer
for MIDI generation, and KaraSinger for musical audio
generation.
In
partnership with the Taiwan AI Labs and the famous singer Sandee Chan, we
also built the first deep learning-based singing voice synthesis model (AI Sandee; AI陳珊妮) with human-level performance in Taiwan (ref).
The
lab used to be affiliated with the Research Center for IT Innovation at
Academia Sinica, from Sept. 2011 to Jan. 2023. See here for the good old memories.
|
Location:
BL-505 (博理館505; map)
Information
for prospective students joining the lab
Lab intro: link
Prerequisites
- Deep interest in music
- Good
background in machine learning and mathematics (e.g., have taken courses
such as Machine Learning, Deep Learning, Signals and Systems, Digital
Signal Processing, Linear Algebra, Probability and Statistics)
- Good
coding experience in python and a deep learning framework such as
PyTorch
Suggested readings
Lab style
- Work as a team rather than lots
of individual projects
- Work on meaningful projects
that are related to the lab and that can have some impact rather than
random ideas
- Aim high yet set milestones
along the way
- Open source code
- Open to collaboration with the
industry
- Open to collaboration with labs
around the world (via internships or remote collaboration; we have many
papers with international collaborators)
Current
Members
PhD
students
- Chih-Pin
Tan / 譚至斌
(class
of 2023) / symbolic music generation
- Wei-Han
Hsu
/ 許巍瀚 (class of 2024;
co-advised) / music transcription
- Yen-Tung Yeh / 葉彥東 (class of 2025) /
audio effect modeling & automatic mixing
- Fang-Duo Tsai / 蔡芳鐸 (class of 2025) /
text-to-music generation
- Wei-Jaw Lee / 李維釗 (class of 2025) / text-to-music
generation
Master
students
- Fang-Chih
Hsieh /
謝方智 (class of 2024) /
text-to-music generation
- Chi-En
Dai
/ 戴麒恩 (class of 2024) /
music emotion recognition
- Josh
Chen / 陳仲桓 (class of 2024) /
- Bo-Rui
Chen / 陳柏睿 (class of 2025) /
symbolic music generation
- Yi-An
Lai / 賴奕銨 (class of 2025) /
text-to-music generation
- Yun-He
Lin / 林昀禾 (class of 2025) /
text-to-music generation
- Hsueh-Wei Fu / 傅學惟 (class of 2025) /
- Ting-Yi
Hu
/ 胡庭翊 (class of 2025; co-advised) /
- Yi-Chen
Kao
/ 高翊禎 (class of 2025; co-advised) /
Full-time
Admin Assistant
Visiting
Scholars
- Dinh-Viet-Toan Le / PhD student
from Univ. Lille / April-July 2024
Collaborators
- Bo-Yu Chen / 陳柏昱 /
- Fei-Yueh Chen / 陳飛岳 / MIDI generation
- Rinni Fang / 方品涵
/
- Jingyue Huang / 黃婧越 (from China) / MIDI
generation
- Haven Kim (from Korea) /
text-to-music
Former
PhD students
- Yu-Hua Chen / 陳宥華 (2025). From Audio
to Score and Tone: Exploring Representations and Transformations in
Guitar-Oriented Music Information Retrieval. PhD
Thesis, National
Taiwan University, June 2025.
- Ching-Yu Chiu / 邱晴瑜
(2023). Human-Inspired Methods for Music Information Retrieval
(MIR)— Taking Music Source Separation and Beat Tracking as
Examples. PhD Thesis, Joint PhD Program between National Cheng-Kung
University and Academia Sinica, January 2023.
- Chih-Ming Chen / 陳志明 (2022). Exploring
High-Order Relations for Recommender Systems. PhD Thesis, Joint PhD
Program between National Chengchi University and Academia Sinica,
September 2022.
- Fernando Henrique Calderon
Alvarado (2022). Human
Expressions: A Study on the Valuable Insights Embedded in Human
Generated Content. PhD Thesis, Joint PhD Program between National
Tsing-Hua University and Academia Sinica, May 2022.
- Zhe-Cheng
Fan / 范哲誠 (2019). Vector-neuron-based Learning with
Arbitrary Bilinear Products. PhD
Thesis, National Taiwan University, July 2019.
- Szu-Yu Chou / 周思瑜(2019). Attention-based
Sound Event Recognition using Weakly Labeled Data. PhD Thesis,
National Taiwan University, January 2019.
- Jen-Yu Liu / 劉任瑜
(2018). Weakly-supervised Event Detection for Music Audios and Videos
Using Fully-convolutional Networks. PhD Thesis, National Taiwan
University, June 2018.
Former
master students
- Fischer Yeh / 葉軒瑜 (2023-2025) /
MediaTek
- Albert Hsu / 許庭肇 (2023-2025) /
- Hsin Ai / 艾芯 (2023-2025) /
- Ting-Kang Wang / 王庭康 (2023-2025) /
- Hsien-Chen
Yeh
/ 葉咸辰 (2022-2025) /
- Cyan Lan / 藍雲瀚 (2022-2024) /
Qualcomm
Former
research assistants
- Josh Chen / 陳仲桓 (2023-2024) /
text-to-music
Alumni before 2022
|