Vienna Talk 2015 on Music Acoustics
“Bridging the Gaps”     16–19 September 2015


Automatic music transcription using spectrogram factorization methods

Benetos, Emmanouil 

Proceedings of the Third Vienna Talk on Music Acoustics (2015), p. 302


Automatic music transcription (AMT) is defined as the process of converting an acoustic music signal into some form of human- or machine-readable musical notation. It can be divided into several subtasks, which include multi-pitch detection, note onset/offset detection, instrument recognition, pitch/timing quantisation, extraction of rhythmic information, and extraction of dynamics and expressive information. AMT is considered a key enabling technology in music signal processing but despite recent advances it still remains an open problem, especially when considering multiple-instrument music. A large part of current AMT research focuses on spectrogram factorization methods, which decompose a time-frequency representation of a music signal into a series of note templates and note activations. This has led to music transcription systems that are computationally efficient, robust, and interpretable. In this talk, I will present recent advances in AMT focusing on proposed systems that are able to detect multiple pitches and instruments, and are able to support tuning changes and frequency modulations. Recent work on creating a transcription system that models the temporal evolution of each note as a succession of sound states (such as attack, sustain, and decay) will also be presented. The final part of this talk will be on the applicability of AMT methods to fields beyond music signal processing, namely musicology, performance science, and music education. Specific examples on the use of AMT technology will be given in problems related to the analysis of temperament, the analysis of non-Western music, and the creation of systems for automated piano tutoring.


Export citation

  • music signal analysis
  • automatic music transcription
  • multi-pitch detection
  • instrument recognition
  • computational musicology

  • Status
    Invited Paper
    not reviewed

    Banner Pictures: © PID/Schaub-Walzer