Silvet is a Vamp plugin for note transcription in polyphonic music.
** What does it do?
Silvet listens to audio recordings of music and tries to work out what
notes are being played.
To use it, you need a Vamp plugin host (such as Sonic Visualiser).
How to use the plugin will depend on the host you use, but in the case
of Sonic Visualiser, you should load an audio file and then run Silvet
Note Transcription from the Transform menu. This will add a note
layer to your session with the transcription in it, which you can
listen to or export as a MIDI file.
** How good is it?
Silvet performs well for some recordings, but the range of music that
works well is quite limited at this stage. Generally it works best
with piano or acoustic instruments in solo or small-ensemble music.
Silvet does not transcribe percussion and has a limited range of
instrument support. It does not technically support vocals, although
it will sometimes transcribe them anyway.
You can usually expect the output to be reasonably informative and to
bear some audible relationship to the actual notes, but you shouldn't
expect to get something that can be directly converted to a readable
score. For much rock/pop music in particular the results will be, at
To summarise: try it and see.
** Can it be used live?
In theory it can, because the plugin is causal: it emits notes as it
hears the audio. But it has to operate on long blocks of audio with a
latency of many seconds, so although it will work with non-seekable
streams, it isn't in practice responsive enough to use live.
** How does it work?
Silvet uses the method described in "A Shift-Invariant Latent Variable
Model for Automatic Music Transcription" by Emmanouil Benetos and
Simon Dixon (Computer Music Journal, 2012).
It uses probablistic latent-variable estimation to decompose a
Constant-Q time-frequency matrix into note activations using a set of
spectral templates learned from recordings of solo instruments.
For a formal evaluation, please refer to the 2012 edition of MIREX,
the Music Information Retrieval Evaluation Exchange, where the basic
method implemented in Silvet formed the BD1, BD2 and BD3 submissions
in the Multiple F0 Tracking task:
Linux-audio-announce mailing list