Skip to content
Fabian-Robert Stöter edited this page Mar 1, 2017 · 35 revisions

Python in Audio Research

The python ecosystem is huge

The idea of this repository is to create a comprehensive, curated list of python software and packages related to scientific research in audio.

Related lists

There is already this list, but it not up to date and includes too many packages of special interest that are mostly not relevant for scientific applications. Awesome-Python is large curated list of python packages. However the audio section is very small.

Read/Write

  • (Py)Soundfile - is an audio library based on libsndfile, CFFI, and NumPy
  • (Py)Soundcard - is an audio library based on PortAudio for realtime audio processing
  • pySox - Python wrapper around sox
  • pyAV - PyAV is a Pythonic binding for FFmpeg or Libav
  • tinytag - reading music meta data of MP3, OGG, FLAC and Wave files.
  • audiolazy - Expressive Digital Signal Processing (DSP) package for Python.
  • audioread - Cross-library (GStreamer + Core Audio + MAD + FFmpeg) audio decoding.
  • python-sounddevice another portaudio wrapper

Transformations, General DSP

  • FFT (part of scipy.fftpack (fast) and numpy (slower))
  • pyFFTW3
  • NSGT - non-stationary gabor transform, constant-q
  • MDCT - MDCT transform
  • STFT - standalone STFT package
  • Gammatone - Gammatone filterbank implementation
  • Sidekit - Speaker and Language recognition
  • Resampy - sample rate conversion.
  • PyRubberband - A python wrapper for rubberband to do pitch-shifting and time-stretching.
  • pydub - Manipulate audio with a simple and easy high level interface.

Speech Processing

  • pyAudioAnalysis - feature Extraction, Classification, Diarization
  • SIDEKIT - Speaker and Language recognition.
  • py-webrtcvad - interface to the WebRTC Voice Activity Detector
  • talkbox - General speech/signal processing algorithms. ⚠️ Not maintained.

Perceptial/Auditory Models

  • Loudness - perceived loudness, includes Zwicker, Moore/Glasberg model
  • [Sound Field Synthesis Toolbox](Sound Field Synthesis Toolbox for Python)
  • BrianHears - General Auditory Models

Realtime applications

  • PYO - realtime audio engine similar supercollider

Source Separation

  • NUSSL - common source separation algorithms + framework
  • pyFASST - Flexible Audio Source Separation Toolbox
  • commonfate - Common Fate Transform
  • beta_ntf - Non-Negative Tensor factorisation using PARAFAC
  • Simfa, NMF flavors - Several NMF flavors
  • NTFLib - Sparse Beta-Divergence Tensor Factorization Library

Music Information Retrieval

  • librosa - general audio and music analysis
  • mir_eval - common heuristic accuracy scores for various MIR tasks.
  • essentia - C++ based feature extractor + general purpose audio/MIR related DSP algorithms like pitch tracking, beat detection.
  • Madmom MIR packages with strong focus on beat detection, onset detection and chord recognition.
  • dejavu - Audio fingerprinting and recognition.
  • Catchy - Corpus Analysis Tools for Computational Hook Discovery

Feature extraction

  • pyYAAFE Python bindings for YAAFE
  • aubio feature extractor, written in C, python interface
  • audiolazy Realtime Audio Processing lib, general purpose

Web + Audio

  • TimeSide - Open web audio processing framework.

Packages to access public APIs / Parse Datasets

Machine-Learning / Deep Neural Networks

  • Scikit-Learn
  • Keras
  • Lasagne
  • Tensorflow

Optimization

Symbolic Music / MIDI

  • Music21 - a Toolkit for Computer-Aided Musicology
  • Mido - Realtime MIDI wrapper
  • Pretty-MIDI Utility functions for handling MIDI data in a nice/intuitive way
  • mingus - An advanced music theory and notation package with MIDI file and playback support.

Bindings/Wrappers to other languages

  • VamPy Host - interface compiled vamp plugins
  • PyAU - Python Audio Unit Host
  • rpy2 call R from python
  • Matlab_Wrapper runs code in matlab and returns results to python
  • CFFI - easily interface c libraries
  • pybind11 - interface c++ code

Tutorials/Books

Clone this wiki locally