Paper
10 January 2003 Survey of compressed domain audio features and their expressiveness
Silvia Pfeiffer, Thomas Vincent
Author Affiliations +
Proceedings Volume 5021, Storage and Retrieval for Media Databases 2003; (2003) https://doi.org/10.1117/12.476300
Event: Electronic Imaging 2003, 2003, Santa Clara, CA, United States
Abstract
We give an overview of existing audio analysis approaches in the compressed domain and incorporate them into a coherent formal structure. After examining the kinds of information accessible in an MPEG-1 compressed audio stream, we describe a coherent approach to determine features from them and report on a number of applications they enable. Most of them aim at creating an index to the audio stream by segmenting the stream into temporally coherent regions, which may be classified into pre-specified types of sounds such as music, speech, speakers, animal sounds, sound effects, or silence. Other applications centre around sound recognition such as gender, beat or speech recognition.
© (2003) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Silvia Pfeiffer and Thomas Vincent "Survey of compressed domain audio features and their expressiveness", Proc. SPIE 5021, Storage and Retrieval for Media Databases 2003, (10 January 2003); https://doi.org/10.1117/12.476300
Lens.org Logo
CITATIONS
Cited by 5 scholarly publications.
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Analytical research

Speech recognition

Image segmentation

Video

Feature extraction

Statistical analysis

Interference (communication)

RELATED CONTENT


Back to Top