Multimedia Retrieval: Audio. Speech, Images and Video


Presenters: Dulce Ponceleon and Malcolm Slaney

Dulce B. Ponceleon holds a Ph.D. degrees in computer science from Stanford University. She worked in the Advanced Technology Group at Apple Computer, Inc., where she worked on information retrieval, video and audio compression technologies for QuickTime. She was a key contributor to the first software-only videoconferencing system and holds several patents and numerous publications in video and audio compression, multimedia information retrieval, numerical linear algebra and non-linear programming. She is currently at IBM Almaden Research Center where she has worked on multimedia content analysis and indexing, video summarization, and applications of speech recognition. She is in the program committee for WWW, ACM Multimedia, SPIE.

Malcolm Slaney holds a Ph.D. degree in electrical engineering from Purdue University. He is an author (with A.C. Kak) of the book "Principles of Computerized Tomographic Imaging," which was recently republished by SIAM as one of their "Classics in Applied Mathematics." He is an editor (with Steven Greenberg) of "Computational Models of Auditory Function." He is an instructor at Stanford University's CCRMA (Center for Computer Research in Music and Acoustics) and has organized the Hearing Seminar for the last 15 years. He has been employed at Bell Laboratories, Schlumberger Palo Alto Research, Apple's Advanced Technology Group, Interval Research, and currently IBM's Almaden Research Center.


This half-day introductory tutorial teaches information-retrieval practitioners the tools and techniques that make multimedia different than conventional IR (and more interesting!). We assume a general background in text information retrieval and we will describe and demonstrate aspects of multimedia retrieval that are new and different for text IR professionals. The tutorial will 1) provide an overview of the multimedia retrieval field, 2) describe what is different about multimedia retrieval, query formulation and evaluation, 3) talk about file formats and low-level features, and 4) demonstrate multimedia retrieval tools. In this half-day tutorial we will provide context for all the important ideas, with as much depth as possible, and provide pointers to the literature for those that want more details. The tutorial will include many audio and video examples.

The tutorial handouts will include slides, an annotated bibliography consisting of leading references, a CDROM with copies of papers (those for which we can obtain permission) and URLs, and screenshots of multimedia systems.

