Image and Signal Processing Research Group
Speech and Audio Processing for Ubiquitous Communications, Industrial Short Course,
February 20-21, 2006
About the course
The ubiquity of communication networks has led to a wide range in network properties,
terminal capabilities, and user environments, and to the demand for context-aware services.
As a result, a new generation of signal-processing and coding algorithms for audio-visual
communication is emerging. These algorithms target scalability, noise suppression, real-time
quality estimation, emotional-state detection, and robustness against packet loss and bit
errors. This course discusses processing techniques that facilitate effective and efficient
communication services and adapt to the physical user environment and other context, with a
focus on speech and audio signals.
Who should attend?
This course is aimed at engineers, scientists and
managers who need to understand the basic concepts of the Digital
Signal Processing techniques and their applications. Participants will
be expected to have some background in basic electronics, mathematics,
and programming. Prior background in digital signal processing would be
helpful but not required.
Format
The course will consist of a series of well-illustrated lectures. The course
will be taught in two days, in eight 90 minute sessions. Hand-outs of relevant overview papers
and of the slide presentations will be provided.
Course dates and venue
20th - 21st February, 2006 (9:00 am - 5.00 pm), Massey University,
Palmerston North Campus. Registration from 8.30am Monday 20th February.
Course Fee
$650 (includes course material,
handouts, lunches & teas). To entol, please complete the attached enrolment form, enclose the
seminar fee and mail prior to 13th February,
2006 to:
Christine Allport, Institute of Information Sciences
& Technology
Massey University, Private Bag 11222
Palmerston North, NEW ZEALAND
Ph: 06 350 5799 extn 2438, Fax: 06 350 2259
Email: c.allport@massey.ac.nz
Course Content
Day 1
Speech and audio perception and speech production:
- state-of-the-art models of auditory perception
- speech production mechanism
Source coding for heterogeneous networks:
- basics of quantization, results of rate-distortion theory
- high-rate quantization theory
- transforms and coding
- multiple-description coding
- adaprion to network and contex
- jitter buffers
Day 2:
Modelling and coding of speech and audio signals:
- a principle for generic modeling
- transforms and prediction
- application to speech
Signal enhancement:
- spectral subtraction / Wiener filtering
- noise estimation by minimum statistics
- usage of prior information
- impact of nonstationarity
Detection of emotion:
- archetypical emotion types
- features for emotion classification
- subband based emotion classification
- bimodal emotion classification
Signal quality estimation
- subjective quality estimation
- objective quality estimation
Course Presenters
Bastiaan Kleijn
Bastiaan is a Professor at the School of Electrical Engineering at KTH (the
Royal Institute of Technology) in Stockholm, Sweden, where he heads the Sound and Image
Processing Laboratory. He is also a founder and former Chairman of Global IP Sound, the
company that supplies the audio processing for the Voice over IP (VoIP) systems of Skype,
Google, and Microsoft; he remains Chief Scientist there. He has written about 150 papers
and holds 23 patents. He has Ph.D. degrees from Delft University of Technology (Netherlands)
and the University of California. He has held visiting Professor positions at Delft University
of Technology, Vienna University of Technology, Graz University of Technology, and Massey
University. Before entering academia, he worked at AT&T Bell Laboratories (Research). He has
been on the Editorial Boards of the IEEE Transactions of Speech and Audio Processing, IEEE
Signal Processing Letters, IEEE Signal Processing Magazine, and the EURASIP Journal of Applied
Signal Processing. He is a Fellow of the IEEE.
Liyanage C De Silva
Liyanage received a BSc Eng.(Hons) degree from the University of Moratuwa Sri Lanka
in 1985, M.Phil. degree from The Open University of Sri Lanka in 1989, MEng. and PhD degrees
from the Univ. of Tokyo, Japan in 1992 and 1995 respectively. He was with the University of
Tokyo, Japan, from 1989 to 1995. From April 1995 to March 1997 he has pursued his postdoctoral
research as a researcher at ATR (Advanced Telecommunication Research) Laboratories, Kyoto, Japan.
In March 1997 he joined The National University of Singapore as a Lecturer, where he was an
Assistant Professor till June 2003. Currently he is a Senior Lecturer at Massey University.
He has expertise in Digital Image Processing, Speech Processing and Communication theory.
He has published over 100 technical papers in these areas in international conferences, journals
and Japanese national conventions and holds one Japanese national patent, which was successfully
sold to Sony Corporation Japan for commercial utilization. This particular patent is in the area
of bimodal emotion recognition and will be utilized in human computer interaction in computer
game interfaces. He received the Best Student Paper Award from SPIE (The International Society
for Optical Engineering) for an outstanding paper contribution to the International Conference
on Visual Communication and Image Processing (VCIP) in 1995. He is a senior member of IEEE USA.
Massey University
Massey University is one of New Zealand’s leading educational institutions. It has 4 campuses,
and provides a choice of over 200 degrees, certificates and diplomas. In 2003 Massey had a total
of 40,000 students, 21,000 of whom were studying by distance. The university has a proud 76-year
tradition of academic and research excellence combined with a strong national and international
reputation. The School of Engineering and Advanced Technology is Massey University’s focal point
for quality education and research in the broad areas of Information & Telecommunications, Computer
Systems, and Software Engineering.
|