Interspeech

Interspeech
Name	Interspeech
Status	Active
Genre	Academic conference
Discipline	Speech processing
Frequency	Annual
First	2000
Organizer	International Speech Communication Association

Contents

History
Conference Format and Organization
Topics and Technical Scope
Awards and Competitions
Publication and Proceedings
Related Events and Community Impact

Interspeech is an annual international conference dedicated to the science and technology of speech, spoken language processing, and audio communication. It gathers researchers, practitioners, and industrial partners for presentation of peer-reviewed papers, tutorials, and demonstrations. The conference has become a central venue alongside events such as ICASSP, ACL, NeurIPS, EMNLP for cross-disciplinary exchange among communities from industry and academia.

History

The conference originated from a series of regional meetings and merged traditions in the late 1990s, culminating in the launch of the modern conference in 2000 under the auspices of the International Speech Communication Association. Early editions built on legacies from conferences tied to organizations such as IEEE Signal Processing Society, ISCA, and regional bodies like IEEE, drawing participants linked to centers such as Bell Labs, MIT, Cambridge University Engineering Department, and ATR research laboratories. Over successive years Interspeech rotated through host cities including Beijing, Lisbon, San Francisco, Stockholm, Hyderabad, and Dublin, reflecting growth similar to SIGGRAPH and ICASSP in scale. Key historical milestones paralleled breakthroughs by research groups at IBM Research, Microsoft Research, Google Research, Facebook AI Research, and universities like Stanford University, Carnegie Mellon University, University of Edinburgh, and University of Tokyo that influenced areas such as automatic speech recognition, speaker diarization, and speech synthesis. The venue has witnessed seminal demonstrations related to technologies championed by entities like Nuance Communications, Amazon, Apple Inc., and startups spun out of labs such as DeepMind and OpenAI.

Conference Format and Organization

The conference is organized annually by the International Speech Communication Association with local organizing committees drawn from host institutions—often involving departments such as Centre for Speech Technology Research at University of Edinburgh or labs at KTH Royal Institute of Technology. Typical program elements include oral sessions, poster sessions, keynote talks by prominent figures from IEEE, Royal Society, or corporate research groups, tutorials by teams from Google DeepMind or Microsoft Research Cambridge, and special sessions co-chaired by committees with experience from ICASSP and ACL. Steering and technical program committees include members affiliated with ETH Zurich, Max Planck Society, Johns Hopkins University, and Tsinghua University. Venue choices have included conference centers in cities with infrastructure supported by organizations such as City of Stockholm, Government of India, and local universities, while sponsorship often comes from corporations like NVIDIA, Intel, Qualcomm, and research funding agencies such as European Research Council.

Topics and Technical Scope

The technical scope spans automatic speech recognition, text-to-speech, speaker recognition, speech enhancement, prosody, multilingual processing, and paralinguistics, overlapping with research agendas at ACL, NAACL, and ICASSP. Other focal areas include end-to-end neural approaches popularized by teams at Google Brain, Facebook AI Research, Baidu Research, and Tencent AI Lab; signal processing contributions associated with IEEE Transactions on Audio, Speech, and Language Processing; and linguistic analyses linked to scholars from University of Cambridge, Yale University, Harvard University, and University of Oxford. Datasets introduced or discussed at the conference have origins or parallels with corpora from LDC, ELRA, Common Voice, and national initiatives in countries such as China, India, Japan, and Germany. Evaluation campaigns and benchmarks discussed often reference activities tied to CHiME challenge, ASR evaluations run by NIST, and shared tasks organized with partners like EMNLP and ICASSP.

Awards and Competitions

Interspeech recognizes outstanding contributions through awards and competition tracks that mirror practices at ACL and NeurIPS, including best paper awards, best student paper awards, and special prizes sponsored by corporations such as Amazon Web Services, Google, and Microsoft. Competitive challenges hosted in conjunction with the conference have included speaker recognition challenges with participation from teams at SRI International, Idiap Research Institute, and university consortia; speech separation and enhancement tracks attracting entries from MILA, UPenn, and University of Maryland; and low-resource language tasks engaging groups supported by UNESCO and national research councils. Prize committees commonly feature academics and industry researchers affiliated with University of Illinois Urbana-Champaign, RWTH Aachen University, Peking University, and Cornell University.

Publication and Proceedings

Accepted papers are published in conference proceedings overseen by the International Speech Communication Association and indexed in digital libraries alongside proceedings from ICASSP and ACL. Proceedings include full papers, short papers, demos, and workshop reports, with archival records mirrored in databases maintained by institutions like IEEE Xplore, Scopus, and Google Scholar. Authors frequently expand conference papers into journal articles for outlets such as IEEE/ACM Transactions on Audio, Speech, and Language Processing, Computer Speech & Language, and Speech Communication, with peer-review mechanisms coordinated with editorial boards at those journals.

Interspeech interfaces with a broader ecosystem of workshops, summer schools, and industry tracks similar to activities around EMNLP, ICASSP, LREC, and NeurIPS that foster cross-pollination among research groups at CMU, MIT CSAIL, Oxford Robotics Institute, and corporate labs including Apple Machine Learning Research and Samsung Research. The conference has influenced standards and deployments in products by Google, Amazon, Apple Inc., and telecommunications vendors like Ericsson and Huawei Technologies. Training programs and collaborations spawned at Interspeech have supported initiatives funded by agencies such as NSF and Horizon 2020, and community-building efforts have engaged organizations like IETF and W3C on speech-related interoperability.

Category:Conferences in computer science

History

Conference Format and Organization

Topics and Technical Scope

Awards and Competitions

Publication and Proceedings

Related Events and Community Impact