Generated by GPT-5-mini| Centre for Speech Technology Research | |
|---|---|
| Name | Centre for Speech Technology Research |
| Established | 1995 |
| Type | Research centre |
| City | Edinburgh |
| Country | Scotland |
| Affiliations | University of Edinburgh |
Centre for Speech Technology Research is a multidisciplinary research centre based at the University of Edinburgh that focuses on speech science, speech technology, and spoken language processing. It brings together researchers from linguistics, engineering, computer science, and cognitive science to develop technology for speech synthesis, automatic speech recognition, and spoken dialogue systems. The centre has contributed to foundational tools, corpora, and evaluation campaigns that shape international efforts in phonetics, signal processing, and machine learning.
The centre was founded amid a period of expansion in computational linguistics and speech engineering at the University of Edinburgh, joining traditions associated with Alan Turing, John McCarthy, Alec Broers, James Clerk Maxwell, Hugh Gaitskell, and institutions such as Bell Labs, IBM, Microsoft Research, Google Research, and AT&T Labs. Early collaborators included researchers from Max Planck Institute for Psycholinguistics, University of Cambridge, Massachusetts Institute of Technology, Carnegie Mellon University, University of Tokyo, University of California, Berkeley, and Harvard University. The centre grew through participation in European Union funded projects with partners like SRI International, Philips, Siemens, Nokia, and Siemens AG and by contributing to international standards bodies including International Telecommunication Union and World Wide Web Consortium. Over time the centre's staff have engaged with initiatives associated with DARPA, ESRC, EPSRC, Royal Society, and European Research Council funding streams.
Research spans automatic speech recognition, text-to-speech synthesis, speaker recognition, prosody modelling, and dialogue management, engaging with communities around Hidden Markov Model Toolkit, Kaldi, HTK, WaveNet, Tacotron, and Transformer (machine learning model). Work intersects with studies driven by Noam Chomsky, Daniel Jurafsky, Steven Pinker, Geoffrey Hinton, Yoshua Bengio, and Andrew Ng in computational approaches to phonology, morphology, and semantics. Projects address low-resource languages with partners in regions represented by UNESCO, World Health Organization, African Union, British Council, and Commonwealth of Nations. The centre contributes to benchmarks and shared tasks organized by NIST, ACL, Interspeech, EMNLP, ICASSP, and ISCA.
Facilities include anechoic recording studios, computational clusters, and language corpora repositories used alongside software from GNU Project, Linux, Python (programming language), TensorFlow, PyTorch, Kaldi (speech recognition toolkit), and HTK (Hidden Markov Model Toolkit). The centre curates speech databases comparable to resources from Linguistic Data Consortium, ELRA, Bootlegger, VoxForge, and Common Voice. High-performance computing resources are provisioned through collaborations with ARCHER, UKRI, EPSRC National Facility, and cloud services from Amazon Web Services, Google Cloud Platform, and Microsoft Azure.
The centre supervises postgraduate students on programmes affiliated with School of Informatics, University of Edinburgh, Edinburgh College of Art, Napier University, Heriot-Watt University, and offers modules that connect to curricula at University of Oxford, University of Cambridge, Imperial College London, and UCL. It co-delivers courses that draw on textbooks by Jurafsky and Martin, Daniel Jurafsky, James Allen, and practical labs using tools from CMU Sphinx and Kaldi. Students engage with conferences such as Interspeech, ACL, EMNLP, ICASSP, and receive training via workshops linked to Mozilla Foundation community programmes.
The centre has formal partnerships with industry players including Google, Amazon, Apple Inc., IBM, Microsoft, Nuance Communications, Baidu, Tencent, Samsung Electronics, and Sony. It participates in consortia with standards and policy organisations such as ITU, W3C, ISO, IEEE, and collaborates with healthcare and assistive technology partners like NHS Scotland, National Health Service (England), Royal National Institute of Blind People, and Sense (charity). Research projects have run with companies across telecommunications and consumer electronics including Nokia, Ericsson, Huawei, Vodafone Group, and Orange S.A..
The centre contributed to development of synthesis systems related to waveform modelling techniques from WaveNet and neural vocoder advances inspired by work at DeepMind, and recognition technologies influenced by Kaldi and HTK. Its corpora and annotation practices have informed shared tasks organized by NIST, CHiME, AMI Meeting Corpus, CALLHOME, Switchboard, and GlobalPhone. Staff have taken leading roles in evaluation campaigns like Blizzard Challenge, CHiME Challenge, and influenced toolchains used by Mozilla Common Voice and Linguistic Data Consortium. Contributions include prosody modelling, multi-speaker synthesis, robust front-ends for noisy settings, and multilingual acoustic models used in projects funded by Horizon 2020 and European Research Council.
Leadership and researchers have included figures associated with the University of Edinburgh and external visiting scholars from University of Cambridge, MIT, Stanford University, Carnegie Mellon University, Max Planck Society, and Chinese Academy of Sciences. Senior academics and postdoctoral fellows have connections with awardees from Royal Society, Royal Academy of Engineering, European Research Council, Marie Skłodowska-Curie Actions, and have published in venues such as Nature Communications, Science Advances, IEEE Transactions on Audio, Speech, and Language Processing, and Computational Linguistics.
The centre and its members have received grants, fellowships, and awards from EPSRC, ERC, Royal Society, Royal Society of Edinburgh, British Academy, Marie Curie Actions, NATO Science for Peace, and recognition via best paper awards at Interspeech, ICASSP, ACL, and EMNLP. Its datasets and software have earned citations in international community resources and have been acknowledged in reports by UK Research and Innovation and policy documents involving Department for Business, Energy and Industrial Strategy.