- Professor of Speech Science, University College London
- Senior Lecturer in Speech Sciences, University College London
- Lecturer in Speech Sciences, University College London
- Technical Director, Marquesa Search Systems Ltd
- SERC Advanced Research Fellow, UCL
- Research Assistant, UCL
- Software Engineer, Scicon Consultancy International.
- Postgraduate Research Student, UCL
- 1984 PhD University of London
- Title: "An Interactive Speech Pattern Audiometer"
- 1976 BSc (Hons) University of Warwick
Prizes, Awards And Other Honours
- Provost's Teaching Award, UCL
- Acoustic Modelling of Phonetic Segments, MoD University Research Agreement with RSRE, £100k.
- Fundamental relations between acoustic and phonetic descriptions of speech, SERC Advanced Fellowship, £125k.
- Computational modelling of aspects of speech perception, MRC Cognitive Science Initiative, £150k. (with Andrew Faulkner and Stuart Rosen)
- Development of an automatic parsing system, EPSRC £176k. (with Sydney Greenbaum)
- Integrated prosodic approach to speech synthesis, EPSRC £190k. (with Jill House, York University and Cambridge University)
- Automatic enhancement of speech, EPSRC £170k. (with Valerie Hazan)
- Enhanced Language Modelling, EPSRC £160k.
- Marie Curie Fellowship €110k. (with Santi Fernandez)
- Centre for Law-Enforcement Audio Research (CLEAR), Home Office £1.1m (£400k to UCL). (with Mike Brookes and Patrick Naylor at Imperial College)
- Performance-based Measures of Speech Quality, Research in Motion, £70k.
- Avatar Therapy, Wellcome Trust, £1.3M.
- Integrated voice analysis of satellite communications embedded in time and safety-critical environment (iVOICE), European Space Agency, €250k. (with Iya Whiteley)
- E-LOBES, EPSRC, £1.2M (£500k to UCL). (with Stuart Rosen, and also Mike Brookes and Patrick Naylor at Imperial College)
My research has been at the intersection of Phonetics and Speech Technology: looking at how technological solutions to speech processing problems can improve our understanding of human speech processing, and how modern phonological theories might be applied in speech synthesis and recognition. Recently I have been exploiting speech technologies in novel clinical applications.
- Speech recognition
- I have been concerned with how phonological knowledge is exploited in speech recognition (Huckvale, 1990; Huckvale, 1998) and whether more modern non-linear phonological representations could be used as the basis for speech recognition (Huckvale, 1993; Huckvale, 1995). I have considered why particular technological solutions to speech recognition are successful (Holmes & Huckvale, 1994, Huckvale, 1996; Huckvale, 1998), and what this tells us about the human speech recognition task (Huckvale, 1997). I was able to show how the introduction of a morpho-phonological component improved a speech recognition system’s vocabulary (Huckvale & Fang, 2002).
- Speech synthesis
- I developed a speech synthesis-by-rule system within the ProSynth project (Hawkins et al, 1998; House et al, 1999; Huckvale, 1999; Ogden et al, 2000; Heid et al, 2000) being responsible for system design and implementation. This used novel representational structures for data and knowledge for “all-prosodic” synthesis. I was vice-chair of the COST 258 project “Naturalness of Synthetic Speech”, helping to direct the research efforts into improving the expressiveness of speech made within the consortium (Keller et al, 2001; Huckvale, 2001). The outcome of this work led me to question whether speech synthesis technology was less concerned with human speech production than the separate problem of simulating human speech (Huckvale, 2002).
- I contributed to the scientific study of accents through a metric which computes an accent similarity measure even across two different speakers (Huckvale, 2004; Huckvale, 2007a; Huckvale 2007b). The ACCDIST algorithm has been shown to give state of the art performance on accent recognition (Hanani et al, 2013), and has also been shown to be useful in adapting speech recognition systems to accented speech (Najafian, 2014). My colleague Paul Iverson has shown how ACCDIST can be used to predict the mutual intelligibility of second-language learners of English (Pinet et al, 2011). With Kayoko Yanagisawa, I helped create the concept of accent morphing, in which the accent of a speaker could be modified without affecting their speaker identity (Huckvale & Yanagisawa, 2007; Yanagisawa & Huckvale, 2008; Yanagisawa & Huckvale, 2010).
- Infant speech acquisition
- With Ian Howard I showed how modern machine learning methods could be applied to the computational modelling of infant speech acquisition (Huckvale & Howard, 2005; Howard & Huckvale, 2005). My frustration in the difficulty of performing experiments in the social acquisition of language led to the development of KLAIR: a virtual infant for speech acquisition research (Huckvale, Howard & Fagel, 2009; Huckvale, 2011; Huckvale & Sharma, 2013). KLAIR is a 3D animated head with the ability to hear through a real-time auditory analysis system and speak through a real-time articulatory synthesizer. It is designed to be the computer’s "interface" with caregivers for machine learning of language through social interactions. See www.phon.ucl.ac.uk/project/klair.
- Speech signal enhancement
- In a joint collaboration with Imperial College London I helped set up the Centre for Law-Enforcement Audio Research (CLEAR) with funding from the UK Home Office. Here we studied methods for the enhancement of poor-quality speech recordings found in law-enforcement. While Imperial College focussed on the speech signal processing aspects, UCL studied the effects of noise and signal enhancement on the intelligibility of speech to human listeners. The CLEAR centre has led to many publications, which can be seen at www.clear-labs.com. UCL's conribution was to the methodology used for collecting and modelling intelligibility data (Hilkhuysen et al, 2012; Hilkhuysen et al, 2014), and also to the idea of performance-based measures of speech quality (Huckvale & Leak, 2009; Huckvale & Frasi, 2010; Huckvale & Hilkhuysen, 2012).
- Avatar therapy
- With Julian Leff and Geoff Williams I designed and developed an avatar system for use within a novel therapy for schizophrenic patients suffering hearing voices. The system was highly innovative in that both the avatar face and the avatar voice could be customised to suit each patient. As part of this I had to develop a flexible real-time voice conversion system (Huckvale & Williams, 2013). Avatar therapy was shown to be highly effective in a pilot study (Leff et al, 2013; Leff et al, 2014) with some patients losing their voice hallucinations even after suffering them for many years. A larger trial is currently ongoing, funded by the Wellcome Trust and I am now responsible for developing the technical elements into a portable product. See www.phon.ucl.ac.uk/project/avtherapy.
- Voice analysis
- With the UCL Centre for Space Medicine, I have recently set up a new activity in voice analysis. With funding from the European Space Agency and in collaboration with the Russian Gagarin Cosmonaut Training Centre and the Russian Institute of Biomedical Problems, we are currently investigating how characteristics of the voice change with stress and cognitive load (Huckvale, 2014). The goal is to determine the feasibility of tracking the physical and psychological state of aerospace crew on long term missions. My voice analysis work is now promising to make contributions in the clinical domain, with interest in using voice help with diagnosis and rehabilitation of patients with Parkinson’s Disease, Stroke and Dementia.
For a list of publications, see my publications page.