Hearing is a complex sensory function which converts mechanical waves (sound) to electrical patterns on the auditory nerve. The sound receptor is the inner ear, or cochlea. Each year Europe counts more than 5,000 deaf-born babies (i.e. one out of 1,500 newborns). Not treating these babies leads to deaf-muteness. In adults, the prevalence of ‘cochlear’ hearing loss increases with age. Approximately 20% of people over 75 years of age have moderate and 3% severe hearing loss. Severe and profound hearing losses can be treated with cochlear implants (CIs). When implanting a CI, some 10–20 contacts are surgically placed into the cochlea. Within the device, sound is analyzed by an external speech processor, which resembles a classical behind-the-ear hearing aid. The implanted electrodes are then stimulated to generate an electrical field to pass the information directly to the auditory nerves in the cochlea (see Figure 1).
Components of a cochlear implant. (1) An external behind-the-ear processor receives sound through a microphone, processes it and (2) delivers it to the internal components through a radio frequency link. (3) An implanted coil receives the signal, and an internal device (4) converts it to electrical pulses that are delivered to electrodes in an array (5) in the cochlea.
The working of a CI is controlled by roughly 200 tunable parameters that determine the ‘input-output’ behaviour of the speech processor: sensitivity levels at different frequencies, electrical dynamic ranges for each electrode, characteristics of amplifiers and strategies for stimulating electrodes.
Following implantation, a CI must be programmed or ‘fitted’ to optimize the hearing sensation of individual patients. This is a challenging and time-consuming task that is typically performed by highly trained engineers, audiologists or medical doctors. One reason is that the outcome of the optimization process is difficult to measure. Tone and speech audiometry are the only outcome measures that are used clinically all over the world. But they cover no more than a small amount of the entire auditory performance and provide little analytical feedback to the fitter. Consequently, many fitters rely on instantaneous feedback from patients. Since patients are often very young, however, or may never have heard ‘normally’ before, this feedback often relates more to comfort than to the intrinsic accuracy of sound coding.
As a result, CI centres and manufacturers have developed their own heuristics, usually in the form of simple ‘if-then’ rules that are applied in a very flexible, individual and uncontrollable way. At present, no universal standards or well-defined good clinical practices exist to guide the fitters. With more than 200,000 CI users worldwide and an annual increase of over 30,000, this lack of feedback represents an ever-increasing problem and a real bottleneck to further implementation.
Opti-Fox (OPTImization of the automated Fitting to Outcomes eXpert) is a project funded by the European Commission's Seventh Framework Programme (FP7) of research under the Small and Medium-Sized Enterprises (SMEs) initiative. Our objective for this project is to develop an intelligent, self-learning agent (or system) for CI fitting (see Figure 2). We will combine the latest technologies from linguistics, automatic speech recognition, machine learning and optimization. The consortium consists of SMEs and research institutes from Belgium, the Netherlands and Germany, in close collaboration with the CI manufacturer Advanced Bionics. Details are available elsewhere.1–3
The Opti-Fox logo combining the FOX (Fitting Outcomes to eXpert) intelligent-agent software and European Commission Seventh Framework Programme logos.
Prior to this project, we developed several psycho-acoustic tests to better monitor the auditory performance of CI users and to provide feedback to the fitter.4 Basically, these tests make it possible to break down the coding of sound into its different components, such as intensity and spectral and temporal content. When the results are compared with those in hearing subjects (‘the norm’), deviations can be directly linked to the processing of the particular sound component by the CI.
Ongoing research is focused on a new speech-understanding test that is language independent, provides automatic scoring by means of automatic speech recognition (ASR) technology and allows detailed spectral analysis to feed back to CI fitters. Together with linguists and computational scientists, we are developing grapheme (symbol)- and phoneme (sound)-based inventories that characterize languages. They will be validated on large corpora of several European languages and will allow us to draw customized samples for individual patients that are representative of the entire language in terms of phonetics, typology, morpho-syntax (word and sentence structure) and so forth.
We are also creating automated scoring strategies using ASR. Algorithms are being developed and investigated to compare reference utterances with test utterances which may contain errors due to the test person's hearing deficit. During a first stage, we have made recordings of 30 hearing volunteers whose mother tongue is Flemish, Dutch or German. The subjects were asked to (re)produce about 400 words 20 times, yielding around 240,000 .wav files. This vast data set will serve as the first validation of the algorithms under construction. Future stages will include validation based on hearing-impaired subjects.
Finally, we plan to devise self-learning tuning strategies that will minimize the number of ‘test-tune’ steps in the fitting process by combining machine learning with a variety of optimization and search methods. The result will be a self-learning agent that will employ both historical and very recent data to continually improve the tuning strategy.