Microsoft’s AI research team has made a major breakthrough in the area of speech recognition by developing a system which allows computers to understand what a person is saying as well as any human. The group published their findings in a paper, with the company’s chief speech scientist Xuedong Huang calling it an historic achievement.
The Microsoft Artificial Intelligence and Research Group claim that the technology has a word error rate (WER) of just 5.9%, lower than the 6.3% WER the team reported a month ago. The former figure is approximately equal to that of normal people who were asked to transcribe the same conversation as the machine.
It’s also notably the lowest ever WER recorded against the industry standard Switchboard speech recognition task. Microsoft considers this a major milestone, claiming that this marks the first time a computer can identify words in a conversation as well as an individual would.
Also Read: Microsoft to kill Lumia range in December
Microsoft plans to take advantage of the breakthrough system by implementing it across its product portfolio. This includes consumer entertainment offerings such as Xbox and digital assistants like Cortana. It believes the technology will make the latter more powerful, with an eye on crafting a truly intelligent assistant.
However, the team has a number of challenges ahead of it. Achieving human parity is only the first step of many. One major obstacle is making sure recognition works perfectly in real-life situations where there could be multiple people talking or loud background noise.
Moving forward, the researchers also plan to deepen their focus on teaching machines how to go beyond just transcribing to actually understanding the words and context behind them.