A.I. in Focus: Translating natural language

A big challenge in the development of artificial intelligence is the programming of computers to respond in a human-like way to input from the senses, including touch, vision and hearing.

Dr Reza Haffari is making inroads in the study of natural language processing (NLP). His goal is to enable computers to do language-related human tasks such as trans- lation and ‘summarisation’—the taking on of detailed information and distillation of the essence of the knowledge it contains.

The prime goal in the field of NLP is to develop models of language such that computers will be able to recognise patterns, then use them to make correct predictions based on input from the real world with minimal human involvement.

As with other applications that rely on MASSIVE (see the story on Professor Tom Drummond) for super computing power, Dr Haffari is using the method of deep learning to train computers.

The concept is not new. Google Translate is a widely-used platform that allows users to move back and forward between two languages.

But Dr Haffari’s model is very different. Whereas Google relies on a vast database of translated examples (so-called ‘annotated data’) for the computer to draw on when solving a problem of translation, Dr Haffari’s model is based on a relatively small pool of annotated data and a much bigger volume of ‘unannotated’ data.

Critical to Dr Haffari’s research to enable computers to make sense of linguistic patterns in untranslated data is access to MASSIVE’s super-computing power and all-important graphics processing units (GPUs), which are costly to purchase and otherwise not readily available to Australian scientists.

“We use MASSIVE all the time,” Dr Haffari said. “For our research, MASSIVE is the air that we breathe.”

It’s not only the computing power per se that is so important. MASSIVE also provides much-needed support for the software and hardware that comprise the enabling techno- logical package behind Dr Haffari’s research.

As well as machine translation, Dr Haffari is working on the problem of summarisation, which requires that a computer be trained to apply understanding and reasoning to grasp the essential message contained in a body of information and present it succinctly.

Behind his approach to both problems is Dr Haffari’s rationale that humans can learn more than one linguistic skill, so suitably trained computers should be able to do the same. Such computers would need to apply previous learning to the solution of unfamiliar problems.