Multilingual Search Engine

Researchers from the Validation and Business Applications Group (VAI) at the 's School of Computing have developed a multilingual search engine to query a contents repository written in Interlingua using questions formulated in any language. The search engine returns a precise answer in the language in which the question was formulated.

"Interlingua" is a language-independent contents representation. The United Nations' Universal Networking Language (UNL) is the only general-purpose Interlingua specified by standards, handbooks, and governing organizations. UNL was created to break Internet language barriers, and the VAI is the UNL support group for the Spanish language. The multilingual search engine is a question-answering system that aims to return precise answers to questions about facts formulated in the user's mother tongue

The novelty of this system is that the question can be formulated in English, French, Spanish, or any other language, and the system will return an answer formulated in this same language without any translation from source to target languages, because the information base that the system searches is written in UNL.

Supposing that the answer is implicit in the question, the system exploits the features of the UNL representation of the user's question to find the answer. The search engine works by deducing the answer from the question rather than "finding" the answer to the question.

How It Works

The search engine is responsible for searching the text corpus written in UNL to find the answer as follows. First, it searches the text corpus for statements that could contain the answer. Second, it determines which of this set of statements contains the answer, and what the answer is. It then generates the answer in the same language that the question was formulated in.

In response to the question, "Why was Aubert awarded the Camere prize?," for example, the search engine searches the repository and locates the graph shown below. From this graph, it deduces the answer to the question, i.e. "For a new type of movable dam."

Promising Results
Researchers used the UNESCO biographical encyclopedia as an information base for the exercise concerning the French engineer Jean Aubert (1894-1984). This encyclopedia has 25 articles, which have been translated to UNL and contain 101 UNL expressions and 2534 universal words.

The results of this research -- 82% precise answers -- are promising. A total of 75 different questions (when, how, who) were formulated, to which the right answer was known beforehand. Other questions for which the repository contained no answer were formulated to examine system behavior in such cases. The results confirm the validity of this search engine for developing multilingual question answering systems.

The complete findings of this research, conducted by Jesus Cardenosa (VAI director), Carolina Gallardo and Miguel A. de la Villa, were presented at the 8th International Conference FQAS 2009 in Denmark, in October 2009, and are available in Springer's Lecture Notes in Artificial Intelligence 5822.

