Researchers at Xerox Corporation have unveiled new document search software, FactSpotter that claims to go beyond the conventional ‘keyword’ search.
Representatives say that the new text mining software combines a powerful linguistic engine with an easy-to-use interface so that anyone can query the system in everyday language. Unlike traditional enterprise search tools, FactSpotter looks for the keywords contained in a query along with the context of the document those words contain.
According to Xerox, the search engine is capable of combing through almost any document regardless of the language, location, format or type; take advantage of the way humans think, speak and ask questions; and discriminate the results highlighting just a handful of relevant answers instead of returning thousands of unrelated responses.
Frederique Segond, manager of parsing and semantics research at XRCE said, “Our advanced search engine goes beyond today’s typical ‘keyword’ search or current data-mining programs, which typically end up searching only 40 percent of all the documents that are relevant because the keywords are too limiting. Xerox’s tool is more accurate because it delves into documents, extracting the concepts and the relationships among them. By ‘understanding’ the context, it returns the right information to the searcher, and it even highlights the exact location of the answer within the document.”
FactSpotter’s interface allows users to express their queries naturally instead of forcing them to adapt their questions to the logic of computers. Traditional systems, on the other hand, split a query into isolated words and return only documents that contain exactly those words. Also, unlike traditional search engines that return the entire document forcing the user to find the relevant information manually, FactSpotter returns the specific portion of a search document that is relevant to the query.
FactSpotter takes into account the context of the entire document instead of just a cluster of nearby words. It introduces the concept of ‘relation,’ searching within and across sentences and paragraphs. The software also recognizes abstract concepts, like ‘people’ or ‘building,’ and will retrieve all the words that fit within that category.
Xerox plans to launch FactSpotter next year as part of its Xerox Litigation Services offerings.