Quest for next-generation search engine

Friday, 20 March, 2009


A computer grid designed to analyse data that will be generated by the world’s biggest scientific experiment is being used by two high-tech companies to help build the next generation of an internet search engine.

The Cambridge start-up firms — Imense and iLexIR — have created a joint venture called Camtology to link their individual expertise and products together to enable searching of both text and images online.

Their aim is to provide a British-based search engine capable of competing with the best providers on the world stage, capturing a large share of the huge market for search services.

Imense is already building the next generation of image searching, developing innovative, more powerful solutions that make retrieval of images easier than with any existing system on the internet.

iLexIR focuses on natural language processing, aimed at identifying relevant information, as opposed to just looking at individual words in a document.

The Camtology team is using GridPP to test and enhance their software. Funded by the Britain's Science & Technology Facilities Council (STFC), this computer grid was built to handle and analyse Britain’s share of the petabytes of data generated annually by the Large Hadron Collider project at the European Organisation for Nuclear Research (CERN) in Switzerland, requiring huge data storage and processing capabilities. (One petabyte is one quadrillion bytes.)

With the aim of becoming 'the Google of image searching', Imense has developed a search engine that will make sense of the huge numbers of pictures on the web.

Although images and video make up more than 70% of digital data available on the internet, traditional software cannot index this information directly, relying totally on text descriptions entered by hand.

Imense’s software can look at a photo and recognise the colours, shapes, objects and scenes and retrieve images based on their content, without the need for human-generated captions.

It also uses a query language — the user just types in a few key words and the software can interpret the request and match it to relevant images on the basis of their visual content.

The use of the Grid and its vast processing power has enabled Imense to test and demonstrate its software on sufficiently large numbers of photos — millions upon millions of images — that otherwise would have been impossible.

iLexIR is focusing on natural language processing. Current search engines present pages of results in order of expected relevance to a query, based on key words typed in by the user, usually resulting in vast numbers of irrelevant pages being returned and often with some important results not presented.

The use of natural language can help with both interpreting the query and also, crucially, with interpreting the pages with the potential answers.

Related Articles

3D reflectors help boost data rate in wireless communications

Cornell researchers have developed a semiconductor chip that will enable smaller devices to...

Scientists revolutionise wireless communication with 3D processors

Scientists have developed a method for using semiconductor technology to manufacture processors...

Portable antenna could help restore communication after disasters

Researchers from Stanford and the American University of Beirut have developed a lightweight,...


  • All content Copyright © 2024 Westwick-Farrow Pty Ltd