What you will find here

Tuesday, October 21, 2008

Project - consideration


As I said in last artical, my topic for the final essay are searchengines and specially focus on natural language search engines.

First and second generation of search engines
Technologies should help people to make things easily and faster. After the first generation of search engines which were quite complicated and was quit necessary to have a help of the thirth side (librarian or information specialist), because of using special forced language. It was necessary to know how ... Came the second generation of search engines, which was based on the boolean logic and used graphic userfirendly interface. Thats actually search engine which we know from nowadays. I dont think that its necessary to give here an example, because everyone knows them, Google or Yahoo. They are for sure not the only but they are the biggest and the most famous. Nowadays could users quit easily use this kind of interface and most of users are able to work with boolean logic. That means use the basic operators as AND, OR, NOT and keywords. With combination of fulltext search which nowadays promote common search engines.

My own opinion is, that this is good way how to search information. That actually the process of thinking and searching for information is based on simple definitions of keywords and their combinations. Our brain is used to work with these basic forms, always when we create a sentence or other speech, we have to use at least two different ranges: vocabularies and grammar. How we use it depends on our language skills or communication skills. Actually the way of searching information reflect the way of thinking, but it is not process in brain, but the final proces of creating sentence takes place directly in the search engine. The interface of search engines makes this process easier. In the better search engines you could use for exmple suggestions, to see how other people use this or that keyword in which combination, which could help you to create exactly the query.
Conclusion for this is that this way of searching information is used at the begun of thinking and helps to create the right query or sentence directly in the process.

Thirth generation ... ?
Nowadays is very trendy and big tendency to develop new engines based on natural language searching. The idea is that user writes question in the natural language ( mostly in english) and the system decodes the keywords from this sentence and go throught the webpages and use these keywords to find the answer. Problem is that the decoding of the sentence could be quit defficult, because of the basic charakter of natural language which is asymetry. That means that there are different ways how to express the same thing. Then after is the system usually confused and doesnt give the right answer or any answer at all.

If we look at this problematic, we could see that the that here exist two ways of coding and decoding of information. First is in our brain when we have to think out how to create proper sentence which would be system able to answer, then the system decode this sentence back on the basic fundaments and find the answer. During the first coding process of making a sentence
in the brain we have to work just with our knowledges, because the search engines usually cant offer any suggestions. Well in this point I think is the searching much more exhausting than when you could use the clusters or other supports of common search engines.

What say libraries of it?
The use of natural language maybe could help them get older users, because they would feel more comfortable to use the natural language then to use the traditional boolean logic. It could remind more the traditional communication with the librarian. And contrary to the boolean logic search engines the system would speak as a normal human. But is that enought?

More interesting would be implementation of sth. like Trueknowledge which is search engine, which use the natural language, but uses its own knowledge database with the basic in Wikipedia database. I will speak about this engine more later...
Or the system Powerset which helps to scan and make abstract from the webpages and ofer briefly information about the article. Nowadays is oriented just on the Wikipedia articles, but the plan is to extend the focus. This is really nice engine ... and contrary to the others it works quit well ... I also will speak about it in special articel ...

Conclusion
This was just theoretical concideration about the use of natural language in search engines. Just a small theory...

No comments: