What you will find here

Tuesday, October 21, 2008

Project - consideration


As I said in last artical, my topic for the final essay are searchengines and specially focus on natural language search engines.

First and second generation of search engines
Technologies should help people to make things easily and faster. After the first generation of search engines which were quite complicated and was quit necessary to have a help of the thirth side (librarian or information specialist), because of using special forced language. It was necessary to know how ... Came the second generation of search engines, which was based on the boolean logic and used graphic userfirendly interface. Thats actually search engine which we know from nowadays. I dont think that its necessary to give here an example, because everyone knows them, Google or Yahoo. They are for sure not the only but they are the biggest and the most famous. Nowadays could users quit easily use this kind of interface and most of users are able to work with boolean logic. That means use the basic operators as AND, OR, NOT and keywords. With combination of fulltext search which nowadays promote common search engines.

My own opinion is, that this is good way how to search information. That actually the process of thinking and searching for information is based on simple definitions of keywords and their combinations. Our brain is used to work with these basic forms, always when we create a sentence or other speech, we have to use at least two different ranges: vocabularies and grammar. How we use it depends on our language skills or communication skills. Actually the way of searching information reflect the way of thinking, but it is not process in brain, but the final proces of creating sentence takes place directly in the search engine. The interface of search engines makes this process easier. In the better search engines you could use for exmple suggestions, to see how other people use this or that keyword in which combination, which could help you to create exactly the query.
Conclusion for this is that this way of searching information is used at the begun of thinking and helps to create the right query or sentence directly in the process.

Thirth generation ... ?
Nowadays is very trendy and big tendency to develop new engines based on natural language searching. The idea is that user writes question in the natural language ( mostly in english) and the system decodes the keywords from this sentence and go throught the webpages and use these keywords to find the answer. Problem is that the decoding of the sentence could be quit defficult, because of the basic charakter of natural language which is asymetry. That means that there are different ways how to express the same thing. Then after is the system usually confused and doesnt give the right answer or any answer at all.

If we look at this problematic, we could see that the that here exist two ways of coding and decoding of information. First is in our brain when we have to think out how to create proper sentence which would be system able to answer, then the system decode this sentence back on the basic fundaments and find the answer. During the first coding process of making a sentence
in the brain we have to work just with our knowledges, because the search engines usually cant offer any suggestions. Well in this point I think is the searching much more exhausting than when you could use the clusters or other supports of common search engines.

What say libraries of it?
The use of natural language maybe could help them get older users, because they would feel more comfortable to use the natural language then to use the traditional boolean logic. It could remind more the traditional communication with the librarian. And contrary to the boolean logic search engines the system would speak as a normal human. But is that enought?

More interesting would be implementation of sth. like Trueknowledge which is search engine, which use the natural language, but uses its own knowledge database with the basic in Wikipedia database. I will speak about this engine more later...
Or the system Powerset which helps to scan and make abstract from the webpages and ofer briefly information about the article. Nowadays is oriented just on the Wikipedia articles, but the plan is to extend the focus. This is really nice engine ... and contrary to the others it works quit well ... I also will speak about it in special articel ...

Conclusion
This was just theoretical concideration about the use of natural language in search engines. Just a small theory...

Monday, October 13, 2008

Project

The main project of our course should be something like essay or what ever which should be somehow connected with library 2.0. It will be quit hard for me, but we will see... you know that I am obsessed with searching engines so its no surprise that I will make my essay on such topic. I thought about focus on natural language searching, which is quit trendy today. I actually don´t believe that its so valuable and necessary, because nowadays are people quit used to use boolean logic and so on. And even if I don´t think about HOW to make such kind of search engine, than I always have the question about WHY... But we will see maybe I will change my view ...
Some of them. I would speak about them more but for the begun you could make your own opinion ... :

Sunday, October 5, 2008

How trendy are Christmas?

Blogpulse is webservice which could measure frekquence or popularity of different discussed topics on blogs. The searching is based on fulltext. That means that it show all texts which mention the searched word, but without any context and any evaluation. The nice example are Christmas. My hypothesis was that there will be nothing interesting about this topic at least during the summer months. I suppossed that there will be slowly increase during september, because Christmas time is comming. But as you can see in the graf, there was rapidly increase of this topic in blog´s articles during the end of June. It was very surprising for me...
After deeper scan of these blog´s articles I found out, that someone created and sent kind of enquiry regarding to books. A lot of people rewrote the list of books and added marks about quality and readability ect. and presented it on their blogs.
And how could it influence the Christmas topic? Easily ... the first book in the list was Christmas Carol of Charles Dickens.

Conclusion
This kind of service is interesting, but actually it doesn´t tell us anything important about context, which is most important. Blogpulse is not the only service which you can use for such measuring. The similar service offer Google Trends. With small difference and that is that they are measure the popularity of keywords used in google search engine.