What you will find here

Tuesday, December 15, 2009

TRENDS 1 - Visual search

Visual search engines

When talking about visual search engines, we are talking about visual interface of search engine. We omitted image search engine, but not the way of results representation.

The visual approach significantly evolved in 1980s when the Graphical User Interface (GUI) was fully implemented in computer applications instead of the traditional textual interface. The exploitation of graphical features became the leading approach. Nowadays we may consider according to web search engines that people use a visual metaphor for their core system interaction - that is, manipulating a mouse to select fields for data entry and submit a query for processing (Newby, 2002).

Cognitive aspects

The idea of visual representation emerged from the fact that humans initially acquired all information as symbols or images. That means before the natural language developing. As well as before the amount of information started to be enormous and brought necessity of their shattering on smaller solitary items. This development led into the term or conceptual thinking that substituted symbols and images by terms to ease the processing and extend the communication skills. Humans are thus more likely to be familiar with non-visual IR interfaces, but not visual interfaces.

This development could put the visual interface at a disadvantage, or create a need for extensive training (Newby, 2002). While the actual retrieval results are presented as linear text, supported by some hyperlinks; reflecting the evolution of cognitive process.

On the other hand the exploitation of images and alternative symbols never disappeared from the humans thinking and communicational processes (Cejpek, 1998). It causes the easement for the short-term memory in the low cognitive processes. Enables faster upload of already acquired information stored in human’s memory (Loukotová, 2009) and it underlines human-computer interaction (Newby, 2002).

Information retrieval

For purposes of information retrieval (IR) there was a long-standing interest in visualization of documents, collections and retrieval results presented by work Card et al. (1999).

Visual IR system is based on the idea of Information space that is defined as the set of relations among items held by an information system (cf. Ingwersen, 1996). Information space is multidimensional (2D x 3D) consisting of terms and documents found in retrieval results,which creates an intuitive landscape (Sabol, 2002).

We may think of the structure composed by collection of documents and their related terms as an information space. This idea is based on the vector space modeling where the document or collection is in centre (Newby, 2002). Information space is beyond the representational level of the IR system; however it may be apparent in different representational approaches:

· Book House – extension of library catalog. This approach works with items that could be catalogized as traditional documents and the structure of catalogue is based on bibliographic data. However in case of web sources as not catalogized items is definition of such structure not that easy – bibliographic data are substituted by metadata.

  • Hyperbolic tree – tree structure with focused term centered and gradually progressing branches of related terms in the hyperbolic space.
  • Visualization lexical thesaurus data – does reflect the structure of thesaurus. It is not related to the documents, it is based on hierarchical structure of the thesaurus’s network.

Problems

Current question of information space problematic is the use of 3D over 2D, however there is no simple recommendation, but rather the series of situations suitable for each approach. As well as there is nor the study that would prefer visualization over text.

Visual structure implemented in large data sets may bring difficulties of information overload and unnoticed results representation. For that reason there is Shneiderman’s “visualization mantra(Shneiderman, 1996) that consists of three options that should be reached in visual search engine:

1. Overview first

2. Zoom and filter

3. Details on demand

I would suggest add two other options of visual search engine, as:

1) Interactivity – modifying visual presentation of a dataset according to user’s demand

2) Linking – connection to the desired information source/document.

The main problems are based on the data structures – hierarchies, thesauri - that are exploited as the base for visual representation. Aforementioned problems of implementation of such visual approach on the large data sets – Web – are mainly because of the insufficient data structure and data description – indexing – when acquired tacitly.

There are other IR approaches that serve as a background for the visual representation. Three general approaches are Boolean retrieval, probabilistic retrieval and vector retrieval. Where is the probabilistic approach based on Bayesian method. The probabilistic method is likely to be the leading method for next development as may be seen on the Latent semantic indexing approach , which will be described later.

Other potential sources beyond the visualized structure might include characteristics of the information seeker, such as standing profiles of information need (Hull, 1999), knowledge of the information seeker’s situation (Schamber et al., 1991), and individual differences among seekers (Borgman, 1989).

Conclusion

Nowadays web based visual search engines can not compete with other textual based search engines. The reason is mainly because of the development which supported since the beginning mainly term cognitional approach on the higher level of cognition and the exploitation of visual tools was led for low cognition as the basic automatic manipulation with applications. However the potential which is hidden in visual search engines approach is significant and the realization of web search engine as the real visual interactive and linked network is just the matter of time.

Examples

Search me application – new generation of visual search engine as the combination of tangent and visual approach. It is exploited more on the low cognitive approach.

Viewzi is similar to search me application, but it offers already some of structural backgrounds. It is highly designed and offers around 16 patterns of representation, unfortunately to the prejudice of the functionality.

Kartoo is probably the best version of web based visual search engine. It offers a structured map of terms, topics and the document connection.


References

BORGMAN, Christine L. (1989). All Users of Information Retrieval Systems are Not Created Equal: An Exploration into Individual Differences. Information Processing and Management, vol. 25, no.3, pp. 237–251.

CARD, Stuart K., Mackinlay, Jock D., and Shneiderman, Ben. (1999). Readings in Information Visualization : Using Vision to Think. San Francisco: Morgan-Kaufman.

CEJPEK, J. (1998) Informace, komunikace a myšlení. Karolinum, Praha. 178

HULL, David A. (1999). The TREC-7 Filtering Track: Description and Analysis. In Voorhees, Ellen and Harman, Donna (Eds.), Proceedings of the 7th Text REtrieval Conference (TREC-7), Gaithersburg. Maryland: National Institute of Science and Technology

INGWERSEN, P. (1996). Cognitive Perspectives in Information Retrieval Interaction: Elements of a Cognitive IR Theory. J. Documentation, vol. 52, no. 1, pp. 3–50.

LOUKOTOVÁ, K. (2009) Úvod do problematiky uživatelského rozhraní. In Červenková, A. & Hořava, M. (Eds.), Uživatelsky přívětivá rozhraní. Horava &Associates, Praha.

NEWBY, G. B. (2002) Empirical Study of a 3D Visualization for Information Retrieval Tasks. Journal of Intelligent Information Systems, vol. 18, pp. 31–53.

SABOL, V. et al. (2002) Applications of a Lightweight, Web-Based Retrieval, Clustering, and Visualization Framework. In D. Karagiannis and U. Reimer (Eds.): PAKM 2002, LNAI 2569, pp. 359–368, 2002.

SHNEIDERMAN, Ben. (1996). The Eyes Have It: User Interfaces for Information Visualization. Technical Report No. CS-TR-3665, Human Computer Interface Laboratory. University of Maryland at College Park. Available at http://www.cs.umd.edu/TRs/groups/HCIL-no-abs.html

SCHAMBER, Linda, Eisenberg, Michael, and Nilan, Michael. (1991). Towards a Dynamic, Situational Definition of Relevance. Information Processing and Management, vol. 26, no. 2, pp. 755–776.

TUTORIAL on Clustering Algorithms (2000) Politecnico di Milano. Available at: http://home.dei.polimi.it/matteucc/Clustering/tutorial_html/

ZAÏANE, Osmar R. (1999) Principles of Knowledge Discovery in Databases - Chapter 8: Data Clustering. University of Alberta. Available on : http://www.cs.ualberta.ca/~zaiane/courses/cmput690/slides/Chapter8/index.html


No comments: