Modern Information Retrieval
Chapter 10: User Interfaces and Visualization


Contents

next up previous
Next: 3. SeeSoft Up: 2. Query Term Hits Previous: 1. KWIC

   
2. TileBars

retrieval results!TileBars TileBars

A more compact form of query term hit display is made available through the TileBars interface. The user enters a query in a faceted format, with one topic per line. After the system retrieves documents (using a quorum or statistical ranking algorithm), a graphical bar is displayed next to the title of each document showing the degree of match for each facet. TileBars thus illustrate at a glance which passages in each article contain which topics - and moreover, how frequently each topic is mentioned (darker squares represent more frequent matches).

Each document is represented by a rectangular bar. Figure [*] shows an example. The bar is subdivided into rows that correspond to the query facets. The top row of each TileBar corresponds to `osteoporosis,' the second row to `prevention,' and the third row to `research.' The bar is also subdivided into columns, where each column refers to a passage within the document. Hits that overlap within the same passage are more likely to indicate a relevant document than hits that are widely dispersed throughout the document [#!hearst96a!#]. The patterns are meant to indicate whether terms from a facet occur as a main topic throughout the document, as a subtopic, or are just mentioned in passing.

The darkness of each square corresponds to the number of times the query occurs in that segment of text; the darker the square the greater the number of hits. White indicates no hits on the query term. Thus, the user can quickly see if some subset of the terms overlap in the same segment of the document. (The segments for this version of the interface are fixed blocks of 100 tokens each.)

The first document can be seen to have considerable overlap among the topics of interest towards the middle, but not at the beginning or the end (the actual end is cut off). Thus it most likely discusses topics in addition to research into osteoporosis. The second through fourth documents, which are considerably shorter, also have overlap among all terms of interest, and so are also probably of interest to the user. (The titles help to verify this.) The next three documents are all long, and from the TileBars we can tell they discuss research and prevention, but do not even touch on osteoporosis, and so probably are not of interest.


  
Figure: An example of the TileBars retrieval results visualization [#!hearst95b!#].

Because the TileBars interface allows the user to specify the query in terms of facets, where the terms for each facet are listed on anentry line, a color can be assigned to each facet. When the user displays a document with query term hits, the user can quickly ascertain what proportion of search topics appear in a passage based only on how many different highlight colors are visible. Most systems that use highlighting use only a single color to bring attention to all of the search terms.

It would be difficult for users to specify in advance which patterns of term hits they are interested in. Instead, TileBars allows users to scan graphic representations and recognize which documents are and are not of interest. It may be the case that TileBars may be most useful for helping users discard misleadingly interesting documents, but only preliminary studies have been conducted to date. Passages can correspond to paragraphs or sections, fixed sized units of arbitrary length, or to automatically determined multiparagraph segments [#!hearst95b!#].


next up previous
Next: 3. SeeSoft Up: 2. Query Term Hits Previous: 1. KWIC


Modern Information Retrieval © Addison-Wesley-Longman Publishing co.
1999 Ricardo Baeza-Yates, Berthier Ribeiro-Neto