Searching for information in large collections of data, such as the Internet or more specific information repositories, has become an everyday part of life. The Information Retrieval and Visualisation group investigates how to design search interfaces that allow people to search for information in ways that are in line with their natural ideas about information and searching, and how to visually present that information in such a way that people can understand it easily and intuitively.
An important aspect of Information Retrieval is understanding the semantic distinctions that language conveys, and determining similarity of documents, words and phrases in terms of meaning, not just words. This thus connects our work on learning syntax (Leibbrandt and Powers, 2010-12) to our work on learning semantic relationships and understanding word similarity (Yang and Powers, 2005-10). For more information on this linguistic aspect of Information Retrieval, see our work on Language and Learning Technology .
Human-Oriented Data Visualisation
This project examines ways to display complex, multi-dimensional data on a two-dimensional computer display in such a way that information users can rapidly understand the data. Our recent experiments have investigated how easily people are able to search for specific features of visual icons such as shape, colour and animation, and the ways in which these dimensions map intuitively onto semantic dimensions such as time, relevance and complexity.
A Human Factors approach to evaluation and testing is used, where we construct special purpose minimally functional interfaces that are designed to allow us to test the role of a specific variable. Because having many different features in an interface, whether innovative or ubiquitous, can confuse the exploration of individual features we want to explore, by varying them in a controlled way, we deliberately keep them very simplistic. This is also important in terms of performing a robust analysis of the significance of the relationships we find between interface attributes, task performance and other human factors.
We have also explored the role of redundant and alternate forms of presentation, sequentially and simultaneously, the utility of popups and the impact of transparency, the naturalness or intuitiveness of the assignment of screen attributes to data attributes, the utility of graphical representations versus lists, expanding trees, and concept maps or clouds involving single words or complex terms.
Some of the results are expected, and some unexpected – all give pause for thought as we reevaluate the complexity of cues that we take for granted in the real world. For example, a single item or related group of item flashing has high salience, but we are very bad at distinguish items or groups based on different ways or rates of flashing. As another example, transparency is becoming common in user interfaces, but our results so far only show negative effects!
Automatic vs Human Indexing
Categorization and Annotation of Texts
This project investigates how humans differ from automated systems in their ideas on how text documents are related to each other, and aims to bring the automated schemes for classifying, summarizing, annotating and retrieving texts closer in line with human intuition. Results from this project have already shown that people use different keywords to describe (to other people) what a document is about, compared to the terms they would use to search for the document with a search engine, and that the words typically used by search engines to distinguish documents from each other have little relation to the words that people think are relevant to the meaning of the text.
Other interesting observations from the human factors analysis of subject surveys included significant differences between the performance of novice users/junior undergraduate and experienced users/postgrads/academics – this makes clear that we learn and adapt our searching technique to the technologies we are using. In addition, increased experience of conventional search makes it more difficult to take full advantage of new visual search paradigms and complicates the endeavour of improving these interfaces. We also found that small changes in the precise equations used for clustering, dimension reduction, or standardizing words and documents (e.g. TFIDF), could have high impact on search effectiveness, as well as strong relationships to the kind of user.