16 January 2006 Mapping texts through dimensionality reduction and visualization techniques for interactive exploration of document collections
Alneu de Andrade Lopes, Rosane Minghim, Vinícius Melo, Fernando Vieira Paulovich
Author Affiliations +
Proceedings Volume 6060, Visualization and Data Analysis 2006; 60600T (2006)
Event: Electronic Imaging 2006, 2006, San Jose, California, United States
The current availability of information many times impair the tasks of searching, browsing and analyzing information pertinent to a topic of interest. This paper presents a methodology to create a meaningful graphical representation of documents corpora targeted at supporting exploration of correlated documents. The purpose of such an approach is to produce a map from a document body on a research topic or field based on the analysis of their contents, and similarities amongst articles. The document map is generated, after text pre-processing, by projecting the data in two dimensions using Latent Semantic Indexing. The projection is followed by hierarchical clustering to support sub-area identification. The map can be interactively explored, helping to narrow down the search for relevant articles. Tests were performed using a collection of documents pre-classified into three research subject classes: Case-Based Reasoning, Information Retrieval, and Inductive Logic Programming. The map produced was capable of separating the main areas and approaching documents by their similarity, revealing possible topics, and identifying boundaries between them. The tool can deal with the exploration of inter-topics and intra-topic relationship and is useful in many contexts that need deciding on relevant articles to read, such as scientific research, education, and training.
© (2006) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Alneu de Andrade Lopes, Rosane Minghim, Vinícius Melo, and Fernando Vieira Paulovich "Mapping texts through dimensionality reduction and visualization techniques for interactive exploration of document collections", Proc. SPIE 6060, Visualization and Data Analysis 2006, 60600T (16 January 2006); Logo
Cited by 8 scholarly publications.
Get copyright permission  Get copyright permission on Copyright Marketplace


Associative arrays

Computer programming

Analytical research

Dimension reduction

Visual analytics


Emotion scents a method of representing user emotions on...
Proceedings of SPIE (February 04 2013)
A survey and task based quality assessment of static 2D...
Proceedings of SPIE (February 08 2015)
Webs on the Web (WOW) 3D visualization of ecological...
Proceedings of SPIE (June 04 2004)
Performance issues in a real-time true color data display
Proceedings of SPIE (April 07 1995)
Rule-based database visualization
Proceedings of SPIE (May 03 2001)

Back to Top