![]() |
||||||||||||
|
PUBLICATIONS Archived Press Releases |
||||||||||||
|
Data Mining Project |
||||||||||||
![]() |
|
November 28, 2001 UIC Scientists Launch Terra Wide Data Mining Project CHICAGO, IL -- The wordplay in Robert Grossman's Terra Wide Data Mining Testbed project title is the first hint at the scale and scope of the datasets he manages. Tera is the mathematical prefix meaning one trillion; terra is Latin for the earth. Applied to data transfer terms, Grossman's Terra Wide project, launched this month at the SC conference in Denver, Colorado, is aimed at remotely exploring globally held terabyte datasets in real time. Grossman and his University of Illinois at Chicago (UIC) colleague Jason Leigh accessed, correlated and then visualized data generated from a variety of datasets, including earth science data from the National Center for Atmospheric Research (NCAR), El Nino data from the National Oceanic and Atmospheric Administration (NOAA) and cholera data from the World Health Organization (WHO). The underlying aim of the technology behind the testbed is to provide scientists a means to data mine and correlate datasets from different organizations to make new discoveries. "Researchers may be able to find a correlation between global weather patterns and the spread of diseases by correlating data from NCAR and the WHO," said Grossman. The demonstration also showcased PC-based clusters called TeraNodes, now gradually being deployed throughout the world, which will be dedicated to massive computation, data mining or visualization over national and international high performance networks. In coming years, as optical technology transforms networking capabilities, TeraNodes will become the building blocks for an optically connected web of data. The SC testbed correlated and visualized WHO and NCAR data replicated onto the testbed. There are TeraNodes in Chicago (at UIC), Amsterdam (at SARA, Holland's supercomputer center), Halifax (Dalhousie University), Denver (the SC show floor), London (Imperial College of Science, Technology and Medicine), Virginia (Virginia Tech and ACCESS DC), Michigan (Internet2), California (UC Davis) and Pennsylvania (University of Pennsylvania). Given the large and growing scientific and engineering data resources available on the web, there is a growing need for an easy-to-use data web infrastructure. DataSpace, an open-standards-based system for working with data over the web, is Grossman's attempt to provide such an infrastructure. "DataSpace provides a new way for scientists and engineers to work with each others' data," said Grossman. "If organizations publish their data in the Dataspace format, many others could potentially make use of it." The Terra Wide Data Mining Testbed is an infrastructure built on top of DataSpace for remote analysis, distributed data mining, and real-time exploration of scientific, engineering, defense, business, and other complex data. Tera mining applications are designed to exploit the capabilities provided by emerging domestic and international optical networks so that gigabyte and terabyte datasets can be remotely explored in real time. Leigh, a scientific visualization expert from UIC's Electronic Visualization Laboratory, and Grossman, head of UIC's National Center for Data Mining, are collaborating to develop such tera mining applications. Their partnership is a natural extension of their research interests. Both work with data-intensive, very-high-bandwidth applications that test even the most advanced networks. Both need to cull specific data from massive datasets stored in widely distributed facilities. Both are seeking a means for researchers to accelerate scientific discovery. The optical Terra Wide Testbed is now being built in parallel with another UIC-managed project, StarLightSM. StarLight is an advanced optical infrastructure and proving ground for network services optimized for high-performance applications, with major funding provided by the National Science Foundation. It is being developed by UIC's Electronic Visualization Laboratory, the International Center for Advanced Internet Research (iCAIR) at Northwestern University, and the Mathematics and Computer Science Division at Argonne National Laboratory, in partnership with Canada's CANARIE and Holland's SURFnet. About EVL
About NCDM
Contact:
Contact:
|
||||||||||