Computing Reviews
Today's Issue Hot Topics Search Browse Recommended My Account Log In
Review Help
Search
Scalable access to scientific data
Menasce D. IEEE Internet Computing9 (3):94-96,2005.Type:Article
Date Reviewed: Jan 25 2006

The quantity of data stored online quadruples every 18 months. In order to transform this data into information and knowledge, it is essential to develop effective data analysis and extraction capabilities. This clearly written paper presents (using a textual representation on page 94 that is substantially better and semantically richer than the pictorial one in Figure 1) a typical environment in which data is collected, stored, and validated at “ground stations” that send this data to “data product generation centers which generate higher level data products.” The generation cycle of these centers consists of a data product generation phase and a quiet period, during which new data accumulates. The paper does not refer to the semantics of data product generation, or to the semantics of relationships between data, despite the clear need for data collections and data products to be structured, since end users request data products of interest.

A very simple analytical model (based on processor and input/output (I/O) subsystem utilization) is presented to answer such questions as: What is the maximum data product retrieval rate? What is the average data product retrieval time for a given data product retrieval rate? What is the impact of quiet period duration on the data product retrieval time? Some numerical results shown using this model appear to be in agreement with qualitative considerations. The model and the results are straightforward, and it appears that they handle only some, and not the most interesting (or important), aspects of scalability, because, first, data selection of any kind ought to be based on semantic considerations (not addressed in the paper), and, second, concepts such as parallelism or indexing are not mentioned at all.

Finally, the author emphasizes the need to provide semantically rich ontologies associated with metadata, in order to solve the problem of semantic information integration across different scientific domains, due, in particular, to different terminologies. This important problem, and ways to solve it, has been well known for decades; no references are provided in the paper.

In summary, the paper may be a good illustration of Hayek’s observation that, in the use of statistics, we “either deliberately ignore or are ignorant of the relations between the individual elements with different attributes,” while, in dealing with complexity, “it is precisely the[se] relations that matter” [1].

Reviewer:  H. I. Kilov Review #: CR132359 (0609-0959)
1) Hayek, F.A. The critical approach to science and technology (In honor of Karl R. Popper). The Free Press of Glencoe, , 1964.
Bookmark and Share
  Featured Reviewer  
 
Abstracting Methods (H.3.1 ... )
 
 
Data Manipulation Languages (DML) (H.2.3 ... )
 
 
Data Models (H.2.1 ... )
 
 
Web-Based Services (H.3.5 ... )
 
 
Information Search And Retrieval (H.3.3 )
 
 
Languages (H.2.3 )
 
  more  
Would you recommend this review?
yes
no
Other reviews under "Abstracting Methods": Date

Hahn U., Reimer U.Type: Article
Aug 1 1985
The synthesis of specialty narratives from co-citation clusters
Small H. Journal of the American Society for Information Science 37(3): 97-110, 1986. Type: Article
Apr 1 1987
The possible effect of abstracting guidelines on retrieval performance of free-text searching
Fidel R. (ed) Information Processing and Management: an International Journal 22(4): 309-316, 1986. Type: Article
Mar 1 1987
more...

E-Mail This Printer-Friendly
Send Your Comments
Contact Us
Reproduction in whole or in part without permission is prohibited.   Copyright 1999-2024 ThinkLoud®
Terms of Use
| Privacy Policy