Computing Reviews
Today's Issue Hot Topics Search Browse Recommended My Account Log In
Review Help
Search
Soft error benchmarking of L2 caches with PARMA
Suh J., Manoochehri M., Annavaram M., Dubois M.  SIGMETRICS 2011 (Proceedings of the ACM SIGMETRICS International Conference on Measurement and Modeling of Computer Systems, San Jose, CA, Jun 7-11, 2011)85-96.2011.Type:Proceedings
Date Reviewed: Jan 15 2013

The authors state in their introduction that the precise analytical reliability model for architecture (PARMA) “uses a rigorous fault generation model to address temporal multi-bit errors (MBEs), starting with the probability of SEUs [single event upsets] on a single bit and then expounding the probabilities in both temporal and spatial dimensions.”

They go on to claim,

Using the insights gained from the comparison of the PARMA model with prior approximate analytical models, we have introduced a new approximate analytical model based on a refined AVF [architectural vulnerability factor] methodology to estimate the DUE FIT rate of SECDED [single error correcting double error detecting] protected caches.

The introduction clearly explains why the topic is important, and the authors cite relevant related papers and show their contributions clearly (p. 86). Topics include goals of performance analysis and measurement, performance metrics, means, modes, measurement tools and techniques, perturbations due to measurement, and the design of experiments and simulation. The paper introduces an original and unified framework for measuring the reliability of static random-access memory (SRAM) arrays protected by any possible error protection scheme, and a new and highly accurate approximate analytical model for measuring the FIT rate of caches protected by word-level SECDED codes. The conclusions follow directly from the body of the paper. They are well structured and introduce no new material.

The technical approach is very courageous. However, for better performance on real-world applications, the authors should perhaps try in future works to cut out some limitations and confusions. First of all, the probability distribution of SEUs is assumed to be the same after every cycle. Another weakness likely to bewilder the student is the assumption that there is no correlation between SEUs affecting any two cache bits. The results show that PARMA simulation is slower than the basic sim-outorder simulation by a factor of about 25 times for 100 million SimPoint simulations. Those interested in this area of research should also study three other papers [1,2,3].

Reviewer:  Florin Popentiu-Vladicescu Review #: CR140832 (1305-0393)
1) Etsion, Y.; Feitelson, D. G. Probabilistic prediction of temporal locality. IEEE Computer Architecture Letters 6, 1(2007), 17–20.
2) Cheng, Y.; Ma, A.; Zhang, M. Accurate and simplified prediction of L2 cache vulnerability for cost-efficient soft error protection. IEICE Transactions on Information and Systems E95-D, 1(2012), 56–66.
3) Sun, H.; Zheng, N.; Zhang, T. Leveraging access locality for the efficient use of multibit error-correcting codes in L2 cache. IEEE Transactions on Computers 58, 10(2009), 1297–1306.
Bookmark and Share
 
Reliability, Availability, And Serviceability (C.4 ... )
 
 
Design Studies (C.4 ... )
 
Would you recommend this review?
yes
no
Other reviews under "Reliability, Availability, And Serviceability": Date
Implementing fault-tolerant services using the state machine approach: a tutorial
Schneider F. ACM Computing Surveys 22(4): 299-319, 2001. Type: Article
Jul 1 1992
Network reliability and algebraic structures
Shier D., Clarendon Press, New York, NY, 1991. Type: Book (9780198533863)
Sep 1 1992
On building systems that will fail
Corbató F. Communications of the ACM 34(9): 72-81, 1991. Type: Article
Sep 1 1992
more...

E-Mail This Printer-Friendly
Send Your Comments
Contact Us
Reproduction in whole or in part without permission is prohibited.   Copyright 1999-2024 ThinkLoud®
Terms of Use
| Privacy Policy