Computing Reviews
Today's Issue Hot Topics Search Browse Recommended My Account Log In
Review Help
Search
Canary Analysis Service
Davidovič Š., Beyer B. Queue16 (1):35-57,2018.Type:Article
Date Reviewed: Mar 5 2020

Canary analysis is based on the age-old principle of bringing canaries into coal mines to detect possible air quality issues. Because canaries are more sensitive to pollution than humans, they serve as an early warning sign of possible danger.

Likewise, canary analysis aims to detect the possible negative impacts of changes in information technology (IT) services, for example, during software development, and can alert users to the unforeseen consequences of such changes. The principle is relatively new yet already deployed by leading companies such as Google and Netflix.

Davidovič is a site reliability engineer at Google who developed the canary analysis methodology. Even Wikipedia does not yet have an entry on this topic, making this article a great source for those who want to learn more about canary analysis and how to practically apply it within their own organizations.

Canary Analysis Service (CAS) is a suite of application programming interfaces (APIs) that Google has made freely available through a Kubernetes distribution called Spinnaker. In its simplest form, CAS relies on two central APIs: Evaluate and GetResult. The latter polls the process under evaluation, while the former determines whether the development, or rollout, process takes place in a controlled manner.

Obviously, this requires a well-defined baseline against which deviations can be measured. This is especially important because the Evaluate API results in a simple PASS/FAIL verdict. Having a comprehensive understanding of the underlying process will help in understanding why a FAIL message shows up, as well as with troubleshooting the root cause for a potential FAIL.

The paper clearly describes the process and includes practical advice on how to avoid possible pitfalls, such as historical incorrect data that compromises the baseline, user mistrust of the alerts, or scale limitations on input values that define whether a result is PASS or FAIL.

The CAS suite can play a helpful role when integrated with a service-level objective (SLO) system to obtain immediate insight into whether such objectives are being met. Still, many challenges remain, most of them focusing on developing metrics and evaluation criteria for whether a deviation from the normal warrants a warning or should be considered within parameters. Once such conditions are met, the author writes that CAS has a valuable role to play in change management processes during software development.

Reviewer:  Riemer Brouwer Review #: CR146919 (2009-0228)
Bookmark and Share
  Featured Reviewer  
 
Office Automation (H.4.1 )
 
 
Data Communications (C.2.0 ... )
 
 
Electronic Commerce (K.4.4 )
 
 
Management Of Computing And Information Systems (K.6 )
 
Would you recommend this review?
yes
no
Other reviews under "Office Automation": Date
Office automation
Doswell A., John Wiley & Sons, Inc., New York, NY, 1990. Type: Book (9780471925538)
May 1 1992
Automated office systems
Gibson H., Rademacher R., Holt, Rinehart & Winston, Austin, TX, 1987. Type: Book (9789780030716393)
Sep 1 1987
Executive information systems
Thierauf R., Quorum Books, Westport, CT, 1991. Type: Book (9780899305981)
Oct 1 1992
more...

E-Mail This Printer-Friendly
Send Your Comments
Contact Us
Reproduction in whole or in part without permission is prohibited.   Copyright 1999-2024 ThinkLoud®
Terms of Use
| Privacy Policy