Cloud computing is increasingly being used in the information technology area to deliver services that are always available. Although this architecture offers many benefits to users, there are still concerns about privacy, reliability, and compliance issues.
This book, as the title suggests, attempts to address the reliability and availability issues of this emerging domain. It is targeted at system architects, developers, and engineers on the one hand, and product and quality management professionals on the other. The book is very easy to read and does not require advanced scientific or technical knowledge to grasp its contents. It is clearly focused on a hot topic, with public clouds now reaching into common households.
It is organized into three main parts: “Basics,” “Analysis,” and “Recommendations.” The first part, spanning three chapters, reviews all you need to know about the definitions and terms of cloud computing, virtualization technology, and service reliability and availability. If you’re already knowledgeable in these areas, it can still be useful as a good review. In particular, you will learn about the now-famous software as a service (SaaS), platform as a service (PaaS), and infrastructure as a service (IaaS) models; the differences between full virtualization, paravirtualization, and operating system (OS) virtualization; and the metrics for measuring service availability, such as the five nines and mean time between failure (MTBF), and service reliability, such as defects per million.
With all this information in mind, you will be able to fully understand the second part of the book, which focuses on the analysis of cloud computing from the viewpoint of reliability and availability. This part spans six chapters and covers all the issues at stake. It starts with the various risks a cloud is exposed to, such as service models, outages, and hardware and software errors and failures, as well as the risks of deployment models. The authors detail the reliability analysis of virtualized hardware, and then discuss capacity and elasticity and their related risks, including denial of service attacks. Service orchestration, requirements, geographic redundancy, and disaster recovery are explained in the last chapters.
Having gained some understanding of what can go wrong, and fully able to measure the state of a cloud, you’re now ready to apply the recommendations given in the last part of the book. In four chapters, the authors provide insights into architecting reliable systems and properly designing cloud solutions.
Despite its accessibility, this book is focused on a very specific theme. Therefore, it will probably only be of real interest to those who are directly involved in improving or implementing their own systems in a cloud platform.