This paper considers the process of characterizing workloads in different kinds of computing services.
It starts with a brief discussion of what a workload is and how characterizing workloads leads to better models and the potential to generate artificial workloads for performance testing and simulation.
The second section covers the methodology involved, including measurement, analysis, and to a lesser extent building the mathematical models that might be used. That some of these workloads depend on underlying graphs (such as social networks, linked pages in something like Wikipedia, or related products in a shopping site) is mentioned briefly, but this section feels a bit short as such graphs are probably a fairly fundamental underlying structure. However, this topic is mentioned in several of the other sections.
This is followed by several sections discussing different types of systems and the workloads typical of them, as well as the statistics that are useful in each category.
- “Web Workloads” covers conventional web workloads, including shopping sites, the impact of robots, and the usual kinds of web content.
- “Online Social Network Workloads” is primarily about services like Facebook, Twitter, and similar sites. The underlying social graph is a major component in understanding such workloads and has been extensively studied. Location-based services such as Instagram are also considered here; not only are the user-defined social graphs important, but graphs derived from temporally changing proximity relations are as well.
- “Video Service Workloads” discusses the fact that video services are characterized not only by patterns of visits to pages (with underlying content graphs and social networks involved), but also by the fact that video has different size and delivery characteristics. There is also the phenomenon of a video “going viral” and less dramatic trends in popularity for individual videos.
- “Mobile Device Workloads” covers things like app stores, but also focuses on bandwidth considerations and user-centric measurements.
- “Cloud Workloads” covers this emerging area that attracts a great deal of interest. Specific subtopics are cloud data centers, infrastructure, and storage services.
This is an interesting topic, and an essential one for many companies providing services both for general users and for businesses. Understanding what your workload is and how to model it is the first step in providing good, reliable, predictable service. This survey covers the basics nicely; while many will find it does not discuss their particular workloads, it provides a nice basis for where to start in measuring and modeling their services.
As befits a good survey paper, there are many references.