An attempt to address operating system workload performance on simultaneous multithreaded (SMT) architecture is presented in this paper. The authors used an eight-thread SMT simulator for their analysis (the hardware parameters are provided in detail in the paper). For their work, the authors mentioned two operating systems (OSs), an extension to Alpha Unix (v 4.0d) and SimOS-Alpha. For the experiments, however, I think they used SimOS-Alpha. For comparison purposes, user benchmarks (SpecInt 95) and OS (for which Apache is used as server and SpecWeb 96 is used as a client) were run on SMT and Superscalar. Note that you need eight personal computers (PCs), and more registers for SMT.
There should be two goals when conducting this kind of research: the first is to shed light on the challenges faced when developing OS for SMT, and the second is to show that SMT provides better performance if OS workloads are taken into account. This paper addresses the latter. Eventually, however, the authors point out that branch prediction performance in SMT with OS workload is very poor. To me, this is the best part of the paper, and I consider it the paper’s major contribution.
I am not convinced of the authors’ claim that the strength of SMT can be better represented by using OS workloads. SpecInt 95 does as well representing SMT. In fact, a comparison of the data shown in this paper itself, for Spec only and Spec plus OS on both SMT and Superscalar, further supports my judgment about this work. In table 4 (p. 250), in which the most valuable data of the paper are shown, we can find the data for various metrics and different machines with different workloads. I considered four metrics that are more important in calculating the performance (since the overall performance is not given). They are: instructions per cycle (IPC), average number of fetchable contexts, branch misprediction, and cache miss rate. I calculated the arithmetic mean on these metrics (the inverse of cache and branch miss rates are considered) to measure the overall performance. It is interesting to note that SMT is 2.3 times faster than Superscalar in both cases (Spec only, and Spec plus OS).
Overall, SMT is a great idea. Since its birth in 1995, a lot of work has been done in exploring the idea. In fact, new generation architectures are based on SMT architecture, including Intel’s Xion, IBM’s Genie, and, of course, Alpha 21464 (if it is ever released). Research work, I believe, should be focused on the OS issues of SMT, for example, how to improve the performance of branch prediction, which seems to be a problem in SMT OS.