Because of the sizes of the data sets (over 128 MB) involved, a data parallel approach is often preferred in parallel volume rendering. With this approach, the data are partitioned and distributed over the nodes, and tasks are assigned to nodes that store the relevant data. This paper shows that the alternative, a demand-driven approach that is often used for ray tracing smaller data sets, also works well for these larger sets. In this approach, the data are still partitioned and distributed, but tasks are assigned on request to nodes that are idle. If needed, data are requested from neighboring nodes.
The paper describes an implementation of this approach on a 128-node distributed-memory, message-passing parallel MIMD computer. The data are replicated over groups of nodes, called neighborhoods, and the nodes in a neighborhood share one copy of the data. Each node has, in addition to the persistent data items, an LRU cache to temporarily store data items from other nodes. Experiments show that the size of the cache is an important factor. For reasonably large cache sizes (greater than 4MB), the system performs well. For smaller cache sizes, the system starts to thrash. The authors estimate that the maximum data set that can be rendered practically is approximately 512 MB, requiring one-fourth of the cumulative memory. Load imbalances are low, and the algorithm scales well as the number of processors is increased, with communication overhead typically less than 20 percent of the total rendering time.