The authors of this paper present a simple, elegant top-level business model of visualization using a combination of basic concepts from semiotics and category theory. The model may be used as a framework for understanding visualization and reasoning about it. Category theory (often together with semiotics) has been used for business models of other much more complex and perhaps more interesting domains [1-3], and the authors properly observe that “the category-theoretic perspective suggests a set of questions that could be asked [...], and a conceptual framework for analyzing [a domain].”
The paper focuses on those visualizations for which the data can be generalized into a corresponding schema. However, even for those visualizations, the translation from the language of the producer to the language of the consumer is not necessarily trivial. “Visualization requires the producer (the addresser, in semiotic terminology) and the consumer (the addressee) to share contextual knowledge for successful meaning making to take place.” This statement by the authors is an instantiation of the famous Lotman observation that “the nature of the intellectual act could be described in terms of being a translation, a definition of meaning as a translation from one language to another, whereas extra-lingual reality may be regarded as yet another type of language” [4]. The authors stress that reading the visualization’s representation leads to an “understanding of the system and … knowledge of the truth as it pertains to that system,” so that the irrelevant details are abstracted out. The authors further note that there may be many representations of a single dataset (presumably creating different viewpoints representing only some characteristics of the data), so that a reverse translation may not always be possible.
The authors observe that “visualization starts with a system that we measure in various ways to generate data,” and that the data will always be partial; therefore, “what we are seeking is knowledge of the system,” and “it is by answering questions that one gains knowledge.” Thus, the business model of the visualization process is presented in this manner as a category with a terminal object of knowledge and an initial object of system. The need for theories is not explicitly mentioned, although, to quote Popper:
Science … cannot start with observations, or with the “collection of data,” as some students of method believe. Before we can collect data, our interest in data of a certain kind must be aroused: the problem always comes first. [5]
The model of the visualization process is used to “define more formally ... some common intuitions” about visualization, including properties such as sensitivity, non-redundancy, literalness, and non-ambiguity, formulated in terms of properties of the render morphism from data to representation, as well as the distinction between arbitrary and schema-defined (that is, potentially useful) chart junk. Regrettably, almost all examples used in the paper (student scores) are not very interesting and compare unfavorably with examples in other sources [1-3]. The intensional generalization example in the very short section 5.3 is much more interesting, however.
The authors survey related research, including both algebraic semiotics “involving theoretical apparatus taken from mathematical [sic] algebra” and various interesting approaches not based on category theory. They claim that their approach is distinguished from algebraic semiotics by “explicitly incorporat[ing] … the visualization’s context,” although Goguen clearly states that he uses “category theory (pushouts) to study the combination (blending) of structures and the effect of context on meaning” [1].
I think that the authors succeed in their goal of “developing from first principles a formal description of visualization” that can be used for reasoning and understanding.