Following up works tagged as link mining, in which we fully consider the links between objects in the data mining process , this paper addresses the clustering task when performed on heterogeneous networks.
The novelty of the model, called PROCESS (short for probabilistic clustering model for heterogeneous networks), lies in the handling of heterogeneity: in addition to the direct relations between objects (for example, friendship), the authors add the possibility of relating the objects on the basis of shared properties (for example, both objects are “red” or “married”) and taking relations between properties into account. The expectation-maximization (EM) framework is used in order to estimate the parameters with a variant of the message passing algorithm, which is not easy to follow if you are unfamiliar with this kind of optimization procedure. The comparison with other models, including a previous model proposed by the same authors  and spectral relational clustering , shows that PROCESS is better for cluster quality, measured with the normalized mutual information and the F-measure, with a linear runtime.
The experiments were performed on eight artificial datasets and one real dataset of a bibliographic network extracted from the ACM Digital Library. The latter is not so convincing, for it seems that no relation between the properties (here, terms) is considered. This weakens the demonstration of PROCESS superiority. Besides, neither the dataset nor the implementation of PROCESS is available to the community.