Outside of a sci-fi reference, semantic cloaking is an approach for obscuring a user’s location and ensuring identity anonymity while still supporting many online location services. With increased and harmful data breaches, this approach can both hide the mobility dataset of a user and also aggregate her location in the crowd.
Compared with encryption or location sharing, this cloaking is done by completely removing physical locations and replacing them with semantic locations, such as home or work. The original user’s identity cannot be found even by analyzing all the data. This is done by merging consecutive waypoints, looking at attributes such as time and distance, and analyzing stationary aspects of the user’s location. These physical trajectories are then correlated with physical points of interest from other data sources. The label choices are important since reduced privacy might result in being left out of popular crowds.
The authors developed a four-state semantic labeling framework to explore distinct episodes in the data, including places, transitions, and stationary time. A machine learning classifier then combines it all to provide semantic inference. Their approach relies on training with large ground truth datasets, and choosing the best approaches by accuracy, area under the receiving curve, and per label precision and recall. The strongest semantic location obfuscation can then be chosen based on these user datasets.
The paper provides an extensive evaluation of their semantic cloaking approach on large user location datasets. They show how the number of spatial points changes the accuracy of their algorithms, as does the number of physical and semantic location options. The authors acknowledge that their approach only works where physical locations can be mapped to semantic locations, so move-by-move navigation is not an appropriate scenario. However, where appropriate, semantic cloaking could provide an important tool for user mobility privacy.