About SemPrivacy

wired
Privacy is a global concern, which many researchers have tried to address in the past, especially through various forms of technology. The same forms of technology however often create new behaviors and processes that sometimes complexify or even damage the privacy of individuals. SemPrivacy intends to be a place for researchers, developers and practitioners to reflect, discuss and investigate this tension between the positive and negative effects of semantic (web) technologies on various aspects of privacy.

Semantic web and linked data technologies have been designed and have grown to be used in a large variety of different industries (search engines, commerce, administration, government, education, etc.) as approaches to facilitate the publication, integration and sense making of data and information on the web. They are especially interesting for making the distribution of data seamlessly happen withing the web infrastructure, and to enable for such data to carry some elements of meaning with them, simplifying the issue of data interchangeability.

The potential for such technologies to generate new, or amplify existing, privacy threats is rather obvious. While the tools and languages used for creating the semantic and linked data web are not, by themselves, posing much privacy issues, the simple fact that they provide easier ways to share information (which might be private information), to transfer it and to distribute it naturally generates uses cases with ever more complex privacy challenges. Data integration is a specially interesting case here. It is usually understood as taking separate sets of data and putting them together so to create a global, consistent new dataset out of which more information can be inferred.  It is quite obvious that current practices consider the preservation of privacy at the level of individual datasets, without the ability to predict how the integration of those datasets with others might create new, unforeseen privacy threats. On the web of linked data, the semantic web, each pieces of data is naturally connected to possibly millions of others from different origins, with different purposes, etc. The web itself becomes an integrated dataset, and understanding what could potentially be extracted from it seems to be an impossible job.

While SemPrivacy is intended to investigate such scenarios where the technology is “amplifying” already complex privacy threats, we also want here to address the other, more positive side of the coin: How the same technologies can support the protection of privacy. It is a rather common idea that, as individual users, we can only achieve privacy through control. In the online world (especially the one depicted above) where information is transferred from us towards many other agents that aggregate data beyond our reach, achieving control is a complex task, impossible without some level of technological support. One of the strengths of semantic technologies here is in their ability to support sense making of complex, rich and distributed data. This can help support privacy through helping users and developers for example to create meaningful models of data access and data propagation, making it possible for individuals to understand what’s happening to their personal data, and therefore an increased ability to control it.

Here, we will therefore look into these issues and opportunities mainly through two angles. First, we will be looking at use cases where semantic technologies are used and either create a privacy risk, or support the better management of privacy. We will also look at the tools that are currently being developed within various R&D communities to support the management of privacy with semantic web technologies. We wish for SemPrivacy to trigger a dialog between the different actors involved, and we welcome contributions to the discussion (get in touch with one of the “members“). We will also try here to maintain a list of resources that can be useful for newcomers to either privacy or semantic technologies to get an overview of the area.