The last post presenting the idea of privacy translucence as a use case was prompted by the PrivOn workshop on online privacy, which interestingly was collocated with the International Semantic Web Conference (ISWC). More precisely, I had a paper at this workshop, which focused on this use case and on some of the tools we have developed to support it. Indeed, the whole argument was that, as described previously, the idea of building web privacy mirrors that integrated and made sense of users’ online privacy-related behaviours generated huge challenges that were the ones generally targeted by semantic web technologies. It is quite clear however that, even if these challenges are difficult to address, a number of initiatives can start addressing some of the smaller issues, hopefully converging to a more complete solution.
Indeed, several systems already implement this idea to a certain extent, from tools to ‘track the trackers’ (Ghostery, Collusion and Spy Watch, which was presented at the workshop) to personal analytics services (Wolfram|Alpha Facebook, Moluti, etc.) Here, we presented some of the investigations we have been conducting over the last couple of years, starting with experiments into ‘Web Lifelogging’, to studies of activity consumer data, identifying the impact of giving access to such data to users on privacy perceptions and policies. Our latest work into the area reduces the scope to one of the largest source of ‘privacy ambiguities’ on the Web: Facebook. The tool, called epifabo, is a proof of concept showing how ontology-based and epistemic logic-based reasoning can give users a better view of the privacy consequences of their (and others’) actions on the social networking system.
These tools and prototypes illustrate the notion of semantic, web privacy mirrors in different scopes and context. They also demonstrate, through their limitations, the need for a more global, coherent initiative addressing the issue, mentioned numerous times in the workshop, of user awareness of and control over online privacy.
Previously, we had posted on how semantic technologies played an important role in mitigating privacy risks in both conventional software systems and semantic web applications. However, immaturity of semantic technologies means it can be easily exploited for nefarious gains. For example, Nasirifard et al. demonstrate how publicly available FOAF profiles allow spammers to send context-aware spam. Unlike online profiles within social networks, FOAF-based structured data provides a more reliable and accessible “food” for spammers and attackers. Current solutions (e.g. Digital Signatures) and proposed methods to restrict unauthorized accesses to FOAF files can prevent a subset of such activities but they are not widely used.
Interestingly however, FOAF is also a great illustration of the two-sided aspect of technology in privacy. Indeed, while creating such potentially new challenges for privacy, FOAF is also at the basis of WebID, one of the most interesting current development on giving back to users control over their personal information, through enabling a peer-to-peer identity framework where each user is responsible from managing their own identity (see also FOAF+SSL).
While the uptake for semantics technologies continues albeit rather slowly, recent applications have thrown open new challenges, especially with regards to privacy and in this post we look at some interesting use-cases.
SMART HOME: Werner Wilms and his team are developing a home automation middelware where they decouple sensors and actors from the applications using ontologies like the Semantic Sensor Network Ontology and thereby create a model of the facilities. The requirement is that all the sensors and their observations including their adjustments should only be seen and manipulated by the persons which are allowed to do so (e.g. some users may not want to have lights blinking in their bedroom). Therefore, all participants such as facility owners, energy managers or hkl-installers should have different levels of access to the facilities. To achieve this, Wilms plan to use a model with roles and rights on one hand and ownership on the other hand to grant or block access on the REST-API level. This pretty much works like OAuth App-Access of Facebook.
WEB OF NEEDS: In the “Web of Needs” project, users are allowed to formulate and publish machine readable ‘needs’ (e.g. ‘I need a hotel room in Paris from 2013/06/01 till 2013/06/09′). These needs are published as linked data. Matching services crawl this subset of linked data and perform matching. Matching needs are sent a ‘hint’ message and their respective owners can connect with each other, currently only via a chat connection, in future versions, negotiation and transaction protocols will be in place. Of course, publishing needs can huge privacy implications, as individual needs should not be traceable back to the owner while at the same time whole set of needs an individual publishes can create a profile that may be exploited in a number of ways.
GEO DATA: When publishing linked data on geographical information, one can soon start to infer lots of things about the conditions of individual people. For instance, what if we publish data about land usage, together with data about who owns what property, which are both public information but together can say a lot about what an individual is doing (e.g. a farmer or a company).
MOBILE APPS: Many services can take advantage of personal information available in RDF. For this reason we developed tools over Android in order to semantise personal information stored on the phone. However, this is a threat to privacy because it potentially exposes this information. It is thus necessary to define mechanisms allowing the information holder to control the information exposure, Jérôme Euzenat and his team are exploring the use of semantic web technologies for this purpose. (see project info.)
An interesting aspect of semantic technologies is that they are better suited to address some of the challenges relating to privacy, especially to model and reason over domain knowledge relating to information-flows that threaten privacy of users in ’conventional’ computer systems. By ‘conventional’ we mean all computer systems which primary focus is not Semantic Web. In this post we look at some of the uses of ontologies in enhancing privacy.
For example, Tang et al. used a privacy ontology to link real-life court cases to concepts found in privacy directives and principles. Using the reasoning capabilities over this expressed knowledge the case-analyser is able to support legal arguments. In another work, a high level system architecture for ‘Semantic Grid’ is proposed by Wong et al. to tackle information sharing of law-enforcement agencies from several countries. They claim the agents in the architecture make sense of criminal information by reasoning over ontologies which link heterogeneous data from distributed sources.
In more generic approaches, Sacco and Passant propose Privacy Preference Ontology (OPO), a lightweight vocabulary on top of the Web Access Control ontology aiming at providing users with means to define fine-grained privacy preferences for restricting (or granting) access specific RDF data. Similarly, Kost et al. propose to use ontology for the systematic design of privacy requirements, and their verification.
While most of them remain preliminary, these initiatives demonstrate some level of added value that ontologies can bring to the management of privacy, generally and in various domains. The most obvious one tends to be the ability to reason upon information access and the policies that govern them. It is however also interesting to see how ontologies can provide more flexible structures to express privacy policies and requirements than the usual, rather rigid formats employed for this.
PriSet-2013, the First International Workshop on Privacy in Semantic Technologies, unfortunately could not be held on the 23rd June 2013, as main the K-CAP conference got cancelled due to extreme weather conditions in Banff, Canada.
However, we thank all those who submitted their contributions to the survey we conducted earlier to be used in the workshop discussion. In the following weeks we hope to analyse the responses and aim to produce a technical report. which will be made available on this website.