Re: SPARQL Protocol Privacy section

Previous Topic Next Topic
classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view

Re: SPARQL Protocol Privacy section

Bugzilla from

Am Wednesday 16 November 2005 13:30 verlautbarte Eric Prud'hommeaux :
> He suggested that Rigo should review the Privacy section.
> I propose that Thomas and Rigo send any suggested changes to the
> comments list.
>   mailto:[hidden email]

My comments on Privacy:

Dear all, I've discussed a bit with Eric Prud'hommeaux and found the
following things interesting and worth a remark to the Group.

I think the current section is rather a template text that can be found
in a lot of places but actually does not carry a lot of meaning while
having huge hidden requirements. What would it mean to have every
SPARQL request or engine to comply with the Directive 95/46/EC of 24
October 1995 on the protection of individuals with regard to the
processing of personal data and on the free movement of such data or
the Directive 2002/58/EC of 12 July 2002 concerning the processing of
personal data and the protection of privacy in the electronic
communications sector? There are a lot of rules ... The conformance of
a SPARQL engine would depend on such rules.

On the other hand, the real issues are not addressed by the current
text. Most queries coming from natural persons that are identifiable
(via IP or other IDs) will be personal data and thus should be treated
with care. On the other hand, data about companies is only protected in
Italy by data protection laws and shouldn't trigger too much data
protection/privacy attention and is dealt with in the
security/confidentiality area.

Now if the query is sent over the wire by the individual, imagine for a
moment someone looking for information about aids or other medical
information. This is highly sensitive information that floats over all
those hubs and routers etc.

Another example shown by Eric to me has three parties involved: 1/ the
requester, 2/ some SPARQL-service and 3/ some content/RDF repository.
In this case, it might be a question of privacy, whether the personal
data (e.g. login name) of the requester 1/ is passed on by 2/ to the
repository. Normally in those setups, 2/ and 3/ have some business
agreement. This means that 3/ does normally not need to know about the
identity of 1/ to fulfil the request. Such a scenario could be very
privacy enhancing as an individual would be perhaps even able to access
the repository 3/ by two different SPARQL-services thus making tracking

All those considerations lead to the following suggestion of a paragraph
into the SPARQL Protocol Specification:

<header>Privacy Considerations</header>

Query strings and URIs attached to it can reveal very sensitive
information. If this sensitive information is linked or linkable to a
company, we normally speak about security. If it is linked or linkable
to a person, we talk about privacy. This section gives recommendations
and hints how to treat the latter. Cases where personal sensitive data
might appear in SPARQL queries over the SPARQL Protocol should require
special attention.

If a setup concerns mostly consumers and natural persons, the personal
information should be in some way protected. This can be achieved using
SSL. But not every setup is so sensitive that it needs the burden of
the full encryption engine. Nevertheless, in cases involving personal
data, this personal data should be obfuscated in some way in the query
string. This could be done by using some known technics like base-64 or
Rot13. It might also be possible to use must-understand symmetric

In cases of more than two parties involved and if the party making the
request is a consumer or a natural person (on its own behalf), the
party that receives the first request MUST NOT pass personal data on
two subsequent services UNLESS this data is necessary for the
completion of the request.

Personal might be found in query-strings but also in URIs that are sent
with the query. Personal data as understood here is every information
that is linked or reasonably linkable to a natural person.

In case, the query also serves as a point of data collection, the
description of data handling practices via P3P is recommended. The P3P
generic attribute can be used in Schemata to link a data handling
policy (P3P Policy) to a certain XML-element. See the [P3P generic
attribute 1] for more information.



Rigo Wenning            W3C/ERCIM
Staff Counsel           Privacy Activity Lead
mail:[hidden email]        2004, Routes des Lucioles      F-06902 Sophia Antipolis