Comments on TF-Graphs/Minimal-dataset-semantics

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

Comments on TF-Graphs/Minimal-dataset-semantics

Graham Klyne-2
Ref. http://www.w3.org/2011/rdf-wg/wiki/TF-Graphs/Minimal-dataset-semantics

I understand this proposal is due to be discussed soon by the RDF working group,
and would like to offer some comments based on my work with the W3C provenance
WG.  (Although derived from my contact with the provenance WG work, these are my
personal comments, and have not been discussed with or endorsed by the
provenance WG.)

I am particularly keen that RDF Datasets can represent the kind of situation
that is intended to be addressed by the provenance "mention" construct; cf.
http://www.w3.org/TR/2012/WD-prov-dm-20120724/#term-mention,
http://lists.w3.org/Archives/Public/public-prov-comments/2012Aug/0001.html.


First, my thanks to the authors of this proposal; generally, it seems to me to
be a nicely crafted and useful proposal for underpinning semantically
justifiable uses of RDF Datasets.

I'll respond to the discussion points raised in the proposal.  Some of my
responses are marked "(preference)", indicating that I don't currently think the
choice made is critical - that I see possible workarounds if the opposite choice
is adopted.  I regard the important responses concern DD0 and DD5.


DD0:
http://www.w3.org/2011/rdf-wg/wiki/TF-Graphs/Minimal-dataset-semantics#DD0:_Do_we_define_a_semantics_for_RDF_datasets.3F

Yes, please define semantics for datasets.  I feel that to fail to provide some
level of framework for associating semantics with datasets would be a failure of
the working group's charter.  Even if relatively few people actually read or
understand the formal semantics, I feel they provide a "centre of gravity" that
helps to promote consistent treatment of RDF constructs, particularly where
subtle alternative uses are possible.


DD1:
http://www.w3.org/2011/rdf-wg/wiki/TF-Graphs/Minimal-dataset-semantics#DD1:_Different_regime_for_default_graphs_and_named_graphs.3F

(preference)

I don't feel strongly about this, but on balance I feel that applying a single
entailment regime across all graphs is simpler, easier to understand hence less
likely to lead to divergent understanding or expectations.  I'm not aware of any
compelling case for supporting multiple entailment regimes in a dataset.


DD2:
http://www.w3.org/2011/rdf-wg/wiki/TF-Graphs/Minimal-dataset-semantics#DD2:_No-Semantics

(preference)

It's not clear to me what purpose is served by a weakened entailment regime.
Depending on how and when it might be applied, It could even be considered
contrary to the current RDF semantics which requires all semantic extensions to
to consistent with base RDF semantics.


DD3:
http://www.w3.org/2011/rdf-wg/wiki/TF-Graphs/Minimal-dataset-semantics#DD3:_Let_the_dataset_announce_its_assumed_entailment_regime.3F

(preference)

I am inclined to respond "no" to this, for 2 reasons:
(1) it is a new feature that might introduce further complications and
difficulties for implementations and modellers.  As far as I can tell, not
defining it now does not preclude defining such a feature as a future semantic
extension when its implications are better understood.
(2) many applications will not support entailment regimes, or may have their own
local and defining a dataset to depend on them could limit its utility.  Failure
to implement an entailment regime should not, as I understand, lead to incorrect
results, just incomplete ones.


DD4, DD5:
http://www.w3.org/2011/rdf-wg/wiki/TF-Graphs/Minimal-dataset-semantics#DD4:_Does_the_graph_extension_assign_graphs_to_resources_or_to_IRIs.3F

I'm treating these together, because I think my response to DD5 renders DD4
somewhat moot.

I think it would be very useful if a graph name n *does* denote the IGEXT(n)
graph, as this would provide a hook for future semantic extensions.  In the
context of provenance, we want to be able to express contexts/situations that
are specializations of other (e.g. when talking about a web document on a
particular date as a particular instance of that document during a particular
year).  While I would not (necessarily) expect the specifics of such a mechanism
to be part of the RDF Dataset semantics, having a name for talking about the
graphs leaves open the possibility of introducing new properties with their own
extension semantics.  The inconsistencies that would arise if the URI is used as
some other kind of resource seem to me to be quite benign (i.e. "don't do that").

BUT: this begs a further question: is there any way to refer to the default
graph (or some graph that entails the default graph) in an RDF Dataset?


DD6:
http://www.w3.org/2011/rdf-wg/wiki/TF-Graphs/Minimal-dataset-semantics#DD6:_Open-graph_or_closed-graph_semantics

(preference)

I favour open graph semantics.  I think that is more consistent with current
RDF, and less likely to lead to surprises.  (Based on my understanding of RDF, I
find some of the illustrated results of closed graph semantics to be surprising.)


DD7:
http://www.w3.org/2011/rdf-wg/wiki/TF-Graphs/Minimal-dataset-semantics#DD7:_Is_the_default_graph_universally_true.3F

(preference)

Nit:  I don't know what is meant by a graph satisfying another graph.  I assume
that "Should the truth of a named graph require that the named graph satisfies
the default graph?" is asking "Should the truth of a named graph require that
the any interpretation satisfying the named graph also satisfies the default graph?"

My response to this would be "no".  I think this kind of additional semantic
constraint should be for an extension to introduce (see my response above to DD5).

I feel that requiring this constraint universally could make it harder to say
things about hypothetical or fictitious contexts.

...

#g




Reply | Threaded
Open this post in threaded view
|

Re: Comments on TF-Graphs/Minimal-dataset-semantics

Ivan Herman-2
Hi Graham,

some questions...

On Sep 18, 2012, at 11:22 , Graham Klyne wrote:

> Ref. http://www.w3.org/2011/rdf-wg/wiki/TF-Graphs/Minimal-dataset-semantics
>

[snip]

>
>
> DD4, DD5: http://www.w3.org/2011/rdf-wg/wiki/TF-Graphs/Minimal-dataset-semantics#DD4:_Does_the_graph_extension_assign_graphs_to_resources_or_to_IRIs.3F
>
> I'm treating these together, because I think my response to DD5 renders DD4 somewhat moot.
>
> I think it would be very useful if a graph name n *does* denote the IGEXT(n) graph, as this would provide a hook for future semantic extensions.  In the context of provenance, we want to be able to express contexts/situations that are specializations of other (e.g. when talking about a web document on a particular date as a particular instance of that document during a particular year).  While I would not (necessarily) expect the specifics of such a mechanism to be part of the RDF Dataset semantics, having a name for talking about the graphs leaves open the possibility of introducing new properties with their own extension semantics.  The inconsistencies that would arise if the URI is used as some other kind of resource seem to me to be quite benign (i.e. "don't do that").
>

I am not sure I understand the argumentation.

The present proposal has a strong analogy to the way properties are modeled in the current RDF semantics. If a property has the URI 'p', 'p' does not 'denote' that property, because I(p) is not set of pairs itself but, rather, IEXT(I(p)) is. That provides a smoother way to talk about 'p' or I(p). The current IGEXT approach has a full analogy to this; I(g) is not a graph, but IGEXT(I(g)) is.

What you favour would mean that the IGEXT is defined on the URI-s themselves. Why would that "...would provide a hook for future semantic extensions" as opposed to the current situation? For practical purposes 'n', in a named graph is, shall we say, 'associated' with the graph, and that seems to be enough for the kind of additional properties you are referring to. Again, just as it is perfectly possible to make all kinds of statement on property 'p', in spite of the fact that, strictly speaking, 'p' does not denote the Property either...


> BUT: this begs a further question: is there any way to refer to the default graph (or some graph that entails the default graph) in an RDF Dataset?
>

Not that I know of.

Ivan

----
Ivan Herman, W3C Semantic Web Activity Lead
Home: http://www.w3.org/People/Ivan/
mobile: +31-641044153
FOAF: http://www.ivan-herman.net/foaf.rdf






Reply | Threaded
Open this post in threaded view
|

Re: Comments on TF-Graphs/Minimal-dataset-semantics

Graham Klyne-2
In reply to this post by Graham Klyne-2
On 18/09/2012 11:17, Ivan Herman wrote:

> On Sep 18, 2012, at 11:22 , Graham Klyne wrote:
>
>> Ref. http://www.w3.org/2011/rdf-wg/wiki/TF-Graphs/Minimal-dataset-semantics
>>
>
> [snip]
>
>>
>>
>> DD4, DD5: http://www.w3.org/2011/rdf-wg/wiki/TF-Graphs/Minimal-dataset-semantics#DD4:_Does_the_graph_extension_assign_graphs_to_resources_or_to_IRIs.3F
>>
>> I'm treating these together, because I think my response to DD5 renders DD4 somewhat moot.
>>
>> I think it would be very useful if a graph name n *does* denote the IGEXT(n) graph, as this would provide a hook for future semantic extensions.  In the context of provenance, we want to be able to express contexts/situations that are specializations of other (e.g. when talking about a web document on a particular date as a particular instance of that document during a particular year).  While I would not (necessarily) expect the specifics of such a mechanism to be part of the RDF Dataset semantics, having a name for talking about the graphs leaves open the possibility of introducing new properties with their own extension semantics.  The inconsistencies that would arise if the URI is used as some other kind of resource seem to me to be quite benign (i.e. "don't do that").
>>
>
> I am not sure I understand the argumentation.
>
> The present proposal has a strong analogy to the way properties are modeled in the current RDF semantics. If a property has the URI 'p', 'p' does not 'denote' that property, because I(p) is not set of pairs itself but, rather, IEXT(I(p)) is. That provides a smoother way to talk about 'p' or I(p). The current IGEXT approach has a full analogy to this; I(g) is not a graph, but IGEXT(I(g)) is.
>
> What you favour would mean that the IGEXT is defined on the URI-s themselves. Why would that "...would provide a hook for future semantic extensions" as opposed to the current situation? For practical purposes 'n', in a named graph is, shall we say, 'associated' with the graph, and that seems to be enough for the kind of additional properties you are referring to. Again, just as it is perfectly possible to make all kinds of statement on property 'p', in spite of the fact that, strictly speaking, 'p' does not denote the Property either...


Regarding the analogy:

By my understanding, the reason for using ICEXT for class extensions is that it
allows a class name to be associated with the set of its members without running
into problems with the set theoretic logic - http://www.w3.org/TR/rdf-mt/#technote

Further, the use of property extensions allows an extensional notion of identity
for properties (i.e. two different properties may hold for the same set of value
pairs, yet retain their distinct identity.)

In the case of IGEXT, as I read it, that's a mapping from URIs to *single*
graphs, not sets of graphs.  And there does not seem to be any possibility here
of self-referentiality.  So I'm not seeing any compelling reason to use IGEXT
here rather than direct denotation.  Unless I'm mistaken (which is entirely
possible), the role of IGEXT could be subsumed by the interpretation Id itself.

Thus, accepting the suggestion "DD5: Does the graph name denote the graph?" the
part of graph interpetation:

"for an IRI n and RDF graph g, I(<n,g>) is true iff IGEXT(Id(n)) is defined and
E-entails g;"

might become

"for an IRI n and RDF graph g, I(<n,g>) is true iff Id(n) is a graph E-entails g;"

For expositional purposes, it might be useful to keep the IGEXT function (or
similar), but add additional semantic constraints on the interprtation to
require that the graph names also denote the corresponding graphs.


Turning to the second part of your question.

If the graph URIs do denote graphs, then one can imagine that new specifications
can define new vocabularies with associated semantic extensions that constraint
the graph interpretations in interesting ways.

I'm thinking that something like:

   g1 rdfgr:contextSpecializationOf g2

would come with an additional semantic constraint

   Id(g1) entails Id(g2)
   for all <g1,g2> in IEXT(rdfgr:contextSpecializationOf)

Thinking about your question, it occurs to me that maybe something like this can
be achieved without having Id(n) = IGEXT(n), but I'd still ask why bother with
the extra machinery and indirection, which seems to be used to avoid what seems
to me to be a pretty harmless potential problem (i.e. the inconsistency if n is
used in a way that implies it's something other than a graph).

...

I clearly haven't worked out the technical details, but I'm hoping this is
sufficient to explain my motivation for offering the response that I did.

#g