Re: semantics of OPTIONAL/LeftJoin

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

Re: semantics of OPTIONAL/LeftJoin

Roman Kontchakov
Dear Andy

It seems to me that the fix to the "written in full" definition of LeftJoin as described in http://www.w3.org/2013/sparql-errata#sparql11-query does not quite work.

For example, let Ω1 = { (?X -> :a) }, Ω2 = { (?X -> :a,  ?A -> 20), (?X -> :b, ?A -> 30) } and F = ?A = 30.

Using the normative definition, LeftJoin(Ω1,Ω2,F) = Filter(F, Join(Ω1,Ω2)) union Diff(Ω1,Ω2,F), we obtain

  - Join(Ω1,Ω2) = { (?X -> :a, ?A -> 20) } and so Filter(F, Join(Ω1,Ω2)) is empty;

  - on the other hand, Diff(Ω1,Ω2,F) = { (?X -> :a) } because for the only mapping μ1 in Ω1, the first mapping in Ω2 is compatible with μ1 but the effective boolean value of F is false and the second mapping in Ω2 is not compatible with μ1.

Thus,  LeftJoin(Ω, Ω2, F) = { (?X -> :a) }       (1)

Using the "written in full" definition from the errata,

LeftJoin(Ω1, Ω2, F) =   { merge(μ1, μ2) | μ1 in Ω1 and μ2 in Ω2, μ1 and μ2 are compatible and F(merge(μ1, μ2)) is true }
    ∪  { μ1 | μ1 in Ω1, ∀ μ2 in Ω2, μ1 and μ2 are not compatible, or Ω2 is empty }
    ∪  { μ1 | μ1 in Ω1, ∀ μ2 in Ω2, μ1 and μ2 are compatible and F(merge(μ1, μ2)) is false }

we obtain, respectively,  

   - the empty set (see above),
   - the empty set (because Ω2 is not empty and contains a mapping compatible with the only mapping in Ω1),
   - the empty set (because the first mapping in Ω2 is compatible with the only mapping in Ω1 but F is false).

The result, therefore, is the empty set, which is not equal to (1).

In fact, I suspect, there is no simple way of translating the two components in the union of the normative definition into some sort of `three-component' union. So, in my view, it is easier to replace the current "written in full" version of LeftJoin with Filter over Join + Diff (or omit it altogether to avoid any confusion).

Best regards
Roman

--
Roman Kontchakov
Birkbeck, University of London

Reply | Threaded
Open this post in threaded view
|

Re: semantics of OPTIONAL/LeftJoin

Andy Seaborne-4
Roman,

Thank you for the comment.  I've recorded this as query-errata-7a.

I agree that it may be impossible to write an explanatory expansion because the second two elements of the union can't be considered separately.  (When combined, they seem to form the diff part of the definition.)

     Andy

On 16/06/15 16:42, Roman Kontchakov wrote:

> Dear Andy
>
> It seems to me that the fix to the "written in full" definition of LeftJoin as described in http://www.w3.org/2013/sparql-errata#sparql11-query does not quite work.
>
> For example, let Ω1 = { (?X -> :a) }, Ω2 = { (?X -> :a,  ?A -> 20), (?X -> :b, ?A -> 30) } and F = ?A = 30.
>
> Using the normative definition, LeftJoin(Ω1,Ω2,F) = Filter(F, Join(Ω1,Ω2)) union Diff(Ω1,Ω2,F), we obtain
>
>    - Join(Ω1,Ω2) = { (?X -> :a, ?A -> 20) } and so Filter(F, Join(Ω1,Ω2)) is empty;
>
>    - on the other hand, Diff(Ω1,Ω2,F) = { (?X -> :a) } because for the only mapping μ1 in Ω1, the first mapping in Ω2 is compatible with μ1 but the effective boolean value of F is false and the second mapping in Ω2 is not compatible with μ1.
>
> Thus,  LeftJoin(Ω, Ω2, F) = { (?X -> :a) }       (1)
>
> Using the "written in full" definition from the errata,
>
> LeftJoin(Ω1, Ω2, F) =   { merge(μ1, μ2) | μ1 in Ω1 and μ2 in Ω2, μ1 and μ2 are compatible and F(merge(μ1, μ2)) is true }
>      ∪  { μ1 | μ1 in Ω1, ∀ μ2 in Ω2, μ1 and μ2 are not compatible, or Ω2 is empty }
>      ∪  { μ1 | μ1 in Ω1, ∀ μ2 in Ω2, μ1 and μ2 are compatible and F(merge(μ1, μ2)) is false }
>
> we obtain, respectively,
>
>     - the empty set (see above),
>     - the empty set (because Ω2 is not empty and contains a mapping compatible with the only mapping in Ω1),
>     - the empty set (because the first mapping in Ω2 is compatible with the only mapping in Ω1 but F is false).
>
> The result, therefore, is the empty set, which is not equal to (1).
>
> In fact, I suspect, there is no simple way of translating the two components in the union of the normative definition into some sort of `three-component' union. So, in my view, it is easier to replace the current "written in full" version of LeftJoin with Filter over Join + Diff (or omit it altogether to avoid any confusion).
>
> Best regards
> Roman
>
> --
> Roman Kontchakov
> Birkbeck, University of London
>


Reply | Threaded
Open this post in threaded view
|

Re: semantics of OPTIONAL/LeftJoin

Roman Kontchakov
In reply to this post by Roman Kontchakov
Dear Andy

I  wonder if you could also clarify the official formal semantics of OPTIONAL. If I understand the intention correctly, then LeftJoin(Ω1, Ω2, expr) should contain a representation of **each** solution mapping μ1 from Ω1 (either extended with some information from Ω2 or not). More precisely,

 - if there is μ2 in Ω2 such that μ2 is compatible with μ1 and their merge satisfies expr then each such merge is included in LeftJoin(Ω1, Ω2, expr);

 - otherwise, if there is no μ2 in Ω2 such that μ2 is compatible with μ1 and their merge satisfies expr, then μ1 is included in LeftJoin(Ω1, Ω2, expr).

If this understanding is correct then there is a slight problem with the definition of Diff: the condition in Diff(Ω1, Ω2, expr) does not appear to be the complement of the condition in Filter(expr, Join(Ω1, Ω2)). More precisely, if we denote the merge of μ1 and μ2 by μ, then `expr(μ) is an expression that has an effective boolean value of false' is not the complement of `expr(μ) has an effective boolean value of true': according to Section 17.2.2, if an argument of a subexpression is unbound then the EBV is a type error. To illustrate, consider

Ω1 = { (?X -> :a) }, Ω2 = { (?X -> :a), (?X -> :b, ?Y -> 30) } and F = ?Y = 30.

Then Join(Ω1,Ω2) = { (?X -> :a) } and so Filter(F, Join(Ω1,Ω2)) is empty because the EBV is a type error (not true). Also, Diff(Ω1, Ω2, F) is empty because μ1 = (?X -> :a) in Ω1 has a solution mapping in Ω2 that is compatible with it (the merge coincides with μ1) but F(?X -> :a) has an EBV of a type error because ?Y is not bound. Therefore, the solution mapping μ1 from Ω1 is not represented in any way in LeftJoin(Ω1, Ω2, F).

Such behaviour does not seem to correspond to the intention (as described above) and it also differs from LEFT JOIN in SQL. I was wondering whether this is a mistake or my understanding of the intention in incorrect. If it is a mistake then it could be corrected by replacing `expr(merge(μ, μ')) has an effective boolean value of false' in the definition of Diff by the following

'expr(merge(μ, μ')) does not have an effective boolean value of true'

Probably, it is also worth explaining that `does not have an EBV of true' means that the EBV is either false or a type error. It might also make sense to slightly rephrase the whole definition of Diff:

Diff(Ω1, Ω2, expr) = { μ | μ in Ω1 and expr(merge(μ, μ')) does not have an effective boolean value of true for all μ′ in Ω2 such that μ and μ′ are compatible  }

Best regards
Roman


Reply | Threaded
Open this post in threaded view
|

Re: semantics of OPTIONAL/LeftJoin

Axel Polleres-4
In reply to this post by Andy Seaborne-4
Dear Roman, Andy,

not having looked in detail into it yet (travelling), admittedly, but in our original suggestion for an erratum,
we had suggested

>>>>>>> LeftJoin(Ω1, Ω2, expr) =
>>>>>>> { merge(mu1, mu2) | mu1 in Omega1 and mu2 in Omega2, mu1 and mu2 are compatible and expr(merge(μ1, μ2)) is true }
>>>>>>> UNION
>>>>>>> { μ1 | μ1 in Ω1, *forall* mu2 in Omega2, either mu1 and mu2 are not compatible
>>>>>>>                                         or mu1 and mu2 are compatible and expr(merge(mu1, mu2)) is false }

which doesn't separate the 2nd and 3rd branch of the UNION.

Axel
--
Prof. Dr. Axel Polleres
Institute for Information Business, WU Vienna
url: http://www.polleres.net/  twitter: @AxelPolleres

> On 22 Jun 2015, at 12:46, Andy Seaborne <[hidden email]> wrote:
>
> Roman,
>
> Thank you for the comment.  I've recorded this as query-errata-7a.
>
> I agree that it may be impossible to write an explanatory expansion because the second two elements of the union can't be considered separately.  (When combined, they seem to form the diff part of the definition.)
>
>    Andy
>
> On 16/06/15 16:42, Roman Kontchakov wrote:
>> Dear Andy
>>
>> It seems to me that the fix to the "written in full" definition of LeftJoin as described in http://www.w3.org/2013/sparql-errata#sparql11-query does not quite work.
>>
>> For example, let Ω1 = { (?X -> :a) }, Ω2 = { (?X -> :a,  ?A -> 20), (?X -> :b, ?A -> 30) } and F = ?A = 30.
>>
>> Using the normative definition, LeftJoin(Ω1,Ω2,F) = Filter(F, Join(Ω1,Ω2)) union Diff(Ω1,Ω2,F), we obtain
>>
>>   - Join(Ω1,Ω2) = { (?X -> :a, ?A -> 20) } and so Filter(F, Join(Ω1,Ω2)) is empty;
>>
>>   - on the other hand, Diff(Ω1,Ω2,F) = { (?X -> :a) } because for the only mapping μ1 in Ω1, the first mapping in Ω2 is compatible with μ1 but the effective boolean value of F is false and the second mapping in Ω2 is not compatible with μ1.
>>
>> Thus,  LeftJoin(Ω, Ω2, F) = { (?X -> :a) }       (1)
>>
>> Using the "written in full" definition from the errata,
>>
>> LeftJoin(Ω1, Ω2, F) =   { merge(μ1, μ2) | μ1 in Ω1 and μ2 in Ω2, μ1 and μ2 are compatible and F(merge(μ1, μ2)) is true }
>>     ∪  { μ1 | μ1 in Ω1, ∀ μ2 in Ω2, μ1 and μ2 are not compatible, or Ω2 is empty }
>>     ∪  { μ1 | μ1 in Ω1, ∀ μ2 in Ω2, μ1 and μ2 are compatible and F(merge(μ1, μ2)) is false }
>>
>> we obtain, respectively,
>>
>>    - the empty set (see above),
>>    - the empty set (because Ω2 is not empty and contains a mapping compatible with the only mapping in Ω1),
>>    - the empty set (because the first mapping in Ω2 is compatible with the only mapping in Ω1 but F is false).
>>
>> The result, therefore, is the empty set, which is not equal to (1).
>>
>> In fact, I suspect, there is no simple way of translating the two components in the union of the normative definition into some sort of `three-component' union. So, in my view, it is easier to replace the current "written in full" version of LeftJoin with Filter over Join + Diff (or omit it altogether to avoid any confusion).
>>
>> Best regards
>> Roman
>>
>> --
>> Roman Kontchakov
>> Birkbeck, University of London
>>
>
>


Reply | Threaded
Open this post in threaded view
|

Re: semantics of OPTIONAL/LeftJoin

Andy Seaborne-4
On 25/06/15 13:02, Axel Polleres wrote:

> Dear Roman, Andy,
>
> not having looked in detail into it yet (travelling), admittedly, but in our original suggestion for an erratum,
> we had suggested
>
>>>>>>>> LeftJoin(Ω1, Ω2, expr) =
>>>>>>>> { merge(mu1, mu2) | mu1 in Omega1 and mu2 in Omega2, mu1 and mu2 are compatible and expr(merge(μ1, μ2)) is true }
>>>>>>>> UNION
>>>>>>>> { μ1 | μ1 in Ω1, *forall* mu2 in Omega2, either mu1 and mu2 are not compatible
>>>>>>>>                                          or mu1 and mu2 are compatible and expr(merge(mu1, mu2)) is false }
>
> which doesn't separate the 2nd and 3rd branch of the UNION.

Yes - that's writing out

LeftJoin = Filter(expr, Join(Ω1, Ω2)) ∪ Diff(Ω1, Ω2, expr)

based on the diff definition (box above in the spec)

        Andy

>
> Axel
> --
> Prof. Dr. Axel Polleres
> Institute for Information Business, WU Vienna
> url: http://www.polleres.net/  twitter: @AxelPolleres
>
>> On 22 Jun 2015, at 12:46, Andy Seaborne <[hidden email]> wrote:
>>
>> Roman,
>>
>> Thank you for the comment.  I've recorded this as query-errata-7a.
>>
>> I agree that it may be impossible to write an explanatory expansion because the second two elements of the union can't be considered separately.  (When combined, they seem to form the diff part of the definition.)
>>
>>     Andy
>>
>> On 16/06/15 16:42, Roman Kontchakov wrote:
>>> Dear Andy
>>>
>>> It seems to me that the fix to the "written in full" definition of LeftJoin as described in http://www.w3.org/2013/sparql-errata#sparql11-query does not quite work.
>>>
>>> For example, let Ω1 = { (?X -> :a) }, Ω2 = { (?X -> :a,  ?A -> 20), (?X -> :b, ?A -> 30) } and F = ?A = 30.
>>>
>>> Using the normative definition, LeftJoin(Ω1,Ω2,F) = Filter(F, Join(Ω1,Ω2)) union Diff(Ω1,Ω2,F), we obtain
>>>
>>>    - Join(Ω1,Ω2) = { (?X -> :a, ?A -> 20) } and so Filter(F, Join(Ω1,Ω2)) is empty;
>>>
>>>    - on the other hand, Diff(Ω1,Ω2,F) = { (?X -> :a) } because for the only mapping μ1 in Ω1, the first mapping in Ω2 is compatible with μ1 but the effective boolean value of F is false and the second mapping in Ω2 is not compatible with μ1.
>>>
>>> Thus,  LeftJoin(Ω, Ω2, F) = { (?X -> :a) }       (1)
>>>
>>> Using the "written in full" definition from the errata,
>>>
>>> LeftJoin(Ω1, Ω2, F) =   { merge(μ1, μ2) | μ1 in Ω1 and μ2 in Ω2, μ1 and μ2 are compatible and F(merge(μ1, μ2)) is true }
>>>      ∪  { μ1 | μ1 in Ω1, ∀ μ2 in Ω2, μ1 and μ2 are not compatible, or Ω2 is empty }
>>>      ∪  { μ1 | μ1 in Ω1, ∀ μ2 in Ω2, μ1 and μ2 are compatible and F(merge(μ1, μ2)) is false }
>>>
>>> we obtain, respectively,
>>>
>>>     - the empty set (see above),
>>>     - the empty set (because Ω2 is not empty and contains a mapping compatible with the only mapping in Ω1),
>>>     - the empty set (because the first mapping in Ω2 is compatible with the only mapping in Ω1 but F is false).
>>>
>>> The result, therefore, is the empty set, which is not equal to (1).
>>>
>>> In fact, I suspect, there is no simple way of translating the two components in the union of the normative definition into some sort of `three-component' union. So, in my view, it is easier to replace the current "written in full" version of LeftJoin with Filter over Join + Diff (or omit it altogether to avoid any confusion).
>>>
>>> Best regards
>>> Roman
>>>
>>> --
>>> Roman Kontchakov
>>> Birkbeck, University of London
>>>
>>
>>
>


Reply | Threaded
Open this post in threaded view
|

Re: semantics of OPTIONAL/LeftJoin

Andy Seaborne-4
In reply to this post by Roman Kontchakov
On 25/06/15 12:51, Roman Kontchakov wrote:

> Dear Andy
>
> I  wonder if you could also clarify the official formal semantics of OPTIONAL. If I understand the intention correctly, then LeftJoin(Ω1, Ω2, expr) should contain a representation of **each** solution mapping μ1 from Ω1 (either extended with some information from Ω2 or not). More precisely,
>
>   - if there is μ2 in Ω2 such that μ2 is compatible with μ1 and their merge satisfies expr then each such merge is included in LeftJoin(Ω1, Ω2, expr);
>
>   - otherwise, if there is no μ2 in Ω2 such that μ2 is compatible with μ1 and their merge satisfies expr, then μ1 is included in LeftJoin(Ω1, Ω2, expr).
>
> If this understanding is correct then there is a slight problem with the definition of Diff: the condition in Diff(Ω1, Ω2, expr) does not appear to be the complement of the condition in Filter(expr, Join(Ω1, Ω2)). More precisely, if we denote the merge of μ1 and μ2 by μ, then `expr(μ) is an expression that has an effective boolean value of false' is not the complement of `expr(μ) has an effective boolean value of true': according to Section 17.2.2, if an argument of a subexpression is unbound then the EBV is a type error. To illustrate, consider
>
> Ω1 = { (?X -> :a) }, Ω2 = { (?X -> :a), (?X -> :b, ?Y -> 30) } and F = ?Y = 30.
>
> Then Join(Ω1,Ω2) = { (?X -> :a) } and so Filter(F, Join(Ω1,Ω2)) is empty because the EBV is a type error (not true). Also, Diff(Ω1, Ω2, F) is empty because μ1 = (?X -> :a) in Ω1 has a solution mapping in Ω2 that is compatible with it (the merge coincides with μ1) but F(?X -> :a) has an EBV of a type error because ?Y is not bound. Therefore, the solution mapping μ1 from Ω1 is not represented in any way in LeftJoin(Ω1, Ω2, F).
>
> Such behaviour does not seem to correspond to the intention (as described above) and it also differs from LEFT JOIN in SQL. I was wondering whether this is a mistake or my understanding of the intention in incorrect. If it is a mistake then it could be corrected by replacing `expr(merge(μ, μ')) has an effective boolean value of false' in the definition of Diff by the following
>
> 'expr(merge(μ, μ')) does not have an effective boolean value of true'
>
> Probably, it is also worth explaining that `does not have an EBV of true' means that the EBV is either false or a type error. It might also make sense to slightly rephrase the whole definition of Diff:
>
> Diff(Ω1, Ω2, expr) = { μ | μ in Ω1 and expr(merge(μ, μ')) does not have an effective boolean value of true for all μ′ in Ω2 such that μ and μ′ are compatible  }
>
> Best regards
> Roman
>
>

(this is a comment on SPARQL 1.0 and SPARQL 1.1)

Hi Roman,

Yes, that is the intuition.

When in a filter, an evaluation exception is considered to be false and
this could be brought out more clearly in diff.

I've recorded your suggestion as query-errata-12 [2].

("Effective boolean value" is terminology from XQuery [1])

        Andy

[1] http://www.w3.org/TR/xquery/#id-ebv
[2] http://www.w3.org/2013/sparql-errata#errata-query-12