Supporting incremental-definition of a type?

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
10 messages Options
Reply | Threaded
Open this post in threaded view
|

Supporting incremental-definition of a type?

Matt Warden-2
First, I want to say that while I have only subscribed today, I have
already gotten a lot out of this list over the years, having ended up
at your list archived 30 or so times to get answers to my questions.
This is an extremely valuable resource, so thank you for the
contributions you have all made over the years on this list. I could
not find the answer to this question, though, which is likely because
I don't really know what to call it.

Our education data standard is made up of 1 "core" XSD, which contains
only types and no root element, and "interchange" XSDs which include
the core and use the types to define specific messages.

As you can imagine, many of these types (like Student) are rather
large. So far, the standard has been used to load data in batch mode,
but we have increasingly seen interest in using it for REST style
interfaces. As I understand the use case, if you imagine a 7-page
wizard collecting student data, the app developer wants to be able to
use our types in our XSD as the basis to define the messages for each
page. But the problem is that after page one, only 1/7 of the student
information is known, yet we have a single type called Student with
lots of mandatory elements... elements that are collected on
subsequent pages.

The technical request we have received is to make nearly ALL elements
optional in the types (e.g. Student) in the "core" XSD, and then
restrict those types when they are used in specific interchanges to
specify which elements are required.

Have any of you seen this done before? Do you have alternative
suggestions on how to accommodate this "incremental definition"
scenario, where the application cannot possibly fill all mandatory
elements for our type (e.g., Student) yet?

Thanks for any insight,

--
Matt Warden
http://mattwarden.com


This email proudly and graciously contributes to entropy.


Reply | Threaded
Open this post in threaded view
|

Re: Supporting incremental-definition of a type?

Michael Kay
I have seen exactly this problem with a couple of my consulting clients,
and there's no completely straightforward solution.

The approach of having an abstract schema in which everything is
optional (minOccurs="0") and then restricting it for specific messages
to make selected things mandatory certainly seems the right way to think
about it. The interesting part is then how to implement the restriction
process.

One alternative is to add constraints in the form of Schematron or XSD
1.1 assertions (but this isn't useful if you want to use the schema for
a message as input to a data binding tool).

I've found that when producing a "restricted" schema for specific
message types, transforming the schema using XSLT to produce a new
schema can be a lot simpler than defining new types derived by
restriction from the existing types, even with the help of xs:redefine.
Of course, transforming a schema is a lot easier if the schema is
designed with this in mind, e.g with generous use of id attributes.

One potential way of doing this is to make the schema document take the
form of a stylesheet:

<xsl:param name="history-is-optional" as="xs:boolean"/>

<xsl:template name="make-schema">
<xs:schema>
   ....
<xs:complexType ...>
<xs:element ref="history" minOccurs="{if ($history-is-optional) then 0
else 1}">
        ...


I've been experimenting in Saxon with the concept of "parameterized
schemas", in which the assertions can refer to variables/parameters
whose values are supplied at the time a validation episode is initiated.
It wouldn't be an inconceivable extension of this to parameterize the
minOccurs and maxOccurs values (though at the moment, I don't know how
to create a parameterized finite state machine to implement this idea!)

Michael Kay
Saxonica

On 11/06/2012 19:04, Matt Warden wrote:

> First, I want to say that while I have only subscribed today, I have
> already gotten a lot out of this list over the years, having ended up
> at your list archived 30 or so times to get answers to my questions.
> This is an extremely valuable resource, so thank you for the
> contributions you have all made over the years on this list. I could
> not find the answer to this question, though, which is likely because
> I don't really know what to call it.
>
> Our education data standard is made up of 1 "core" XSD, which contains
> only types and no root element, and "interchange" XSDs which include
> the core and use the types to define specific messages.
>
> As you can imagine, many of these types (like Student) are rather
> large. So far, the standard has been used to load data in batch mode,
> but we have increasingly seen interest in using it for REST style
> interfaces. As I understand the use case, if you imagine a 7-page
> wizard collecting student data, the app developer wants to be able to
> use our types in our XSD as the basis to define the messages for each
> page. But the problem is that after page one, only 1/7 of the student
> information is known, yet we have a single type called Student with
> lots of mandatory elements... elements that are collected on
> subsequent pages.
>
> The technical request we have received is to make nearly ALL elements
> optional in the types (e.g. Student) in the "core" XSD, and then
> restrict those types when they are used in specific interchanges to
> specify which elements are required.
>
> Have any of you seen this done before? Do you have alternative
> suggestions on how to accommodate this "incremental definition"
> scenario, where the application cannot possibly fill all mandatory
> elements for our type (e.g., Student) yet?
>
> Thanks for any insight,
>

Reply | Threaded
Open this post in threaded view
|

Re: Supporting incremental-definition of a type?

Andrew Welch
In reply to this post by Matt Warden-2
On 11 June 2012 19:04, Matt Warden <[hidden email]> wrote:

> First, I want to say that while I have only subscribed today, I have
> already gotten a lot out of this list over the years, having ended up
> at your list archived 30 or so times to get answers to my questions.
> This is an extremely valuable resource, so thank you for the
> contributions you have all made over the years on this list. I could
> not find the answer to this question, though, which is likely because
> I don't really know what to call it.
>
> Our education data standard is made up of 1 "core" XSD, which contains
> only types and no root element, and "interchange" XSDs which include
> the core and use the types to define specific messages.
>
> As you can imagine, many of these types (like Student) are rather
> large. So far, the standard has been used to load data in batch mode,
> but we have increasingly seen interest in using it for REST style
> interfaces. As I understand the use case, if you imagine a 7-page
> wizard collecting student data, the app developer wants to be able to
> use our types in our XSD as the basis to define the messages for each
> page. But the problem is that after page one, only 1/7 of the student
> information is known, yet we have a single type called Student with
> lots of mandatory elements... elements that are collected on
> subsequent pages.
>
> The technical request we have received is to make nearly ALL elements
> optional in the types (e.g. Student) in the "core" XSD, and then
> restrict those types when they are used in specific interchanges to
> specify which elements are required.
>
> Have any of you seen this done before? Do you have alternative
> suggestions on how to accommodate this "incremental definition"
> scenario, where the application cannot possibly fill all mandatory
> elements for our type (e.g., Student) yet?

Instead of validating on Student after page 1, could you validate
whatever subtypes you have at that point, maybe 'name', 'address' or
'id' etc. then only validate <Student> after page 7 ?



--
Andrew Welch
http://andrewjwelch.com

Reply | Threaded
Open this post in threaded view
|

Re: Supporting incremental-definition of a type?

Matt Warden-2
On Mon, Jun 11, 2012 at 5:50 PM, Michael Kay <[hidden email]> wrote:
> I have seen exactly this problem with a couple of my consulting clients, and
> there's no completely straightforward solution.
>
> The approach of having an abstract schema in which everything is optional
> (minOccurs="0") and then restricting it for specific messages to make
> selected things mandatory certainly seems the right way to think about it.
> The interesting part is then how to implement the restriction process.

And:

On Tue, Jun 12, 2012 at 3:16 AM, Andrew Welch <[hidden email]> wrote:
> Instead of validating on Student after page 1, could you validate
> whatever subtypes you have at that point, maybe 'name', 'address' or
> 'id' etc. then only validate <Student> after page 7 ?


Thank you Michael and Andrew for your thoughts. I am leaning toward
taking the view that the "incremental messages" describing student are
not a student itself, but a small subset of specific elements that may
or may not ultimately be capable of describing a student if/when the
other 6 wizard screens are filled out. Therefore, it would be
appropriate to use the subtypes, as Andrew points out, to define these
messages.

We will try this approach and see how it works out. I think it brings
up some change management questions as the definition of Student
evolves over time, which would impact the subtype-based messages, even
though it may not be very easy to see that.

If it doesn't work out, we will pursue the technique Michael outlined
around generating the all-optional schema from the main schema. This
is a brilliant idea and handles all the primary maintenance burden
concerns that were floating in my head, and relegates these to
implementation details that we don't even have to think about during
design time.

Thanks for your help,

--
Matt Warden
http://mattwarden.com


This email proudly and graciously contributes to entropy.

Reply | Threaded
Open this post in threaded view
|

Re: Supporting incremental-definition of a type?

Michael Kay
 >We will try this approach and see how it works out. I think it brings
up some change management questions as the definition of Student evolves
over time, which would impact the subtype-based messages...

Yes, this is a serious problem with the xs:restriction mechanism for
defining subtypes, that you have to declare all the parts of the content
model that you want to leave unchanged, rather than the parts you are
restricting away. Restriction by assertion in XSD 1.1 is one way around
this; it also allows the restrictions to penetrate deep into the
hierarchy rather than requiring new restricted types at every level (for
example to define that in a financial transaction, all money amounts
must be in US dollars, you only need one assertion at the top level,
rather than defining restricted types all the way down).

(There's an unfortunate usability glitch in XSD 1.1 which we noticed too
late to fix it, that when you restrict by assertion you still need to
repeat the entire content model. But you can get around this if you use
named model groups. In fact careful use of named model groups is
generally a good idea whenever you are planning to use restriction of
complex content.)

Michael Kay
Saxonica

Reply | Threaded
Open this post in threaded view
|

Re: Supporting incremental-definition of a type?

Mukul Gandhi
On Mon, Jun 18, 2012 at 4:21 PM, Michael Kay <[hidden email]> wrote:
> Yes, this is a serious problem with the xs:restriction mechanism for defining subtypes, that you have to declare all the
> parts of the content model that you want to leave unchanged, rather than the parts you are restricting away.

XSD has it's own unique notion of how complexType restrictions work,
and I would say you may find it quite strange if you're expecting it
to work like an OO class inheritance for example. If an XSD
complexType X restricts complexType Y, then informally the number of
XSD particles in type X should be same as Y, and a particle at some
offset in X must be a valid restriction of the particle at same offset
in Y. But I do find this notion of complexType restriction sensible.

> There's an unfortunate usability glitch in XSD 1.1 which we noticed too late to fix it, that when you restrict by assertion you still
> need to repeat the entire content model.

I think, XSD assertions are orthogonal (co-)constraints to the
particle restriction constraints. Therefore while restricting XSD
complex types, in addition to obeying the particle restriction rules
an assertion would enforce constraints via XPath expressions on the
tree rooted at an element (whose complexType is the context type). I
personally don't see assertions working with complexType restrictions
a glitch -- it's just that, we can use assertions during complexType
restrictions if we want to apply an orthogonal co-occurrence
constraint in addition to particle restriction constraints.




--
Regards,
Mukul Gandhi

Reply | Threaded
Open this post in threaded view
|

Re: Supporting incremental-definition of a type?

Michael Kay


On 18/06/2012 21:22, Mukul Gandhi wrote:
> On Mon, Jun 18, 2012 at 4:21 PM, Michael Kay<[hidden email]>  wrote:
>> Yes, this is a serious problem with the xs:restriction mechanism for defining subtypes, that you have to declare all the
>> parts of the content model that you want to leave unchanged, rather than the parts you are restricting away.
> XSD has it's own unique notion of how complexType restrictions work,
> and I would say you may find it quite strange if you're expecting it
> to work like an OO class inheritance for example.

I'm working at the moment on the schema for XSLT. I'm trying to add
assertions so that the schema catches more static errors than it does at
present, for example the rule that an xsl:param child of xsl:function
must have no select attribute. This turns out to be surprisingly
troublesome. It isn't possible to simply add the assertion to the type,
because the type (like most types in this schema) is derived by
extension from a type with empty content having a number of attributes.
So we need to change this:

<xs:element name="function">
<xs:complexType>
<xs:extension base="X">
<xs:sequence>
<xs:element name="param"/>
<xs:group ref="sequence-constructor"/>
</
</
</
</

to this:

<xs:element name="function">
<xs:complexType>
<xs:restriction base="function-base-type">
<xs:group ref="function-model"/>
<xs:assert xpath="empty(xsl:param/@select)"/>
</
</
</

<xs:complexType name="function-base-type">
<xs:extension base="X">
<xs:group ref="function-model"/>
</xs:extension>
</xs:complexType>

<xs:group name="function-model">
<xs:sequence>
<xs:element name="param"/>
<xs:group ref="sequence-constructor"/>
</
</xs:group>

which is pretty disruptive just to add an assertion. It would be even
more disruptive if the content model of the type had been built up by
several extension steps, because the need to restate the content model
as part of the restriction step would essentially destroy all the
benefits of building it up incrementally.

Of course there would be other ways of doing it, like having local
element declarations for xsl:param so the type depends on its parentage.
But that's not really the point.

It's actually quite tempting to put all the assertions at the level of
the xsl:stylesheet element, but I'm resisting that because it's likely
to be expensive and give poor diagnostics.

I think it would have been much better if we had allowed assertions to
be defined within an <xs:extension> - the semantics being to define an
anonymous type by extension and then restrict this by assertion.

Michael Kay
Saxonica

Reply | Threaded
Open this post in threaded view
|

Re: Supporting incremental-definition of a type?

Mukul Gandhi
Hi Mike,
   I wish, assertions allowed you to define a schema for XSLT as you
want. But I believe, XSD assertions have other many uses than defining
the schema for XSLT :)

On Mon, Jun 18, 2012 at 9:53 PM, Michael Kay <[hidden email]> wrote:
> I think it would have been much better if we had allowed assertions to be
> defined within an <xs:extension>

I think, this is allowed presently by the XSD 1.1 language. Following
are the relevant grammar fragments of the XSD complexType (copied from
the spec),

<complexContent
     id = ID
     mixed = boolean
     {any attributes with non-schema namespace . . .}>
     Content: (annotation?, (restriction | extension))
</complexContent>

<extension
    base = QName
    id = ID
    {any attributes with non-schema namespace . . .}>
    Content: (annotation?, openContent?, ((group | all | choice |
sequence)?, ((attribute | attributeGroup)*, anyAttribute?), assert*))
</extension>

It seems, an xs:assert in complexType xs:extension has the same
semantic meaning as an xs:assert within xs:restriction (i.e a
orthogonal co-occurrence constraint in addition to the content model
derivation constraint).



--
Regards,
Mukul Gandhi

Reply | Threaded
Open this post in threaded view
|

Re: Supporting incremental-definition of a type?

Michael Kay

>> I think it would have been much better if we had allowed assertions to be
>> defined within an<xs:extension>
> I think, this is allowed presently by the XSD 1.1 language.
Of course. Not sure how I overlooked that. So if I have a type A and I
want to define a type B that restricts A with an assertion, the simplest
way is to define B as a "vacuous extension" of A. Non-obvious, but very
useful.

Michael Kay
Saxonica

Reply | Threaded
Open this post in threaded view
|

Re: Supporting incremental-definition of a type?

Mukul Gandhi
On Mon, Jun 18, 2012 at 10:47 PM, Michael Kay <[hidden email]> wrote:
> So if I have a type A and I want to define a type B that restricts A with an assertion, the simplest way is
> to define B as a "vacuous extension" of A. Non-obvious, but very useful.

Here's a simple schema example I came up with,

<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">

   <xs:element name="X">
          <xs:complexType>
               <xs:complexContent>
                      <xs:extension base="T1">
                           <xs:sequence>
                                <xs:element name="y" type="xs:integer"/>
                           </xs:sequence>
                           <xs:assert test="x gt y"/>
                     </xs:extension>
                </xs:complexContent>
          </xs:complexType>
   </xs:element>

   <xs:complexType name="T1">
         <xs:sequence>
              <xs:element name="x" type="xs:integer"/>
         </xs:sequence>
   </xs:complexType>

</xs:schema>

An assertion being a co-occurrence constraint, in this case for
example enforces a relational constraint between two sibling elements
(and to me, this doesn't produce a restriction effect between complex
types, but rather is an orthogonal constraint on top of extension).

Assertions (for example, during XSD type extensions) must be used
carefully though. If the intention is to only produce an XSD particle
extension affect (like what XSD 1.0 did), then using assertion
improperly in such cases may inject a restriction behavior.



--
Regards,
Mukul Gandhi