The Rule of Least Power

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
10 messages Options
Reply | Threaded
Open this post in threaded view
|

The Rule of Least Power

Costello, Roger L.
Hi Folks,

Below is a discussion of the rule of least power and how it applies to XML Schema design. The rule of least power is very cool. Comments welcome.  /Roger


The rule of least power says that given a choice of suitable ways to implement something, choose the least powerful way.

The following example illustrates the rule of least power.

The XML Schema enumeration facets are less powerful than regular expressions (used by the XML Schema pattern facet), which is less powerful than XPath (used by the XML Schema 1.1 assert element):

        enumerations < regular expressions < XPath expressions

Given the task of declaring an element to have a value that is one of a finite list of strings (or some other simple data type), you should declare the element using the least powerful method—enumerations.

Example: Create an XML Schema that declares a Color element to have one of these strings: red, white, or blue. One way to implement Color is with a simpleType that lists the values using enumeration facets:

    <xs:element name="Color">
        <xs:simpleType>
            <xs:restriction base="xs:string">
                <xs:enumeration value="red" />
                <xs:enumeration value="white" />
                <xs:enumeration value="blue" />
            </xs:restriction>
        </xs:simpleType>
    </xs:element>

A second way to implement Color is with a simpleType that lists the values using a regular expression in a pattern facet:

    <xs:element name="Color">
        <xs:simpleType>
            <xs:restriction base="xs:string">
                <xs:pattern value="red|white|blue" />
            </xs:restriction>
        </xs:simpleType>
    </xs:element>

You should use the first way, not the second.

Why?

Answer: With the first way it is easier for applications to analyze the XML Schema Color element for its list of valid values. In the second way applications must understand the regular expression (regex) language. Although this particular regex is simple, the regex language is complex and creating applications to determine what set of strings an arbitrary regex accepts is non-trivial.

The regular expression language has more power than enumeration facets. If your task is to constrain the value of an element to a specified list of values, then use enumeration facets, not the pattern facet.

There is a third way to implement this example, using the XML Schema 1.1 assert element:

    <xs:element name="Color">
        <xs:complexType>
            <xs:simpleContent>
                <xs:extension base="xs:string">
                    <xs:assert test=". = ('red', 'white', 'blue')" />
                </xs:extension>
            </xs:simpleContent>
        </xs:complexType>
    </xs:element>

In the assert element the value of the test attribute is an XPath expression:

        <xs:assert test="XPath" />

XPath is a powerful language, even more powerful than regular expressions. For applications to analyze the XML Schema Color element for its list of valid values will require applications to understand the XPath language--a daunting task indeed.

Lesson Learned: When creating an XML Schema, determine the suitable ways of implementing each feature and choose the one with the least power.

W3C Paper: Tim Berners-Lee and Noah Mendelsohn wrote a wonderful paper on the rule of least power:

      http://www.w3.org/2001/tag/doc/leastPower.html

Here are a few fascinating snippets from the paper:

Powerful languages inhibit information reuse.

Expressing constraints, relationships and processing instructions in less powerful languages increases the flexibility with which information can be reused: the less powerful the language, the more you can do with the data stored in that language.

Less powerful languages are usually easier to secure … Because programs in simpler languages are easier to analyze, it's also easier to identify the security problems that they do have.

… characteristics that make languages powerful can complicate or prevent analysis of programs or information conveyed in those languages … Indeed, on the Web, the least powerful language that's suitable should usually be chosen. This is The Rule of Least Power

… the suggestion to use less powerful languages must in practice be weighed against other factors. Perhaps the more powerful language is a standard and the less powerful language not, or perhaps the use of simple idioms in a powerful language makes it practical to use the powerful languages without unduly obscuring the information conveyed. Overall, the Web benefits when less powerful languages can be successfully applied.
Reply | Threaded
Open this post in threaded view
|

Re: The Rule of Least Power

Michael Kay


On 21/06/2012 17:44, Costello, Roger L. wrote:
> Hi Folks,
>
> Below is a discussion of the rule of least power and how it applies to XML Schema design. The rule of least power is very cool. Comments welcome.  /Roger
>
>
> The rule of least power says that given a choice of suitable ways to implement something, choose the least powerful way.
>
While I can see the arguments, I have to say I am very uncomfortable
with this as an architectural principle. A great deal of software design
is concerned with building systems that have potential for change, and
that means choosing technologies and designs that provide enough
headroom to cope with future requirements as well as current
requirements. I think this "rule" could be used to justify some really
poor design decisions, for example using a text file for data
interchange instead of using XML.

Michael Kay
Saxonica

Reply | Threaded
Open this post in threaded view
|

Re: The Rule of Least Power

Daniel Dui
I second Michael on this one.

I like lean/agile design and development, although I am no fundamentalist. I like ideas such as:

"Do the simplest thing that can possibly work, refactor and refine the design later, if and when needed."

...and...

"Solve today's problems today and tomorrow's problems tomorrow."

Sadly, I don't find that this way of thinking works well for schema design, where evolution and extensibility are often a very real concern and the schema designer needs to predict (or divinate?) future requirements.

I find that predictive/proactive design and use of more powerful constructs can lead to more extensible schemas. Unfortunately these schemas will probably be also more difficult to adopt as they are more difficult to understand and use.

-daniel


On 27 June 2012 11:14, Michael Kay <[hidden email]> wrote:


On 21/06/2012 17:44, Costello, Roger L. wrote:
Hi Folks,

Below is a discussion of the rule of least power and how it applies to XML Schema design. The rule of least power is very cool. Comments welcome.  /Roger


The rule of least power says that given a choice of suitable ways to implement something, choose the least powerful way.

While I can see the arguments, I have to say I am very uncomfortable with this as an architectural principle. A great deal of software design is concerned with building systems that have potential for change, and that means choosing technologies and designs that provide enough headroom to cope with future requirements as well as current requirements. I think this "rule" could be used to justify some really poor design decisions, for example using a text file for data interchange instead of using XML.

Michael Kay
Saxonica




--
____________________________________________________________
   Daniel Dui - [hidden email] - skype: danieldui
Reply | Threaded
Open this post in threaded view
|

Re: The Rule of Least Power

Pete Cordell-5
In reply to this post by Michael Kay
Original Message From: "Michael Kay"

>
> On 21/06/2012 17:44, Costello, Roger L. wrote:
>> Hi Folks,
>>
>> Below is a discussion of the rule of least power and how it applies to
>> XML Schema design. The rule of least power is very cool. Comments
>> welcome.  /Roger
>>
>>
>> The rule of least power says that given a choice of suitable ways to
>> implement something, choose the least powerful way.
>>
> While I can see the arguments, I have to say I am very uncomfortable with
> this as an architectural principle. A great deal of software design is
> concerned with building systems that have potential for change, and that
> means choosing technologies and designs that provide enough headroom to
> cope with future requirements as well as current requirements. I think
> this "rule" could be used to justify some really poor design decisions,
> for example using a text file for data interchange instead of using XML.

Likewise, using a myriad of simple tools may actually be more complicated
than using one powerful tool.  (It would help if the powerful tool is
readily parsable though, unlike C++!)

Like lots of things in life, there are no simple answers!

Pete Cordell
Codalogic Ltd
Twitter: http://twitter.com/petecordell
Interface XML to C++ the easy way using C++ XML
data binding to convert XSD schemas to C++ classes.
Visit http://codalogic.com/lmx/ or http://www.xml2cpp.com
for more info


Reply | Threaded
Open this post in threaded view
|

Re: The Rule of Least Power

Kevin Braun
In reply to this post by Michael Kay

On 6/27/2012 6:14 AM, Michael Kay wrote:

>
>
> On 21/06/2012 17:44, Costello, Roger L. wrote:
>> Hi Folks,
>>
>> Below is a discussion of the rule of least power and how it applies
>> to XML Schema design. The rule of least power is very cool. Comments
>> welcome.  /Roger
>>
>>
>> The rule of least power says that given a choice of suitable ways to
>> implement something, choose the least powerful way.
>>
> While I can see the arguments, I have to say I am very uncomfortable
> with this as an architectural principle. A great deal of software
> design is concerned with building systems that have potential for
> change, and that means choosing technologies and designs that provide
> enough headroom to cope with future requirements as well as current
> requirements. I think this "rule" could be used to justify some really
> poor design decisions, for example using a text file for data
> interchange instead of using XML.
>
> Michael Kay
> Saxonica
>
>

In fairness to the authors of the document that Roger mentioned
(http://www.w3.org/2001/tag/doc/leastPower.html), I think it should be
pointed out that they actually were addressing, I think, a more narrow
question.  The referenced paper gives the rule of least power as "Use
the least powerful language suitable for expressing information,
constraints or programs on the World Wide Web."  The paper is concerned
with maximizing the reusability of information published on the Web.  I
don't believe they intended to establish a broad architectural principle
such as "choose the least powerful, suitable way to do anything".

Kevin



Reply | Threaded
Open this post in threaded view
|

Re: The Rule of Least Power (UNCLASSIFIED)

Cheney, Edward A SFC MIL USA FORSCOM
In reply to this post by Michael Kay
Classification: UNCLASSIFIED
I would say, at best, the rule is misguided.  It is more beneficial to all audiences to always use the way of closest intention.  If a tool exists that most closely aligns with the given intention of a data facet or communication instance then this specific tool should be used regardless of whether it is more or less powerful than other ways.

I say this for two reasons:

1. Least powerful is not a well defined term and the ambiguity surrounding "least powerful" has failed the web on many occasions.
2. The end result of a given task is most important, but this accomplishment is coupled with cost limitations on how the end result is attained.

In a given economic system costs are fixed.  They can be transferred from one agent in the system to another, but the costs are not diminished from the system as a whole.  When I use the term "least powerful" to whom should this term apply?  Does it mean least powerful to write the instance, to parse the instance, to write the given parser, or even natural language understandability?  It is my experience that software tasks generally have a fixed cost.  Making a task less power for the author of a given instance generally requires a more powerful parser to take up the slack of the author's sloppiness.  In other words there is a balance to the system and when one party is biased costs are transferred disproportionally to the remaining parties in the system.  This does not hide costs, but instead reduces the obviousness of costs, which creates additional and often unnecessary challenges.

Web technologies are notoriously sloppy.  XML was created to supply a more limited, confined, and terse syntax than that allowed by SGML in order to eliminate much of this sloppiness.  In my opinion this is fantastic and certainly the way to go.  Instance authors have to try a little bit harder to reduce their errors which means parsers don't have to be nearly so complex like HTML parsers.  Costs are more balanced between given parties and as a result bugs become easier to detect.  HTML permits a higher tolerance of sloppiness, which means HTML parsers are more complex and obscure bugs may never be detected.

Out of the box XML does nothing for accessibility because it is merely a syntax without a schema while HTML hardly does any better because of its high tolerance for sloppiness.  Achieving accessibility is an additional requirement, which means an additional cost factor.  Accessibility is extremely expensive to achieve on the web with very expensive penalties even though the requirements are commonly known clearly stated.  This is because the associated costs are extraordinarily out of balance.  As earlier stated HTML permits a high tolerance for sloppiness, which is one disruptive factor.  The other disruptive factor is that in the standardized language semantics and accessibility are permissively supported but not required by the design of the language.  This means there is little or no motivation for a document author to care.  A more powerful parser will not address the problem.  The costs are transferred to an enforcing party and then transferred back to the document author in the form of government fines, law suits, and boycotts.  The question of power quantity or even power distribution completely misses the point and provides no solution.

I have plenty of other examples of how sloppiness in web technologies is disruptive.  The point of "least powerful" has never addressed this disruption, and in my opinion is partially to blame for the presence of the problem in the first place.

Austin


On 06/27/12, Michael Kay  <[hidden email]> wrote:

> On 21/06/2012 17:44, Costello, Roger L. wrote:
> >Hi Folks,
> >
> >Below is a discussion of the rule of least power and how it applies to XML Schema design. The rule of least power is very cool. Comments welcome.  /Roger
> >
> >
> >The rule of least power says that given a choice of suitable ways to implement something, choose the least powerful way.
> >
> While I can see the arguments, I have to say I am very uncomfortable with this as an architectural principle. A great deal of software design is concerned with building systems that have potential for change, and that means choosing technologies and designs that provide enough headroom to cope with future requirements as well as current requirements. I think this "rule" could be used to justify some really poor design decisions, for example using a text file for data interchange instead of using XML.
>
> Michael Kay
> Saxonica
Classification: UNCLASSIFIED

Reply | Threaded
Open this post in threaded view
|

Re: The Rule of Least Power

Noah Mendelsohn
In reply to this post by Kevin Braun


On 6/28/2012 9:16 AM, Kevin Braun wrote:
> In fairness to the authors of the document that Roger mentioned
> (http://www.w3.org/2001/tag/doc/leastPower.html), I think it should be
> pointed out that they actually were addressing, I think, a more narrow
> question.

> The referenced paper gives the rule of least power as "Use the least
> powerful language suitable for expressing information, constraints or
> programs on the World Wide Web."  The paper is concerned with maximizing
> the reusability of information published on the Web.  I don't believe
> they intended to establish a broad architectural principle such as
> "choose the least powerful, suitable way to do anything".


Thank you. That's exactly right. I confess that I find this thread a little
troubling, since the finding seems to be criticized for things it isn't saying.

Let's imagine what could have been a simple Web page with, say, a price
list. If you put it up on the Web in HTML, automated agents like search
engine crawlers have at least a decent chance of parsing the content and
extracting at least some of the intended information. If you write the same
page using (to take an extreme example), an empty <body>, and use
Javascript at runtime to build up a DOM that renders the same price list,
then few crawlers will do as well figuring out what's going on. Indeed, we
know that a crawler can't even determine in all cases whether the
Javascript will complete its work (the halting problem).

So the finding is saying: consider that tradeoff when you publish
information on the Web. It's not saying "don't use Javascript"; it's saying
that when you do there's often a cost. I think that's an important and
valid message.

Roger Costello paraphrased the rule as:

> The rule of least power says that given a choice of suitable ways to implement something, choose the least powerful way.

I'm glad Roger found the rule interesting and useful, but I think the
paraphrase is way too strong, and much of this thread seems (to me) to be
critiquing that stronger formulation. First of all, as already noted the
rule in the Finding is particular to the Web:

"Good Practice: Use the least powerful language suitable for expressing
information, constraints or programs on the World Wide Web."

Furthermore, immediately under that guidance, the Finding goes on to say:

"In aiming for simplicity, one must of course go far enough but not too
far. The language you choose must be powerful enough to successfully solve
your problem, and indeed, complexity and lack of clarity can easily result
from clumsy efforts to patch around use of a language that is too limited.
Furthermore, the suggestion to use less powerful languages must in practice
be weighed against other factors. Perhaps the more powerful language is a
standard and the less powerful language not, or perhaps the use of simple
idioms in a powerful language makes it practical to use the powerful
languages without unduly obscuring the information conveyed (see 3 Scalable
language families). Overall, the Web benefits when less powerful languages
can be successfully applied. "

Maybe I'm too close to this one,  but that seems pretty balanced and
appropriate to me.

Noah

Reply | Threaded
Open this post in threaded view
|

Re: The Rule of Least Power

Pete Cordell-5
Original Message From: "Noah Mendelsohn"

> Furthermore, immediately under that guidance, the Finding goes on to say:
>
> "In aiming for simplicity, one must of course go far enough but not too
> far. The language you choose must be powerful enough to successfully solve
> your problem, and indeed, complexity and lack of clarity can easily result
> from clumsy efforts to patch around use of a language that is too limited.
> Furthermore, the suggestion to use less powerful languages must in
> practice be weighed against other factors. Perhaps the more powerful
> language is a standard and the less powerful language not, or perhaps the
> use of simple idioms in a powerful language makes it practical to use the
> powerful languages without unduly obscuring the information conveyed (see
> 3 Scalable language families). Overall, the Web benefits when less
> powerful languages can be successfully applied. "
>
> Maybe I'm too close to this one,  but that seems pretty balanced and
> appropriate to me.

The clarifying text does seem to change it from a rule to a guideline!

I'm thinking that static data can be represented using XML, or (arguably)
less powerful JSON, or even less powerful .ini file format or even CSV.  But
to me it seems a world using all four of those representations is actually
more complicated than just using XML.

Maybe a "Rule of  Not Over Complicating" would be better!

Pete Cordell
Codalogic Ltd
Twitter: http://twitter.com/petecordell
Interface XML to C++ the easy way using C++ XML
data binding to convert XSD schemas to C++ classes.
Visit http://codalogic.com/lmx/ or http://www.xml2cpp.com
for more info


Reply | Threaded
Open this post in threaded view
|

Re: The Rule of Least Power

Noah Mendelsohn


On 7/2/2012 4:29 AM, Pete Cordell wrote:
> I'm thinking that static data can be represented using XML, or (arguably)
> less powerful JSON, or even less powerful .ini file format or even CSV.
> But to me it seems a world using all four of those representations is
> actually more complicated than just using XML.

I think that any of those are in the spirit of the "Finding", but if you
read the introductory text, the focus is mainly on computational power. It
establishes a ladder that has declarative languages at one end (least
powerful and from which it's easiest to extract information), functional
languages somewhere in the middle (e.g. XSLT, XQuery), and imperative
Turing-complete languages as the most "powerful", but tending to hide
information in ways that make it difficult to extract.

My point is that the finding doesn't try to suggest that, withing the
declarative languages, you should choose the least expressive or capable.
Yes, it's easier to write a JSON or CSV parser than to write one for XML,
and that may indeed be a good reason for choosing the simpler languages,
but the finding is pretty nearly neutral on that. All of those languages
are declarative, so from the point of view of computability theory, it's
equally possible to extract the data you might encode in any of them. When
you put information on the Web in XML, you're not making it impossible to
extract, as you might be with Java/Javascript/Flash; you are making it just
a bit harder if you don't have the right parser lying around.

Anyway, I still have the impression that several participants in this
discussion haven't read the finding as a whole. It's maybe 2 pages long,
and you can read it in about 5 mins. I suspect that doing so will alleviate
at least many of the concerns about it.

Thank you.

Noah


Reply | Threaded
Open this post in threaded view
|

Re: The Rule of Least Power

Pete Cordell-5
In reply to this post by Costello, Roger L.
In reply to:
> Anyway, I still have the impression that several participants in this
> discussion haven't read the finding as a whole. It's maybe 2 pages long,
> and you can read it in about 5 mins. I suspect that doing so will
> alleviate at least many of the concerns about it.

What!!!  Read the original documents to get to get the facts and miss the chance for an argument?  Where's the fun in that!!!

Pete Cordell
Codalogic Ltd
Twitter: http://twitter.com/petecordell
Interface XML to C++ the easy way using C++ XML
data binding to convert XSD schemas to C++ classes.
Visit http://codalogic.com/lmx/ or http://www.xml2cpp.com
for more info