Versioning and HTML

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
31 messages Options
12
Reply | Threaded
Open this post in threaded view
|

Versioning and HTML

masinter

To kick off a broader discussion about “versioning” and web content (and HTML in particular) (W3C TAG ACTION-259):

 

In the 16 April 2009 HTML weekly conference call, we discussed the issues around versioning and what kinds of things the TAG could do that might be helpful to the HTML working group.

 

Minutes of that discussion are:

http://www.w3.org/2009/04/16-html-wg-minutes.html#item06

 

History:

 

It was suggested we look at the history of CSS and HTML evolution (I would find it interesting if the TAG looked at this from a historical perspective rather than a framework perspective. E.g. look at CSS and HTML in more detail)

For example, the issues around DOCTYPE switching:  http://hsivonen.iki.fi/doctype/

Version indicators used in HTML have included DOCTYPE, namespaces, adding new elements, attributes, new APIs, Javascript indicators of versions, MIME types, URI schemes….

 

Problem (from Chris Wilson):

the general problem with how we define HTML today; if HTML5 becomes a Rec and we realize we did something poorly we will cause rampant compatibility problems if we change implementations. There are a whole bunch of versioning mechanism that will address that but also cause their own problems.”

 

I think providing guidance and a analysis of requirements, possible solutions, and evaluation of solutions against requirements would be helpful.

 

 

Framework:

 

Inspired by Jonathan’s posts on a framework for thinking about versioning

  http://lists.w3.org/Archives/Public/www-tag/2009Apr/0042.html

 

I wrote:

 

The general idea of 'versioning' is that you include some indicator of version in the current language that will allow current processors to deal appropriately with future languages and recognize that they don't understand or can process appropriately this future content. The main thing is to categorize or predict the kinds of future content that current implementations should avoid or react to in some appropriate way. What are those categories?

 

Of course, in addition, you want future processors to be able to distinguish between the future languages and the current (and legacy) languages.

 

I am using “version” very broadly here to indicate any evolution, extension, or variation:  as languages evolve, how do you indicate which “version” (variant, dialect, extension) language is being sent in a way that the receiver knows the sender’s intention.

 

And I liked Jonathan’s approach to the “payoff” question, but want to try to see if we could express this in terms of requirements. I.e., any version indicator solution needs to satisfy some technical and non-technical requirements: open extensibility, current and future interoperability, resilience to untrained authors using version indicators incorrectly, ability to copy/paste segments content of one “version” into another, deployment of version indicators that are outside of the control of typical authors (MIME types), ….   I’d like to elaborate the “requirements” in a document form.

 

I think these questions interact with “Distributed Extensibility” and “Robustness Principle” .

 

Larry

--

http://larry.masinter.net

 

Reply | Threaded
Open this post in threaded view
|

Re: Versioning and HTML

Ian Hickson
On Sat, 18 Apr 2009, Larry Masinter wrote:
>
> Problem (from Chris Wilson): "the general problem with how we define
> HTML today; if HTML5 becomes a Rec and we realize we did something
> poorly we will cause rampant compatibility problems if we change
> implementations. There are a whole bunch of versioning mechanism that
> will address that but also cause their own problems."

Isn't this the problem the Candidate Recommendation stage is supposed to
address? Having serious CR phases, where we aim for two complete
implementations of the entire specification (including all optional parts,
and with no bugs, and with a comprehensive test suite written with the
intent of finding every last edge case bug) seems like it would avoid the
problem of doing things poorly, or at least reduce the likelihood to the
point where it would be rare enough to not be enough to justify adding
syntax-level support for routing around such problems later.

--
Ian Hickson               U+1047E                )\._.,--....,'``.    fL
http://ln.hixie.ch/       U+263A                /,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'

Reply | Threaded
Open this post in threaded view
|

Re: Versioning and HTML

Dan Brickley-2
On 19/4/09 10:30, Ian Hickson wrote:

> On Sat, 18 Apr 2009, Larry Masinter wrote:
>> Problem (from Chris Wilson): "the general problem with how we define
>> HTML today; if HTML5 becomes a Rec and we realize we did something
>> poorly we will cause rampant compatibility problems if we change
>> implementations. There are a whole bunch of versioning mechanism that
>> will address that but also cause their own problems."
>
> Isn't this the problem the Candidate Recommendation stage is supposed to
> address? Having serious CR phases, where we aim for two complete
> implementations of the entire specification (including all optional parts,
> and with no bugs, and with a comprehensive test suite written with the
> intent of finding every last edge case bug) seems like it would avoid the
> problem of doing things poorly, or at least reduce the likelihood to the
> point where it would be rare enough to not be enough to justify adding
> syntax-level support for routing around such problems later.

This would be nice. Can you suggest any inspirational precedents for a
comparably-complex technology?

Dan

--
http://danbri.org/


Reply | Threaded
Open this post in threaded view
|

Re: Versioning and HTML

Ian Hickson
On Sun, 19 Apr 2009, Dan Brickley wrote:

> >
> > Isn't this the problem the Candidate Recommendation stage is supposed
> > to address? Having serious CR phases, where we aim for two complete
> > implementations of the entire specification (including all optional
> > parts, and with no bugs, and with a comprehensive test suite written
> > with the intent of finding every last edge case bug) seems like it
> > would avoid the problem of doing things poorly, or at least reduce the
> > likelihood to the point where it would be rare enough to not be enough
> > to justify adding syntax-level support for routing around such
> > problems later.
>
> This would be nice. Can you suggest any inspirational precedents for a
> comparably-complex technology?

It's basically what we've been doing with CSS, to good success.

--
Ian Hickson               U+1047E                )\._.,--....,'``.    fL
http://ln.hixie.ch/       U+263A                /,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'

Reply | Threaded
Open this post in threaded view
|

Re: Versioning and HTML

Henri Sivonen
In reply to this post by masinter
> On Sun, 19 Apr 2009, Dan Brickley wrote:
> > >
> > > Isn't this the problem the Candidate Recommendation stage is  
> supposed
> > > to address? Having serious CR phases, where we aim for two  
> complete
> > > implementations of the entire specification (including all  
> optional
> > > parts, and with no bugs, and with a comprehensive test suite  
> written
> > > with the intent of finding every last edge case bug) seems like it
> > > would avoid the problem of doing things poorly, or at least  
> reduce the
> > > likelihood to the point where it would be rare enough to not be  
> enough
> > > to justify adding syntax-level support for routing around such
> > > problems later.
> >
> > This would be nice. Can you suggest any inspirational precedents  
> for a
> > comparably-complex technology?
>
> It's basically what we've been doing with CSS, to good success.

We are coming to a point where doctype sniffing and the modes have  
almost nothing to do with HTML itself[1] and almost everything to do  
with CSS. (WebKit and Opera show that the modes don't really need to  
have much (anything?) to do with the DOM APIs, either.)

Furthermore, CSS has not needed new modes (or 'versions') in browsers  
that don't have "MSIE" in their User-Agent string[2] after the CSS WG  
started being very serious about the two interoperable implementations  
requirement (as opposed to pushing CSS2 to REC years ahead of  
implementation experience in the context of existing Web content).

[1] http://hsivonen.iki.fi/last-html-quirk/
[2] Consider http://hsivonen.iki.fi/chrome-ua/ : a new product was  
launched triggering just about every UA sniffer *except* the scripts  
that look for "MSIE"
--
Henri Sivonen
[hidden email]
http://hsivonen.iki.fi/



Reply | Threaded
Open this post in threaded view
|

RE: Versioning and HTML

Chris Wilson-12
In reply to this post by Ian Hickson
On Sunday, April 19, 2009 Ian Hickson wrote:
>> Problem (from Chris Wilson): "the general problem with how we define
>> HTML today; if HTML5 becomes a Rec and we realize we did something
>> poorly we will cause rampant compatibility problems if we change
>> implementations. There are a whole bunch of versioning mechanism that
>> will address that but also cause their own problems."
>
>Isn't this the problem the Candidate Recommendation stage is supposed to
>address?

Not if vendors have already shipped implementations, in commercial products, prior to CR (let alone *IN* CR), and are unwilling to change (and break compatibility for customers who have shipped products based on those implementations of unratified "standards".  Reference:  the current SQL store issue and Maciej's commentary in public-webapps two weeks ago (sorry, listserv appears to be down right now).

This isn't pointing fingers at Maciej or Apple; it's pointing out that this problem is not just Microsoft's.  It's systemic; browsers need to ship on their own time cycle, and need to provide features for their customers, and when the standards don't keep up, they may ship experimental things.  The only way out of this would be for EVERY browser to very carefully only ship "proprietary-marked" (a la CSS' vendor extensions) versions of APIs/elements until the standard moves OUT of CR, and then add support for the standard naming and deprecate their proprietary-marked versions over time.

-Chris

Reply | Threaded
Open this post in threaded view
|

RE: Versioning and HTML

Ian Hickson
On Fri, 24 Apr 2009, Chris Wilson wrote:
>
> The only way out of this would be for EVERY browser to very carefully
> only ship "proprietary-marked" (a la CSS' vendor extensions) versions of
> APIs/elements until the standard moves OUT of CR, and then add support
> for the standard naming and deprecate their proprietary-marked versions
> over time.

Or at least, to do so with features that don't have obvious ways to be
updated without needing version syntax. Why is that a problem?

In the case of the spec changing while there is already an implementation,
it's not like the spec is going to have BOTH versions defined, and it's
not like other browsers are going to want to impement BOTH versions.
There'll just be one browser with the two versions, and we can quite
easily do that without version syntax (for example, Mozilla implemented
globalStorage, then their feedback resulted in major changes, so we just
renamed the object so that they could have both without compat problems).

--
Ian Hickson               U+1047E                )\._.,--....,'``.    fL
http://ln.hixie.ch/       U+263A                /,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'

Reply | Threaded
Open this post in threaded view
|

RE: Versioning and HTML

masinter
In reply to this post by Chris Wilson-12
I'm trying to lay out some of the general reasons for "why languages
evolve and why we need versions" with the specifics for HTML.


On Sunday, April 19, 2009 Ian Hickson wrote:
>> Problem (from Chris Wilson): "the general problem with how we define
>> HTML today; if HTML5 becomes a Rec and we realize we did something
>> poorly we will cause rampant compatibility problems if we change
>> implementations. There are a whole bunch of versioning mechanism that
>> will address that but also cause their own problems."
>
>Isn't this the problem the Candidate Recommendation stage is supposed to
>address?

In the history of computer science, I can think of no language that has
not evolved, been extended, or otherwise "versioned" as long as the
language has been in use.  This applies to network protocols, character
encoding standards, programming languages, and certainly to every known
technology found on the web.

And I can think of no cases where a language hasn't gone through at least
some minor incompatible change.

The standards process is established as a way of evolving specifications
and implementations in a way to reduce the likelihood of complete failure
to interoperate, but certainly not to guarantee that no incompatible
changes will be needed in the future.

Reasons why Languages (and HTML in particular) will need changes in the
future:

1:requirements change: a. New contexts b. competitive pressure
2: two implementations aren't represenative
3: ambiguiies appear


1: Requirements change: This is the main reason for evolution of
languages -- people want the language to support some new feature
that hadn't been thought of at the time of the original language
design. Often requirements can be accommodated without actually
changing the behavior of anything else, but at times, something
resembling a "version" is necessary.

1.a. Requirements can change because of environmental changes -- newer
hardware platforms, operating systems, user interface devices,
and so forth.

1. b. Requirements can change because of competitive pressures -- there are
features in some other competitive "language" that seem desirable
enough to want to evolve the language without making it unreasonably
complex.


What future technologies might appear? 3D browsers? Non-WIMP devices?
For HTML, by the current specification has only primarily
considered the needs of "browsers" which have "windows" and something
one can "click" with, a typist, the notion of being able to enter forms,
the idea that the "web page" is being displayed on a two-dimensional
device with rectangular layout, that there is a single "user", and so
on and so forth. Who knows whether these constraints are necessary.

Is there some merger of web and messaging and conferencing that would
work better with highly desirable changes to HTML5? With what confidence
can you say "no"? What is the cost of providing for incompatible changes
against the risk of chaos if they begin to appear?

2. Two implementations aren't representative

"Candidate Recommendation" exit criteria only needs two implementations,
and does not even require spanning the breadth of applicable hardware
and software.

Can HTML5/CSS3 work well on an electronic paper display such as Kindle?
Can it work well in a collaborative multi-pointer system?
Is there a single "focus" or "tab order"? Does it work well with
typical "remote control" devices used for TVs? These are current
platforms which are not required to work well, in order to exit CR.

3. Ambiguities appear

This is another common reason why languages evolve. Implementors get
together and write a specification. They're happy because the spec
matches what they implement -- or so they think!

However, all of the implementors were part of the spec development
process and .... amazingly .... there are some things they know and
agree on that aren't part of the spec. (No matter how brilliant and
wise the spec editor).

Later, someone else comes along and implements the spec as written, but,
either because of confusing wording or missing information, their
implementation is incompatible. Then there's a desire to update the
spec to resolve the ambiguity, but there is no way for authors to
create material that acknowledges that the author has chosen the
new (unambiguous) definition over the previous (ambiguous) one.


I'm sure there are other reasons for language evolution and there's
some overlap between these, but I thought this might help in
elaboration of the versioning framework.

Thoughts?

Larry
--
http://larry.masinter.net




Reply | Threaded
Open this post in threaded view
|

RE: Versioning and HTML

masinter
In reply to this post by Chris Wilson-12
Another use case for embedded version indicators is to track versions
during authoring, production and deployment before they are sent over
the wire.  Authors and authoring tools may well know which version of
a language they are editing or producing content for, which features
they are assuming and so forth. Without any way of marking the intended
version in the content itself, it is likely that version indicators will
be carried outside, and subject to loss. As has been seen with MIME types,
external metadata is subject to risks of separation, lack of control
by authors on deployment separation.

Right now, new HTML features seem to be deployed on the web by advanced
cites "sniffing" the User Agent version string and using it to determine
which version of a HTML page should be generated. This process is subject
to some significant failures, mainly because new or otherwise unrecognized
servers have no way of indicating to such sniffers that they, too, intend
to interpret the same features as one or another proprietary browser.

So I think we need to consider the use cases of language version management
during pre-publication processes, and also the use case of "browser version"
sniffing and the failure cases. This touches on the "content negotiation"
issue (as the sub-case of negotiating versions should be in scope for
the versioning discussion.)

Larry
--
http://larry.masinter.net


-----Original Message-----
From: Larry Masinter
Sent: Saturday, April 25, 2009 12:06 PM
To: 'Ian Hickson'
Cc: '[hidden email] WG'
Subject: RE: Versioning and HTML

I'm trying to lay out some of the general reasons for "why languages
evolve and why we need versions" with the specifics for HTML.


On Sunday, April 19, 2009 Ian Hickson wrote:
>> Problem (from Chris Wilson): "the general problem with how we define
>> HTML today; if HTML5 becomes a Rec and we realize we did something
>> poorly we will cause rampant compatibility problems if we change
>> implementations. There are a whole bunch of versioning mechanism that
>> will address that but also cause their own problems."
>
>Isn't this the problem the Candidate Recommendation stage is supposed to
>address?

In the history of computer science, I can think of no language that has
not evolved, been extended, or otherwise "versioned" as long as the
language has been in use.  This applies to network protocols, character
encoding standards, programming languages, and certainly to every known
technology found on the web.

And I can think of no cases where a language hasn't gone through at least
some minor incompatible change.

The standards process is established as a way of evolving specifications
and implementations in a way to reduce the likelihood of complete failure
to interoperate, but certainly not to guarantee that no incompatible
changes will be needed in the future.

Reasons why Languages (and HTML in particular) will need changes in the
future:

1:requirements change: a. New contexts b. competitive pressure
2: two implementations aren't represenative
3: ambiguiies appear


1: Requirements change: This is the main reason for evolution of
languages -- people want the language to support some new feature
that hadn't been thought of at the time of the original language
design. Often requirements can be accommodated without actually
changing the behavior of anything else, but at times, something
resembling a "version" is necessary.

1.a. Requirements can change because of environmental changes -- newer
hardware platforms, operating systems, user interface devices,
and so forth.

1. b. Requirements can change because of competitive pressures -- there are
features in some other competitive "language" that seem desirable
enough to want to evolve the language without making it unreasonably
complex.


What future technologies might appear? 3D browsers? Non-WIMP devices?
For HTML, by the current specification has only primarily
considered the needs of "browsers" which have "windows" and something
one can "click" with, a typist, the notion of being able to enter forms,
the idea that the "web page" is being displayed on a two-dimensional
device with rectangular layout, that there is a single "user", and so
on and so forth. Who knows whether these constraints are necessary.

Is there some merger of web and messaging and conferencing that would
work better with highly desirable changes to HTML5? With what confidence
can you say "no"? What is the cost of providing for incompatible changes
against the risk of chaos if they begin to appear?

2. Two implementations aren't representative

"Candidate Recommendation" exit criteria only needs two implementations,
and does not even require spanning the breadth of applicable hardware
and software.

Can HTML5/CSS3 work well on an electronic paper display such as Kindle?
Can it work well in a collaborative multi-pointer system?
Is there a single "focus" or "tab order"? Does it work well with
typical "remote control" devices used for TVs? These are current
platforms which are not required to work well, in order to exit CR.

3. Ambiguities appear

This is another common reason why languages evolve. Implementors get
together and write a specification. They're happy because the spec
matches what they implement -- or so they think!

However, all of the implementors were part of the spec development
process and .... amazingly .... there are some things they know and
agree on that aren't part of the spec. (No matter how brilliant and
wise the spec editor).

Later, someone else comes along and implements the spec as written, but,
either because of confusing wording or missing information, their
implementation is incompatible. Then there's a desire to update the
spec to resolve the ambiguity, but there is no way for authors to
create material that acknowledges that the author has chosen the
new (unambiguous) definition over the previous (ambiguous) one.


I'm sure there are other reasons for language evolution and there's
some overlap between these, but I thought this might help in
elaboration of the versioning framework.

Thoughts?

Larry
--
http://larry.masinter.net




Reply | Threaded
Open this post in threaded view
|

Re: Versioning and HTML

Jonathan Rees-3
In reply to this post by masinter
On Apr 18, 2009, at 2:14 PM, Larry Masinter wrote:
> I wrote:
>
> The general idea of 'versioning' is that you include some indicator  
> of version in the current language that will allow current  
> processors to deal appropriately with future languages and recognize  
> that they don't understand or can process appropriately this future  
> content. The main thing is to categorize or predict the kinds of  
> future content that current implementations should avoid or react to  
> in some appropriate way. What are those categories?

I think version indicators should be approached with some amount of  
skepticism.

You are making an assumption. It is empirically true that one can  
version a language without having version indicators. For example,  
Algol 60 and Algol 68 do not have version indicators. You are making a  
design choice, or perhaps attempting to establish 'versioning' as a  
term of art, not articulating a given.

Expanding on what you said (you use "or" very significantly): Version  
indicators can communicate various kinds of information to consumers,  
depending on the design of the versioning regime itself. In particular:

1. To syntactically characterize the text in question.  The same  
information could in principle be determined by scanning the text to  
see whether it syntactically conforms to the corresponding language  
specification; the purpose of the indicator is to make it unnecessary  
for the consumer to do this.

2. To modulate the interpretation of the text in question. That is,  
depending on what the version indicator is, interpreting agents might  
have to interpret the same text in two different ways.

This choice has profound consequences for the design of future  
versions. Suppose that an A (old) text is marked with indicator A. #1  
does not in itself imply that a text generated by an A-interpreter  
will lead to the desired payoff for a B producer, for any text. That  
will only be true if when we designed the versioning regime we made a  
stipulation that all future versions will have this property (new  
producers "must be" happy with what old consumers do with all texts).  
If we stipulate only sense #1, then future version designers do not  
have the freedom to transition a given interpretation of a given text  
from acceptable (in A) to less acceptable (in B) - or vice versa.

Given forwards and backwards compatibility as a language series  
policy, there is no real need for a version indicator, other than as a  
convenience (so that agents who care don't have to scan the document  
to see if it contains constructs it doesn't understand).

So in any discussion, you need to be clear about the sense of the  
version indicator. Sense 1 is economical in that a consumer can always  
just use a B-interpreter to interpret according to language A. There  
is strong incentive for a consumer to assume it even when doing so  
isn't in spec. Sense 2 is harder to implement since the consumer needs  
two interpreters or two interpretation modes, one for A texts and one  
for B texts.

Version indicators can be helpful, but they just push off the problem  
one level - they are really part of the language(s) in question, so  
they have to be evaluated according to exactly the same criteria that  
one would apply to a language series that doesn't have them. Suppose  
you have language versions A and B, and then a "sum" language C = A +  
B whose texts consist of a version indicator followed by a text of  
either A or B. (If A and B both already have version indicators you  
*may* be able to take C texts = A texts union B texts.) You still have  
to agree ahead of time - before language B is invented - on how to  
interpret texts of C - that is, everyone concerned needs a priori  
knowledge of how to parse and understand version indicators, even if  
it's just to say that rejecting unknown versions, or unknown texts, is  
OK. When you design a language series initially, you may set aside a  
place for version indicators, and specify that the indicator  
"sublanguage" is extensible (i.e. new indicators may come along). If  
you get the indicator language wrong in the first place, e.g. if you  
define it to specify sense 1 instead of sense 2 or vice versa, then  
you may find yourself stuck, either underconstraining the series (so  
that old consumers can't consume new content with confidence) or  
overconstraining series (so that new content will be rejected by  
conforming old consumers).

So version indicators only support extensibility (or whatever other  
goal you're after) if the future consequences for both old and new  
consumers are articulated and documented before the whole process gets  
started.

Best
Jonathan

---Footnotes---

1. Saying that C = A + B where B is not yet invented is not an  
nonsensical as it sounds. An extension may be thought of as a secret  
that is somehow known in principle, but not revealed to producers and  
consumers until some future date. I think of versioning and extension  
as being similar to the concept of single assignment or "future" in  
programming languages.

2. For those of you who read my formal stuff, I used a different  
definition of "language" there... I think that language (or language  
version) as class/predicate of interpreters, or equivalently  
requirements/specification/constraints on interpreters, is probably a  
more useful definition that either language as set of strings or  
language as single interpretation function on set of strings.


Reply | Threaded
Open this post in threaded view
|

RE: Versioning and HTML

Ian Hickson
In reply to this post by masinter
On Sat, 25 Apr 2009, Larry Masinter wrote:
>
> I'm trying to lay out some of the general reasons for "why languages
> evolve and why we need versions" with the specifics for HTML.

There's no doubt that language do and often should evolve.

It's also demonstrably the case that languages can and have been evolved
without changes that require implementations to implement forks to consume
content from different versions of the language in different ways. On the
Web, for example, URIs have evolved that way, and with some unnecessary
exceptions, so have CSS, HTML, and the DOM APIs.

In fact, in the case of CSS and HTML, the only versionning has been quirks
vs standards mode, a versioning that wasn't sanctioned by the
specifications contemporary to its introduction, and which would have been
unnecessary had the deviations from the original design required by
deployed content been codified as standard, as we have been doing for the
past few years with CSS 2.1 and HTML5.

Forking the language makes implementations orders of magnitude more
complex. Watching the Internet Explorer engineers' pained expressions when
one discusses the implications of their decision to ship multiple versions
of their rendering engine makes this abundantly clear. It also makes the
language less suitable for constrained devices (instead of one language to
support, one effectively ends up with multiple languages to support),
harder to test (instead of testing one language implementation, one
effectively ends up testing multiple implementations, as well as their
interactions in edge cases), and harder to document (instead of just
specifying the weird behaviours that end up de-facto part of the language
due to wide deployment of implementation bugs, one has to also specify the
other behaviours expected in each version).

Thus my belief that in general, language designers should strive to make
their languages versionless at the syntax level.

--
Ian Hickson               U+1047E                )\._.,--....,'``.    fL
http://ln.hixie.ch/       U+263A                /,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'

Reply | Threaded
Open this post in threaded view
|

Re: Versioning and HTML

Norman Walsh
In reply to this post by Jonathan Rees-3
Jonathan Rees <[hidden email]> writes:
> So version indicators only support extensibility (or whatever other  
> goal you're after) if the future consequences for both old and new  
> consumers are articulated and documented before the whole process gets  
> started.

But that's not uniquely true of version indicators, is it? That's true
no matter what technique is used to distinguish one version from
another. The alternative, where there aren't any version identifiers,
requires consumers to deal with both old and new markup as well.

For some languages and some applications, it may be reasonable to
define a universal semantics for all versions, such as the HTML rule
of ignoring wrappers it doesn't recognize. (Not that that hasn't
introduced problems of its own, with special elements created over
time just to work around the consequences of the "ignore wrappers"
rule.)

For other languages and other applications, it may not be reasoanble
to define a universal semantics. Applications must be expected in that
case to do something else. Version identifiers offer a convenient
mechanism to help users distinguish between versions, even if machines
don't need them: "Unexpected element 'fribble' encountered in this
V1.2.3 document. The element 'fribble' is not defined in V1.2.3."

                                        Be seeing you,
                                          norm

--
Norman Walsh <[hidden email]> | Do not seek to follow in the footsteps
http://nwalsh.com/            | of men of old; seek what they
                              | sought.--Matsuo Basho

attachment0 (191 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

RE: Versioning and HTML

Chris Wilson-12
In reply to this post by Ian Hickson
Ian Hickson [mailto:[hidden email]] wrote:
>On Fri, 24 Apr 2009, Chris Wilson wrote:
>> The only way out of this would be for EVERY browser to very carefully
>> only ship "proprietary-marked" (a la CSS' vendor extensions) versions of
>> APIs/elements until the standard moves OUT of CR, and then add support
>> for the standard naming and deprecate their proprietary-marked versions
>> over time.
>
>Or at least, to do so with features that don't have obvious ways to be
>updated without needing version syntax. Why is that a problem?

I didn't say it was - but aside from CSS, that's not what is happening.  Actually, there is the problem that web developers would have to abstract their code to point to the mostly-interoperable implementations of a feature, until that feature moved out of CR - you shouldn't use <canvas> today, then, you should use <webkit-canvas>, <moz-canvas>, etc.  We've had this question internally - e.g. for rounded border corners, should IE bother doing -ms-border-radius, or just skip straight to border-radius?

I'm not against the above as a process, you understand - I'm just pointing out that this isn't the way that vendors are doing things today, outside of CSS.

>In the case of the spec changing while there is already an implementation,
>it's not like the spec is going to have BOTH versions defined, and it's
>not like other browsers are going to want to impement BOTH versions.

It sort of depends what third-party applications are built on that browser-specific implementation, I expect.  Sort of like if GMail is built on Safari's SQL store support, and the spec changes to abstract that more.  What should the spec do, in this case?

-Chris

Reply | Threaded
Open this post in threaded view
|

RE: Versioning and HTML

Ian Hickson
On Mon, 27 Apr 2009, Chris Wilson wrote:

> Ian Hickson [mailto:[hidden email]] wrote:
> >On Fri, 24 Apr 2009, Chris Wilson wrote:
> >> The only way out of this would be for EVERY browser to very carefully
> >> only ship "proprietary-marked" (a la CSS' vendor extensions) versions of
> >> APIs/elements until the standard moves OUT of CR, and then add support
> >> for the standard naming and deprecate their proprietary-marked versions
> >> over time.
> >
> >Or at least, to do so with features that don't have obvious ways to be
> >updated without needing version syntax. Why is that a problem?
>
> I didn't say it was - but aside from CSS, that's not what is happening.

This would be a more convincing argument if Microsoft hadn't agreed to
renaming some of its APIs in IE8 and then forgotten to do so:

   http://lists.w3.org/Archives/Public/public-html-comments/2008Jun/0020.html

...despite being reminded to do so half a dozen times, and despite other
vendors apparently understanding the convention without trouble:

   http://lists.w3.org/Archives/Public/public-html/2008Jun/0294.html

Microsoft could change its behavior, and then it _would_ be happening.


> Actually, there is the problem that web developers would have to
> abstract their code to point to the mostly-interoperable implementations
> of a feature, until that feature moved out of CR - you shouldn't use
> <canvas> today, then, you should use <webkit-canvas>, <moz-canvas>, etc.  

This is why for features like new elements, vendors should always first
get buy-in from the working group.


> We've had this question internally - e.g. for rounded border corners,
> should IE bother doing -ms-border-radius, or just skip straight to
> border-radius?

You use a prefix until it's stable -- meaning in CR, for CSS specs. With
HTML5 I've been annotating the spec on a per-section basis to give
implementors fine-grained detail on what is stable and what is not, and
have been resonding to requests for advice from all browser vendors on
this very matter.


> >In the case of the spec changing while there is already an
> >implementation, it's not like the spec is going to have BOTH versions
> >defined, and it's not like other browsers are going to want to impement
> >BOTH versions.
>
> It sort of depends what third-party applications are built on that
> browser-specific implementation, I expect.  Sort of like if GMail is
> built on Safari's SQL store support, and the spec changes to abstract
> that more.  What should the spec do, in this case?

If other browser vendors believe they need to implement the feature, then
it should be a first-class feature and specified, even if there are other
features added that make it less important.

If other browser vendors believe that the feature isn't required to
support Web content, then the spec can die and the feature can just be a
transitory browser-specific feature.

--
Ian Hickson               U+1047E                )\._.,--....,'``.    fL
http://ln.hixie.ch/       U+263A                /,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'

Reply | Threaded
Open this post in threaded view
|

RE: Versioning and HTML

Chris Wilson-12
Ian Hickson [mailto:[hidden email]] wrote:

>This would be a more convincing argument if Microsoft hadn't agreed to
>renaming some of its APIs in IE8 and then forgotten to do so:
>
>   http://lists.w3.org/Archives/Public/public-html-comments/2008Jun/0020.html
>
>...despite being reminded to do so half a dozen times, and despite other
>vendors apparently understanding the convention without trouble:
>
>   http://lists.w3.org/Archives/Public/public-html/2008Jun/0294.html
>
>Microsoft could change its behavior, and then it _would_ be happening.

Umm, no, it wouldn't; I said:

>On Fri, 24 Apr 2009, Chris Wilson wrote:
> >> The only way out of this would be for EVERY browser to very
> >> carefully only ship "proprietary-marked" (a la CSS' vendor
> >> extensions) versions of APIs/elements until the standard moves OUT
> >> of CR, and then add support for the standard naming and deprecate
> >> their proprietary-marked versions over time.

Unless I missed something, none of the HTML5 features have moved past CR.  We could have a conversation about DOM storage if you like, but we're talking about implementing features that are not past CR - and I agree with RO'C on the mail you sent, except that I'm suggesting that "accepted in the spec" is likely "past CR" unless a browser is willing to say they won't whip out the "compatibility" card (as Apple is doing with SQL store now, AIUI).

>> Actually, there is the problem that web developers would have to
>> abstract their code to point to the mostly-interoperable implementations
>> of a feature, until that feature moved out of CR - you shouldn't use
>> <canvas> today, then, you should use <webkit-canvas>, <moz-canvas>, etc.
>
>This is why for features like new elements, vendors should always first
>get buy-in from the working group.

Buy-in is kind of irrelevant; buy-in is not a standard or a full spec that you won't have to change.  I'd say instead that

>> We've had this question internally - e.g. for rounded border corners,
>> should IE bother doing -ms-border-radius, or just skip straight to
>> border-radius?
>
>You use a prefix until it's stable -- meaning in CR, for CSS specs. With
>HTML5 I've been annotating the spec on a per-section basis to give
>implementors fine-grained detail on what is stable and what is not, and
>have been resonding to requests for advice from all browser vendors on
>this very matter.

That's incorrect, in my opinion, because it presumes that you are a replacement for CR.

>> >In the case of the spec changing while there is already an
>> >implementation, it's not like the spec is going to have BOTH versions
>> >defined, and it's not like other browsers are going to want to impement
>> >BOTH versions.
>>
>> It sort of depends what third-party applications are built on that
>> browser-specific implementation, I expect.  Sort of like if GMail is
>> built on Safari's SQL store support, and the spec changes to abstract
>> that more.  What should the spec do, in this case?
>
>If other browser vendors believe they need to implement the feature, then
>it should be a first-class feature and specified, even if there are other
>features added that make it less important.
>
>If other browser vendors believe that the feature isn't required to
>support Web content, then the spec can die and the feature can just be a
>transitory browser-specific feature.

Here, at least, I think we agree.  But prefixing such things would make it clearer that they're not "standard" until moved past CR.

-Chris

Reply | Threaded
Open this post in threaded view
|

RE: Versioning and HTML

Ian Hickson
On Mon, 27 Apr 2009, Chris Wilson wrote:
>
> [...]

The discussion regarding pre-CR implementations is besides the point,
since versioning wouldn't help with that problem. There's no version of
HTML5 or Web Storage that ever included the features for which Microsoft
committed to using vendor prefixing [1] (despite not actually using it
when the product was finally shipped), so versioning wouldn't affect how
they work. There's similarly no version of HTML5 or Web Storage that ever
included the behaviour IE8 implements [2] with respect to its handling of
concurrent script execution, so versioning wouldn't help with that either.

In practice, as I've said before, the actual cases where language
versioning would be a help at all are few and far between; IMHO they are
rare enough that it is a significantly better use of resources to route
around these problems on a case-by-case basis.

[1] http://lists.w3.org/Archives/Public/public-html-comments/2008Jun/0020.html
[2] http://lists.w3.org/Archives/Public/public-html/2009Mar/0574.html

--
Ian Hickson               U+1047E                )\._.,--....,'``.    fL
http://ln.hixie.ch/       U+263A                /,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'

Reply | Threaded
Open this post in threaded view
|

RE: Versioning and HTML

Chris Wilson-12
I think we're talking at cross-purposes.  I am not, in this thread, suggesting versioning in the language to solve any particular problem.  In fact, if you go back to my initial mail, I was 1) pointing out that the CR stage won't resolve problems with defining behavior incorrectly for the long term, or even necessarily being definitive enough in behavior description, but 2) noting that a bigger, but related problem, is that vendors shipping their implementations of in-progress specificiations, and then using incompatibilities with shipped implementations as a reasoning for not fixing their behavior, is a problem.  I said that the only way out of this particular problem appears to be to ask all vendors to prefix their non-CR'ed features, so that we don't cycle around with this problem.  Nothing excuses Microsoft from this issue; I didn't say that explicitly, but I would hope it was implied.

In this thread, I've been referring to problems INSIDE one version of the spec (HTML in this case, presumable) - the language versioning need, in my opinion, is for the deltas BETWEEN two major versions of the language.  Our experience there (with multiple versions of large core standards at the heart of the web platform) is knowingly limited, particularly with HTML; there's really only been one version of HTML (4.01) in deployed practice in the past decade, and it did not define behavior in much detail.  HTML5 defines it much more carefully, of course, but the supposition that it will define everything, and define it correctly for the long term, has always struck me as presumptuous at best.  CSS2 defined behavior much more strictly than CSS1; that caused incompatibilities.  CSS 2.1 redefined some behaviors from CSS2; that caused incompatibilities too.  But at any rate, that wasn't the problem I was referring to in this thread.

-----Original Message-----
From: Ian Hickson [mailto:[hidden email]]
Sent: Monday, April 27, 2009 12:30 PM
To: Chris Wilson
Cc: [hidden email] WG
Subject: RE: Versioning and HTML

On Mon, 27 Apr 2009, Chris Wilson wrote:
>
> [...]

The discussion regarding pre-CR implementations is besides the point,
since versioning wouldn't help with that problem. There's no version of
HTML5 or Web Storage that ever included the features for which Microsoft
committed to using vendor prefixing [1] (despite not actually using it
when the product was finally shipped), so versioning wouldn't affect how
they work. There's similarly no version of HTML5 or Web Storage that ever
included the behaviour IE8 implements [2] with respect to its handling of
concurrent script execution, so versioning wouldn't help with that either.

In practice, as I've said before, the actual cases where language
versioning would be a help at all are few and far between; IMHO they are
rare enough that it is a significantly better use of resources to route
around these problems on a case-by-case basis.

[1] http://lists.w3.org/Archives/Public/public-html-comments/2008Jun/0020.html
[2] http://lists.w3.org/Archives/Public/public-html/2009Mar/0574.html

--
Ian Hickson               U+1047E                )\._.,--....,'``.    fL
http://ln.hixie.ch/       U+263A                /,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'


Reply | Threaded
Open this post in threaded view
|

RE: Versioning and HTML

Ian Hickson
On Mon, 27 Apr 2009, Chris Wilson wrote:
>
> I think we're talking at cross-purposes.  I am not, in this thread,
> suggesting versioning in the language to solve any particular problem.  

Ah. My apologies. (I had assumed this was what you were suggesting because
of the Subject line.)


> In fact, if you go back to my initial mail, I was 1) pointing out that
> the CR stage won't resolve problems with defining behavior incorrectly
> for the long term, or even necessarily being definitive enough in
> behavior description,

Then what is the purpose of the CR stage? I thought the whole point of the
CR stage was to check that the spec is in fact interoperably implementable
(i.e. that it is definitive enough in behavior description), and to gain
enough implementation experience to ensure that problems with defining
behavior incorrectly for the long term are all caught.


> but 2) noting that a bigger, but related problem, is that vendors
> shipping their implementations of in-progress specificiations, and then
> using incompatibilities with shipped implementations as a reasoning for
> not fixing their behavior, is a problem.

Feedback from vendors must be taken into account so that the spec is
something they are willing to implement, sure. But I don't think this has
caused any problems for people willing to put interoperability ahead of
theoretical purity.


> I said that the only way out of this particular problem appears to be to
> ask all vendors to prefix their non-CR'ed features, so that we don't
> cycle around with this problem.

I think working closely with implementors to make sure they are aware of
which cases should be prefixed, that they are aware of areas that are
important to consider, and that they know which parts are likely to be
stable and which are not, should be enough, if the browser vendors are
willing to cooperate. It has, in fact, been enough for all but one of the
browser vendors involved in HTML5's development. This doesn't require
prefixing all non-CR'ed features.


> In this thread, I've been referring to problems INSIDE one version of
> the spec (HTML in this case, presumable) - the language versioning need,
> in my opinion, is for the deltas BETWEEN two major versions of the
> language.

I think in practice languages evolve continually, and that the model of
having different major versions (the way the W3C does) is the problem.
Instead of releasing new major versions of HTML or CSS, it would be better
if we had a continually evolving language definition where new features
were slowly added in tandem with implementation work. In such a world,
versioning wouldn't in fact make any sense.


> HTML5 defines it much more carefully, of course, but the supposition
> that it will define everything, and define it correctly for the long
> term, has always struck me as presumptuous at best.

I'm not really sure what you mean by "everything" and "correctly" here.
The definition that matters is what deployed content depends on; the spec
can only ever be an increasingly precise approximation of this.


> CSS2 defined behavior much more strictly than CSS1; that caused
> incompatibilities.  CSS 2.1 redefined some behaviors from CSS2; that
> caused incompatibilities too.

The key is that when defining behaviours of already-implemented features,
we have to do it in a way that IS compatible with deployed content. This
was not done with CSS 2.x.

--
Ian Hickson               U+1047E                )\._.,--....,'``.    fL
http://ln.hixie.ch/       U+263A                /,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'

Reply | Threaded
Open this post in threaded view
|

RE: Versioning and HTML -- version indicators

masinter
In reply to this post by Ian Hickson
Version indicators

Version indicators can either be out-of-band (not within the content,
but associated with the content, such as with using file extensions,
MIME types or other indicators) or in-band (contained in the content),
or some combination (out-of-band information overriding in-band or
vice versa, or combined in some other more complex way.)

In-content version indicators can either be global (readily determined
by reviewing the content in a fixed location or within the head
1k bytes of the file, for example) or local.

Languages can change through augmentation (adding new keywords,
features, procedures, available combinations) restriction
(previous options are deprecated, removed, disallowed),
clarification (previously  ambiguous features clarified) or
changed incompatibly in some or all circumstance.

Augmentations increase the set of strings that are valid
or meaningful or useful instances in the language, restrictions
decrease the set. Clarifications generally leave the set
alone, while incompatible changes may or may not modify
the set.

Whether language changes can be recognized without version
indicators depends on the type of change:

Some augmentations might be recognized by appearance
of syntax that wasn't previously recognized (i.e., the
"version indictor" is the use of the feature itself).

Augmentations might be ignored or merely processed incorrectly
by old implementations rather than being recognized as
intended with a formerly unimplemented interpretation.

Restrictions, clarifications, incompatible changes cannot
readily be determined by scanning, though.

Even though it is possible to avoid having out-of-band
or global-scope version indicators for augmentations,
this does not mean that there are no advantages or uses
for in-band global indicators.

If there are multiple languages (whether Algol 60 vs Algol 68
or just multiple "modes", having a global-scope in-band
version indicator allows for switching between one interpreter
and another. Indicating the version in-band but requiring
parsing of the content means that it isn't possible to
evolve syntax or parsers.

Larry
--
http://larry.masinter.net


Reply | Threaded
Open this post in threaded view
|

RE: Versioning and HTML -- CR exit criteria

masinter
In reply to this post by Ian Hickson
This is a little off-topic from "versioning and HTML" except
for the assertion that once HTML exits CR, no incompatible
changes will ever be necessary.


IF there are two implementations that are actually built
from reading the spec itself, and the implementations interoperate, then
you have some confidence that the spec isn't incomprehensible and that
it is actually possible to build SOMETHING interoperable based on it.

The process assumes that the assertions that the implementations in
fact match the specification are made in good faith. Unfortunately,
this isn't always the case.

Many specifications unfortunately are completely incomprehensible,
and the CR exit criteria doesn't explicitly require that the
implementations weren't built using inside knowledge and the
spec written after the fact.

Even if the implementations are written based on the specification
rather than the other way around, there is no process for verifying
that they match. Test cases, even if results are reported honestly,
only verify implementation of the test cases and not of the
specification.

Having only two implementations is hardly a guarantee of the
utility of the specification for wide applicability.  Surely
only two implementations aren't a guarantee that the considerations
of the wide variety of devices, operating systems, usability
concerns, international contexts, networking situations have
really been considered, even for the simplest of specifications.

As noted earlier, even if there are many implementations, all
built based on the specifications, over time requirements change,
and changing requirements might require incompatible changes.

It is never possible to " ensure that problems with defining
behavior incorrectly for the long term are all caught."

Larry
--
http://larry.masinter.net


12