HTML5 and XHTML2 combined (a new approach)

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
30 messages Options
12
Reply | Threaded
Open this post in threaded view
|

HTML5 and XHTML2 combined (a new approach)

Giovanni Campagna
I read the whole topic about HTML5 and XHTML2 integration, and I always thought that it is silly that the W3C is spending people, time, money, facilities for a language that the destinataries won't implement.
But I've got an idea: if XHTML2 defines now all current features of HTML5, without HTML5 bloat (and the oppressive control of WhatWG), it will be easier for HTML WG to adopt this specifications and eventually reform only one MarkUp Working Group.
How this is to be implemented? Take features from HTML5 and modularize.

The biggest advantage of XHTML1.1/2 is modularization: create the following modules
- XHTML Web Applications Module (with video - audio - canvas - datalist - header - footer)
- XHTML Backward Compatibility Module (with b - i - font - etc.)
and in the latter say that conforming author must not use any feature defined in that module

Then the remaining steps for complete abolishing of HTML5 are firstly a specification "Another serialization for XHTML" that defines a processing model (maybe based on real SGML), depending on a XHTML2 + XHTMLWebApps for content model and adding things like "this element has an implied end tag when a start tag not allowed in the content model is found", "this element has an implied end tag when an end tag for an element that is not open is found", "this element discards mismatched end tag", but allowing other spec to integrate (like XHTML Mod1.1, that defines the Modularization framework and the base modules for XHTML1.1)
This way we can allow modularization and extensiblity (since content model and start/end tag handling are not defined in the same specification), although we may need to get DTDs back.

All the rest of HTML5 (DOM3 HTML - Scripting Execution Contexts - Persistent Storage API - Advanced Network Communications API) are in scope of WebApplications WG, and many people inside and outside the HTML WG would rather have them separated.

In conclusion, the only way to get XHTML widely deployed and implemented (and thus to reach PR) is to get rid of HTML5. Start working immediately at the XHTML WebApps module, before it's too late.

Giovanni
Reply | Threaded
Open this post in threaded view
|

Re: HTML5 and XHTML2 combined (a new approach)

bhawkeslewis

On 21/1/09 20:30, Giovanni Campagna wrote:
> The biggest advantage of XHTML1.1/2 is modularization

Is it?

How does modularization of XHTML help content producers, user agent
developers, or end-users in practice?

Has it ever helped you?

How would modularization of XHTML help achieve the stated goals of the
HTML WG?

I've yet to be persuaded that modularization of XHTML actually works or
that, if it did work, it would make the Web better.

The core idea of modularization, as far as I can gather, was that user
agent developers could declare the modules they support and that content
producers could match their content to that support profile:

http://www.w3.org/TR/1999/WD-html-in-xml-19990224/#mods

Whether this sort of fragmentation of the Web is remotely desirable is
open to debate; personally I want access to _the_ Web on my phone (and
my TV and my fridge and my cyborg cat), not lots of little walled garden
webs.

But in any case, when people actually try to use those modules, they
seem to find they cannot declare a profile in terms of the because it's
rather difficult to carve up XHTML features into useful groups.

The Open Mobile Alliance defined a profile of XHTML for mobile devices
that included various XHTML modules. But we find this statement in their
specification:

"The XHTML Mobile Profile document type could also serve as a host
language, that is, a language containing a mix of vocabularies within
one document type. Those considering its use as a host language should
consider that it is not strictly XHTML Host Language Conforming, as it
only partially includes three modules."

OMA wanted XHTML MP to include "start" and "value" out of the Legacy
module, "b", "big", "hr", "i", and "small" out of the Presentation
module, and "fieldset" and "optgroup" out of the Forms module.

http://www.openmobilealliance.org/tech/affiliates/wap/wap-277-xhtmlmp-20011029-a.pdf

Isn't a lesson to be drawn that language designers are going to
cherrypick features not modules, and that, if they are going to do that,
they are going to need to declare what features they support rather than
what modules?

> "Another serialization for XHTML" that defines a
> processing model (maybe based on real SGML)

If it was based on real SGML then it wouldn't be useful for processing
the existing text/html web corpus, and it wouldn't be implemented by
popular user agents.

> This way we can allow modularization and extensiblity (since content
> model and start/end tag handling are not defined in the same
> specification), although we may need to get DTDs back.

How does defining the content models of elements in one document and the
implied opening/closing tags of the same elements in another document
allow "modularization and extensibility" in a way that defining them in
same document does not?

> All the rest of HTML5 (DOM3 HTML - Scripting Execution Contexts -
> Persistent Storage API - Advanced Network Communications API) are in
> scope of WebApplications WG, and many people inside and outside the HTML
> WG would rather have them separated.

Many people certainly would. Indeed HTML WG/WHATWG are taking steps to
spin some of HTML5's tangle out into other specifications. For example,
Hixie has submitted Web Sockets as an IETF RFC:

http://bgp.potaroo.net/ietf/all-ids/draft-hixie-thewebsocketprotocol-01.txt

However, what is needed is not people saying these components should be
in separate documents but people capable and willing to edit those
documents. It's not like people are very actively editing the Web
Applications drafts at the moment.

> In conclusion, the only way to get XHTML widely deployed and implemented
> (and thus to reach PR) is to get rid of HTML5. Start working immediately
> at the XHTML WebApps module, before it's too late.

This still seems like a proposal to turn XHTML 2 _into_ HTML5, without
any attempt to gauge whether that meets the stated goals of XHTML 2. In
particular, the emphasis on web applications seems at odds with XHTML
2's emphasis on documents.

--
Benjamin Hawkes-Lewis

Reply | Threaded
Open this post in threaded view
|

Re: HTML5 and XHTML2 combined (a new approach)

Brett Patterson-5
I've yet to be persuaded that modularization of XHTML actually works or that, if it did work, it would make the Web better.

http://www.w3.org/TR/xhtml-modularization/introduction.html#s_intro_whatisxhtml

--
Brett P.


On Wed, Jan 21, 2009 at 6:38 PM, Benjamin Hawkes-Lewis <[hidden email]> wrote:

On 21/1/09 20:30, Giovanni Campagna wrote:
The biggest advantage of XHTML1.1/2 is modularization

Is it?

How does modularization of XHTML help content producers, user agent developers, or end-users in practice?

Has it ever helped you?

How would modularization of XHTML help achieve the stated goals of the HTML WG?

I've yet to be persuaded that modularization of XHTML actually works or that, if it did work, it would make the Web better.

The core idea of modularization, as far as I can gather, was that user agent developers could declare the modules they support and that content producers could match their content to that support profile:

http://www.w3.org/TR/1999/WD-html-in-xml-19990224/#mods

Whether this sort of fragmentation of the Web is remotely desirable is open to debate; personally I want access to _the_ Web on my phone (and my TV and my fridge and my cyborg cat), not lots of little walled garden webs.

But in any case, when people actually try to use those modules, they seem to find they cannot declare a profile in terms of the because it's rather difficult to carve up XHTML features into useful groups.

The Open Mobile Alliance defined a profile of XHTML for mobile devices that included various XHTML modules. But we find this statement in their specification:

"The XHTML Mobile Profile document type could also serve as a host language, that is, a language containing a mix of vocabularies within one document type. Those considering its use as a host language should consider that it is not strictly XHTML Host Language Conforming, as it only partially includes three modules."

OMA wanted XHTML MP to include "start" and "value" out of the Legacy module, "b", "big", "hr", "i", and "small" out of the Presentation module, and "fieldset" and "optgroup" out of the Forms module.

http://www.openmobilealliance.org/tech/affiliates/wap/wap-277-xhtmlmp-20011029-a.pdf

Isn't a lesson to be drawn that language designers are going to cherrypick features not modules, and that, if they are going to do that, they are going to need to declare what features they support rather than what modules?


"Another serialization for XHTML" that defines a
processing model (maybe based on real SGML)

If it was based on real SGML then it wouldn't be useful for processing the existing text/html web corpus, and it wouldn't be implemented by popular user agents.


This way we can allow modularization and extensiblity (since content
model and start/end tag handling are not defined in the same
specification), although we may need to get DTDs back.

How does defining the content models of elements in one document and the implied opening/closing tags of the same elements in another document allow "modularization and extensibility" in a way that defining them in same document does not?


All the rest of HTML5 (DOM3 HTML - Scripting Execution Contexts -
Persistent Storage API - Advanced Network Communications API) are in
scope of WebApplications WG, and many people inside and outside the HTML
WG would rather have them separated.

Many people certainly would. Indeed HTML WG/WHATWG are taking steps to spin some of HTML5's tangle out into other specifications. For example, Hixie has submitted Web Sockets as an IETF RFC:

http://bgp.potaroo.net/ietf/all-ids/draft-hixie-thewebsocketprotocol-01.txt

However, what is needed is not people saying these components should be in separate documents but people capable and willing to edit those documents. It's not like people are very actively editing the Web Applications drafts at the moment.


In conclusion, the only way to get XHTML widely deployed and implemented
(and thus to reach PR) is to get rid of HTML5. Start working immediately
at the XHTML WebApps module, before it's too late.

This still seems like a proposal to turn XHTML 2 _into_ HTML5, without any attempt to gauge whether that meets the stated goals of XHTML 2. In particular, the emphasis on web applications seems at odds with XHTML 2's emphasis on documents.

--
Benjamin Hawkes-Lewis


Reply | Threaded
Open this post in threaded view
|

Re: HTML5 and XHTML2 combined (a new approach)

Brett Patterson-5
I do not know if I am reading this wrong, but:

http://www.w3.org/MarkUp/2000/Charter
http://www.w3.org/2002/05/html/charter

According to this, their only true goal: "To fulfill the promise of XML for applying XHTML to a wide variety of platforms. To assist W3C's leadership role to support rich Web contents that combine XHTML with other W3C's work on areas such as math, scalable vector graphics, synchronized multimedia, and forms." seems to remain intact from the last charter.

And the scope of the charter is to combine all the languages and drop HTML, by helping to transition over from HTML to XHTML in the last of the two links, and in the earlier of the two links, from XHTML to XML.

At least that's how i read it.

--
Brett P.


On Thu, Jan 22, 2009 at 7:55 AM, Brett Patterson <[hidden email]> wrote:
I've yet to be persuaded that modularization of XHTML actually works or that, if it did work, it would make the Web better.

http://www.w3.org/TR/xhtml-modularization/introduction.html#s_intro_whatisxhtml

--
Brett P.



On Wed, Jan 21, 2009 at 6:38 PM, Benjamin Hawkes-Lewis <[hidden email]> wrote:

On 21/1/09 20:30, Giovanni Campagna wrote:
The biggest advantage of XHTML1.1/2 is modularization

Is it?

How does modularization of XHTML help content producers, user agent developers, or end-users in practice?

Has it ever helped you?

How would modularization of XHTML help achieve the stated goals of the HTML WG?

I've yet to be persuaded that modularization of XHTML actually works or that, if it did work, it would make the Web better.

The core idea of modularization, as far as I can gather, was that user agent developers could declare the modules they support and that content producers could match their content to that support profile:

http://www.w3.org/TR/1999/WD-html-in-xml-19990224/#mods

Whether this sort of fragmentation of the Web is remotely desirable is open to debate; personally I want access to _the_ Web on my phone (and my TV and my fridge and my cyborg cat), not lots of little walled garden webs.

But in any case, when people actually try to use those modules, they seem to find they cannot declare a profile in terms of the because it's rather difficult to carve up XHTML features into useful groups.

The Open Mobile Alliance defined a profile of XHTML for mobile devices that included various XHTML modules. But we find this statement in their specification:

"The XHTML Mobile Profile document type could also serve as a host language, that is, a language containing a mix of vocabularies within one document type. Those considering its use as a host language should consider that it is not strictly XHTML Host Language Conforming, as it only partially includes three modules."

OMA wanted XHTML MP to include "start" and "value" out of the Legacy module, "b", "big", "hr", "i", and "small" out of the Presentation module, and "fieldset" and "optgroup" out of the Forms module.

http://www.openmobilealliance.org/tech/affiliates/wap/wap-277-xhtmlmp-20011029-a.pdf

Isn't a lesson to be drawn that language designers are going to cherrypick features not modules, and that, if they are going to do that, they are going to need to declare what features they support rather than what modules?


"Another serialization for XHTML" that defines a
processing model (maybe based on real SGML)

If it was based on real SGML then it wouldn't be useful for processing the existing text/html web corpus, and it wouldn't be implemented by popular user agents.


This way we can allow modularization and extensiblity (since content
model and start/end tag handling are not defined in the same
specification), although we may need to get DTDs back.

How does defining the content models of elements in one document and the implied opening/closing tags of the same elements in another document allow "modularization and extensibility" in a way that defining them in same document does not?


All the rest of HTML5 (DOM3 HTML - Scripting Execution Contexts -
Persistent Storage API - Advanced Network Communications API) are in
scope of WebApplications WG, and many people inside and outside the HTML
WG would rather have them separated.

Many people certainly would. Indeed HTML WG/WHATWG are taking steps to spin some of HTML5's tangle out into other specifications. For example, Hixie has submitted Web Sockets as an IETF RFC:

http://bgp.potaroo.net/ietf/all-ids/draft-hixie-thewebsocketprotocol-01.txt

However, what is needed is not people saying these components should be in separate documents but people capable and willing to edit those documents. It's not like people are very actively editing the Web Applications drafts at the moment.


In conclusion, the only way to get XHTML widely deployed and implemented
(and thus to reach PR) is to get rid of HTML5. Start working immediately
at the XHTML WebApps module, before it's too late.

This still seems like a proposal to turn XHTML 2 _into_ HTML5, without any attempt to gauge whether that meets the stated goals of XHTML 2. In particular, the emphasis on web applications seems at odds with XHTML 2's emphasis on documents.

--
Benjamin Hawkes-Lewis



Reply | Threaded
Open this post in threaded view
|

Re: HTML5 and XHTML2 combined (a new approach)

bhawkeslewis

On 22/1/09 13:08, Brett Patterson wrote:
> I do not know if I am reading this wrong, but:
>
> http://www.w3.org/MarkUp/2000/Charter
> http://www.w3.org/2002/05/html/charter

Those charters are of purely historical interest, as they've been replaced:

HTML WG: http://www.w3.org/2007/03/HTML-WG-charter.html

XHTML 2 WG: http://www.w3.org/2007/03/XHTML2-WG-charter

--
Benjamin Hawkes-Lewis

Reply | Threaded
Open this post in threaded view
|

Re: HTML5 and XHTML2 combined (a new approach)

Brett Patterson-5
Okay, I see. So, may I ask, what was the real reason for dropping the goals of the past working groups? Such as what is stated in the Mission Statement?

--
Brett P.


On Thu, Jan 22, 2009 at 9:46 AM, Benjamin Hawkes-Lewis <[hidden email]> wrote:
On 22/1/09 13:08, Brett Patterson wrote:
I do not know if I am reading this wrong, but:

http://www.w3.org/MarkUp/2000/Charter
http://www.w3.org/2002/05/html/charter

Those charters are of purely historical interest, as they've been replaced:

HTML WG: http://www.w3.org/2007/03/HTML-WG-charter.html

XHTML 2 WG: http://www.w3.org/2007/03/XHTML2-WG-charter

--
Benjamin Hawkes-Lewis

Reply | Threaded
Open this post in threaded view
|

Re: HTML5 and XHTML2 combined (a new approach)

bhawkeslewis

On 22/1/09 14:51, Brett Patterson wrote:
> Okay, I see. So, may I ask, what was the real reason for dropping the
> goals of the past working groups? Such as what is stated in the Mission
> Statement?

No better person to explain the creation of the new HTML WG than Tim
Berners-Lee:

http://dig.csail.mit.edu/breadcrumbs/node/166

--
Benjamin Hawkes-Lewis

Reply | Threaded
Open this post in threaded view
|

Re: HTML5 and XHTML2 combined (a new approach)

Mark Birbeck-4
In reply to this post by bhawkeslewis

Hi Benjamin,

I think you make some fair points.

However, I don't agree that modularisation is of no value, since not
only is it being used by various language designers for a variety of
uses, but it also opens up the possibility that different interest
groups can work on different parts of the 'larger' vision.

For example, the organisations and individuals involved with XForms
are very different to those involved with XHTML 2, yet XForms began as
an XHTML module. Nowadays though, there are implementations that use
HTML, XHTML 1.x, and of course XHTML 2 as the host language for
XForms, mainly because it was modularised.

An even more striking example is RDFa; it also began life as an XHTML
2 module, yet is now a richer product for having been worked on by a
combination of people from the HTML, XHTML, Microformats and semantic
web communities.

But I do agree with you that the ability to modularise is not the most
significant part of XHTML 2; I think there are other far more powerful
advantages.

In my view, the most important aspect of XHTML 2, and where it breaks
new ground, is that it makes author extensibility a central design
goal, rather than an accidental add-on.

I saw some code yesterday where a site was using the @rel attribute to
hold some JavaScript object definitions:

  <a href="..." rel="{a :b;}">...</a>

And why shouldn't they? If you want to use declarative constructs, or
unobtrusive techniques, you are pretty much forced to do this kind of
thing -- or use other attributes such as @class, with convoluted
formats -- unless you go off and invent new markup.

The early work on XHTML 2 looked again at how a markup language can
provide well-defined points that authors, and the organisations they
work with, can use to enrich their documents -- but without breaking
the markup.

XHTML 2 provides two key extension points, in the form of the role
attribute (now implemented in both Firefox and IE8), and RDFa
(supported by Yahoo!'s SearchMonkey, Creative Commons, a number of UK
government sites, and even being added to Drupal core).

I think what many people don't realise in these discussions about
languages, is that this is essentially the core of the difference
between HTML5 and XHTML 2 -- HTML5 is about a monolithic spec,
individually authored, making a *virtue* out of having no extension
points, whilst XHTML 2 is about bending over backwards to add
extension points to the language *itself*, so that authors and
organisations are not beholden to the standards process.

This last point about the standards process is rather ironic, since
when the WHATWG came onto the scene it had some healthy -- and to my
mind reasonable -- criticisms of the W3C process, yet far from
replacing them with anything new, seems now to be even more
restrictive.

Anyway, yesterday I happened to blog in a lot more detail about the
extension points that RDFa and @role bring to markup languages,
explaining at the same time why the very act of providing extension
points runs counter to the HTML5 approach to language design. The post
is here, for anyone interested in this topic:

  <http://webbackplane.com/mark-birbeck/blog/2009/01/rdfa-means-extensibility>

Regards,

Mark

On Wed, Jan 21, 2009 at 11:38 PM, Benjamin Hawkes-Lewis
<[hidden email]> wrote:

>
> On 21/1/09 20:30, Giovanni Campagna wrote:
>>
>> The biggest advantage of XHTML1.1/2 is modularization
>
> Is it?
>
> How does modularization of XHTML help content producers, user agent
> developers, or end-users in practice?
>
> Has it ever helped you?
>
> How would modularization of XHTML help achieve the stated goals of the HTML
> WG?
>
> I've yet to be persuaded that modularization of XHTML actually works or
> that, if it did work, it would make the Web better.
>
> The core idea of modularization, as far as I can gather, was that user agent
> developers could declare the modules they support and that content producers
> could match their content to that support profile:
>
> http://www.w3.org/TR/1999/WD-html-in-xml-19990224/#mods
>
> Whether this sort of fragmentation of the Web is remotely desirable is open
> to debate; personally I want access to _the_ Web on my phone (and my TV and
> my fridge and my cyborg cat), not lots of little walled garden webs.
>
> But in any case, when people actually try to use those modules, they seem to
> find they cannot declare a profile in terms of the because it's rather
> difficult to carve up XHTML features into useful groups.
>
> The Open Mobile Alliance defined a profile of XHTML for mobile devices that
> included various XHTML modules. But we find this statement in their
> specification:
>
> "The XHTML Mobile Profile document type could also serve as a host language,
> that is, a language containing a mix of vocabularies within one document
> type. Those considering its use as a host language should consider that it
> is not strictly XHTML Host Language Conforming, as it only partially
> includes three modules."
>
> OMA wanted XHTML MP to include "start" and "value" out of the Legacy module,
> "b", "big", "hr", "i", and "small" out of the Presentation module, and
> "fieldset" and "optgroup" out of the Forms module.
>
> http://www.openmobilealliance.org/tech/affiliates/wap/wap-277-xhtmlmp-20011029-a.pdf
>
> Isn't a lesson to be drawn that language designers are going to cherrypick
> features not modules, and that, if they are going to do that, they are going
> to need to declare what features they support rather than what modules?
>
>> "Another serialization for XHTML" that defines a
>> processing model (maybe based on real SGML)
>
> If it was based on real SGML then it wouldn't be useful for processing the
> existing text/html web corpus, and it wouldn't be implemented by popular
> user agents.
>
>> This way we can allow modularization and extensiblity (since content
>> model and start/end tag handling are not defined in the same
>> specification), although we may need to get DTDs back.
>
> How does defining the content models of elements in one document and the
> implied opening/closing tags of the same elements in another document allow
> "modularization and extensibility" in a way that defining them in same
> document does not?
>
>> All the rest of HTML5 (DOM3 HTML - Scripting Execution Contexts -
>> Persistent Storage API - Advanced Network Communications API) are in
>> scope of WebApplications WG, and many people inside and outside the HTML
>> WG would rather have them separated.
>
> Many people certainly would. Indeed HTML WG/WHATWG are taking steps to spin
> some of HTML5's tangle out into other specifications. For example, Hixie has
> submitted Web Sockets as an IETF RFC:
>
> http://bgp.potaroo.net/ietf/all-ids/draft-hixie-thewebsocketprotocol-01.txt
>
> However, what is needed is not people saying these components should be in
> separate documents but people capable and willing to edit those documents.
> It's not like people are very actively editing the Web Applications drafts
> at the moment.
>
>> In conclusion, the only way to get XHTML widely deployed and implemented
>> (and thus to reach PR) is to get rid of HTML5. Start working immediately
>> at the XHTML WebApps module, before it's too late.
>
> This still seems like a proposal to turn XHTML 2 _into_ HTML5, without any
> attempt to gauge whether that meets the stated goals of XHTML 2. In
> particular, the emphasis on web applications seems at odds with XHTML 2's
> emphasis on documents.
>
> --
> Benjamin Hawkes-Lewis
>
>



--
Mark Birbeck, webBackplane

[hidden email]

http://webBackplane.com/mark-birbeck

webBackplane is a trading name of Backplane Ltd. (company number
05972288, registered office: 2nd Floor, 69/85 Tabernacle Street,
London, EC2A 4RR)

Reply | Threaded
Open this post in threaded view
|

Re: HTML5 and XHTML2 combined (a new approach)

Ian Hickson

On Thu, 22 Jan 2009, Mark Birbeck wrote:
>
> HTML5 is about a monolithic spec, individually authored, making a
> *virtue* out of having no extension points

HTML5 now consists of at least eight specs:

   http://dev.w3.org/html5/spec/
   http://dev.w3.org/html5/workers/
   http://dev.w3.org/html5/websockets/
   http://dev.w3.org/2006/webapi/XMLHttpRequest/
   http://dev.w3.org/2006/webapi/selectors-api/
   http://www.ietf.org/internet-drafts/draft-abarth-mime-sniff-00.txt
   http://www.ietf.org/internet-drafts/draft-abarth-origin-00.txt
   http://www.ietf.org/internet-drafts/draft-hixie-thewebsocketprotocol-01.txt

...with six credited authors and over 260 acknowledged contributors,
which has at least seven different extensions mechanisms:

 * class="" to extend elements
 * data-*="" for script annotations
 * <meta name="" content=""> for document-level name/value pairs
 * rel="" for extending link semantics
 * <script type=""> for embedding data blobs
 * <embed> for adding new native-code-implemented features
 * JS prototyping for extending the APIs
 
--
Ian Hickson               U+1047E                )\._.,--....,'``.    fL
http://ln.hixie.ch/       U+263A                /,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'

Reply | Threaded
Open this post in threaded view
|

Re: HTML5 and XHTML2 combined (a new approach)

Giovanni Campagna
Actually, only http://dev.w3.org/html5/spec/ can be considered HTML5 (a vocabulary and associated api for (X)HTML): the other are not dependent or related to HTML (the markup language), although implemented together, and are in scope of Web Applications WG (and are developed there).

Even the fact that you consider WebWorkers or XMLHttpRequest (or even the WebSocket protocol) as part of HTML5 is against modularity and extensibility: they're independent technologies with different use cases, conformance requirements and designs. Why shouldn't I be able to use WebSocket from a C++ application? Or use XMLHttpRequest with image/png (XHR2 of course)? Or implemnt Selectors API in a Gnome's LibXML2?

Giovanni

2009/1/22 Ian Hickson <[hidden email]>

On Thu, 22 Jan 2009, Mark Birbeck wrote:
>
> HTML5 is about a monolithic spec, individually authored, making a
> *virtue* out of having no extension points

HTML5 now consists of at least eight specs:

  http://dev.w3.org/html5/spec/
  http://dev.w3.org/html5/workers/
  http://dev.w3.org/html5/websockets/
  http://dev.w3.org/2006/webapi/XMLHttpRequest/
  http://dev.w3.org/2006/webapi/selectors-api/
  http://www.ietf.org/internet-drafts/draft-abarth-mime-sniff-00.txt
  http://www.ietf.org/internet-drafts/draft-abarth-origin-00.txt
  http://www.ietf.org/internet-drafts/draft-hixie-thewebsocketprotocol-01.txt

...with six credited authors and over 260 acknowledged contributors,
which has at least seven different extensions mechanisms:

 * class="" to extend elements
 * data-*="" for script annotations
 * <meta name="" content=""> for document-level name/value pairs
 * rel="" for extending link semantics
 * <script type=""> for embedding data blobs
 * <embed> for adding new native-code-implemented features
 * JS prototyping for extending the APIs

--
Ian Hickson               U+1047E                )\._.,--....,'``.    fL
http://ln.hixie.ch/       U+263A                /,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'


Reply | Threaded
Open this post in threaded view
|

Re: HTML5 and XHTML2 combined (a new approach)

Ian Hickson

On Thu, 22 Jan 2009, Giovanni Campagna wrote:
>
> Actually, only http://dev.w3.org/html5/spec/ can be considered HTML5 (a
> vocabulary and associated api for (X)HTML): the other are not dependent
> or related to HTML (the markup language), although implemented together,
> and are in scope of Web Applications WG (and are developed there).

Well by that definition, any spec is "monolithic", then, so Mark's slur
is meaningless.


> Even the fact that you consider WebWorkers or XMLHttpRequest (or even
> the WebSocket protocol) as part of HTML5 is against modularity and
> extensibility: they're independent technologies with different use
> cases, conformance requirements and designs. Why shouldn't I be able to
> use WebSocket from a C++ application? Or use XMLHttpRequest with
> image/png (XHR2 of course)? Or implemnt Selectors API in a Gnome's
> LibXML2?

You can use HTML5 from a C++ application (most Web browsers do!). And the
HTML5 APIs and features (e.g. <iframe>) can be used with image/png. And
you could implement HTML5's getElementsByClassName() feature in libxml2
just like the Selectors API.

The specs I mentioned are all specs that have been extracted from HTML5's
core document, illustrating that the HTML5 effort has been very willing to
split things up and spread the language to multiple working groups,
organisations, and editors, as appropriate.

--
Ian Hickson               U+1047E                )\._.,--....,'``.    fL
http://ln.hixie.ch/       U+263A                /,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'

Reply | Threaded
Open this post in threaded view
|

Re: HTML5 and XHTML2 combined (a new approach)

Giovanni Campagna


2009/1/22 Ian Hickson <[hidden email]>
On Thu, 22 Jan 2009, Giovanni Campagna wrote:
>
> Actually, only http://dev.w3.org/html5/spec/ can be considered HTML5 (a
> vocabulary and associated api for (X)HTML): the other are not dependent
> or related to HTML (the markup language), although implemented together,
> and are in scope of Web Applications WG (and are developed there).

Well by that definition, any spec is "monolithic", then, so Mark's slur
is meaningless.

No: there are "monolithic" specs, that try to define everything "from producer to consumer", and spec that address very specific use case and technologies. HTML5 is "monolithic" in that sense, behavior in the middle of web page processing, generation, etc. is very difficult to modify. XHTML2 is "modularized": a collection of "specific" specs (each XHTML module could be considered an independent specification, besides of W3C Process Document) that integrate together.

> Even the fact that you consider WebWorkers or XMLHttpRequest (or even
> the WebSocket protocol) as part of HTML5 is against modularity and
> extensibility: they're independent technologies with different use
> cases, conformance requirements and designs. Why shouldn't I be able to
> use WebSocket from a C++ application? Or use XMLHttpRequest with
> image/png (XHR2 of course)? Or implemnt Selectors API in a Gnome's
> LibXML2?

You can use HTML5 from a C++ application (most Web browsers do!). And the
HTML5 APIs and features (e.g. <iframe>) can be used with image/png. And
you could implement HTML5's getElementsByClassName() feature in libxml2
just like the Selectors API.

You (I hope not willfully) ignored my point: can I use what you called "HTML5 related" technologies without implementing HTML5? Can I use WebSockets using carefully crafted byte streams and POSIX sockets api? Can I use XMLHttpRequest to fetch an image/png (or any other type of data), ignoring the Content-Type sniffing (because both Apache and IIS automatically append a Content-Type for static content, even if just application/octet-stream), the text/html parsing, the responseXML implementing HTMLDocument interface, etc.? Can I put a method called "querySelectorsAll", that processes a CSS Selectors and fetches a DOM NodeList inside LibXML2, in an application that has nothing to do with HTML (or web in general), but happens to like Selectors more than XPath?
Yes, of course I can. This is modularization.


Giovanni

Reply | Threaded
Open this post in threaded view
|

Re: HTML5 and XHTML2 combined (a new approach)

Ian Hickson

On Thu, 22 Jan 2009, Giovanni Campagna wrote:

> 2009/1/22 Ian Hickson <[hidden email]>
> > On Thu, 22 Jan 2009, Giovanni Campagna wrote:
> > >
> > > Actually, only http://dev.w3.org/html5/spec/ can be considered HTML5
> > > (a vocabulary and associated api for (X)HTML): the other are not
> > > dependent or related to HTML (the markup language), although
> > > implemented together, and are in scope of Web Applications WG (and
> > > are developed there).
> >
> > Well by that definition, any spec is "monolithic", then, so Mark's
> > slur is meaningless.
>
> No: there are "monolithic" specs, that try to define everything "from
> producer to consumer", and spec that address very specific use case and
> technologies. HTML5 is "monolithic" in that sense, behavior in the
> middle of web page processing, generation, etc. is very difficult to
> modify. XHTML2 is "modularized": a collection of "specific" specs (each
> XHTML module could be considered an independent specification, besides
> of W3C Process Document) that integrate together.

If by this you mean that the XHTML2 specification doesn't define
processing rules for its language, then I agree. But that's a bug, not a
feature. We need detailed conformance requirements so that we can have
uniformity of implementations.

You could consider each HTML5 chapter or section to be its own independent
specification, if you like. Indeed the WebSocket API spec and the
WebSocket protocol spec are both generated from the HTML5 document's
source file, so they literally are sections of HTML5.


> > > Even the fact that you consider WebWorkers or XMLHttpRequest (or
> > > even the WebSocket protocol) as part of HTML5 is against modularity
> > > and extensibility: they're independent technologies with different
> > > use cases, conformance requirements and designs. Why shouldn't I be
> > > able to use WebSocket from a C++ application? Or use XMLHttpRequest
> > > with image/png (XHR2 of course)? Or implemnt Selectors API in a
> > > Gnome's LibXML2?
> >
> > You can use HTML5 from a C++ application (most Web browsers do!). And
> > the HTML5 APIs and features (e.g. <iframe>) can be used with
> > image/png. And you could implement HTML5's getElementsByClassName()
> > feature in libxml2 just like the Selectors API.
>
> You (I hope not willfully) ignored my point: can I use what you called
> "HTML5 related" technologies without implementing HTML5?

I don't know what you mean by "HTML5" in that sentence. The Web platform
as a whole -- from the text/html serialisation to the Window object to the
XHR constructor to the value of the Origin header and everything in
between and around it -- are all interrelated in various ways, sometimes
closely, sometimes loosely. There are parts of the HTML5 spc that don't
depend on other parts of the HTML5 spec, and there are parts of Web
Workers that depend on parts of the HTML5 spec. The whole caboodle is
designed is designed from the ground up to be implementable piecemeal,
because that's how the Web grew.


> Can I use WebSockets using carefully crafted byte streams and POSIX
> sockets api? Can I use XMLHttpRequest to fetch an image/png (or any
> other type of data), ignoring the Content-Type sniffing (because both
> Apache and IIS automatically append a Content-Type for static content,
> even if just application/octet-stream), the text/html parsing, the
> responseXML implementing HTMLDocument interface, etc.? Can I put a
> method called "querySelectorsAll", that processes a CSS Selectors and
> fetches a DOM NodeList inside LibXML2, in an application that has
> nothing to do with HTML (or web in general), but happens to like
> Selectors more than XPath? Yes, of course I can. This is modularization.

If that's modularisation, then HTML5 is modular! You can use MessagePorts
without having a Window, you can use addEventSource() in SVG, you can use
the text/html parsing without having a SQL database, you can put a method
called getElementsByClassName() in libxml2...

I doubt that's what Mark meant, though maybe he could clarify.

--
Ian Hickson               U+1047E                )\._.,--....,'``.    fL
http://ln.hixie.ch/       U+263A                /,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'

Reply | Threaded
Open this post in threaded view
|

Fwd: HTML5 and XHTML2 combined (a new approach)

Giovanni Campagna
In reply to this post by bhawkeslewis
2009/1/22 Benjamin Hawkes-Lewis <[hidden email]>
On 22/1/09 18:55, Giovanni Campagna wrote:

   How does modularization of XHTML help content producers, user agent
   developers, or end-users in practice?


It helps W3C WG in defining a new languages using previous content
(XHTML3 won't need a Structure Module, HTML6 will).

If XHTML3 resists the urge to change how elements and attributes in a namespace behave in the way that XHTML2 is changing how elements behave from XHTML1 then that might be true, though it's not modularization that facilitates that but rather refusing to change the specified behaviour of elements and attributes in a given namespace.

However, W3C WGs are not part of the set "content producers, user agent developers, or end-users": they are specification creators.


It helps the
implementors, because all modules have strict defined dependencies, so
changes don't affect other modules.

I'm not an implementor. But the impression I have from talking to implementors is that formal distinctions between modules by spec writers make little practical difference to how difficult it is to introduce changes such as adding hyperlinking capability to every element.
If I want to implement XHTML Lists, I don't need to implement XHTML Metainformation Attributes. This is modularization.
Btw, hyperlinking can be "easily" achieved in Gecko using currently available technology (XBL1.0), without adding any feature to the browser.
 

But what implementors do you think modularization has helped in this way?


   How would modularization of XHTML help achieve the stated goals of
   the HTML WG?

   I've yet to be persuaded that modularization of XHTML actually works
   or that, if it did work, it would make the Web better.


The point is: what modularization supposed to do? It is supposed to
create an extensibility framework for future enhancements of XHTML.

Hmm.

http://www.w3.org/TR/xhtml-modularization/introduction.html#s_intro_xhtml_mods

states two goals:

1. Enable user agent developers to specify what bits of XHTML they support so that content producers can deliver suitable content. It's already failed to achieve that goal as far as I can see (witness OMA's XHTML Mobile Profile, never mind the tattered variation of real-world implementations that don't in the least conform to neat modules). Does anyone accurately declare support in terms of modules not features? Does anyone deliver content that matches modules not features?

2. Extend XHTML's "layout and presentation capabilities". This is a rather odd goal, given the existence of SVG, SMIL, and most importantly CSS (and it seems directly counter to the design aims of XHTML2 which mention removing the layout and presentation capabilities of XHTML!). I'm not really clear how modularization would help with adding layout and presentational features to XHTML anyhow.


   The core idea of modularization, as far as I can gather, was that
   user agent developers could declare the modules they support and
   that content producers could match their content to that support
   profile:

   http://www.w3.org/TR/1999/WD-html-in-xml-19990224/#mods


Yes. More important, if UA get markup they cannot handle, you're quite
sure they'll process only features in supported modules and ignore the
rest, without breaking it all.

That's more the result of a process for handling unrecognized features than XHTML modularization:

   * insert unknown attributes into the DOM
   * insert unknown elements into the DOM
   * apply CSS to unknown elements
   * render content of unknown elements
No. DOM (and DOM depending models, like CSS rendering model or XPath data model) is not created after XHTML processing, it is created after Infoset Generation (that is, pure XML parsing). You don't need to know about XHTML to successfully build a DOM (but you need to know about HTML and its extendend interfaces, quirks, etc.)
 

That works just as well (or badly) for new XML elements that are not part of modules, like the "canvas" element from HTML5.


   Whether this sort of fragmentation of the Web is remotely desirable
   is open to debate; personally I want access to _the_ Web on my phone
   (and my TV and my fridge and my cyborg cat), not lots of little
   walled garden webs.

The fact is that you cannot, because not all UA can (or feel even
useful) support the same things:

I think that's basically true of purely presentational and aesthetic aspects and basically false when its comes to communicating information or providing user interactions. No fragmentation of the web is required for the later; the same information and interactivity can be restyled as required with CSS or XBL2.

There are some UAs that are not designed for interactivity at all (e.g. a printer), but I'm not persuaded the world is crying out for alternate content for such UAs or that their needs couldn't be mostly met the existing mechanisms (stylesheets) or by modifying content on their end (with automation, transformations, scripts, and stylesheets).

Being able to author content and interactivity usable on multiple devices is one of the express goals of XHTML2:

"More device independence: new devices coming online, such as telephones, PDAs, tablets, televisions and so on mean that it is imperative to have a design that allows you to author once and render in different ways on different devices, rather than authoring new versions of the document for each type of device."

http://www.w3.org/MarkUp/2009/ED-xhtml2-20090121/introduction.html#aims

It's also a goal for HTML5, of course:

http://www.w3.org/TR/html-design-principles/#media-independence


that's why some modern technologies
defined different conformance levels (profiles).

Why would listing supported features (e.g. elements and attributes) be insufficient for such purposes? Seems to work fine in practice:

http://www.opera.com/docs/specs/presto211/

And does grouping features into XHTML modules actually allow defining profiles? For example, if I were a developer of the text browser ELinks, how would I report that ELinks supports the "color" attribute but not the "face" attribute? They're both in the same module ("Legacy").


What is the use for  XForms on a TV not connected to the telephone?

I'm not really clear how telephone connectivity is relevant?

At least in Italy, when watching Digital TV, you need a telephone connection to send data back.

TV sets were one of the devices originally envisaged as using XForms:

http://www.w3.org/MarkUp/Forms/2000/Charter.html


Or images and image maps in Lynx?

Lynx works fine with images. You can view the alternative text with the mechanism provided by HTML 4.01, download them, manipulate them, and open them in your client of choice. Lynx works fine with clientside image maps because HTML 4.01 provides a mechanism for providing text equivalents for your navigation. Text alternative interfaces can be provided for many of the potential uses of serverside image maps too. (Look at T. V. Raman's work with an Emacspeak interface to Google Maps, for example.)
Yes, but it doesn't work with image maps (and will never work). Similarly printers will never work with scripting
 


Or audio in a purring cyborg cat?

I want to play online audio books from Gutenberg on _my_ cyborg cat. You can do what you want with yours. ;)
I don't know what is a cyborg cat, I thought it was just a rethoric invention. (and cats usally don't read books at loud)
 

Anyhow, I don't think content authors trying to author documents matching different groups of XHTML modules is a good approach to catering towards differing user agents compared to approaches like:

  * Media types.
  * Media-independent events (e.g. DOMActivate).
  * Text alternatives.
  * Stylesheets.
  * XBL2 layers on top of essential content and functionality.
  * Providing APIs to get and post data.
  * KISS.
 
If I don't use Scripting, why should I care of Dom / Scripting execution context (window object) / script elements?
If I don't use XBL, why should I care of XBL elements and PIs?


That's a fault of OMA, not modularization idea.

Is it? Are you blaming the user without assessing whether XHTML modularization met their needs? What's wrong with specifying what features you support, rather than what groups of features?
 

What features does not a correctly set up SGML application support
of "real" html?

Well for one thing, an SGML compliant processor would have to interpret "<br />" in HTML 4.01 documents according to HTML 4.01's SGML declaration (that is, as equivalent to "<br>&gt;") - sprinkling the web corpus with misplaced "greater than" signs. Being incompatible with the Web in that way is not viable for software attempting to compete with rival browsers or search engines.
No: setting up correctly SGML, <br/ becames a NET-enabling start tag, and > its corresponding end tag. Don't forget that XML can be expressed in SGML, but without the needing of DTD.
 

   How does defining the content models of elements in one document and
   the implied opening/closing tags of the same elements in another
   document allow "modularization and extensibility" in a way that
   defining them in same document does not?


You define "Another serialization of XHTML" in one specification, then
define "SGML serialization of SVG" containing only handling for
opening/closing SVG tags. The same for MathML, RDF/XML, OWL, InkML,
XSL:FO: any language that CDF WG will find usefult to serialize without
XML draconian error handling. You could even think of "Namespaces in SGML".

Hmm. This doesn't really answer my question; how does putting these serialization definitions into separate documents allow additional "modularization and extensibility"? (I don't have any particular bias against the proliferation of technical documents - I just don't see any necessary connection between this and allowing modularization or allowing extensibility.)
Documents after REC cannot be modified other than errata-ed. If you need a new language, you need a new spec. Better to define a new spec only for that language, than for all language implemented so far, isn't it?
 

XHTML2 is not specifically targeted to documents

Except that the draft states it is, whereas the HTML5 draft expresses pitches it at web applications too.


and even if it was,
we could extend it to web applications, and this only beacuse it is
modularized.

I disagree with the "only". You could also "extend it" by just adding features, just like all the other languages that are extended without anything analogous to XHTML modularization. HTML WG is adding features to HTML, yet HTML isn't modularized.

--
Benjamin Hawkes-Lewis

Yes, the fact is that they needed HTML5, a completely new and huge specifications, to add few new features (on the core vocubulary side): video - audio - canvas - datalist - section - etc.
If HTML4 had been modularized, HTML5 would have used HTML4's table module, text module (b-i-span-strong..), metainformation module (html-head-body-meta) etc.

Giovanni
Reply | Threaded
Open this post in threaded view
|

Re: Fwd: HTML5 and XHTML2 combined (a new approach)

Ian Hickson

On Thu, 22 Jan 2009, Giovanni Campagna wrote:
>
> If HTML4 had been modularized, HTML5 would have used HTML4's table
> module, text module (b-i-span-strong..), metainformation module
> (html-head-body-meta) etc.

We actually tried that (using XHTML Modularisation in Web Forms 2), but
browser vendors indicated no interest whatsoever in using this, and in
practice it was more of a hindrance than a help to the specification
process. I agree with Benjamin's assessment in this e-mail:

   http://lists.w3.org/Archives/Public/www-html/2009Jan/0042.html

--
Ian Hickson               U+1047E                )\._.,--....,'``.    fL
http://ln.hixie.ch/       U+263A                /,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'

Reply | Threaded
Open this post in threaded view
|

Re: Fwd: HTML5 and XHTML2 combined (a new approach)

David Woolley (E.L)
In reply to this post by Giovanni Campagna

Giovanni Campagna wrote:

> Yes, the fact is that they needed HTML5, a completely new and huge
> specifications, to add few new features (on the core vocubulary side):
> video - audio - canvas - datalist - section - etc.
> If HTML4 had been modularized, HTML5 would have used HTML4's table
> module, text module (b-i-span-strong..), metainformation module
> (html-head-body-meta) etc.

I think a single monolithic specification is more fundamental to HTML5
than that.  Their primary market is people who require that all browsers
behave exactly alike, in particular produce the same displayed reseult,
which means that sub-setting is strongly discouraged.  Their market is
also assumed not to know what is valid HTML, so can't be expected to
know what is in a particular module.

--
David Woolley
Emails are not formal business letters, whatever businesses may want.
RFC1855 says there should be an address here, but, in a world of spam,
that is no longer good advice, as archive address hiding may not work.

Reply | Threaded
Open this post in threaded view
|

Re: Fwd: HTML5 and XHTML2 combined (a new approach)

bhawkeslewis
In reply to this post by Giovanni Campagna

On 22/1/09 21:48, Giovanni Campagna wrote:

>         It helps the
>         implementors, because all modules have strict defined
>         dependencies, so
>         changes don't affect other modules.
>
>
>     I'm not an implementor. But the impression I have from talking to
>     implementors is that formal distinctions between modules by spec
>     writers make little practical difference to how difficult it is to
>     introduce changes such as adding hyperlinking capability to every
>     element.
>
> If I want to implement XHTML Lists, I don't need to implement XHTML
> Metainformation Attributes. This is modularization.

But we were discussing "changes" to existing implementations, not
selective implementation. If you have an implementation including Lists
and you _do_ implement Metainformation Attributes, you will likely need
to revisit the code implementing Lists, which is my point.

> Btw, hyperlinking can be "easily" achieved in Gecko using currently
> available technology (XBL1.0), without adding any feature to the browser.

href-on-any-element is a feature and would require code changes, even in
Gecko - never mind in user agents that don't implement XBL.

>         Yes. More important, if UA get markup they cannot handle, you're
>         quite
>         sure they'll process only features in supported modules and
>         ignore the
>         rest, without breaking it all.
>
>
>     That's more the result of a process for handling unrecognized
>     features than XHTML modularization:
>
>     * insert unknown attributes into the DOM
>     * insert unknown elements into the DOM
>     * apply CSS to unknown elements
>     * render content of unknown elements
>
> No. DOM (and DOM depending models, like CSS rendering model or XPath
> data model) is not created after XHTML processing, it is created after
> Infoset Generation (that is, pure XML parsing). You don't need to know
> about XHTML to successfully build a DOM (but you need to know about HTML
> and its extendend interfaces, quirks, etc.)

Sorry, I can't really follow what you're saying here, or how it's a
reply to what I said.

>     Lynx works fine with
>     clientside image maps because HTML 4.01 provides a mechanism for
>     providing text equivalents for your navigation. Text alternative
>     interfaces can be provided for many of the potential uses of
>     serverside image maps too. (Look at T. V. Raman's work with an
>     Emacspeak interface to Google Maps, for example.)
>
> Yes, but it doesn't work with image maps (and will never work).

A text browser's lack of useful support for serverside image maps is
just one of many reasons why depending on serverside image maps is a bad
idea. Of course, browsers can support the feature as required by the
HTML 4.01 specification - submitting "0,0" if the user cannot select
particular coordinates.

As I explained, clientside image maps work fine.

Typically, XHTML modularization fails to allow user agents to declare
support for one and not the other:

http://www.w3.org/TR/xhtml2/mod-csImgMap.html#s_csImgMapmodule

> Similarly printers will never work with scripting

I'm not sure about that. It seems to me a printer could download a
(X)HTML resource, apply included scripts to it, and print the result.
(Granted, you'd need to decide _when_ to print - e.g. what to do with
timer functions.)

Rather amusingly, the actual attempt to define a profile of XHTML for
printers (XHTML-Print) actually includes the Scripting module! Not
because conforming agents were to execute scripts (on the contrary, they
were ultimately required not to do so), but because the Scripting module
includes the "noscript" element and conforming agents were required to
show that alternative content instead:

http://www.w3.org/TR/xhtml-print/#s3.12

This is thus yet another example of how XHTML modules are insufficiently
fine-grained to express critical differences between implementations.
The only thing that clarified what the Scripting module's inclusion
meant was additional specification text:

http://www.w3.org/MarkUp/2006/xhtml-print-pr-doc.html#ssec1

> If I don't use Scripting, why should I care of Dom / Scripting execution
> context (window object) / script elements?

There are plenty of browsers that don't implement scripting and plenty
of users that disable it, with the result that scripts on a page are not
executed. Authors who want their content/functionality to work in those
scenarios don't make their content/functionality rely on scripts. XHTML
modularization doesn't help there (as we've seen above, it doesn't even
let user agent developers declare they don't execute scripts!);
progressive enhancement/unobtrusive scripting does.

 > If I don't use XBL, why should I care of XBL elements and PIs?

What's the relevance of this to XHTML modularization? XBL is a different
language.

>     Well for one thing, an SGML compliant processor would have to
>     interpret "<br />" in HTML 4.01 documents according to HTML 4.01's
>     SGML declaration (that is, as equivalent to "<br>&gt;") - sprinkling
>     the web corpus with misplaced "greater than" signs. Being
>     incompatible with the Web in that way is not viable for software
>     attempting to compete with rival browsers or search engines.
>
> No: setting up correctly SGML, <br/ becames a NET-enabling start tag,
> and > its corresponding end tag.

When the HTML4 SGML declaration is applied to a document validating as
HTML 4.01 containing "<br />", "<br /" is the null-end start tag. Since
the DTDs define BR as "EMPTY", the end tag /must/ be omitted. Therefore
the subsequent > cannot be parsed as the element's end tag.

See also:

http://www.cs.tut.fi/~jkorpela/html/empty.html

http://www.is-thought.co.uk/book/sgml-9.htm

http://www.w3.org/TR/REC-html40/appendix/notes.html#h-B.3.7

Now *perhaps* it would be legitimate to require HTML processors to use a
different SGML declaration even with documents that validate to HTML
4.01 DTDs and perhaps a different SGML declaration could define "<br />"
as equivalent to "<br>"? At any rate, this doesn't seem much more of a
departure than requiring HTML processors to use completely new
processing rules. The full requirements of acceptably processing the tag
soup web corpus are so tangled, however, that it is not obvious that
_any_ SGML declaration could express them and it seems likely that even
if it did, it would allow syntax to be validated even when it is
non-conforming and broken - which is indeed one of the problems the
original attempt to apply SGML to HTML ran into:

http://www.w3.org/TR/REC-html40/sgml/intro.html#h-19.1

>     Hmm. This doesn't really answer my question; how does putting these
>     serialization definitions into separate documents allow additional
>     "modularization and extensibility"? (I don't have any particular
>     bias against the proliferation of technical documents - I just don't
>     see any necessary connection between this and allowing
>     modularization or allowing extensibility.)
>
> Documents after REC cannot be modified other than errata-ed. If you need
> a new language, you need a new spec. Better to define a new spec only
> for that language, than for all language implemented so far, isn't it?

How do ten new RECs allow "modularization and extensibility" than one
new REC?

> Yes, the fact is that they needed HTML5, a completely new and huge
> specifications, to add few new features (on the core vocubulary side):
> video - audio - canvas - datalist - section - etc.
> If HTML4 had been modularized, HTML5 would have used HTML4's table
> module, text module (b-i-span-strong..), metainformation module
> (html-head-body-meta) etc.

Some of HTML5 defines processing for existing features that was
undefined in previous specifications (e.g. munging "image" to "img").

Some of HTML5 changes processing for existing features (e.g. the
algorithm for associating table cells and table headers)

So I cannot agree that if HTML4 had defined image and table features in
formally separate "modules", HTML5 as it stands could have merely reused
them.

--
Benjamin Hawkes-Lewis

Reply | Threaded
Open this post in threaded view
|

Re: Fwd: HTML5 and XHTML2 combined (a new approach)

Giovanni Campagna


2009/1/24 Benjamin Hawkes-Lewis <[hidden email]>
On 22/1/09 21:48, Giovanni Campagna wrote:
       It helps the
       implementors, because all modules have strict defined
       dependencies, so
       changes don't affect other modules.


   I'm not an implementor. But the impression I have from talking to
   implementors is that formal distinctions between modules by spec
   writers make little practical difference to how difficult it is to
   introduce changes such as adding hyperlinking capability to every
   element.

If I want to implement XHTML Lists, I don't need to implement XHTML
Metainformation Attributes. This is modularization.

But we were discussing "changes" to existing implementations, not selective implementation. If you have an implementation including Lists and you _do_ implement Metainformation Attributes, you will likely need to revisit the code implementing Lists, which is my point.
Why? Meta-Attributes changes don't change List features. Obviously, for the sake of performance, you could hardcode everything inline, but this is an implementation issue.
 


Btw, hyperlinking can be "easily" achieved in Gecko using currently
available technology (XBL1.0), without adding any feature to the browser.

href-on-any-element is a feature and would require code changes, even in Gecko - never mind in user agents that don't implement XBL.
Href-on-any-element in Gecko could be implemented with code changes or with XBL. This is, another time, an implementation issue. (Btw, XBL2 is a W3C Candidate Recommendation, so all UA are asked to implement it, sooner or later)
 

       Yes. More important, if UA get markup they cannot handle, you're
       quite
       sure they'll process only features in supported modules and
       ignore the
       rest, without breaking it all.


   That's more the result of a process for handling unrecognized
   features than XHTML modularization:

   * insert unknown attributes into the DOM
   * insert unknown elements into the DOM
   * apply CSS to unknown elements
   * render content of unknown elements

No. DOM (and DOM depending models, like CSS rendering model or XPath
data model) is not created after XHTML processing, it is created after
Infoset Generation (that is, pure XML parsing). You don't need to know
about XHTML to successfully build a DOM (but you need to know about HTML
and its extendend interfaces, quirks, etc.)

Sorry, I can't really follow what you're saying here, or how it's a reply to what I said.
You said that XHTML needs to define how to handle unknown elements. I replied that it's not part of XHTML processing building the Infoset and then DOM, CSS rendering tree, XPath data model, etc. The XML 1.0 and XML Information Set RECs define how to handle any element, with any attribute, in any namespace, provided a well-formed document.
 

   Lynx works fine with
   clientside image maps because HTML 4.01 provides a mechanism for
   providing text equivalents for your navigation. Text alternative
   interfaces can be provided for many of the potential uses of
   serverside image maps too. (Look at T. V. Raman's work with an
   Emacspeak interface to Google Maps, for example.)

Yes, but it doesn't work with image maps (and will never work).

A text browser's lack of useful support for serverside image maps is just one of many reasons why depending on serverside image maps is a bad idea. Of course, browsers can support the feature as required by the HTML 4.01 specification - submitting "0,0" if the user cannot select particular coordinates.

As I explained, clientside image maps work fine.

Typically, XHTML modularization fails to allow user agents to declare support for one and not the other:

http://www.w3.org/TR/xhtml2/mod-csImgMap.html#s_csImgMapmodule
 

Similarly printers will never work with scripting

I'm not sure about that. It seems to me a printer could download a (X)HTML resource, apply included scripts to it, and print the result. (Granted, you'd need to decide _when_ to print - e.g. what to do with timer functions.)

Rather amusingly, the actual attempt to define a profile of XHTML for printers (XHTML-Print) actually includes the Scripting module! Not because conforming agents were to execute scripts (on the contrary, they were ultimately required not to do so), but because the Scripting module includes the "noscript" element and conforming agents were required to show that alternative content instead:

http://www.w3.org/TR/xhtml-print/#s3.12

This is thus yet another example of how XHTML modules are insufficiently fine-grained to express critical differences between implementations. The only thing that clarified what the Scripting module's inclusion meant was additional specification text:

http://www.w3.org/MarkUp/2006/xhtml-print-pr-doc.html#ssec1


If I don't use Scripting, why should I care of Dom / Scripting execution
context (window object) / script elements?

There are plenty of browsers that don't implement scripting and plenty of users that disable it, with the result that scripts on a page are not executed. Authors who want their content/functionality to work in those scenarios don't make their content/functionality rely on scripts. XHTML modularization doesn't help there (as we've seen above, it doesn't even let user agent developers declare they don't execute scripts!); progressive enhancement/unobtrusive scripting does.
I asked a different question: why an author that doesn't rely on script (or an implementation that cannot, for various reason, implement scripts) should learn a plenty of DOM interfaces and APIs?
 

> If I don't use XBL, why should I care of XBL elements and PIs?

What's the relevance of this to XHTML modularization? XBL is a different language.
It was just an example of added languages.
 

   Well for one thing, an SGML compliant processor would have to
   interpret "<br />" in HTML 4.01 documents according to HTML 4.01's
   SGML declaration (that is, as equivalent to "<br>&gt;") - sprinkling
   the web corpus with misplaced "greater than" signs. Being
   incompatible with the Web in that way is not viable for software
   attempting to compete with rival browsers or search engines.

No: setting up correctly SGML, <br/ becames a NET-enabling start tag,
and > its corresponding end tag.

When the HTML4 SGML declaration is applied to a document validating as HTML 4.01 containing "<br />", "<br /" is the null-end start tag. Since the DTDs define BR as "EMPTY", the end tag /must/ be omitted. Therefore the subsequent > cannot be parsed as the element's end tag.

See also:

http://www.cs.tut.fi/~jkorpela/html/empty.html

http://www.is-thought.co.uk/book/sgml-9.htm

http://www.w3.org/TR/REC-html40/appendix/notes.html#h-B.3.7

Now *perhaps* it would be legitimate to require HTML processors to use a different SGML declaration even with documents that validate to HTML 4.01 DTDs and perhaps a different SGML declaration could define "<br />" as equivalent to "<br>"? At any rate, this doesn't seem much more of a departure than requiring HTML processors to use completely new processing rules. The full requirements of acceptably processing the tag soup web corpus are so tangled, however, that it is not obvious that _any_ SGML declaration could express them and it seems likely that even if it did, it would allow syntax to be validated even when it is non-conforming and broken - which is indeed one of the problems the original attempt to apply SGML to HTML ran into:

http://www.w3.org/TR/REC-html40/sgml/intro.html#h-19.1
I'm not asking to get SGML back. I'm asking to separate syntax from vocabulary, and possibly apply this new syntax to any W3C or externally defined language based on XML, providing an appropriate way to switch between languages (the old DTD)
 

   Hmm. This doesn't really answer my question; how does putting these
   serialization definitions into separate documents allow additional
   "modularization and extensibility"? (I don't have any particular
   bias against the proliferation of technical documents - I just don't
   see any necessary connection between this and allowing
   modularization or allowing extensibility.)

Documents after REC cannot be modified other than errata-ed. If you need
a new language, you need a new spec. Better to define a new spec only
for that language, than for all language implemented so far, isn't it?

How do ten new RECs allow "modularization and extensibility" than one new REC?
Because not all ten RECs must be released at the same time. Actually, the most important part is CR: implementation should wait till CR to add new features, otherwise they risk to have them changed at any time (or they block changes because they already have implemented). This means that if a feature is dubious or has lots of discussion in course cannot block other features.
 

Yes, the fact is that they needed HTML5, a completely new and huge
specifications, to add few new features (on the core vocubulary side):
video - audio - canvas - datalist - section - etc.
If HTML4 had been modularized, HTML5 would have used HTML4's table
module, text module (b-i-span-strong..), metainformation module
(html-head-body-meta) etc.

Some of HTML5 defines processing for existing features that was undefined in previous specifications (e.g. munging "image" to "img").

Some of HTML5 changes processing for existing features (e.g. the algorithm for associating table cells and table headers)

So I cannot agree that if HTML4 had defined image and table features in formally separate "modules", HTML5 as it stands could have merely reused them.
Well maybe Image module or Table module needed a new version, but I'm sure that there are features just copied from HTML4 / XHTML1 / DOM2HTML etc.

--
Benjamin Hawkes-Lewis


In addition, I've thought more on the XHTML2 vs HTML5 problem, and saw two possibilities:

- if you think that XHTML2 and HTML5 have the same use cases and destinataries, you surely cannot stand two different and not interoperable specifications for the same thing. So you must decide one and put all features in it.
Assuming that the choice was HTML5 (preferred by authors and implenters), you would need to port features XHTML2 back to HTML5: ie. you're stuck with RDFa in HTML (and XHTML Access Module, and XForms in HTML, etc.)
- if you instead think that XHTML2 is targeted at documents (hypertextual collections of data) while HTML5 is targeted to web applications (binary serializations of user interfaces) as its original name suggests (Web Applications 1.0), then you should purify both spec: remove what is not strictly necessary to a document from XHTML2 and remove what is not needed by an user interface from HTML5:
GMail doesn't need a List Module (not in standard mode, at least), while a cooking book doesn't need client side Databases.
What you will get is a very light version of XHTML that actually would mean no XHTML processing at all (just some CSS and XBL). On the other hand, the resulting purified HTML5 would look very like an XHTML Web Apps module (the one I proposed on the first mail).

But the current situation is not sustainable: the W3C cannot mantain two WG, producing two different and not interoperable technologies, with overlapping features, implementation, use cases, destinataries.

Giovanni
Reply | Threaded
Open this post in threaded view
|

Re: Fwd: HTML5 and XHTML2 combined (a new approach)

David Woolley (E.L)

Giovanni Campagna wrote:

> - if you instead think that XHTML2 is targeted at documents
> (hypertextual collections of data) while HTML5 is targeted to web
> applications (binary serializations of user interfaces) as its original

My perception is that HTML5 is also aimed at presentational,
particularly marketing, documents.  One could treat them as user
interface - especially given the stress on consistent presentation - but
they are relatively output only (although with custom controls for
navigation, panning, etc.).

There is a tendency for marketing documents to be applications, because
they often, effectively, contain their own browser, rather than using
native behaviours.

--
David Woolley
Emails are not formal business letters, whatever businesses may want.
RFC1855 says there should be an address here, but, in a world of spam,
that is no longer good advice, as archive address hiding may not work.


Reply | Threaded
Open this post in threaded view
|

Re: HTML5 and XHTML2 combined (a new approach)

Giovanni Campagna
In reply to this post by Giovanni Campagna
If you think that splitting is a solution, then it will be an author choice whether it is more important the content (as in a book review, a manual, a recipe, a white paper) or the presentation of it to the user (as in a webmarketing page).
My definiton of "document" is page you can save on your desktop as MHTML, you can print w/o loading it on a web browser, you can send as email attachment: things that you cannot do with a web app page (it would make much sense either)

(Actually, carefully using CSS you can get many effects without any JS)

Giovanni

2009/1/25 David Woolley <[hidden email]>
Giovanni Campagna wrote:

- if you instead think that XHTML2 is targeted at documents (hypertextual collections of data) while HTML5 is targeted to web applications (binary serializations of user interfaces) as its original

My perception is that HTML5 is also aimed at presentational, particularly marketing, documents.  One could treat them as user interface - especially given the stress on consistent presentation - but they are relatively output only (although with custom controls for navigation, panning, etc.).

There is a tendency for marketing documents to be applications, because they often, effectively, contain their own browser, rather than using native behaviours.


--
David Woolley
Emails are not formal business letters, whatever businesses may want.
RFC1855 says there should be an address here, but, in a world of spam,
that is no longer good advice, as archive address hiding may not work.

12