Thanks to all,
so, if I am right:
1 An uri generic parser as maximum could identify the components and expose
them in escaped form.
2 Query is opaque to the uri parser, following the general rule of
exposing it in escaped form is the maximum that can be done.
Additional issuses seen are the separator that might change and also the
encoding of the parameterd (application/x-www-form-urlencoded example)
3 Scheme-specific API is needed to compose and decompose query component
instances,for example an HttpQuery class might help in composing and
decomposing an Http query string.
4 The meaning of the components depend on the uri scheme and
apart perhaps for the scheme component that associates an URI to an
5 Path is not totally opaque since it can be resolved with a relative
reference and partecipates in normalization with some rules generic
for all URIs
6 Path is made of a non-empty sequence of segments. Segments can be empty.
Path segments are separated by character [/], escaped form can be used to
encode the char [/] in a path segment. Segments content is opaque to the
7 There are some other characters together with [/] that are allowed inside
paths in unescaped form without having a meaning for a generic uri, these
are left to scheme implementations for implementing scheme specific rules.
9 scheme specific implementations are needed to (furtherly) syntax-check the
sub-components of the uri, at least for all but scheme.
Same can be said for normalization.
10 Also path segments cannot be generally made available in unescaped form,
a scheme specific implementation is needed to build each segment in a form
that obeys to the URI syntax rules possibly using some of the
no-escape-will-be-perfomed characters available as scheme specific
The meaning of these special characters will be different from their
For example, if through FTP we want to access a folder named [a;b]
(From the RFC1738 and I think is possible) we must write
if now we unescape the last segment it becomes [a;b;type=d] so unescaping
cannot distinguish anymore from the [;] part of the FTP path and the [;]
part of the FTP type command. So an api with unescaped segments must
be scheme-specific. The FTP API to build an FTP path described by Jeremy
in 4) of  is a good example.
As a confirm of what said, I would like to ask: a colon in the first segment
a relative reference without slash can be written as <./a:b> in *any case*
<a%3ab> *if not not meant to be used as control char in the considered
specific implementation* ?
I thought a generic uri parser could be a components separator for a further
scheme specific processing of the components. The exception in  of the
uri with a '?' in the path in unescaped form breaks also this assumption.
Remains the fact of recognizing the scheme and the fact that the URI
is an URI reference or not.
After this analysys I think this is the only thing that a generic URI parser
should do before giving the ball to a scheme specific implementation:
if there is a scheme so that a suitable uri parser implementation can be
selected or/and finding out if instead the uri is a relative reference so
that the same implementation of the current base uri can be selected.
The algorithm is already there:
RFC 3986 4.1 [..] If the URI-reference's prefix does not match the syntax of
a scheme followed by its colon separator, then the URI-reference
is a relative reference [..]
Maybe also fragment and isFragmentOnly attribute (no deference) could be
extracted. Becomes similiar to an opaque uri. Haven't thought about
and opacity yet. Do exist schemes that might use opaque uris in some
situations and hierarchical in others ?
Or opacity is real uri-scheme-implementation specific ?
Are there any exception (like that one of the ? in FTP path) to authority
that make parsing scheme specific ?
Normalization to be effective is scheme specific. Relative references
resolution implementation might be uri-generic once the authority is known
and the segments sequence is created (after that the [?] inside the segment
of the FTP path of  has been hidden in the opaque value of a segment by
FTP URL parser) ?
|Free forum by Nabble||Edit this page|