(forwarding this to list as it seems Richard is not subscribed and
hence this message didn't show up in the archive).
Thanks a lot for the comments, Richard - I'll follow up in a separate
On 27 Apr 2011, at 12:05, Richard Cyganiak wrote:
> Hi Erik, hi Michael,
> This is a comment on the first draft of “URI Fragment Identifiers
> for the text/csv Media Type” , announced here .
>  http://www.ietf.org/id/draft-hausenblas-csv-fragment-00.txt
>  http://lists.w3.org/Archives/Public/uri/2011Apr/0003.html
> Section 2
> The draft does not appear to provide a way of addressing the most
> fundamental part of a CSV file: a cell. I find this confusing, as it
> seems like a really obvious and surprising use case to me. In fact,
> you say that one use case is “making assertions about a certain
> value”. How is this possible given the current design?
> I guess I'm asking for something like this: #cell:temperature,4 to
> address the value in the temperature column, row 4.
> A less critical but perhaps also interesting feature would be Excel-
> style cell ranges, such as #cells:temperature,4:temperature,6.
> Section 2.1
> This is quite fuzzy on the question of header detection. As the
> draft is currently designed, an implementation has to detect whether
> a header is present or not, otherwise it cannot determine what part
> of the table exactly is being addressed. So is the header=present
> thing in the media type the only and canonical way of determining
> presence of headers?
> If that is the case, then what with non-HTTP protocols, e.g., file:///Users/richard/test.csv
> What does #head address if the media type does not indicate the
> presence of a header?
> (A possible solution might be to make the addressed part independent
> of the presence of a header. #head would simply address the first
> row, regardless of whether it's actually a header. Same for #row:0.
> #col:2 would be
> place,Galway,Galway,Galway,Berkeley,Berkeley,Berkeley. If the
> example table had no header, then #col:2 would be
> Galway,Galway,Galway,Berkeley,Berkeley,Berkeley. And #col:Galway
> would be the same. And so on.)
> The first paragraph of 2.1 is poorly written.
> Section 2.2
> How does the row:n format interact with presence/absence of header?
> If a header is present, does #row:0 address the same as #head?
> A handy feature would be to allow addressing of the last row using
> #row:-1 (and similar for the second-to-last row etc).
> What is addressed by #row:1000 if the table has only 10 rows?
> What is the use case for the #row:* format? It seems a bit obscure
> to me and perhaps might better be dropped.
> Section 2.3
> It appears that the header row, if present, is excluded from
> #col:xxx addressing. Maybe this can be clarified in the text.
> What is addressed by #col:xxx if xxx is neither a number nor a
> column in the table?
> What is addressed by #col:2 if there is a column named "2"?
> What is addressed by #col:xxx if no header is present, or if a
> header is present but not indicated in the media type?
> What is addressed by #col:foo if the header contains a duplicate
> column, like foo,bar,baz,foo?
> Section 2.4
> I am unconvinced that the slice-based selection is useful as it is
> described right now. I'd like to understand better what the use case
> is. Personally, I can see more use cases for selecting entire rows
> based on a value match, such as this:
> I would expect the addressed part to be the entire row, including
> the value that was used for the match. Excluding the matched column
> seems a bit strange to me and I just have trouble understanding what
> the motivation is.
> Independently from that: The name “slice-based” isn't very
> appropriate for the current mechanism. “Slice” implies a complete
> “thin” cut along one dimension. That's how it's used in data
> warehouse speak, anyways. In that sense, both row-based and column-
> based selection are slices, but this “slice-based” selection
> actually is not. More accurate would be “table reduction” or “select
> +project”, but admittedly these are not very snappy. Perhaps “value-
> based selection”?
> Section 3
> URI syntax only allows certain characters. Other characters have to
> be escaped. CSV cells also allow only certain characters, but a
> different set, with different escaping rules. I would expect some
> language here that addresses this. For example, if I have a cell row:
> 2011-01-01,1,"Galway, Ireland"
> then what exactly would a #where:place=xxx fragment that selects
> this row look like?
|Free forum by Nabble||Edit this page|