[css-text] text-transform: capitalize Word Boundaries too strict

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

[css-text] text-transform: capitalize Word Boundaries too strict

Adam Rich
Hi,

I am using text-transform: capitalize; and I'm finding that it treats punctuation within a word as a word boundary. For example:

word(s) gets transformed into Word(S)
afortunado/a gets transformed into Afortunado/A         

In these cases, the S and the A should not be capitalized. In the first case, the S is there to indicate that the word may be plural. In the second case, the word is Spanish and the A is there because the adjective could be for a male or female reader.

Can the word boundary rules get adjusted to deal with these cases?

Thanks,
- Adam
Reply | Threaded
Open this post in threaded view
|

Re: [css-text] text-transform: capitalize Word Boundaries too strict

Sebastian Zartner-3
CSS Text Module Level 3 doesn't define what to consider as word boundary. It lets it up to the user agent to determine this. The related paragraph says this:

"For capitalize, what constitutes a “word“ is UA-dependent; [UAX29] is suggested (but not required) for determining such word boundaries. Authors should not expect capitalize to follow language-specific titlecasing conventions (such as skipping articles in English)."

Note that Gecko actually capitalizes the words as you expect in your examples.

Sebastian

On 6 November 2015 at 09:30, Adam Rich <[hidden email]> wrote:
Hi,

I am using text-transform: capitalize; and I'm finding that it treats punctuation within a word as a word boundary. For example:

word(s) gets transformed into Word(S)
afortunado/a gets transformed into Afortunado/A         

In these cases, the S and the A should not be capitalized. In the first case, the S is there to indicate that the word may be plural. In the second case, the word is Spanish and the A is there because the adjective could be for a male or female reader.

Can the word boundary rules get adjusted to deal with these cases?

Thanks,
- Adam

Reply | Threaded
Open this post in threaded view
|

Re: [css-text] text-transform: capitalize Word Boundaries too strict

Jonathan Kew
On 10/11/15 19:04, Sebastian Zartner wrote:

> CSS Text Module Level 3 doesn't define what to consider as word
> boundary. It lets it up to the user agent to determine this. The related
> paragraph says this:
>
> "For capitalize, what constitutes a “word“ is UA-dependent; [UAX29] is
> suggested (but not required) for determining such word boundaries.
> Authors should not expect capitalize to follow language-specific
> titlecasing conventions (such as skipping articles in English)."
>
> Note that Gecko actually capitalizes the words as you expect in your
> examples.

On the other hand, if you capitalize text with things like "left/right,
forward/back", Gecko gives you "Left/right, Forward/back", which doesn't
look as good as the Webkit result "Left/Right, Forward/Back".

Basically, you can't win -- there's no simple, correct answer without
sophisticated (language- and context-specific) analysis of the content
that would be way out of scope for CSS.

text-transform:capitalize is a quick hack that may sometimes be better
than nothing, but should never be relied on if you want high-quality
typography.

JK

>
> Sebastian
>
> On 6 November 2015 at 09:30, Adam Rich <[hidden email]
> <mailto:[hidden email]>> wrote:
>
>     Hi,
>
>     I am using text-transform: capitalize; and I'm finding that it
>     treats punctuation within a word as a word boundary. For example:
>
>     word(s) gets transformed into Word(S)
>     afortunado/a gets transformed into Afortunado/A
>
>     In these cases, the S and the A should not be capitalized. In the
>     first case, the S is there to indicate that the word may be plural.
>     In the second case, the word is Spanish and the A is there because
>     the adjective could be for a male or female reader.
>
>     Can the word boundary rules get adjusted to deal with these cases?
>
>     Thanks,
>     - Adam
>
>


Reply | Threaded
Open this post in threaded view
|

Re: [css-text] text-transform: capitalize Word Boundaries too strict

Robert O'Callahan-3
On Wed, Nov 11, 2015 at 10:28 AM, Jonathan Kew <[hidden email]> wrote:
On the other hand, if you capitalize text with things like "left/right, forward/back", Gecko gives you "Left/right, Forward/back", which doesn't look as good as the Webkit result "Left/Right, Forward/Back".

Basically, you can't win -- there's no simple, correct answer without sophisticated (language- and context-specific) analysis of the content that would be way out of scope for CSS.

Note that --- as I'm sure you know :-) --- Gecko capitalizes exactly at after line-break opportunities, so you can at least work around cases of insufficient capitalization, e.g. by inserting a <wbr> before 'right'. Which is probably a good idea anyway.

Rob
--
lbir ye,ea yer.tnietoehr  rdn rdsme,anea lurpr  edna e hnysnenh hhe uresyf toD
selthor  stor  edna  siewaoeodm  or v sstvr  esBa  kbvted,t rdsme,aoreseoouoto
o l euetiuruewFa  kbn e hnystoivateweh uresyf tulsa rehr  rdm  or rnea lurpr  
.a war hsrer holsa rodvted,t  nenh hneireseoouot.tniesiewaoeivatewt sstvr  esn
Reply | Threaded
Open this post in threaded view
|

Re: [css-text] text-transform: capitalize Word Boundaries too strict

Jonathan Kew
On 10/11/15 21:40, Robert O'Callahan wrote:

> On Wed, Nov 11, 2015 at 10:28 AM, Jonathan Kew <[hidden email]
> <mailto:[hidden email]>> wrote:
>
>     On the other hand, if you capitalize text with things like
>     "left/right, forward/back", Gecko gives you "Left/right,
>     Forward/back", which doesn't look as good as the Webkit result
>     "Left/Right, Forward/Back".
>
>     Basically, you can't win -- there's no simple, correct answer
>     without sophisticated (language- and context-specific) analysis of
>     the content that would be way out of scope for CSS.
>
>
> Note that --- as I'm sure you know :-) --- Gecko capitalizes exactly at
> after line-break opportunities,

That doesn't seem to be strictly accurate. For example, it doesn't
capitalize after an explicit hyphen in the text, although that is
definitely a line-break opportunity.

> so you can at least work around cases of
> insufficient capitalization, e.g. by inserting a <wbr> before 'right'.
> Which is probably a good idea anyway.

Yes, if you control the content. But if you control the content, you'd
do better to avoid relying on the vagaries of text-transform:capitalize
at all, and instead make sure you explicitly capitalize things exactly
as desired.

IMO, the main use-case for text-transform:capitalize would be when
external content with "uncontrolled" casing is being pulled in -- e.g.
headlines, captions, etc. from a variety of data sources -- and you want
to give it all a reasonably uniform-looking, title-like appearance but
don't want ALL UPPERCASE and can't afford the resources to apply
copy-editing by a human (or a high-end natural-language processing system).

JK


Reply | Threaded
Open this post in threaded view
|

Re: [css-text] text-transform: capitalize Word Boundaries too strict

fantasai
On 11/10/2015 02:17 PM, Jonathan Kew wrote:
>
> Yes, if you control the content. But if you control the content, you'd do better to avoid relying on the vagaries of
> text-transform:capitalize at all, and instead make sure you explicitly capitalize things exactly as desired.
>
> IMO, the main use-case for text-transform:capitalize would be when external content with "uncontrolled" casing is being pulled
> in -- e.g. headlines, captions, etc. from a variety of data sources -- and you want to give it all a reasonably
> uniform-looking, title-like appearance but don't want ALL UPPERCASE and can't afford the resources to apply copy-editing by a
> human (or a high-end natural-language processing system).

Yeah, and know that will often look wrong. Things that you might want
capitalized should be capitalized in source. They can be uppercased or
lowercased as necessary via CSS, but 'text-transform: capitalize' is
really only useful on :first-letter, IMHO.

~fantasai