Bug in fop: bidi direction bleeds forward

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

Bug in fop: bidi direction bleeds forward

Raphael Finkel
Quick summary: RTL inserts in a bi-directional document modify the
directionality of brackets following those inserts.

I am using Apache fop-1.1 on Ubuntu 12.04.2 to generate PDF output.  My
text includes Greek and Hebrew, so I specify font-family="FreeSans" in
all <fo:inline> sections.  I use a personal configuration so fop
registers /usr/share/fonts/truetype/freefont/FreeSans.ttf.  

The bug is elicited by the following piece of text (inside XML) that I
am rendering:

        An etymology for the name of the prophet Habakkuk (in the
        Septuagint, Ambakoum or Avvakoum), based on two Aramaic words found
        in the New Testament. The Suda is drawing from older onomastica; the
        same etymology is found in the Origenistic lexicon (see
        bibliography). <br/>[1] See already alpha 10. The Hebrew/Aramaic אבּא
        means father.<br/>[2] The Hebrew/Aramaic קום kum means arise; it can
        also be used to mean awake.[3] Mark 5:41 (web address 1).<br/>[4]
        The Suda is correct. The doubling of the בּ is indicated by its dot
        (dagesh); unlike Greek, Hebrew and Aramaic do not replicate doubled
        letters.

The xsl template I use for <br> is:

        <xsl:template match="br">
                <fo:block></fo:block>
        </xsl:template>

The bug is that the brackets [ ] and ( ) are often reversed, so I get
PDF with this content:

        An etymology for the name of the prophet Habakkuk )in the
        Septuagint, Ambakoum or Avvakoum(, based on two Aramaic words found
        in the New Testament. The Suda is drawing from older onomastica; the
        same etymology is found in the Origenistic lexicon )see
        bibliography(.

        [1] See already alpha 10. The Hebrew/Aramaic אבּא means father.

        [2] The Hebrew/Aramaic קום kum means arise; it can also be used to mean
        awake.

        ]3[ Mark 5:41 )web address 1(.

        [4] The Suda is correct. The doubling of the בּ is indicated by its
        dot (dagesh); unlike Greek, Hebrew and Aramaic do not replicate
        doubled letters.

I do not see this misbehavior when I use xmlroff 0.6.2 instead of fop 1.1.

Raphael


Reply | Threaded
Open this post in threaded view
|

Re: Bug in fop: bidi direction bleeds forward

Glenn Adams-2
Hi Raphael,

First, you need to move to an appropriate ML. This is a ML about general XSL-FO matters, and not about specific implementation issues. The correct ML in this case is [hidden email].

Second, you need to provide an example (minimal) XSL-FO input file (not an XML file used as input for XSLT processing), the resulting PDF output file, your fop.xconf file, and relevant console output from running the command.

Regards,
Glenn



On Mon, Aug 5, 2013 at 2:41 PM, Raphael Finkel <[hidden email]> wrote:
Quick summary: RTL inserts in a bi-directional document modify the
directionality of brackets following those inserts.

I am using Apache fop-1.1 on Ubuntu 12.04.2 to generate PDF output.  My
text includes Greek and Hebrew, so I specify font-family="FreeSans" in
all <fo:inline> sections.  I use a personal configuration so fop
registers /usr/share/fonts/truetype/freefont/FreeSans.ttf.

The bug is elicited by the following piece of text (inside XML) that I
am rendering:

        An etymology for the name of the prophet Habakkuk (in the
        Septuagint, Ambakoum or Avvakoum), based on two Aramaic words found
        in the New Testament. The Suda is drawing from older onomastica; the
        same etymology is found in the Origenistic lexicon (see
        bibliography). <br/>[1] See already alpha 10. The Hebrew/Aramaic אבּא
        means father.<br/>[2] The Hebrew/Aramaic קום kum means arise; it can
        also be used to mean awake.[3] Mark 5:41 (web address 1).<br/>[4]
        The Suda is correct. The doubling of the בּ is indicated by its dot
        (dagesh); unlike Greek, Hebrew and Aramaic do not replicate doubled
        letters.

The xsl template I use for <br> is:

        <xsl:template match="br">
                <fo:block></fo:block>
        </xsl:template>

The bug is that the brackets [ ] and ( ) are often reversed, so I get
PDF with this content:

        An etymology for the name of the prophet Habakkuk )in the
        Septuagint, Ambakoum or Avvakoum(, based on two Aramaic words found
        in the New Testament. The Suda is drawing from older onomastica; the
        same etymology is found in the Origenistic lexicon )see
        bibliography(.

        [1] See already alpha 10. The Hebrew/Aramaic אבּא means father.

        [2] The Hebrew/Aramaic קום kum means arise; it can also be used to mean
        awake.

        ]3[ Mark 5:41 )web address 1(.

        [4] The Suda is correct. The doubling of the בּ is indicated by its
        dot (dagesh); unlike Greek, Hebrew and Aramaic do not replicate
        doubled letters.

I do not see this misbehavior when I use xmlroff 0.6.2 instead of fop 1.1.

Raphael