Tom Roche
2016-03-31 22:54:34 UTC
summary
=======
Small reST document (linked and attached) has sections with unique names. When I use docutils/restview to convert it to HTML, all but one section id is created by (speaking `sed`ishly) `s/ /-/g`. However *one* section id is "hashed" (i.e., created like a backref), which breaks an explicit internal anchor. Why not create all section IDs in the same way? More importantly, how to fix this (presuming as I do that it's a problem)?
details
=======
background
----------
I frequently generate HTML from reStructuredText, either directly or indirectly. I also frequently do internal linking: i.e., I create explicit links from text in one section of a document to another section. I'm currently working on a reST document which also exhibits the following problem (and have also experienced this previously), which I have "boiled down" to a relatively simple file name=problematic_naming_of_internal_anchors.rst , which I have mounted @
https://bitbucket.org/!api/2.0/snippets/tlroche/LR9oL/HEAD/files/problematic_naming_of_internal_anchors.rst
That is linked in "raw mode" (i.e., no rendering by Bitbucket) so you should see just the characters in the file, as in a text editor. (If you can't follow the link, note I have also attached the contents of the file to this post, following my .sig.) Please also note that the following is NOT about how Bitbucket renders reST (though BB reproduces the problem), since BB has its own problems with {section naming, internal anchors} as detailed here:
https://bitbucket.org/site/master/issues/11314/restructuredtext-link-fragments-require
However, presuming this problem is caused by docutils (as detailed below), fixing it would also improve the lives of everyone writing reST for display "in the cloud."
problem
-------
The problem I wish to raise here is exhibited by `restview <https://pypi.python.org/pypi/restview>`_, which I believe renders by just driving docutils. (Specifically, my version of `restview` renders with docutils-0.12, per header in generated HTML.) The document (problematic_naming_of_internal_anchors.rst) has the following section names, all of which are unique:
for further processing
integrate
move
short-term
next hardware run
short-term bodywear
long-term
long-term bodywear
long-term house goods
lighting
The problem can be illustrated by comparing the section IDs generated for the section names={long-term bodywear, short-term bodywear} and the success of hand-coded links and generated/TOC links to those sections in the text.
1. reST section name='short-term bodywear' generates HTML=
1.1. My hand-coded internal link to that section
2.1. This unexpected behavior breaks my hand-coded internal reference to section name='long-term bodywear'
------------------
ISTM docutils should _always_
1. for unique section names: generate `div id`s by `s/ /-/g`
2. for duplicate section names (and all backrefs): generate `div id`s by serial numbering, i.e. appending a serial number to string='id'
So my first question is, am I missing something? Is there a reason to *not* behave thusly? If not:
My second question is, is there any reason to believe that docutils is *not* producing the above behavior? If so, please lemme know and I'll put an `issue on restview <https://github.com/mgedmin/restview/issues>`_. If not:
My third question presumes this behavior is due to a problem with docutils: is there anything else I should do to help get this fixed? Do I need to make an issue in a tracker? or do something to further debug the problem? or Something Completely Different?
conclusion/attachment
---------------------
If possible, please reply to me (directly) as well as to the list, and
TIA, Tom Roche <***@pobox.com>-----problematic_naming_of_internal_anchors.rst follows to EOF
===
foo
===
.. contents:: **Table of Contents**
for further processing
======================
integrate
---------
move
----
short-term
==========
next hardware run
-----------------
short-term bodywear
-------------------
.. howto style a link (e.g., make it italic): see http://docutils.sourceforge.net/FAQ.html#is-nested-inline-markup-possible
.. |long-term bodywear| replace:: *long-term bodywear*
.. _long-term bodywear: #long-term-bodywear
*see also* |long-term bodywear|_
long-term
=========
long-term bodywear
------------------
.. |short-term bodywear| replace:: *short-term bodywear*
.. _short-term bodywear: #short-term-bodywear
*see also* |short-term bodywear|_
long-term house goods
---------------------
lighting
~~~~~~~~
------------------------------------------------------------------------------
Transform Data into Opportunity.
Accelerate data analysis in your applications with
Intel Data Analytics Acceleration Library.
Click to learn more.
http://pubads.g.doubleclick.net/gampad/clk?id=278785471&iu=/4140
_______________________________________________
Docutils-users mailing list
Docutils-***@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/docutils-users
Please use "Reply All" to reply to the list.
=======
Small reST document (linked and attached) has sections with unique names. When I use docutils/restview to convert it to HTML, all but one section id is created by (speaking `sed`ishly) `s/ /-/g`. However *one* section id is "hashed" (i.e., created like a backref), which breaks an explicit internal anchor. Why not create all section IDs in the same way? More importantly, how to fix this (presuming as I do that it's a problem)?
details
=======
background
----------
I frequently generate HTML from reStructuredText, either directly or indirectly. I also frequently do internal linking: i.e., I create explicit links from text in one section of a document to another section. I'm currently working on a reST document which also exhibits the following problem (and have also experienced this previously), which I have "boiled down" to a relatively simple file name=problematic_naming_of_internal_anchors.rst , which I have mounted @
https://bitbucket.org/!api/2.0/snippets/tlroche/LR9oL/HEAD/files/problematic_naming_of_internal_anchors.rst
That is linked in "raw mode" (i.e., no rendering by Bitbucket) so you should see just the characters in the file, as in a text editor. (If you can't follow the link, note I have also attached the contents of the file to this post, following my .sig.) Please also note that the following is NOT about how Bitbucket renders reST (though BB reproduces the problem), since BB has its own problems with {section naming, internal anchors} as detailed here:
https://bitbucket.org/site/master/issues/11314/restructuredtext-link-fragments-require
However, presuming this problem is caused by docutils (as detailed below), fixing it would also improve the lives of everyone writing reST for display "in the cloud."
problem
-------
The problem I wish to raise here is exhibited by `restview <https://pypi.python.org/pypi/restview>`_, which I believe renders by just driving docutils. (Specifically, my version of `restview` renders with docutils-0.12, per header in generated HTML.) The document (problematic_naming_of_internal_anchors.rst) has the following section names, all of which are unique:
for further processing
integrate
move
short-term
next hardware run
short-term bodywear
long-term
long-term bodywear
long-term house goods
lighting
The problem can be illustrated by comparing the section IDs generated for the section names={long-term bodywear, short-term bodywear} and the success of hand-coded links and generated/TOC links to those sections in the text.
1. reST section name='short-term bodywear' generates HTML=
<div class="section" id="short-term-bodywear">
<h2><a class="toc-backref" href="#id8">short-term bodywear</a></h2>
Note the form of the div attribute='id': it is the section name with all spaces replaced by dashes, aka 's/ /-/g'. This is as I expect (therefore good :-)<h2><a class="toc-backref" href="#id8">short-term bodywear</a></h2>
1.1. My hand-coded internal link to that section
.. |short-term bodywear| replace:: *short-term bodywear*
.. _short-term bodywear: #short-term-bodywear
*see also* |short-term bodywear|_
<p><em>see also</em> <a class="reference external" href="#short-term-bodywear"><em>short-term bodywear</em></a></p>
(I dunno why 'class="reference external"', since this is an internal link, but that's a quibble.).. _short-term bodywear: #short-term-bodywear
*see also* |short-term bodywear|_
<p><em>see also</em> <a class="reference external" href="#short-term-bodywear"><em>short-term bodywear</em></a></p>
<li><a class="reference internal" href="#short-term-bodywear" id="id8">short-term bodywear</a></li>
2. reST section name='long-term bodywear' generates HTML=<div class="section" id="id1">
<h2><a class="toc-backref" href="#id10">long-term bodywear</a></h2>
Note the form of the div attribute='id', which is NOT as I expect. I expect the generated ID to use the same rule (s/ /-/g) as was used to generate the ID from section name='short-term bodywear'; instead the div/section ID is "hashed" by appending a serial number to string='id'.<h2><a class="toc-backref" href="#id10">long-term bodywear</a></h2>
2.1. This unexpected behavior breaks my hand-coded internal reference to section name='long-term bodywear'
.. |long-term bodywear| replace:: *long-term bodywear*
.. _long-term bodywear: #long-term-bodywear
*see also* |long-term bodywear|_
<p><em>see also</em> <a class="reference external" href="#long-term-bodywear"><em>long-term bodywear</em></a></p>
<li><a class="reference internal" href="#id1" id="id10">long-term bodywear</a></li>
solution/questions.. _long-term bodywear: #long-term-bodywear
*see also* |long-term bodywear|_
<p><em>see also</em> <a class="reference external" href="#long-term-bodywear"><em>long-term bodywear</em></a></p>
<li><a class="reference internal" href="#id1" id="id10">long-term bodywear</a></li>
------------------
ISTM docutils should _always_
1. for unique section names: generate `div id`s by `s/ /-/g`
2. for duplicate section names (and all backrefs): generate `div id`s by serial numbering, i.e. appending a serial number to string='id'
So my first question is, am I missing something? Is there a reason to *not* behave thusly? If not:
My second question is, is there any reason to believe that docutils is *not* producing the above behavior? If so, please lemme know and I'll put an `issue on restview <https://github.com/mgedmin/restview/issues>`_. If not:
My third question presumes this behavior is due to a problem with docutils: is there anything else I should do to help get this fixed? Do I need to make an issue in a tracker? or do something to further debug the problem? or Something Completely Different?
conclusion/attachment
---------------------
If possible, please reply to me (directly) as well as to the list, and
TIA, Tom Roche <***@pobox.com>-----problematic_naming_of_internal_anchors.rst follows to EOF
===
foo
===
.. contents:: **Table of Contents**
for further processing
======================
integrate
---------
move
----
short-term
==========
next hardware run
-----------------
short-term bodywear
-------------------
.. howto style a link (e.g., make it italic): see http://docutils.sourceforge.net/FAQ.html#is-nested-inline-markup-possible
.. |long-term bodywear| replace:: *long-term bodywear*
.. _long-term bodywear: #long-term-bodywear
*see also* |long-term bodywear|_
long-term
=========
long-term bodywear
------------------
.. |short-term bodywear| replace:: *short-term bodywear*
.. _short-term bodywear: #short-term-bodywear
*see also* |short-term bodywear|_
long-term house goods
---------------------
lighting
~~~~~~~~
------------------------------------------------------------------------------
Transform Data into Opportunity.
Accelerate data analysis in your applications with
Intel Data Analytics Acceleration Library.
Click to learn more.
http://pubads.g.doubleclick.net/gampad/clk?id=278785471&iu=/4140
_______________________________________________
Docutils-users mailing list
Docutils-***@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/docutils-users
Please use "Reply All" to reply to the list.