Mark Andrews
2015-02-03 11:54:00 UTC
(I asked the following question on stackoverflow two days ago. It has not
been given any answers or even comments and very few views, so I was hoping
that it would ok to ask it again here.)
I would like to extract out the source code verbatim from code directives
in a restructuredtext string.
What follows is my first attempt at doing this, but I would like to know if
there is a better (i.e. more robust, or more general, or more direct) way
of doing it.
Let's say I have the following rst text as a string in python:
s = '''
My title
========
Use this to square a number.
.. code:: python
def square(x):
return x**2
and here is some javascript too.
.. code:: javascript
foo = function() {
console.log('foo');
}
'''
To get the two code blocks, I could do
from docutils.core import publish_doctree
doctree = publish_doctree(s)
source_code = [child.astext() for child in doctree.children if 'code'
in child.attributes['classes']]
Now *source_code* is a list with just the verbatim source code from the two
code blocks. I could also use the *attributes* attribute of *child* to find
out the code types too, if necessary.
It does the job, but is there a better way?
been given any answers or even comments and very few views, so I was hoping
that it would ok to ask it again here.)
I would like to extract out the source code verbatim from code directives
in a restructuredtext string.
What follows is my first attempt at doing this, but I would like to know if
there is a better (i.e. more robust, or more general, or more direct) way
of doing it.
Let's say I have the following rst text as a string in python:
s = '''
My title
========
Use this to square a number.
.. code:: python
def square(x):
return x**2
and here is some javascript too.
.. code:: javascript
foo = function() {
console.log('foo');
}
'''
To get the two code blocks, I could do
from docutils.core import publish_doctree
doctree = publish_doctree(s)
source_code = [child.astext() for child in doctree.children if 'code'
in child.attributes['classes']]
Now *source_code* is a list with just the verbatim source code from the two
code blocks. I could also use the *attributes* attribute of *child* to find
out the code types too, if necessary.
It does the job, but is there a better way?