Discussion:
[Docutils-users] Semantic annotation in reStructuredText
Mantas
2014-01-02 13:52:40 UTC
Permalink
Is it possible to add semantic annotations in reStructuredText
documents?

For example, using RDFa, one could annotate things in HTML like this::

<div xmlns:dc="http://purl.org/dc/elements/1.1/"
about="http://www.example.com/books/wikinomics">
<span property="dc:title">Wikinomics</span>
<span property="dc:creator">Don Tapscott</span>
<span property="dc:date">2006-10-01</span>
</div>

This can be converted to Turtle format::

@prefix dc: <http://purl.org/dc/elements/1.1/> .

<http://www.example.com/books/wikinomics>
dc:title "Wikinomics" ;
dc:creator "Don Tapscott" ;
dc:date "2006-10-01" .


As I understand, only way to achieve this in reST, is using rules, some
thing like this::

.. rdf:: turtle

@prefix dc: <http://purl.org/dc/elements/1.1/> .

<http://www.example.com/books/wikinomics>
dc:title {wikinomics-title} ;
dc:creator {wikinomics-creator} ;
dc:date {wikinomics-date} .

.. role:: wikinomics-title
.. role:: wikinomics-creator
.. role:: wikinomics-date

`Wikinomics`:wikinomics-title: `Don Tapscott`:wikinomics-creator:
`2006-10-01`:wikinomics-date:.

Where, ``{...}`` would be used as template variables with values taken
from roles interpreted texts.

The problem is, that roles, must be defined, before use, which in this
case is redundant.

Also, interpreted text can't be nested, while in semantic annotation,
there can be cases, where you need to specify nested annotations. For
example RDFa annotation for "Alice in Wonderland"::

<div
xmlns:dc="http://purl.org/dc/elements/1.1/"
xmlns:foaf="http://xmlns.com/foaf/0.1/"
xmlns:place="http://purl.org/ontology/places#"
about="http://en.wikipedia.org/wiki/Alice_in_Wonderland">
<span property="dc:title">
<span property="foaf:givenName">Alice</span> in
<span property="place:Country">Wonderland</span>
</span>
</div>

I think, for this it would be better to have more flexible interpreted
text form, for example:

.. rdf:: turtle

@prefix dc: <http://purl.org/dc/elements/1.1/> .
@prefix foaf: <http://xmlns.com/foaf/0.1/> .
@prefix place: <http://purl.org/ontology/places#> .

<http://en.wikipedia.org/wiki/Alice_in_Wonderland>
dc:title {alice-in-wonderland} ;
foaf:givenName {alice} ;
place:Country {wonderland} .

{{Alice|alice} in {Wonderland|wonderland}|alice-in-wonderland}.


What do you think?


--
Mantas aka sirex
__o /\
_ \<,_ -- launchpad.net/~sirex -- /\/ \
___(_)/_(_)_____________________________/_/ \
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
engelbert gruber
2014-01-02 22:58:29 UTC
Permalink
my 2c.

to me reStructured Text is a document formatted for humans and programs,
if you look at old rfcs they look very similar and they are for human
readers.

reSt is not text with markup, especially not so excessive as semantic
markup is
as this will make it unreadable for ... ay me?

reSt differs from , eg latex or rtf, in that it is two dimensional
(indentation matters)

adding annotations to words is a third dimension which really is difficult
to comprehend on a 2D screen/paper.

(very interesting problem , but beats me)

just my 2c

e
Post by Mantas
Is it possible to add semantic annotations in reStructuredText
documents?
<div xmlns:dc="http://purl.org/dc/elements/1.1/"
about="http://www.example.com/books/wikinomics">
<span property="dc:title">Wikinomics</span>
<span property="dc:creator">Don Tapscott</span>
<span property="dc:date">2006-10-01</span>
</div>
@prefix dc: <http://purl.org/dc/elements/1.1/> .
<http://www.example.com/books/wikinomics>
dc:title "Wikinomics" ;
dc:creator "Don Tapscott" ;
dc:date "2006-10-01" .
As I understand, only way to achieve this in reST, is using rules, some
.. rdf:: turtle
@prefix dc: <http://purl.org/dc/elements/1.1/> .
<http://www.example.com/books/wikinomics>
dc:title {wikinomics-title} ;
dc:creator {wikinomics-creator} ;
dc:date {wikinomics-date} .
.. role:: wikinomics-title
.. role:: wikinomics-creator
.. role:: wikinomics-date
`2006-10-01`:wikinomics-date:.
Where, ``{...}`` would be used as template variables with values taken
from roles interpreted texts.
The problem is, that roles, must be defined, before use, which in this
case is redundant.
Also, interpreted text can't be nested, while in semantic annotation,
there can be cases, where you need to specify nested annotations. For
<div
xmlns:dc="http://purl.org/dc/elements/1.1/"
xmlns:foaf="http://xmlns.com/foaf/0.1/"
xmlns:place="http://purl.org/ontology/places#"
about="http://en.wikipedia.org/wiki/Alice_in_Wonderland">
<span property="dc:title">
<span property="foaf:givenName">Alice</span> in
<span property="place:Country">Wonderland</span>
</span>
</div>
I think, for this it would be better to have more flexible interpreted
.. rdf:: turtle
@prefix dc: <http://purl.org/dc/elements/1.1/> .
@prefix foaf: <http://xmlns.com/foaf/0.1/> .
@prefix place: <http://purl.org/ontology/places#> .
<http://en.wikipedia.org/wiki/Alice_in_Wonderland>
dc:title {alice-in-wonderland} ;
foaf:givenName {alice} ;
place:Country {wonderland} .
{{Alice|alice} in {Wonderland|wonderland}|alice-in-wonderland}.
What do you think?
--
Mantas aka sirex
__o /\
_ \<,_ -- launchpad.net/~sirex -- /\/ \
___(_)/_(_)_____________________________/_/ \
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
------------------------------------------------------------------------------
Rapidly troubleshoot problems before they affect your business. Most IT
organizations don't have a clear picture of how application performance
affects their revenue. With AppDynamics, you get 100% visibility into your
Java,.NET, & PHP application. Start your 15-day FREE TRIAL of AppDynamics
Pro!
http://pubads.g.doubleclick.net/gampad/clk?id=84349831&iu=/4140/ostg.clktrk
_______________________________________________
Docutils-users mailing list
https://lists.sourceforge.net/lists/listinfo/docutils-users
Please use "Reply All" to reply to the list.
engelbert gruber
2014-01-03 10:43:42 UTC
Permalink
hi , i reordered your mail

last thing first
By the way, "_" prefix in very annoying in hyperlink targets, since I
can't use autocompletion and is very unintuitive, because reference
names in text are suffixed with "_", not
prefixed.Adventures_in_Wonderland)

that is because it is for human readers/writers , trailing "_" points to ,
leading "_" is pointed.
in html::

<a href="thething">the thing_</a>

points to <a id="thething"> the thing

you have to distinguish also for computers, because the two are different.
Quoting engelbert gruber (2014-01-03 00:58:29)
Post by engelbert gruber
adding annotations to words is a third dimension which really is
difficult
Post by engelbert gruber
to comprehend on a 2D screen/paper.
I think, this is the same thing, as links, for example annotation, could
`Alice_'s Adventures in Wonderland_`_.
http://dbpedia.org/resource/Alice_(Alice's_Adventures_in_Wonderland)
http://dbpedia.org/resource/Wonderland_(fictional_country)
http://dbpedia.org/resource/Alice's_Adventures_in_Wonderland
but this is only useful for rdf ?
wouldnt it then be better to make a rdf-writer, put the links into a second
file or separate section
and only merge them on output generation.
keeping separate things separate, is keeping things simple wne simple
things are less complex, means buggy.

although: semantic information reminds me of an document index, as both
refer to the important words.

all the best
e
This is quite complex example, that also includes nested links. This
way, we only provide links to external resources to separate text from
data description. It is not necessary to include triplets into document,
they can be provided as separate files. The only thing, that is missing
here, that real web links must be distinguished from RDF URI's, because
applications need to know how to render links if they are not links to
pages, but are semantic definitions. To solve this, ``rdf+`` prefix can
`Alice_'s Adventures in Wonderland_`_.
.. _Alice: rdf+
http://dbpedia.org/resource/Alice_(Alice's_Adventures_in_Wonderland)
.. _Wonderland: rdf+
http://dbpedia.org/resource/Wonderland_(fictional_country)
.. _Alice's Adventures in Wonderland: rdf+
http://dbpedia.org/resource/Alice's_Adventures_in_Wonderland
`Alice_'s Adventures in Wonderland_`_.
.. _Alice: dbpedia:Alice_(Alice's_Adventures_in_Wonderland)
.. _Wonderland: dbpedia:Wonderland_(fictional_country)
dbpedia:Alice's_Adventures_in_Wonderland
dbpedia: rdf+http://dbpedia.org/resource/
Guenter Milde
2014-01-03 22:44:41 UTC
Permalink
Post by Mantas
Is it possible to add semantic annotations in reStructuredText
documents?
Currently not. At least not out of the box.
Post by Mantas
<div xmlns:dc="http://purl.org/dc/elements/1.1/"
about="http://www.example.com/books/wikinomics">
<span property="dc:title">Wikinomics</span>
<span property="dc:creator">Don Tapscott</span>
<span property="dc:date">2006-10-01</span>
</div>
@prefix dc: <http://purl.org/dc/elements/1.1/> .
<http://www.example.com/books/wikinomics>
dc:title "Wikinomics" ;
dc:creator "Don Tapscott" ;
dc:date "2006-10-01" .
As I understand, only way to achieve this in reST, is using rules, some
.. rdf:: turtle
@prefix dc: <http://purl.org/dc/elements/1.1/> .
<http://www.example.com/books/wikinomics>
dc:title {wikinomics-title} ;
dc:creator {wikinomics-creator} ;
dc:date {wikinomics-date} .
.. role:: wikinomics-title
.. role:: wikinomics-creator
.. role:: wikinomics-date
`2006-10-01`:wikinomics-date:.
Where, ``{...}`` would be used as template variables with values taken
from roles interpreted texts.
There is no "rdf" directive. If you are creating a new directive, you can
define a list of options (e.g. the set of dc identifiers) and then write
(and parse) something like::

.. rdf:: http://www.example.com/books/wikinomics
:title: Wikinomics
:creator: Don Tapscott
:date: 2006-10-01

but maybe I don't understand your problem.
Post by Mantas
The problem is, that roles, must be defined, before use, which in this
case is redundant.
...
Post by Mantas
What do you think?
Maybe reStructuredText/Docutils is not the right tool for your task?

Günter
Mantas
2014-01-05 15:35:53 UTC
Permalink
Sorry if this is duplicate email, but I can't find my last sent email
on list archives, so I guess that my email is lost (gmail some times
refuses to deliver my emails).


---------- Forwarded message ----------
From: Mantas <***@gmail.com>
Date: 2014/1/4
Subject: Re: [Docutils-users] Semantic annotation in reStructuredText
To: Guenter Milde <***@users.sf.net>, docutils-***@lists.sourceforge.net


Quoting Guenter Milde (2014-01-04 00:44:41)
Post by Guenter Milde
Maybe reStructuredText/Docutils is not the right tool for your task?
I'm writing my master thesis about using linked data in legal documents.
And evaluating reStructuredText as one of possible formats that could be
used for legal documents authoring. And at least now it seems, that
reStructuredText works for this quite well.

In legal documents annotations in first place would be used in fields,
to reference things to URI's for general document meta data. Legal text
must be very explicit and possibility to annotate things helps a lot.
And of course, annotated text is very friendly for search engines. For
well annotated database of legal documents, one could ask for documents
in a very specific way to get exactly what is asked.

RDFa works very well for HTML, but it is hard to write raw HTML
especially annotated with RDFa. In order to user HTML and RDFa for legal
documents, in first place complex tools and editors must be implemented.
So that's why I'm searching for ways to avoid complex tooling and at the
same time be able to use all power that brings semantic annotation.

Another format that I'm looking at is `Akoma Ntoso`_, especially
designed for legal acts, but again, this is XML. reStructuredText
converted to XML is very close to `Akoma Ntoso`_ XML structure even some
tags are identical. So I think, that reStructuredText is best suited for
writing text, then this text can be converted to XML for other tools and
libraries, like `Akoma Ntoso`_.

.. _Akoma Ntoso: http://www.akomantoso.org/

Here is what I came up to at this point. An example document that I want to
annotate::

My name is Manu Sporny and you can give me a ring via
1-800-555-0199. My favorite animal is the Liger.

Using RDFa_, annotated version for HTML will be this::

<p vocab="http://schema.org/"
prefix="ov: http://open.vocab.org/terms/"
resource="#manu" typeof="Person">
My name is
<span property="name">Manu Sporny</span>
and you can give me a ring via
<span property="telephone">1-800-555-0199</span>.
My favorite animal is the
<span property="ov:preferredAnimal">Liger</span>.
</p>

.. _RDFa: http://www.w3.org/TR/rdfa-lite/


Now in reST, for annotation one custom directive and one role is needed,
I will call them both with ``rdf``. Custom directive will be used for
defining base vocabulary, prefixes, resource that is mentioned in text,
type of that resource and properties. Properties works same way as
anonymous hyperlink targets using ``__`` notation, here I will call them
anonymous property targets.

Custom role, called ``rdf`` is used to reference text snippets that are going
to be annotated, I will call them annotation references. Target URI's for
Annotation references can embedded directly, can be assigned indirectly using
anonymous properties or ``rdf`` directive. Annotation references also can embed
resource URI together with type and property. Annotation references can embed
only property assignment.

So annotated reST document using these features could look like this:

My name is `Manu Sporny`:rdf: and you can give me a ring via
`1-800-555-0199`:rdf:. My favorite animal is the `Liger`:rdf:.

.. rdf::
:vocab: http://schema.org/
:prefix: ov: http://open.vocab.org/terms/
:resource: #manu
:typeof: Person
:property: name
:property: phone
:property: ov:preferredAnimal

In this form all annotation references are anonymous and meaning for these
references are assigned below with ``rdf`` directive.

Another example with predefined resource and embedded properties:

.. rdf::
:vocab: http://schema.org/
:prefix: ov: http://open.vocab.org/terms/
:resource: #manu
:typeof: Person
:usefor: all

My name is `Manu Sporny <a name>`:rdf: and you can give me a ring via
`1-800-555-0199 <a phone>`:rdf:. My favorite animal is the `Liger <a
ov:preferredAnimal>`:rdf:.

Here resource and resource type are predefined in ``rdf`` directive and all
following annotation references can use it, since ``userfor`` specifies, that
this resource is default for all annotation references.

It is possible to override default resource for anonymous annotation
references:

.. rdf::
:vocab: http://schema.org/
:prefix: ov: http://open.vocab.org/terms/
:resource: #manu
:typeof: Person
:usefor: all

My name is `Manu Sporny <a name>`:rdf: and you can give me a ring via
`1-800-555-0199 <a phone>`:rdf:. My friend's favorite animal is the
`Liger`:rdf:.

.. rdf::
:resource: #manu-friend
:typeof: Person
:property: ov:preferredAnimal

It is possible to define default resource and type for all following annotation
references in one paragraph using embedded form:

.. rdf::
:vocab: http://schema.org/
:prefix: ov: http://open.vocab.org/terms/

My name is `Manu Sporny <#manu a Person, name>`:rdf: and you can give me
a ring via `1-800-555-0199 <a phone>`:rdf:. My favorite animal is the
`Liger <a ov:preferredAnimal>`:rdf:.

If an external triple store already has information about resource type, then
just resource URI can be specified:

.. rdf::
:vocab: http://schema.org/
:prefix: ov: http://open.vocab.org/terms/

My name is `Manu Sporny <#manu, name>`:rdf: and you can give me a ring via
`1-800-555-0199 <a phone>`:rdf:. My friend's favorite animal is the `Liger
<#manu-friend, ov:preferredAnimal>`:rdf:.

And finally, possibility to assign default property:

.. rdf::
:vocab: http://schema.org/
:prefix: ov: http://open.vocab.org/terms/
:property: name
:usefor: all

My name is `Manu Sporny <#manu>`:rdf: and you can give me a ring via
`1-800-555-0199 <a phone>`:rdf:. My friend's favorite animal is the `Liger
<#manu-friend, ov:preferredAnimal>`:rdf:.


What do you think?

I think, it is quite readable and fully compatible and convertible to
RDFa Lite and of course one can easily get set of RDF triples directly
from reST's XML file.
--
Mantas aka sirex
__o /\
_ \<,_ -- launchpad.net/~sirex -- /\/ \
___(_)/_(_)_____________________________/_/ \
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Guenter Milde
2014-01-05 21:03:07 UTC
Permalink
Dear Mantas,
Post by Mantas
Quoting Guenter Milde (2014-01-04 00:44:41)
Post by Guenter Milde
Maybe reStructuredText/Docutils is not the right tool for your task?
I'm writing my master thesis about using linked data in legal documents.
And evaluating reStructuredText as one of possible formats that could be
used for legal documents authoring. And at least now it seems, that
reStructuredText works for this quite well.
...

OK. I see that while reStructuredText/Docutils will not work out of the
box for your task, it may become a component in a framework for authoring
such documents. You may use a special "writer" or an XML post-processor
to get the desired output.
Post by Mantas
Here is what I came up to at this point. An example document that I
My name is Manu Sporny and you can give me a ring via
1-800-555-0199. My favorite animal is the Liger.
<p vocab="http://schema.org/"
prefix="ov: http://open.vocab.org/terms/"
resource="#manu" typeof="Person">
My name is
<span property="name">Manu Sporny</span>
and you can give me a ring via
<span property="telephone">1-800-555-0199</span>.
My favorite animal is the
<span property="ov:preferredAnimal">Liger</span>.
</p>
.. _RDFa: http://www.w3.org/TR/rdfa-lite/
Now in reST, for annotation one custom directive and one role is needed,
I will call them both with ``rdf``. Custom directive will be used for
defining base vocabulary, prefixes, resource that is mentioned in text,
type of that resource and properties. Properties works same way as
anonymous hyperlink targets using ``__`` notation, here I will call them
anonymous property targets.
Custom role, called ``rdf`` is used to reference text snippets that are
going to be annotated, I will call them annotation references. Target
URI's for Annotation references can embedded directly, can be assigned
indirectly using anonymous properties or ``rdf`` directive. Annotation
references also can embed resource URI together with type and property.
Annotation references can embed only property assignment.
My name is `Manu Sporny`:rdf: and you can give me a ring via
`1-800-555-0199`:rdf:. My favorite animal is the `Liger`:rdf:.
:vocab: http://schema.org/
:prefix: ov: http://open.vocab.org/terms/
:resource: #manu
:typeof: Person
:property: name
:property: phone
:property: ov:preferredAnimal
In this form all annotation references are anonymous and meaning for these
references are assigned below with ``rdf`` directive.
To my eyes, a better match to the "annotated paragraph" object would be a
directive with content::

.. rdf:: manu
:vocab: http://schema.org/
:prefix: ov: http://open.vocab.org/terms/
:typeof: Person
:property: name
:property: phone
:property: ov:preferredAnimal

My name is `Manu Sporny`:rdf: and you can give me a ring via
`1-800-555-0199`:rdf:. My favorite animal is the `Liger`:rdf:.

* It becomes clear, that the paragraph becomes the "rdf resource object".

* The resource tag is specified as directive argument - if it should be
optional, I'd use the standard option for identifiers, ``:name:``.

* It could be possible to program the directive in a way that for the
directive content the "default role" becomes "rdf". Then the content
would be easier to read and write::

My name is `Manu Sporny` and you can give me a ring via
`1-800-555-0199`. My favorite animal is the `Liger`.
Post by Mantas
:vocab: http://schema.org/
:prefix: ov: http://open.vocab.org/terms/
:resource: #manu
:typeof: Person
:usefor: all
My name is `Manu Sporny <a name>`:rdf: and you can give me a ring via
`1-800-555-0199 <a phone>`:rdf:. My favorite animal is the `Liger <a
ov:preferredAnimal>`:rdf:.
Here, you could use custom roles inheriting from rdf:

.. role:: name(rdf)
.. role:: phone(rdf)
.. role:: animal(rdf)
:class: ov:preferredAnimal

and then write::

My name is `Manu Sporny`:name: and you can give me a ring via
:phone:`1-800-555-0199`. My favorite animal is the :animal:`Liger`.

Of course, this only makes sense if there are more than one instances using
the custom roles.

...
Post by Mantas
What do you think?
It still looks complicated for me, especially the "useall" forms and
nesting. This may be due to missing experience in the field.
Post by Mantas
I think, it is quite readable and fully compatible and convertible to
RDFa Lite and of course one can easily get set of RDF triples directly
from reST's XML file.
I agree that it looks promising.

Günter
Matěj Cepl
2014-01-05 22:07:53 UTC
Permalink
Post by Mantas
Sorry if this is duplicate email, but I can't find my last sent email
on list archives, so I guess that my email is lost (gmail some times
refuses to deliver my emails).
You may be interested in the community around LyX for lawyers, I think.
http://wiki.lyx.org/LyX/HumanitiesLyX
https://groups.google.com/forum/#!forum/latex-for-lawyers
http://www.ctan.org/topic/legal
http://www.jurawiki.de/LaTeX (unfortunately, in German only)

Best,

Matěj
--
http://www.ceplovi.cz/matej/, Jabber: ***@ceplovi.cz
GPG Finger: 89EF 4BC6 288A BF43 1BAB 25C3 E09F EF25 D964 84AC

We can tell our level of faith in what God wants to do for us by
our level of enthusiasm for what we want God to do for other.
-- Dave Schmelzer
Loading...