Apache: Automatically convert rST to HTML

At work, our intranet search engine indexes meeting minutes and documentation we wrote in reStructuredText format. Finding and clicking such files results in plain text being shown in the browser, not always easy to read.

My goal last friday was to make Apache automatically convert .rst files to HTML when a browser accesses the file.

Setup

The python-docutils package contains rst2html which does what I wanted: Read rST from stdin and write HTML to stdout.

Apache on the other hand has mod_ext_filter which was made for piping content through an external tool. The setup was easy:

ExtFilterDefine rst2html mode=output \
    outtype=text/html \
    cmd="/usr/bin/rst2html"
 
<FilesMatch "\.rst$">
    SetOutputFilter rst2html
</FilesMatch>

This works fine but gives everyone HTML, even the tools that only understand plain text. Luckily, we can detect if a HTTP client supports HTML by reading the Accept header. mod_ext_filter is able to conditionally turn on filters based on environment variables, and with mod_setenvif we have a facility to set environment variables based on HTTP headers:

ExtFilterDefine rst2html mode=output \
    outtype=text/html \
    EnableEnv=supports_html \
    cmd="/usr/bin/rst2html"
 
<FilesMatch "\.rst$">
    SetEnvIf Accept text/html supports_html
    SetOutputFilter rst2html
</FilesMatch>

The HTTP support check is very basic; we do not parse quality values at all. It suffices for now, but I'd also like a better solution.

If you know a better solution, please send a mail.

Written by Christian Weiske.

Comments? Please send an e-mail.