mod_annot editor

Annotate Section

How it works

mod_proxy_html is based on a SAX parser: specifically the HTMLparser module from libxml2 running in SAX mode (any other parse mode would of course be very much slower, especially for larger documents). It has full knowledge of all URI attributes that can occur in HTML 4 and XHTML 1. Whenever a URL is encountered, it is matched against applicable ProxyHTMLURLMap directives. If it starts with any from-pattern, that will be rewritten to the to-pattern. Rules are applied in the reverse order to their appearance in httpd.conf, and matching stops as soon as a match is found.

Here's how we set up a reverse proxy for HTML. Firstly, full links to the internal servers should be rewritten regardless of where they arise, so we have:


ProxyHTMLURLMap http://internal1.example.com /app1
ProxyHTMLURLMap http://internal2.example.com /app2

Note that in this instance we omitted the "trailing" slash. Since the matching logic is starts-with, we use the minimal matching pattern. We have now globally fixed case 3 above.

Case 2 above requires a little more care. Because the link doesn't include the hostname, the rewrite rule must be context-sensitive. As with ProxyPassReverse above, we deal with that using <Location>


<Location /app1/>
        ProxyHTMLURLMap / /app1/
</Location>
<Location /app2/>
        ProxyHTMLURLMap / /app2/
</Location>