mod_annot editor

Annotate Section

Fixing HTML Links

As we have seen, ProxyPassReverse remaps URLs in the HTTP headers to ensure they work from outside the company network. There is, however, a separate problem when links appear in HTML pages served. Consider the following cases:

  1. <a href="somefile.html">This link will be resolved by the browser and will work correctly.</a>
  2. <a href="/otherfile.html">This link will be resolved by the browser to http://www.example.com/otherfile.html, which is incorrect.</a>
  3. <a href="http://internal1.example.com/">This link will resolve to "no such host" for the browser.</a>

The same problem of course applies to included content such as images, stylesheets, scripts or applets, and other contexts where URLs occur in HTML.

To fix this requires us to parse the HTML and rewrite the links. This is the purpose of mod_proxy_html. It works as an output filter, parsing the HTML and rewriting links as it is served. Two configuration directives are required to set it up:

  • SetOutputFilter proxy-html This simply inserts the filter, to enable ProxyHTMLURLMap
  • ProxyHTMLURLMap from-pattern to-pattern [flags] In its basic form, this has a similar purpose and semantics to ProxyPassReverse. Additionally, an extended form is available to enable search-and-replace rewriting of URLs within Scripts and Stylesheets.