Fixing HTML Links
As we have seen, ProxyPassReverse remaps URLs in the HTTP headers to
ensure they work from outside the company network. There is, however,
a separate problem when links appear in HTML pages served. Consider
the following cases:
- <a href="somefile.html">This link will be resolved by the browser
and will work correctly.</a>
- <a href="/otherfile.html">This link will be resolved by the
browser to http://www.example.com/otherfile.html, which is
incorrect.</a>
- <a href="http://internal1.example.com/">This link will resolve to
"no such host" for the browser.</a>
The same problem of course applies to included content such as images,
stylesheets, scripts or applets, and other contexts where URLs occur
in HTML.
To fix this requires us to parse the HTML and rewrite the links. This
is the purpose of mod_proxy_html. It works as an output filter,
parsing the HTML and rewriting links as it is served. Two
configuration directives are required to set it up:
- SetOutputFilter proxy-html This simply inserts the filter, to
enable ProxyHTMLURLMap
- ProxyHTMLURLMap from-pattern to-pattern [flags] In its basic form,
this has a similar purpose and semantics to ProxyPassReverse.
Additionally, an extended form is available to enable
search-and-replace rewriting of URLs within Scripts and
Stylesheets.