mirror of
https://github.com/apache/httpd.git
synced 2025-08-20 16:09:55 +00:00

git-svn-id: https://svn.apache.org/repos/asf/httpd/httpd/trunk@1059167 13f79535-47bb-0310-9956-ffa450edef68
494 lines
13 KiB
XML
494 lines
13 KiB
XML
<?xml version="1.0" encoding="UTF-8" ?>
|
|
<!DOCTYPE manualpage SYSTEM "../style/manualpage.dtd">
|
|
<?xml-stylesheet type="text/xsl" href="../style/manual.en.xsl"?>
|
|
<!-- $LastChangedRevision$ -->
|
|
|
|
<!--
|
|
Licensed to the Apache Software Foundation (ASF) under one or more
|
|
contributor license agreements. See the NOTICE file distributed with
|
|
this work for additional information regarding copyright ownership.
|
|
The ASF licenses this file to You under the Apache License, Version 2.0
|
|
(the "License"); you may not use this file except in compliance with
|
|
the License. You may obtain a copy of the License at
|
|
|
|
http://www.apache.org/licenses/LICENSE-2.0
|
|
|
|
Unless required by applicable law or agreed to in writing, software
|
|
distributed under the License is distributed on an "AS IS" BASIS,
|
|
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
|
See the License for the specific language governing permissions and
|
|
limitations under the License.
|
|
-->
|
|
|
|
<manualpage metafile="avoid.xml.meta">
|
|
<parentdocument href="./">Rewrite</parentdocument>
|
|
|
|
<title>Advanced Techniques with mod_rewrite</title>
|
|
|
|
<summary>
|
|
|
|
<p>This document supplements the <module>mod_rewrite</module>
|
|
<a href="../mod/mod_rewrite.html">reference documentation</a>. It provides
|
|
a few advanced techniques and tricks using mod_rewrite.</p>
|
|
|
|
<note type="warning">Note that many of these examples won't work unchanged in your
|
|
particular server configuration, so it's important that you understand
|
|
them, rather than merely cutting and pasting the examples into your
|
|
configuration.</note>
|
|
|
|
</summary>
|
|
<seealso><a href="../mod/mod_rewrite.html">Module documentation</a></seealso>
|
|
<seealso><a href="intro.html">mod_rewrite introduction</a></seealso>
|
|
<seealso><a href="remapping.html">Redirection and remapping</a></seealso>
|
|
<seealso><a href="access.html">Controlling access</a></seealso>
|
|
<seealso><a href="vhosts.html">Virtual hosts</a></seealso>
|
|
<seealso><a href="proxy.html">Proxying</a></seealso>
|
|
<seealso><a href="rewritemap.html">Using RewriteMap</a></seealso>
|
|
<!--<seealso><a href="advanced.html">Advanced techniques and tricks</a></seealso>-->
|
|
<seealso><a href="avoid.html">When not to use mod_rewrite</a></seealso>
|
|
|
|
<section id="sharding">
|
|
|
|
<title>URL-based sharding accross multiple backends</title>
|
|
|
|
<dl>
|
|
<dt>Description:</dt>
|
|
|
|
<dd>
|
|
<p>A common technique for distributing the burden of
|
|
server load or storage space is called "sharding".
|
|
When using this method, a front-end server will use the
|
|
url to consistently "shard" users or objects to separate
|
|
backend servers.</p>
|
|
</dd>
|
|
|
|
<dt>Solution:</dt>
|
|
|
|
<dd>
|
|
<p>A mapping is maintained, from users to target servers, in
|
|
external map files. They look like:</p>
|
|
|
|
<example>
|
|
user1 physical_host_of_user1<br />
|
|
user2 physical_host_of_user2<br />
|
|
: :
|
|
</example>
|
|
|
|
<p>We put this into a <code>map.users-to-hosts</code> file. The
|
|
aim is to map;</p>
|
|
|
|
<example>
|
|
/u/user1/anypath
|
|
</example>
|
|
|
|
<p>to</p>
|
|
|
|
<example>
|
|
http://physical_host_of_user1/u/user/anypath
|
|
</example>
|
|
|
|
<p>thus every URL path need not be valid on every backend physical
|
|
host. The following ruleset does this for us with the help of the map
|
|
files assuming that server0 is a default server which will be used if
|
|
a user has no entry in the map:</p>
|
|
|
|
<example>
|
|
RewriteEngine on<br />
|
|
<br />
|
|
RewriteMap users-to-hosts txt:/path/to/map.users-to-hosts<br />
|
|
<br />
|
|
RewriteRule ^/u/<strong>([^/]+)</strong>/?(.*) http://<strong>${users-to-hosts:$1|server0}</strong>/u/$1/$2
|
|
</example>
|
|
</dd>
|
|
</dl>
|
|
|
|
</section>
|
|
|
|
<section id="on-the-fly-content">
|
|
|
|
<title>On-the-fly Content-Regeneration</title>
|
|
|
|
<dl>
|
|
<dt>Description:</dt>
|
|
|
|
<dd>
|
|
<p>We wish to dynamically generate content, but store it
|
|
statically once it is generated. This rule will check for the
|
|
existence of the static file, and if it's not there, generate
|
|
it. The static files can be removed periodically, if desired (say,
|
|
via cron) and will be regenerated on demand.</p>
|
|
</dd>
|
|
|
|
<dt>Solution:</dt>
|
|
|
|
<dd>
|
|
This is done via the following ruleset:
|
|
|
|
<example>
|
|
# This example is valid in per-directory context only<br />
|
|
RewriteCond %{REQUEST_FILENAME} <strong>!-s</strong><br />
|
|
RewriteRule ^page\.<strong>html</strong>$ page.<strong>cgi</strong> [T=application/x-httpd-cgi,L]
|
|
</example>
|
|
|
|
<p>Here a request for <code>page.html</code> leads to an
|
|
internal run of a corresponding <code>page.cgi</code> if
|
|
<code>page.html</code> is missing or has filesize
|
|
null. The trick here is that <code>page.cgi</code> is a
|
|
CGI script which (additionally to its <code>STDOUT</code>)
|
|
writes its output to the file <code>page.html</code>.
|
|
Once it has completed, the server sends out
|
|
<code>page.html</code>. When the webmaster wants to force
|
|
a refresh of the contents, he just removes
|
|
<code>page.html</code> (typically from <code>cron</code>).</p>
|
|
</dd>
|
|
</dl>
|
|
|
|
</section>
|
|
|
|
<section id="load-balancing">
|
|
|
|
<title>Load Balancing</title>
|
|
|
|
<dl>
|
|
<dt>Description:</dt>
|
|
|
|
<dd>
|
|
<p>We wish to randomly distribute load across several servers
|
|
using mod_rewrite.</p>
|
|
</dd>
|
|
|
|
<dt>Solution:</dt>
|
|
|
|
<dd>
|
|
<p>We'll use <directive
|
|
module="mod_rewrite">RewriteMap</directive> and a list of servers
|
|
to accomplish this.</p>
|
|
|
|
<example>
|
|
RewriteEngine on<br />
|
|
RewriteMap lb rnd:/path/to/serverlist.txt<br />
|
|
<br />
|
|
RewriteRule ^/(.*) http://${lb:servers}/$1 [P,L]
|
|
</example>
|
|
|
|
<p><code>serverlist.txt</code> will contain a list of the servers:</p>
|
|
|
|
<example>
|
|
## serverlist.txt<br />
|
|
<br />
|
|
servers one.example.com|two.example.com|three.example.com<br />
|
|
</example>
|
|
|
|
<p>If you want one particular server to get more of the load than the
|
|
others, add it more times to the list.</p>
|
|
|
|
</dd>
|
|
|
|
<dt>Discussion</dt>
|
|
<dd>
|
|
<p>Apache comes with a load-balancing module -
|
|
<module>mod_proxy_balancer</module> - which is far more flexible and
|
|
featureful than anything you can cobble together using mod_rewrite.</p>
|
|
</dd>
|
|
</dl>
|
|
|
|
</section>
|
|
|
|
<section id="autorefresh">
|
|
|
|
<title>Document With Autorefresh</title>
|
|
|
|
<dl>
|
|
<dt>Description:</dt>
|
|
|
|
<dd>
|
|
<p>Wouldn't it be nice, while creating a complex web page, if
|
|
the web browser would automatically refresh the page every
|
|
time we save a new version from within our editor?
|
|
Impossible?</p>
|
|
</dd>
|
|
|
|
<dt>Solution:</dt>
|
|
|
|
<dd>
|
|
<p>No! We just combine the MIME multipart feature, the
|
|
web server NPH feature, and the URL manipulation power of
|
|
<module>mod_rewrite</module>. First, we establish a new
|
|
URL feature: Adding just <code>:refresh</code> to any
|
|
URL causes the 'page' to be refreshed every time it is
|
|
updated on the filesystem.</p>
|
|
|
|
<example>
|
|
RewriteRule ^(/[uge]/[^/]+/?.*):refresh /internal/cgi/apache/nph-refresh?f=$1
|
|
</example>
|
|
|
|
<p>Now when we reference the URL</p>
|
|
|
|
<example>
|
|
/u/foo/bar/page.html:refresh
|
|
</example>
|
|
|
|
<p>this leads to the internal invocation of the URL</p>
|
|
|
|
<example>
|
|
/internal/cgi/apache/nph-refresh?f=/u/foo/bar/page.html
|
|
</example>
|
|
|
|
<p>The only missing part is the NPH-CGI script. Although
|
|
one would usually say "left as an exercise to the reader"
|
|
;-) I will provide this, too.</p>
|
|
|
|
<example><pre>
|
|
#!/sw/bin/perl
|
|
##
|
|
## nph-refresh -- NPH/CGI script for auto refreshing pages
|
|
## Copyright (c) 1997 Ralf S. Engelschall, All Rights Reserved.
|
|
##
|
|
$| = 1;
|
|
|
|
# split the QUERY_STRING variable
|
|
@pairs = split(/&/, $ENV{'QUERY_STRING'});
|
|
foreach $pair (@pairs) {
|
|
($name, $value) = split(/=/, $pair);
|
|
$name =~ tr/A-Z/a-z/;
|
|
$name = 'QS_' . $name;
|
|
$value =~ s/%([a-fA-F0-9][a-fA-F0-9])/pack("C", hex($1))/eg;
|
|
eval "\$$name = \"$value\"";
|
|
}
|
|
$QS_s = 1 if ($QS_s eq '');
|
|
$QS_n = 3600 if ($QS_n eq '');
|
|
if ($QS_f eq '') {
|
|
print "HTTP/1.0 200 OK\n";
|
|
print "Content-type: text/html\n\n";
|
|
print "&lt;b&gt;ERROR&lt;/b&gt;: No file given\n";
|
|
exit(0);
|
|
}
|
|
if (! -f $QS_f) {
|
|
print "HTTP/1.0 200 OK\n";
|
|
print "Content-type: text/html\n\n";
|
|
print "&lt;b&gt;ERROR&lt;/b&gt;: File $QS_f not found\n";
|
|
exit(0);
|
|
}
|
|
|
|
sub print_http_headers_multipart_begin {
|
|
print "HTTP/1.0 200 OK\n";
|
|
$bound = "ThisRandomString12345";
|
|
print "Content-type: multipart/x-mixed-replace;boundary=$bound\n";
|
|
&print_http_headers_multipart_next;
|
|
}
|
|
|
|
sub print_http_headers_multipart_next {
|
|
print "\n--$bound\n";
|
|
}
|
|
|
|
sub print_http_headers_multipart_end {
|
|
print "\n--$bound--\n";
|
|
}
|
|
|
|
sub displayhtml {
|
|
local($buffer) = @_;
|
|
$len = length($buffer);
|
|
print "Content-type: text/html\n";
|
|
print "Content-length: $len\n\n";
|
|
print $buffer;
|
|
}
|
|
|
|
sub readfile {
|
|
local($file) = @_;
|
|
local(*FP, $size, $buffer, $bytes);
|
|
($x, $x, $x, $x, $x, $x, $x, $size) = stat($file);
|
|
$size = sprintf("%d", $size);
|
|
open(FP, "&lt;$file");
|
|
$bytes = sysread(FP, $buffer, $size);
|
|
close(FP);
|
|
return $buffer;
|
|
}
|
|
|
|
$buffer = &readfile($QS_f);
|
|
&print_http_headers_multipart_begin;
|
|
&displayhtml($buffer);
|
|
|
|
sub mystat {
|
|
local($file) = $_[0];
|
|
local($time);
|
|
|
|
($x, $x, $x, $x, $x, $x, $x, $x, $x, $mtime) = stat($file);
|
|
return $mtime;
|
|
}
|
|
|
|
$mtimeL = &mystat($QS_f);
|
|
$mtime = $mtime;
|
|
for ($n = 0; $n &lt; $QS_n; $n++) {
|
|
while (1) {
|
|
$mtime = &mystat($QS_f);
|
|
if ($mtime ne $mtimeL) {
|
|
$mtimeL = $mtime;
|
|
sleep(2);
|
|
$buffer = &readfile($QS_f);
|
|
&print_http_headers_multipart_next;
|
|
&displayhtml($buffer);
|
|
sleep(5);
|
|
$mtimeL = &mystat($QS_f);
|
|
last;
|
|
}
|
|
sleep($QS_s);
|
|
}
|
|
}
|
|
|
|
&print_http_headers_multipart_end;
|
|
|
|
exit(0);
|
|
|
|
##EOF##
|
|
</pre></example>
|
|
</dd>
|
|
</dl>
|
|
|
|
</section>
|
|
|
|
<section id="structuredhomedirs">
|
|
|
|
<title>Structured Userdirs</title>
|
|
|
|
<dl>
|
|
<dt>Description:</dt>
|
|
|
|
<dd>
|
|
<p>Some sites with thousands of users use a
|
|
structured homedir layout, <em>i.e.</em> each homedir is in a
|
|
subdirectory which begins (for instance) with the first
|
|
character of the username. So, <code>/~larry/anypath</code>
|
|
is <code>/home/<strong>l</strong>/larry/public_html/anypath</code>
|
|
while <code>/~waldo/anypath</code> is
|
|
<code>/home/<strong>w</strong>/waldo/public_html/anypath</code>.</p>
|
|
</dd>
|
|
|
|
<dt>Solution:</dt>
|
|
|
|
<dd>
|
|
<p>We use the following ruleset to expand the tilde URLs
|
|
into the above layout.</p>
|
|
|
|
<example>
|
|
RewriteEngine on<br />
|
|
RewriteRule ^/~(<strong>([a-z])</strong>[a-z0-9]+)(.*) /home/<strong>$2</strong>/$1/public_html$3
|
|
</example>
|
|
</dd>
|
|
</dl>
|
|
|
|
</section>
|
|
|
|
<section id="redirectanchors">
|
|
|
|
<title>Redirecting Anchors</title>
|
|
|
|
<dl>
|
|
<dt>Description:</dt>
|
|
|
|
<dd>
|
|
<p>By default, redirecting to an HTML anchor doesn't work,
|
|
because mod_rewrite escapes the <code>#</code> character,
|
|
turning it into <code>%23</code>. This, in turn, breaks the
|
|
redirection.</p>
|
|
</dd>
|
|
|
|
<dt>Solution:</dt>
|
|
|
|
<dd>
|
|
<p>Use the <code>[NE]</code> flag on the
|
|
<code>RewriteRule</code>. NE stands for No Escape.
|
|
</p>
|
|
</dd>
|
|
|
|
<dt>Discussion:</dt>
|
|
<dd>This technique will of course also work with other
|
|
special characters that mod_rewrite, by default, URL-encodes.</dd>
|
|
</dl>
|
|
|
|
</section>
|
|
|
|
<section id="time-dependent">
|
|
|
|
<title>Time-Dependent Rewriting</title>
|
|
|
|
<dl>
|
|
<dt>Description:</dt>
|
|
|
|
<dd>
|
|
<p>We wish to use mod_rewrite to serve different content based on
|
|
the time of day.</p>
|
|
</dd>
|
|
|
|
<dt>Solution:</dt>
|
|
|
|
<dd>
|
|
<p>There are a lot of variables named <code>TIME_xxx</code>
|
|
for rewrite conditions. In conjunction with the special
|
|
lexicographic comparison patterns <code><STRING</code>,
|
|
<code>>STRING</code> and <code>=STRING</code> we can
|
|
do time-dependent redirects:</p>
|
|
|
|
<example>
|
|
RewriteEngine on<br />
|
|
RewriteCond %{TIME_HOUR}%{TIME_MIN} >0700<br />
|
|
RewriteCond %{TIME_HOUR}%{TIME_MIN} <1900<br />
|
|
RewriteRule ^foo\.html$ foo.day.html [L]<br />
|
|
RewriteRule ^foo\.html$ foo.night.html
|
|
</example>
|
|
|
|
<p>This provides the content of <code>foo.day.html</code>
|
|
under the URL <code>foo.html</code> from
|
|
<code>07:01-18:59</code> and at the remaining time the
|
|
contents of <code>foo.night.html</code>.</p>
|
|
|
|
<note type="warning"><module>mod_cache</module>, intermediate proxies
|
|
and browsers may each cache responses and cause the either page to be
|
|
shown outside of the time-window configured.
|
|
<module>mod_expires</module> may be used to control this
|
|
effect. You are, of course, much better off simply serving the
|
|
content dynamically, and customizing it based on the time of day.</note>
|
|
|
|
</dd>
|
|
</dl>
|
|
|
|
</section>
|
|
|
|
<section id="setenvvars">
|
|
|
|
<title>Set Environment Variables Based On URL Parts</title>
|
|
|
|
<dl>
|
|
<dt>Description:</dt>
|
|
|
|
<dd>
|
|
<p>At time, we want to maintain some kind of status when we
|
|
perform a rewrite. For example, you want to make a note that
|
|
you've done that rewrite, so that you can check later to see if a
|
|
request can via that rewrite. One way to do this is by setting an
|
|
environment variable.</p>
|
|
</dd>
|
|
|
|
<dt>Solution:</dt>
|
|
|
|
<dd>
|
|
<p>Use the [E] flag to set an environment variable.</p>
|
|
|
|
<example>
|
|
RewriteEngine on<br />
|
|
RewriteRule ^/horse/(.*) /pony/$1 [E=<strong>rewritten:1</strong>]
|
|
</example>
|
|
|
|
<p>Later in your ruleset you might check for this environment
|
|
variable using a RewriteCond:</p>
|
|
|
|
<example>
|
|
RewriteCond %{ENV:rewritten} =1
|
|
</example>
|
|
|
|
</dd>
|
|
</dl>
|
|
|
|
</section>
|
|
|
|
</manualpage>
|