mirror of
https://github.com/apache/httpd.git
synced 2025-08-10 02:56:11 +00:00

* Quoted arguments to Rewrite{Base,Cond,Map,Rule}. git-svn-id: https://svn.apache.org/repos/asf/httpd/httpd/trunk@1673945 13f79535-47bb-0310-9956-ffa450edef68
191 lines
8.7 KiB
XML
191 lines
8.7 KiB
XML
<?xml version='1.0' encoding='UTF-8' ?>
|
|
<!DOCTYPE manualpage SYSTEM "../style/manualpage.dtd">
|
|
<?xml-stylesheet type="text/xsl" href="../style/manual.en.xsl"?>
|
|
<!-- $LastChangedRevision$ -->
|
|
|
|
<!--
|
|
Licensed to the Apache Software Foundation (ASF) under one or more
|
|
contributor license agreements. See the NOTICE file distributed with
|
|
this work for additional information regarding copyright ownership.
|
|
The ASF licenses this file to You under the Apache License, Version 2.0
|
|
(the "License"); you may not use this file except in compliance with
|
|
the License. You may obtain a copy of the License at
|
|
|
|
http://www.apache.org/licenses/LICENSE-2.0
|
|
|
|
Unless required by applicable law or agreed to in writing, software
|
|
distributed under the License is distributed on an "AS IS" BASIS,
|
|
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
|
See the License for the specific language governing permissions and
|
|
limitations under the License.
|
|
-->
|
|
|
|
<manualpage metafile="tech.xml.meta">
|
|
<parentdocument href="./">Rewrite</parentdocument>
|
|
|
|
<title>Apache mod_rewrite Technical Details</title>
|
|
|
|
<summary>
|
|
<p>This document discusses some of the technical details of mod_rewrite
|
|
and URL matching.</p>
|
|
</summary>
|
|
<seealso><a href="../mod/mod_rewrite.html">Module documentation</a></seealso>
|
|
<seealso><a href="intro.html">mod_rewrite introduction</a></seealso>
|
|
<seealso><a href="remapping.html">Redirection and remapping</a></seealso>
|
|
<seealso><a href="access.html">Controlling access</a></seealso>
|
|
<seealso><a href="vhosts.html">Virtual hosts</a></seealso>
|
|
<seealso><a href="proxy.html">Proxying</a></seealso>
|
|
<seealso><a href="rewritemap.html">Using RewriteMap</a></seealso>
|
|
<seealso><a href="advanced.html">Advanced techniques</a></seealso>
|
|
<seealso><a href="avoid.html">When not to use mod_rewrite</a></seealso>
|
|
|
|
<section id="InternalAPI"><title>API Phases</title>
|
|
|
|
<p>The Apache HTTP Server handles requests in several phases. At
|
|
each of these phases, one or more modules may be called upon to
|
|
handle that portion of the request lifecycle. Phases include things
|
|
like URL-to-filename translation, authentication, authorization,
|
|
content, and logging. (This is not an exhaustive list.)</p>
|
|
|
|
<p>mod_rewrite acts in two of these phases (or "hooks", as they are
|
|
often called) to influence how URLs may be rewritten.</p>
|
|
|
|
<p>First, it uses the URL-to-filename translation hook, which occurs
|
|
after the HTTP request has been read, but before any authorization
|
|
starts. Secondly, it uses the Fixup hook, which is after the
|
|
authorization phases, and after per-directory configuration files
|
|
(<code>.htaccess</code> files) have been read, but before the
|
|
content handler is called.</p>
|
|
|
|
<p>After a request comes in and a corresponding server or
|
|
virtual host has been determined, the rewriting engine starts
|
|
processing any <code>mod_rewrite</code> directives appearing in the
|
|
per-server configuration. (i.e., in the main server configuration file
|
|
and <directive module="core" type="section">Virtualhost</directive>
|
|
sections.) This happens in the URL-to-filename phase.</p>
|
|
|
|
<p>A few steps later, once the final data directories have been found,
|
|
the per-directory configuration directives (<code>.htaccess</code>
|
|
files and <directive module="core"
|
|
type="section">Directory</directive> blocks) are applied. This
|
|
happens in the Fixup phase.</p>
|
|
|
|
<p>In each of these cases, mod_rewrite rewrites the
|
|
<code>REQUEST_URI</code> either to a new URL, or to a filename.</p>
|
|
|
|
<p>In per-directory context (i.e., within <code>.htaccess</code> files
|
|
and <code>Directory</code> blocks), these rules are being applied
|
|
after a URL has already been translated to a filename. Because of
|
|
this, the URL-path that mod_rewrite initially compares <directive
|
|
module="mod_rewrite">RewriteRule</directive> directives against
|
|
is the full filesystem path to the translated filename with the current
|
|
directories path (including a trailing slash) removed from the front.</p>
|
|
|
|
<p> To illustrate: If rules are in /var/www/foo/.htaccess and a request
|
|
for /foo/bar/baz is being processed, an expression like ^bar/baz$ would
|
|
match.</p>
|
|
|
|
<p> If a substitution is made in per-directory context, a new internal
|
|
subrequest is issued with the new URL, which restarts processing of the
|
|
request phases. If the substitution is a relative path, the <directive
|
|
module="mod_rewrite">RewriteBase</directive> directive
|
|
determines the URL-path prefix prepended to the substitution.
|
|
In per-directory context, care must be taken to
|
|
create rules which will eventually (in some future "round" of per-directory
|
|
rewrite processing) not perform a substitution to avoid looping.
|
|
(See <a href="http://wiki.apache.org/httpd/RewriteLooping">RewriteLooping</a>
|
|
for further discussion of this problem.)</p>
|
|
|
|
<p>Because of this further manipulation of the URL in per-directory
|
|
context, you'll need to take care to craft your rewrite rules
|
|
differently in that context. In particular, remember that the
|
|
leading directory path will be stripped off of the URL that your
|
|
rewrite rules will see. Consider the examples below for further
|
|
clarification.</p>
|
|
|
|
<table border="1">
|
|
|
|
<tr>
|
|
<th>Location of rule</th>
|
|
<th>Rule</th>
|
|
</tr>
|
|
|
|
<tr>
|
|
<td>VirtualHost section</td>
|
|
<td>RewriteRule "^/images/(.+)\.jpg" "/images/$1.gif"</td>
|
|
</tr>
|
|
|
|
<tr>
|
|
<td>.htaccess file in document root</td>
|
|
<td>RewriteRule "^images/(.+)\.jpg" "images/$1.gif"</td>
|
|
</tr>
|
|
|
|
<tr>
|
|
<td>.htaccess file in images directory</td>
|
|
<td>RewriteRule "^(.+)\.jpg" "$1.gif"</td>
|
|
</tr>
|
|
|
|
</table>
|
|
|
|
<p>For even more insight into how mod_rewrite manipulates URLs in
|
|
different contexts, you should consult the <a
|
|
href="../mod/mod_rewrite.html#logging">log entries</a> made during
|
|
rewriting.</p>
|
|
|
|
</section>
|
|
|
|
<section id="InternalRuleset"><title>Ruleset Processing</title>
|
|
|
|
<p>Now when mod_rewrite is triggered in these two API phases, it
|
|
reads the configured rulesets from its configuration
|
|
structure (which itself was either created on startup for
|
|
per-server context or during the directory walk of the Apache
|
|
kernel for per-directory context). Then the URL rewriting
|
|
engine is started with the contained ruleset (one or more
|
|
rules together with their conditions). The operation of the
|
|
URL rewriting engine itself is exactly the same for both
|
|
configuration contexts. Only the final result processing is
|
|
different.</p>
|
|
|
|
<p>The order of rules in the ruleset is important because the
|
|
rewriting engine processes them in a special (and not very
|
|
obvious) order. The rule is this: The rewriting engine loops
|
|
through the ruleset rule by rule (<directive
|
|
module="mod_rewrite">RewriteRule</directive> directives) and
|
|
when a particular rule matches it optionally loops through
|
|
existing corresponding conditions (<code>RewriteCond</code>
|
|
directives). For historical reasons the conditions are given
|
|
first, and so the control flow is a little bit long-winded. See
|
|
Figure 1 for more details.</p>
|
|
<p class="figure">
|
|
<img src="../images/rewrite_process_uri.png"
|
|
alt="Flow of RewriteRule and RewriteCond matching" /><br />
|
|
<dfn>Figure 1:</dfn>The control flow through the rewriting ruleset
|
|
</p>
|
|
<p>First the URL is matched against the
|
|
<em>Pattern</em> of each rule. If it fails, mod_rewrite
|
|
immediately stops processing this rule, and continues with the
|
|
next rule. If the <em>Pattern</em> matches, mod_rewrite looks
|
|
for corresponding rule conditions (RewriteCond directives,
|
|
appearing immediately above the RewriteRule in the configuration).
|
|
If none are present, it substitutes the URL with a new value, which is
|
|
constructed from the string <em>Substitution</em>, and goes on
|
|
with its rule-looping. But if conditions exist, it starts an
|
|
inner loop for processing them in the order that they are
|
|
listed. For conditions, the logic is different: we don't match
|
|
a pattern against the current URL. Instead we first create a
|
|
string <em>TestString</em> by expanding variables,
|
|
back-references, map lookups, <em>etc.</em> and then we try
|
|
to match <em>CondPattern</em> against it. If the pattern
|
|
doesn't match, the complete set of conditions and the
|
|
corresponding rule fails. If the pattern matches, then the
|
|
next condition is processed until no more conditions are
|
|
available. If all conditions match, processing is continued
|
|
with the substitution of the URL with
|
|
<em>Substitution</em>.</p>
|
|
|
|
</section>
|
|
|
|
|
|
</manualpage>
|