class: center, middle background-image: url(http://github.com/images/feather.png)
Less magic, more science
Rich Bowen - [email protected]
SLIDES ARE AT: http://boxofclue.com/presentations/
Despite the tons of examples and docs, mod_rewrite is voodoo. Damned cool voodoo, but still voodoo. -- Brian Moore
And numerous websites offer mod_rewrite advice that is much more akin to
magic than science.
RewriteCond %{REMOTE_ADDR} ^205\.209\.177\.
RewriteRule ^(.*)$ - [F]
??? Which is why you see people doing crap like ...
I'd like to show you that mod_rewrite is a clear, concise,
algebraic notation, not a magical incantation.
http://drbacchus.com/books/rewrite
- Atomic description of text patterns
- Start with a small vocabulary and work up
- Essential building block of mod_rewrite syntax
- PCRE
??? Mastering Regular Expressions, Jeffrey Friedl
- Wildcard character
- Matches one "atom"
- In mod_rewrite syntax, it matches a 'character'
- Use \. if you want to match a literal "."
- Repetition characters
- Turns an atom into a molecule
- a* matches zero or more 'a' characters
- a
- aaaaa
- Also matches "Fish", which contains zero 'a' characters
- Makes a match optional
- That is, matches zero or one
- Anchors
- Starts with
- Ends with
- ^$ is a special case - matches empty string
- starts with ends with (nothing between)
- ^ (all by itself) matches every string (including empty string)
- Turns several atoms into a molecule (grouping)
- Can apply repetition characters to this molecule
(ab)+
matches "abababab"
- Also "captures"
- The matched set of parentheses becomes $1
- The next one $2, and so on
- Examples in just a moment
- Character class
- Match one of these things
-
Any regex can be negated in a RewriteRule or RewriteCond by putting a ! in front of it
-
A character class is negated with a ^
[^abc]
Matches anything EXCEPT a, b, c
- mod_rewrite uses regular expressions to match requests, and modify them in some way
- (And a lot of other things)
RewriteRule PATTERN TARGET
RewriteRule PATTERN TARGET
If it matches this
RewriteRule PATTERN TARGET
Do this instead
RewriteRule PATTERN TARGET [flags]
With some optional tweaks
RewriteRule PATTERN TARGET
- PATTERN is a regular expression (usually)
- Applied to the REQUEST_URI
- That's everything after http://hostname
- May be modified by context (eg, .htaccess files) or by earlier rewrite rules
RewriteRule PATTERN TARGET
- TARGET is where you want it to go instead
- File path, or URI, or something else, depending on context and flags
RewriteRule ^/images/(.*)\.jpg \
/pics/$1.gif [R=301]
- Modify behavior of a RewriteRule
- Default is to treat it as a file path relative to current location
-
Forces an external redirect, optionally with the specified HTTP status code.
-
Issues a redirect header to the client - URL in browser changes
RewriteRule ^/images/(.*)\.jpg /pics/$1.gif [R=302] RewriteRule products http://products.example.org/ [R=301]
??? 302=temp, 301=permanent
-
Forces the resulting URI to be passed back to the URL mapping engine for processing of other URI-to-filename translators, such as Alias or Redirect.
-
Treat the target as a URI, processing it for URI-type things
RewriteRule ^/products/(.+?)/ /prod.php?$1 [PT,L]
- Escape non-alphanumeric characters before applying the transformation.
- Preserves special characters in the URI through the rewriting process
- Rule is chained to the following rule. If the rule fails, the rule(s) chained to it will be skipped.
- Use this when you need to do several transformations in a row as part of a single logical operation.
-
Sets a cookie in the client browser.
RewriteRule ^/index.html - \ CO=frontdoor:1:example.com
-
Full syntax is: CO=NAME:VAL:domain[:lifetime [:path[:secure[:httponly]]]]
-
- as the target means "don't rewrite"
- Causes the PATH_INFO portion of the rewritten URI to be discarded.
-
Causes an environment variable VAR to be set (to the value VAL if provided).
-
The form !VAR causes the environment variable VAR to be unset.
RewriteRule \.(png|gif|jpg)$ - [E=image:1] CustomLog logs/access_log combined env=!image
-
Example: Don't log images
-
Returns a 403 FORBIDDEN response to the client browser.
RewriteRule \.exe - [F]
- Returns a 410 GONE response to the client browser.
- I've never actually used this flag.
-
Causes the resulting URI to be sent to the specified Content-handler for processing.
RewriteRule ^(/source/.+\.php)s$ \ $1 [H=application/x-httpd-php-source]
-
Example causes .phps requests to be processed by PHP's syntax-highlighter
-
Stop the rewriting process immediately and don't apply any more rules.
-
Probably doesn't do what you expect in per-directory and .htaccess context
-
(see also the END flag).
RewriteBase / RewriteCond %{REQUEST_URI} !=/index.php RewriteRule ^(.*) /index.php?req=$1 [L,PT]
- Stop the rewriting process immediately and don't apply any more rules.
- Also prevents further execution of rewrite rules in per-directory and .htaccess context.
- (Available in 2.3.9 and later)
- Note that a rule issuing a REDIRECT to itself will still result in rules being re-run
-
Re-run the rewriting process, starting again with the first rule, using the result of the ruleset so far as a starting point.
RewriteRule (.*)A(.*) $1B$2 [N]
-
Example - global search and replace of A with B, looping until there's no more As
-
Use [N=100] to limit to 100 iterations (2.4.8 and later)
-
Makes the pattern comparison case-insensitive.
RewriteRule (.*\.(jpg|gif|png))$ http://images.example.com$1 [P,NC]
- Prevent mod_rewrite from applying hexcode escaping of special characters in the result of the rewrite. * Not to be confused with [B]
- Causes a rule to be skipped if the current request is an internal sub-request.
-
Force the substitution URL to be internally sent as a proxy request.
RewriteRule (.*\.(jpg|gif|png))$ http://images.example.com$1 [P,NC]
-
Proxy to back-end image server
-
Appends any query string from the original request URL to any query string created in the rewrite target.
-
That is, it preserves the user-submitted query string, in addition to the one you created
RewriteRule /pages/(.+) /page.php?page=$1 [QSA]
- Discard any query string attached to the incoming URI.
-
Tells the rewriting engine to skip the next num rules if the current rule matches.
-
Like a GoTo statement for rewrite rules
-
Consider using
<If>
and<Else>
instead# Is the request for a non-existent file? RewriteCond %{REQUEST_FILENAME} !-f RewriteCond %{REQUEST_FILENAME} !-d # If so, skip these two RewriteRules RewriteRule .? - [S=2] RewriteRule (.*\.gif) images.php?$1 RewriteRule (.*\.html) docs.php?$1
-
Force the MIME-type of the target file to be the specified type.
# Serve .pl files as plain text RewriteRule \.pl$ - [T=text/plain]
- .htaccess files are for local (per-directory) configuration
- mod_rewrite assumes you only care about the current directory
- Leading directory path is stripped off of everything
Syntax in .htaccess files:
# In httpd.conf
RewriteRule ^/images/(.+)\.jpg /images/$1.png
# In .htaccess in root dir
RewriteBase /
RewriteRule ^images/(.+)\.jpg images/$1.png
# In .htaccess in images/
RewriteBase /images/
RewriteRule ^(.+)\.jpg $1.png
- Additional condition on a rewrite
- Can consult any variable, not just REQUEST_URI
- Can evaluate arbitrary expressions
Redirect based on client address
RewriteCond %{REMOTE_ADDR} ^10\.2\.
RewriteRule (.*) http://intranet.example.com$1
RewriteCond %{HTTP_HOST} (.*)
RewriteRule ^/(.*) /sites/%1/$1
-
-f
- Is it a file? -
-d
- Is it a directory?RewriteCond /var/www%{REQUEST_URI} !-f RewriteCond /var/www%{REQUEST_URI} !-d RewriteRule ^ /index.php [PT,L]
index.php can examine $_SERVER['REQUEST_URI']
for the original request
- -s - is a file with non-zero size
- -U - resolves to a valid URL - This is SLOW
- -x - is an executable file
-
Look-ahead for a variable that hasn't been set yet
-
For example, use this for auth user, which is set after rewrite phase
RewriteCond %{LA-U:REMOTE_USER} (.+) RewriteRule (.*) http://people.example.org/%1/$1 [R,L]
-
Evaluate arbitrary logical expressions
RewriteCond expr "! %{HTTP_REFERER} \ -strmatch '*://%{HTTP_HOST}/*'" RewriteRule ^/images - [F]
- External map which can be used in a RewriteRule
- 1-1 mapping
- DB lookup
- Some kind of programmatic thingy
RewriteMap MapName MapType:MapSource
eg.
RewriteMap examplemap txt:/path/to/file/map.txt
RewriteRule ^/ex/(.*) ${examplemap:$1}
RewriteMap example
RewriteMap product2id \
txt:/etc/apache2/productmap.txt
RewriteRule ^/product/(.*) \
/prods.php?id=${product2id:$1|NOTFOUND} [PT]
Map file
##
## productmap.txt - Product to ID map file
##
television 993
stereo 198
fishingrod 043
basketball 418
telephone 328
-
txt: Plain text maps
-
rnd: Randomized Plain Text
-
dbm: DBM Hash File
httxt2dbm -i rewritemap.txt -o rewritemap.dbm
Map types
- int: Internal Function
- prg: External Rewriting Program
- dbd or fastdbd: SQL Query
examples in bonus slides at the end, if we have extra time
Logging - 2.2 and earlier
RewriteLog /var/log/httpd/rewrite.log
RewriteLogLevel 9
Then ...
tail -f /var/log/httpd/rewrite.log
Logging - 2.4 and later
ErrorLog /var/log/httpd/error.log
LogLevel warn rewrite:trace6
Then ...
tail -f /var/log/httpd/error.log | grep rewrite
Email: [email protected] Twitter: @rbowen Slides: http://boxofclue.com/presentations and at https://github.com/rbowen/presentations
- RewriteMap examples
- Rewrite recipes
- If, ElseIf, Else syntax
##
## map.txt -- rewriting map
##
static www1|www2|www3|www4
dynamic www5|www6
RewriteMap servers rnd:/path/to/file/map.txt
RewriteRule ^/(.*\.(png|gif|jpg)) http://${servers:static}/$1 [NC,P,L]
RewriteRule ^/(.*) http://${servers:dynamic}/$1 [P,L]
RewriteMap lc int:tolower
RewriteRule (.*?[A-Z]+.*) ${lc:$1} [R]
RewriteMap d2u prg:/www/bin/dash2under.pl
RewriteRule - ${d2u:%{REQUEST_URI}}
#!/usr/bin/perl
$| = 1; # Turn off I/O buffering
while (<STDIN>) {
s/-/_/g; # Replace dashes with underscores
print $_;
}
RewriteMap myquery \
"fastdbd:SELECT destination FROM rewrite WHERE source = %s"
Look somewhere else ...
RewriteEngine on
# first try to find it in dir1/...
# ...and if found stop and be happy:
RewriteCond %{DOCUMENT_ROOT}/dir1/%{REQUEST_URI} -f
RewriteRule ^(.+) %{DOCUMENT_ROOT}/dir1/$1 [L]
# second try to find it in dir2/...
# ...and if found stop and be happy:
RewriteCond %{DOCUMENT_ROOT}/dir2/%{REQUEST_URI} -f
RewriteRule ^(.+) %{DOCUMENT_ROOT}/dir2/$1 [L]
# else go on for other Alias or ScriptAlias directives,
# etc.
RewriteRule ^ - [PT]
Front controller
<Directory /var/www/my_blog>
RewriteBase /my_blog
RewriteCond /var/www/my_blog/%{REQUEST_FILENAME} !-f
RewriteCond /var/www/my_blog/%{REQUEST_FILENAME} !-d
RewriteRule ^ index.php [PT]
</Directory>
Or ...
<Directory /var/www/my_blog>
FallbackResource index.php
</Directory>
Prevent hotlinking
RewriteEngine on
RewriteCond %{HTTP_REFERER} !^$
RewriteCond %{HTTP_REFERER} !example.com [NC]
RewriteRule \.(gif|jpg|png)$ - [F]
or ...
RewriteEngine on
RewriteCond %{HTTP_REFERER} !^$
RewriteCond %{HTTP_REFERER} !example.com [NC]
RewriteRule \.(gif|jpg|png)$ /images/goaway.gif [R,L]
If/Else syntax
<If "$req{Host} != 'www.wooga.com'">
RedirectMatch (.*) http://www.wooga.com$1
</If>
Images should be from local pages: (Prevent image "hotlinking")
<FilesMatch \.(jpg|png|gif)$>
<If "%{HTTP_HOST} !~ 'example.com'>
Require all denied
</If>
</FilesMatch>
- mod_macro
- mod_substitute
- My 'Apache Cookbook' talk in an hour.
@rbowen