Wednesday, 29 November 2017

apache - Reference: mod_rewrite, URL rewriting and "pretty links" explained

itemprop="text">

"Pretty links" is an often requested
topic, but it is rarely fully explained. href="https://en.wikipedia.org/wiki/URL_redirection#Apache_HTTP_Server_mod_rewrite"
rel="nofollow noreferrer">mod_rewrite is one way to make "pretty links",
but it's complex and its syntax is very terse, hard to grok, and the documentation
assumes a certain level of proficiency in HTTP. Can someone explain in simple terms how
"pretty links" work and how mod_rewrite can be used to create
them?



Other common names, aliases, terms for
clean URLs: href="http://en.wikipedia.org/wiki/Representational_state_transfer#RESTful_web_services"
rel="nofollow noreferrer">RESTful URLs, user-friendly URLs, href="http://en.wikipedia.org/wiki/Search_engine_optimization" rel="nofollow
noreferrer">SEO-friendly URLs, href="https://en.wikipedia.org/wiki/Clean_URL#Slug" rel="nofollow
noreferrer">slugging, and MVC URLs (probably a
misnomer)



Answer




To understand what mod_rewrite does you first
need to understand how a web server works. A web server responds to href="http://en.wikipedia.org/wiki/Http" rel="noreferrer">HTTP
requests
. An HTTP request at its most basic level looks like
this:



GET /foo/bar.html
HTTP/1.1



This
is the simple request of a browser to a web server requesting the
URL /foo/bar.html from it. It is important
to stress that it does not request a file, it requests just some
arbitrary URL. The request may also look like
this:



GET /foo/bar?baz=42
HTTP/1.1


This is just
as valid a request for a URL, and it has more obviously nothing to do with
files.



The web server is an application
listening on a port, accepting HTTP requests coming in on that port and returning a
response. A web server is entirely free to respond to any request in any way it sees
fit/in any way you have configured it to respond. This response is not a file, it's an
HTTP response which may or may not have anything to do with
physical files on any disk. A web server doesn't have to be Apache, there are many other
web servers which are all just programs which run persistently and are attached to a
port which respond to HTTP requests. You can write one yourself. This paragraph was
intended to divorce you from any notion that URLs directly equal files, which is really
important to understand. :)




The
default configuration of most web servers is to look for a file that matches the URL on
the hard disk. If the document root of the server is set to, say,
/var/www, it may look whether the file
/var/www/foo/bar.html exists and serve it if so. If the file
ends in ".php" it will invoke the PHP interpreter and then return
the result. All this association is completely configurable; a file doesn't have to end
in ".php" for the web server to run it through the PHP interpreter, and the URL doesn't
have to match any particular file on disk for something to
happen.



mod_rewrite is a way to
rewrite the internal request handling. When the web server receives
a request for the URL /foo/bar, you can
rewrite that URL into something else before the web server will
look for a file on disk to match it. Simple
example:



RewriteEngine
On
RewriteRule /foo/bar
/foo/baz


This rule
says whenever a request matches "/foo/bar", rewrite it to
"/foo/baz".
The request will then be handled as if
/foo/baz had been requested instead. This can be used for
various effects, for
example:




RewriteRule
(.*) $1.html


This rule
matches anything (.*) and captures it
((..)), then rewrites it to append ".html". In other words, if
/foo/bar was the requested URL, it will be handled as if
/foo/bar.html had been requested. See href="http://regular-expressions.info"
rel="noreferrer">http://regular-expressions.info for more information about
regular expression matching, capturing and
replacements.



Another often encountered rule is
this:



RewriteRule (.*)
index.php?url=$1



This,
again, matches anything and rewrites it to the file index.php with the originally
requested URL appended in the url query parameter. I.e., for
any and all requests coming in, the file index.php is executed and this file will have
access to the original request in $_GET['url'], so it can do
anything it wants with it.



Primarily you put
these rewrite rules into your web server configuration file. Apache
also allows* you to put them into a file called .htaccess
within your document root (i.e. next to your .php
files).



* If
allowed by the primary Apache configuration file; it's optional, but often
enabled.





mod_rewrite
does not magically make all your URLs "pretty". This is a common misunderstanding. If
you have this link in your web
site:




            href="/my/ugly/link.php?is=not&very=pretty">


there's
nothing mod_rewrite can do to make that pretty. In order to make this a pretty link, you
have to:




  1. Change
    the link to a pretty link:



                href="/my/pretty/link">


  2. Use
    mod_rewrite on the server to handle the request to the URL
    /my/pretty/link using any one of the methods described
    above.




(One
could use rel="noreferrer">mod_substitute in conjunction to
transform outgoing HTML pages and their contained links. Though this is usally more
effort than just updating your HTML
resources.)



There's a lot mod_rewrite
can do and very complex matching rules you can create, including chaining several
rewrites, proxying requests to a completely different service or machine, returning
specific HTTP status codes as responses, redirecting requests etc. It's very powerful
and can be used to great good if you understand the fundamental HTTP request-response
mechanism. It does not automatically make your links
pretty.



See the href="http://httpd.apache.org/docs/current/mod/mod_rewrite.html"
rel="noreferrer">official documentation for all the possible flags and
options.


No comments:

Post a Comment

php - file_get_contents shows unexpected output while reading a file

I want to output an inline jpg image as a base64 encoded string, however when I do this : $contents = file_get_contents($filename); print ...