A tiny .htaccess tutorial

This is mostly a "stick this in your application" tutorial. The file only does two things, and only on Apache servers, and I explained them as simply as I could. But they offer HUGE functionality for web developers, so take the time to read about it.

A tiny .htaccess tutorial

Here is yet another tutorial about which I know very little. Yet the subject is so powerful that I feel like a tutorial might be of help to some of you who hangout at this site. As far as I know, it only works with Apache webservers.

The .htaccess file controls a lot of aspects of how your Apache webserver behaves. I am only going to address two of them here, and I do not even know these all that well. In fact, the sample file was written by someone else three years ago, and the interpretation of the regular expression is beyond my skill. But it works, and you too can just drop it onto your website and get a whole bunch of cool functionality for very little effort.

The sample file


DirectoryIndex index.php
RewriteEngine on
RewriteCond %{REQUEST_FILENAME} !-d
RewriteCond %{REQUEST_FILENAME} !-f
RewriteRule ^(.+) PageNotFound.php


This file goes in any directory on your webserver, and affects that directory and all directories under it. You may need to do some diddling with your Dreamweaver settings to see it, since most webservers treat it as a hidden file. There is an option in your site settings to show hidden files that will make it visible. There are probably other better ways to do the same things. If someone can enlighten us, please add a comment to this article.

DirectoryIndex filename [filename] [filename]

This command tells the webserver which file to use as the default if no pagename is given. Using this command, you can make any page the default, not just the ones chosen by your web administrator. It also prevents the webserver from showing a directory of the files when no filename is requested. You can list multiple filenames, and it will try them in the order you list them until it finds one that exists.

RewriteEngine On

This is the real reason why I am writing this tutorial. This line and the three that follow do the following:

Whenever a file is not found, instead of giving out an HTTP 404 error, it redirects the request to the page specified (in this case PageNotFound.php) The redirect is totally transparent to the client, and this is what makes it such a big WOW!!. If you use Internet Explorer, you know that you have no control over what the user sees in a Page Not Found condition. Microsoft intercepts the 404 and displays it's own error message, which is probably not what you want to happen. Maybe you want to send the user back to the site index, maybe to a sitemap page, or maybe you want to log the URL and referring page for future investigations or updating of links. By including these lines in your .htaccess file, you take total control over this aspect of the user experience, instead of letting BillG do the driving.

Even more cool, I used this feature to significantly affect one application that I wrote a couple of years back. I had built a totally database-driven site that looks a lot like the portals of today. But instead of using the URL?Page=999 syntax, I gave each page in the database a URL-like name. I used this URL in all page links, as well as search engine listings. But many of these pages did not really exist. Instead, when Apache hit a missing pagename, it redirected the client to PageNotFound.php. That page had extensive database logic to do a page lookup based on the URL, and populated a page with everything it needed, including <TITLE> and <META> tags. It looked to the user just as though the page was really there. And more importantly, the pages all registered with the search engines because there was no "?" in the URL to interrupt search engine parsing.

You can see it in action at an old site Netcitizen which is soon to be retired. Just follow a few of the article links. Everything except for the form and update pages (and index.php) is really a database page.

If you are smarter than me, you can make this work even better. I am sure there are ways to write those regex lines to redirect only certain filenames to specific places. This would be cool in a catalog application, where you could direct itemxxxx.php to ItemNotFound.php and categoryxxx.php to CategoryNotFound.php The implications for search engines indexing your site are huge.

You can also cheat like I do, and use different .htaccess files in subdirectories. This gives you similar functionality to tweaking the sample file, but forces you to use multiple directories to get there. I have not fully explored the implications of this, but it seems like it should work just fine. Maybe someday I will learn enough about regex to do it all in one place.

As always with my tutorials, anyone who has any comments is encouraged to add them. This is especially true if I have any misinformation here. I will benefit more than anyone by being set on the correct path. As I said, I don't know how it works, I just know what it does. But in the meantime, feel free to cut and paste this .htaccess file into your application, even if the only reason is to present a prettier "OOPS, sorry about that" page to your users.

Comments

what about directory browsing

November 8, 2001 by Josh Crosby
How could you use this method to stop prowling eyes from looking into your directorys, (ie. /images/)

More Info

December 10, 2001 by Matt Jensen

Here is something I have archived, posted by Jeff Samborski somewhere:

If your on a Unix server you can use.htaccess and .htpasswd
The htaccess / .htpasswd method is very secure.

This method consists of  two text files work together to password protect
all of the files in a folder.

In your text editor create a file named .htaccess with the following
content:

AuthUserFile /relative/path/to/.htpasswd
AuthName "realm"
AuthType Basic

and a file named .htpasswd with the following content:

username:[encryptedPassword]

Upload .htaccess to the folder you want to protect (remember all files in
this folder will be password protected)
Upload .htpasswd to a different folder (its best to create a folder just for
this file)

Go to one of the PERL script archives and get one of the password management
scripts. This will create the encrypted passwords for the .htpasswd file and
allow you to easily manage, change and delete passwords.

--
Jeff Samborski
Lambert & Samborski Design
http://www.lamsam.com

More Info Again

December 10, 2001 by Matt Jensen

Page not found error

March 28, 2003 by Tracy Watson

The .htaccess is a great tool and I've successfully used it on both Unix and Linux servers mainly for the purposes of avoiding the UGLY 404 errors that inevitably crop up. 

I discovered a nifty piece of information last week in relation to the issue of MSIE displaying it's own 404 page.  To overcome this and to force MSIE to display your version of the 404 error (and the type I use are just basic HTML), you need to make your 404 file, larger than 512kb.  I'm not sure why this is, but it works.

Hope this can be of help to some of you out there.

See all 8 Comments

You must me logged in to write a comment.