.htaccess file guid

person shubham sharmafolder_openhtaccesslocal_offeraccess_time February 8, 2017

htaccess is a very ancient configuration file that controls the Web Server running your website, and is one of the most powerful configuration files you will ever come across. .htaccess has the ability to control access of the WWW‘s HyperText Transfer Protocol (HTTP) using Password Protection, 301 Redirects, rewrites, and much much more. This is because this configuration file was coded in the earliest days of the web (HTTP), for one of the first Web Servers ever! Eventually these Web Servers (configured with htaccess) became known as the World Wide Web, and eventually grew into the Internet we use today.

This is not an introduction to .htaccess… This is the evolution of the best of the best.

You’ve come to the right place if you are looking hostinto acquire mad skills for using .htaccess files.

Originally (2003) this guide was known in certain hacker circles and hidden corners of the net as an ultimate .htaccess due to the powerful htaccess tricks and tips to bypass security on a webhost, and also because many of the tricks and examples were pretty impressive back then in that group.

Contents [hide]

^Htaccess – Evolved

The Hyper Text Transfer Protocol (HTTP) was initiated at the CERN in Geneve (Switzerland), where it emerged (together with the HTML presentation language) from the need to exchange scientific information on a computer network in a simple manner. The first public HTTP implementation only allowed for plain text information, and almost instantaneously became a replacement of the GOPHER service. One of the first text-based browsers was LYNX which still exists today; a graphical HTTP client appeared very quickly with the name NCSA Mosaic. Mosaic was a popular browser back in 1994. Soon the need for a more rich multimedia experience was born, and the markup language provided support for a growing multitude of media types.

Htaccess file know-how will do several things for you:

  • Make your website noticeably faster.
  • Allow you to debug your server with ease.
  • Make your life easier and more rewarding.
  • Allow you to work faster and more productively.

^AskApache Htaccess Journey

Skip this – still under edit

I discovered these tips and tricks mostly while working as a network security penetration specialist hired to find security holes in web hosting environments. Shared hosting is the most common and cheapest form of web-hosting where multiple customers are placed on a single machine and “share” the resources (CPU/RAM/SPACE). The machines are configured to basically ONLY do HTTP and FTP. No shells or any interactive logins, no ssh, just FTP access. That is when I started examining htaccess files in great detail and learned about the incredible untapped power of htaccess. For 99% of the worlds best Apache admins, they don’t use .htaccess much, if AT ALL. It’s much easier, safer, and faster to configure Apache using the httpd.conf file instead. However, this file is almost never readable on shared-hosts, and I’ve never seen it writable. So the only avenue left for those on shared-hosting was and is the .htaccess file, and holy freaking fiber-optics.. it’s almost as powerful as httpd.conf itself!

Most all .htaccess code works in the httpd.conf file, but not all httpd.conf code works in .htaccess files, around 50%. So all the best Apache admins and programmers never used .htaccess files. There was no incentive for those with access to httpd.conf to use htaccess, and the gap grew. It’s common to see “computer gurus” on forums and mailing lists rail against all uses and users of .htaccess files, smugly announcing the well known problems with .htaccess files compared with httpd.conf – I wonder if these “gurus” know the history of the htaccess file, like it’s use in the earliest versions of the HTTP Server- NCSA’s HTTPd, which BTW, became known as Apache HTTP. So you could easily say that htaccess files predates Apache itself.

Once I discovered what .htaccess files could do towards helping me enumerate and exploit security vulnerabilities even on big shared-hosts I focused all my research into .htaccess files, meaning I was reading the venerable Apache HTTP Source code 24/7! I compiled every released version of the Apache Web Server, ever, even NCSA’s, and focused on enumerating the most powerful htaccess directives. Good times! Because my focus was on protocol/file/network vulnerabilites instead of web dev I built up a nice toolbox of htaccess tricks to do unusual things. When I switched over to webdev in 2005 I started using htaccess for websites, not research. I documented most of my favorites and rewrote the htaccess guide for webdevelopers. After some great encouragement on various forums and nets I decided to start a blog to share my work with everyone, AskApache.com was registered, I published my guide, and it was quickly plagiarized and scraped all over the net. Information is freedom, and freedom is information, so this blog has the least restrictive copyright for you. Feel free to modify, copy, republish, sell, or use anything on this site đŸ˜‰

^What Is .htaccess

Specifically, .htaccess is the default file name of a special configuration file that provides a number of directives (commands) for controlling and configuring the Apache Web Server, and also to control and configure modules that can be built into the Apache installation, or included at run-time like mod_rewrite (for htaccess rewrite), mod_alias (for htaccess redirects), and mod_ssl (for controlling SSL connections).

Htaccess allows for decentralized management of Web Server configurations which makes life very easy for web hosting companies and especially their savvy consumers. They set up and run “server farms” where many hundreds and thousands of web hosting customers are all put on the same Apache Server. This type of hosting is called “virtual hosting” and without .htaccess files would mean that every customer must use the same exact settings as everyone else on their segment. So that is why any half-decent web host allows/enables (DreamHost, Powweb, MediaTemple, GoDaddy) .htaccess files, though few people are aware of it. Let’s just say that if I was a customer on your server-farm, and .htaccess files were enabled, my websites would be a LOT faster than yours, as these configuration files allow you to fully take advantage of and utilize the resources allotted to you by your host. If even 1/10 of the sites on a server-farm took advantage of what they are paying for, the providers would go out of business.

SKIP: History of Htaccess in 1st Apache.

One of the design goals for this server was to maintain external compatibility with the NCSA 1.3 server — that is, to read the same configuration files, to process all the directives therein correctly, and in general to be a drop-in replacement for NCSA. On the other hand, another design goal was to move as much of the server’s functionality into modules which have as little as possible to do with the monolithic server core. The only way to reconcile these goals is to move the handling of most commands from the central server into the modules.

However, just giving the modules command tables is not enough to divorce them completely from the server core. The server has to remember the commands in order to act on them later. That involves maintaining data which is private to the modules, and which can be either per-server, or per-directory. Most things are per-directory, including in particular access control and authorization information, but also information on how to determine file types from suffixes, which can be modified by AddType and DefaultType directives, and so forth. In general, the governing philosophy is that anything which can be made configurable by directory should be; per-server information is generally used in the standard set of modules for information like Aliases and Redirects which come into play before the request is tied to a particular place in the underlying file system.

Another requirement for emulating the NCSA server is being able to handle the per-directory configuration files, generally called .htaccess files, though even in the NCSA server they can contain directives which have nothing at all to do with access control. Accordingly, after URI -> filename translation, but before performing any other phase, the server walks down the directory hierarchy of the underlying filesystem, following the translated pathname, to read any .htaccess files which might be present. The information which is read in then has to be merged with the applicable information from the server’s own config files (either from the <directory> sections in access.conf, or from defaults in srm.conf, which actually behaves for most purposes almost exactly like <directory />).

Finally, after having served a request which involved reading .htaccess files, we need to discard the storage allocated for handling them. That is solved the same way it is solved wherever else similar problems come up, by tying those structures to the per-transaction resource pool.

^Creating Htaccess Files

Htaccess files use the default filename “.htaccess” but any unix-style file name can be specified from the main server config using the AccessFileName directive. The file isn’t .htaccess.txt, its literally just named .htaccess.

In a Windows Environment like the one I use for work, you can change how Windows opens and views .htaccess files by modifying the Folder Options in explorer. As you can see, on my computer files ending in .htaccess are recognized as having the HTACCESS extension and are handled/opened by Adobe Dreamweaver CS4.

^Htaccess Scope

Unlike the main server configuration files like httpd.conf, Htaccess files are read on every request, therefore, changes in these files take immediate effect. Apache searches all directories and subdirectories that are htaccess-enabled for an .htaccess file which results in performance loss due to file accesses. I’ve never noticed a performance loss but OTOH, I know how to use them. If you do have access to your main server configuration file, you should, of course, use that instead, and lucky for you ALL the .htaccess tricks and examples can be used there as well (just not vice versa).

^Htaccess File Syntax

Htaccess files follow the same syntax as the main Apache configuration files, for powerusers here’s an apache.vim for VI. The one main difference is the context of the directive, which means whether or not that directive is ALLOWED to be used inside of an .htaccess file. Htaccess files are incredibly powerful, and can also be very dangerous as some directives allowed in the main configuration files would allow users/customers to completely bypass security/bandwidth-limits/resource-limits/file-permissions, etc.. About 1/4 of all Apache directives cannot be used inside an .htaccess file (also known as a per-directory context config). The Apache Developers are well-regarded throughout the world as being among some of the best programmers, ever. To enable a disallowed directive inside a .htaccess file would require modifying the source code and re-compiling the server (which they allow and encourage if you are the owner/admin).

^Htaccess Directives

Don’t ask why, but I personally downloaded each major/beta release of the Apache HTTPD source code from version 1.3.0 to version 2.2.10 (all 63 Apache versions!), then I configured and compiled each version for a custom HTTPD installation built from source. This allowed me to find every directive allowed in .htaccess files for each particular version, which has never been done before, or since. YES! I think that is so cool..

A .htaccess directive is basically a command that is specific to a module or builtin to the core that performs a specific task or sets a specific setting for how Apache serves your WebSite. Directives placed in Htaccess files apply to the directory they are in, and all sub-directories. Here are the 3 top links (official Apache Docs) you will repeatedly use, bookmark/print/save them.

  1. Terms Used to Describe Directives
  2. Official List of Apache Directives
  3. Directive Quick-Reference — with Context

^Main Server Config Examples

Now let’s take a look at some htaccess examples to get a feel for the syntax and some general ideas at the capabilities. Some of the best examples for .htaccess files are included with Apache for main server config files, so let’s take a quick look at a couple of them on our way down to the actual .htaccess examples further down the page (this site has thousands, take your time). The basic syntax is a line starting with # is a comment, everything else is directives followed by the directive argument.

httpd-multilang-errordoc.conf: The configuration below implements multi-language error documents through content-negotiation

Here are the rest of them if you wanna take a look. (httpd-mpm.conf, httpd-default.conf, httpd-ssl.conf, httpd-info.conf, httpd-vhosts.conf, httpd-dav.conf)


^Example .htaccess Code Snippets

Here are some specific examples, this is the most popular section of this page. Updated frequently.

^Redirect Everyone Except IP address to an alternate page

^When developing sites

This lets google crawl the page, lets me access without a password, and lets my client access the page WITH a password. It also allows for XHTML and CSS validation! (w3.org)

^Fix double-login prompt

Redirect non-https requests to https server and ensure that .htpasswd authorization can only be entered across HTTPS

^Set Timezone of the Server (GMT)

^Administrator Email for ErrorDocument

^ServerSignature for ErrorDocument

^Charset and Language headers

Article: Setting Charset in htaccess, and article by Richard Ishida

^Disallow Script Execution

^Deny Request Methods

^Force “File Save As” Prompt

^Show CGI Source Code

^Serve all .pdf files on your site using .htaccess and mod_rewrite with the php script.

^Rewrite to www

^Rewrite to www dynamically

^301 Redirect Old File

^301 Redirect Entire Directory

^Protecting your php.cgi

^Set Cookie based on Request

This code sends the Set-Cookie header to create a cookie on the client with the value of a matching item in 2nd parantheses.

^Set Cookie with env variable

^Custom ErrorDocuments

^Implementing a Caching Scheme with .htaccess

^Password Protect single file

^Password Protect multiple files

^Send Custom Headers

^Blocking based on User-Agent Header

^Blocking with RewriteCond

^.htaccess for mod_php

^.htaccess for php as cgi

^Shell wrapper for custom php.ini

^Add values from HTTP Headers

^Stop hotlinking

^Turn logging off for IP

^Turn logging on for IP


^Example .htaccess Files

Here are some samples and examples taken from different .htaccess files I’ve used over the years. Specific solutions are farther down on this page and throughout the site.

Here are some default MOD_REWRITE code examples.

Examples of protecting your files and securing with password protection.


^Advanced Mod_Rewrites

Here are some specific htaccess examples taken mostly from my WordPress Password Protection plugin, which does alot more than password protection as you will see from the following mod_rewrite examples. These are a few of the mod_rewrite uses that BlogSecurity declared pushed the boundaries of Mod_Rewrite! Some of these snippets are quite exotic and unlike anything you may have seen before, also only for those who understand them as they can kill a website pretty quick.

^Directory Protection

Enable the DirectoryIndex Protection, preventing directory index listings and defaulting. [Disable]

^Password Protect wp-login.php

Requires a valid user/pass to access the login page[401]

^Password Protect wp-admin

Requires a valid user/pass to access any non-static (css, js, images) file in this directory.[401]

^Protect wp-content

Denies any Direct request for files ending in .php with a 403 Forbidden.. May break plugins/themes [401]

^Protect wp-includes

Denies any Direct request for files ending in .php with a 403 Forbidden.. May break plugins/themes [403]

^Common Exploits

Block common exploit requests with 403 Forbidden. These can help alot, may break some plugins. [403]

^Stop Hotlinking

Denies any request for static files (images, css, etc) if referrer is not local site or empty. [403]

^Safe Request Methods

Denies any request not using GET,PROPFIND,POST,OPTIONS,PUT,HEAD[403]

^Forbid Proxies

Denies any POST Request using a Proxy Server. Can still access site, but not comment. See Perishable Press [403]

^Real wp-comments-post.php

Denies any POST attempt made to a non-existing wp-comments-post.php[403]

^HTTP PROTOCOL

Denies any badly formed HTTP PROTOCOL in the request, 0.9, 1.0, and 1.1 only[403]

^SPECIFY CHARACTERS

Denies any request for a url containing characters other than “a-zA-Z0-9.+/-?=&” – REALLY helps but may break your site depending on your links. [403]

^BAD Content Length

Denies any POST request that doesnt have a Content-Length Header[403]

^BAD Content Type

Denies any POST request with a content type other than application/x-www-form-urlencoded|multipart/form-data[403]

^Missing HTTP_HOST

Denies requests that dont contain a HTTP HOST Header.[403]

^Bogus Graphics Exploit

Denies obvious exploit using bogus graphics[403]

^No UserAgent, Not POST

Denies POST requests by blank user-agents. May prevent a small number of visitors from POSTING. [403]

^No Referer, No Comment

Denies any comment attempt with a blank HTTP_REFERER field, highly indicative of spam. May prevent some visitors from POSTING. [403]

^Trackback Spam

Denies obvious trackback spam. See Holy Shmoly! [403]

^Map all URIs except those corresponding to existing files to a handler

^Map any request to a handler

In the case where all URIs should be sent to the same place (including potentially requests for static content) the method to use depends on the type of the handler. For php scripts, use: For other handlers such as php scripts, use:

^And for CGI scripts:

^Map URIs corresponding to existing files to a handler instead

If the existing files you wish to have handled by your script have a common set of file extensions distinct from that of the hander, you can bypass mod_rewrite and use instead mod_actions. Let’s say you want all .html and .tpl files to be dealt with by your script:

^Deny access if var=val contains the string foo.

^Removing the Query String

^Adding to the Query String

Keep the existing query string using the Query String Append flag, but add var=val to the end.

^Rewriting For Certain Query Strings

Rewrite URLs like http://askapache.com/url1?var=val to http://askapache.com/url2?var=val but don’t rewrite if val isn’t present.

^Modifying the Query String

Change any single instance of val in the query string to other_val when accessing /path. Note that %1 and %2 are back-references to the matched part of the regular expression in the previous RewriteCond.


^Technical Look at .htaccess

Source: Apache API notes

^Per-directory configuration structures

Let’s look out how all of this plays out in mod_mime.c, which defines the file typing handler which emulates the NCSA server’s behavior of determining file types from suffixes. What we’ll be looking at, here, is the code which implements the AddType and AddEncoding commands. These commands can appear in .htaccess files, so they must be handled in the module’s private per-directory data, which in fact, consists of two separate tables for MIME types and encoding information, and is declared as follows:

When the server is reading a configuration file, or <Directory> section, which includes one of the MIME module’s commands, it needs to create a mime_dir_config structure, so those commands have something to act on. It does this by invoking the function it finds in the module’s `create per-dir config slot’, with two arguments: the name of the directory to which this configuration information applies (or NULL for srm.conf), and a pointer to a resource pool in which the allocation should happen.

(If we are reading a .htaccess file, that resource pool is the per-request resource pool for the request; otherwise it is a resource pool which is used for configuration data, and cleared on restarts. Either way, it is important for the structure being created to vanish when the pool is cleared, by registering a cleanup on the pool if necessary).

For the MIME module, the per-dir config creation function just ap_pallocs the structure above, and a creates a couple of tables to fill it. That looks like this:

Now, suppose we’ve just read in a .htaccess file. We already have the per-directory configuration structure for the next directory up in the hierarchy. If the .htaccess file we just read in didn’t have any AddType or AddEncoding commands, its per-directory config structure for the MIME module is still valid, and we can just use it. Otherwise, we need to merge the two structures somehow.

To do that, the server invokes the module’s per-directory config merge function, if one is present. That function takes three arguments: the two structures being merged, and a resource pool in which to allocate the result. For the MIME module, all that needs to be done is overlay the tables from the new per-directory config structure with those from the parent:

As a note — if there is no per-directory merge function present, the server will just use the subdirectory’s configuration info, and ignore the parent’s. For some modules, that works just fine (e.g., for the includes module, whose per-directory configuration information consists solely of the state of the XBITHACK), and for those modules, you can just not declare one, and leave the corresponding structure slot in the module itself NULL.

^Command handling

Now that we have these structures, we need to be able to figure out how to fill them. That involves processing the actual AddType and AddEncoding commands. To find commands, the server looks in the module’s command table. That table contains information on how many arguments the commands take, and in what formats, where it is permitted, and so forth. That information is sufficient to allow the server to invoke most command-handling functions with pre-parsed arguments. Without further ado, let’s look at the AddType command handler, which looks like this (the AddEncoding command looks basically the same, and won’t be shown here):

This command handler is unusually simple. As you can see, it takes four arguments, two of which are pre-parsed arguments, the third being the per-directory configuration structure for the module in question, and the fourth being a pointer to a cmd_parms structure. That structure contains a bunch of arguments which are frequently of use to some, but not all, commands, including a resource pool (from which memory can be allocated, and to which cleanups should be tied), and the (virtual) server being configured, from which the module’s per-server configuration data can be obtained if required.

Another way in which this particular command handler is unusually simple is that there are no error conditions which it can encounter. If there were, it could return an error message instead of NULL; this causes an error to be printed out on the server’s stderr, followed by a quick exit, if it is in the main config files; for a .htaccess file, the syntax error is logged in the server error log (along with an indication of where it came from), and the request is bounced with a server error response (HTTP error status, code 500).

The MIME module’s command table has entries for these commands, which look like this:

Here’s a taste of that famous Apache source code that builds the directives allowed in .htaccess file context, the key that tells whether its enabled in .htaccess context is the DIR_CMD_PERMS and then the OR_FILEINFO, which means a directive is enabled dependent on the AllowOverride directive that is only allowed in the main config. First Apache 1.3.0, then Apache 2.2.10

^mod_autoindex

^mod_rewrite

The entries in these tables are:

  • The name of the command
  • The function which handles it a (void *) pointer, which is passed in the cmd_parms structure to the command handler — this is useful in case many similar commands are handled by the same function.
  • A bit mask indicating where the command may appear. There are mask bits corresponding to each AllowOverride option, and an additional mask bit, RSRC_CONF, indicating that the command may appear in the server’s own config files, but not in any .htaccess file.
  • A flag indicating how many arguments the command handler wants pre-parsed, and how they should be passed in. TAKE2 indicates two pre-parsed arguments. Other options are TAKE1, which indicates one pre-parsed argument, FLAG, which indicates that the argument should be On or Off, and is passed in as a boolean flag, RAW_ARGS, which causes the server to give the command the raw, unparsed arguments (everything but the command name itself). There is also ITERATE, which means that the handler looks the same as TAKE1, but that if multiple arguments are present, it should be called multiple times, and finally ITERATE2, which indicates that the command handler looks like a TAKE2, but if more arguments are present, then it should be called multiple times, holding the first argument constant.
  • Finally, we have a string which describes the arguments that should be present. If the arguments in the actual config file are not as required, this string will be used to help give a more specific error message. (You can safely leave this NULL).

Finally, having set this all up, we have to use it. This is ultimately done in the module’s handlers, specifically for its file-typing handler, which looks more or less like this; note that the per-directory configuration structure is extracted from the request_rec’s per-directory configuration vector by using the ap_get_module_config function.

^Side notes — per-server configuration, virtual servers, etc.

The basic ideas behind per-server module configuration are basically the same as those for per-directory configuration; there is a creation function and a merge function, the latter being invoked where a virtual server has partially overridden the base server configuration, and a combined structure must be computed. (As with per-directory configuration, the default if no merge function is specified, and a module is configured in some virtual server, is that the base configuration is simply ignored).

The only substantial difference is that when a command needs to configure the per-server private module data, it needs to go to the cmd_parms data to get at it. Here’s an example, from the alias module, which also indicates how a syntax error can be returned (note that the per-directory configuration argument to the command handler is declared as a dummy, since the module doesn’t actually have per-directory config data):

^Litespeed Htaccess support

Unlike other lightweight web servers, Apache compatible per-directory configuration overridden is fully supported by LiteSpeed Web Server. With .htacess you can change configurations for any directory under document root on-the-fly, which in most cases is a mandatory feature in shared hosting environment. It is worth noting that enabling .htaccess support in LiteSpeed Web Server will not degrade server’s performance, comparing to Apache’s 40% drop in performance.

warningComments are closed.