.htaccess for Beginners

A .htaccess (hypertext access) file is a directory-level plain text configuration file for web servers, which, in simple terms, controls access to a certain directory in your server. The use of .htaccess files became popular because they could be used to override global level server settings related to access of directories. However, in recent times, .htaccess can override many other configuration settings.

Several modern web servers like Apache support .htaccess or related files. Although some other popular servers like Nginx do not have a direct support for .htaccess files, there are ways by which we can convert .htaccess rules to work in Nginx.

.htaccess rules apply to a directory and all its subdirectories, unless there are more .htaccess files present within the sub-directories. The permissions of the .htaccess should be such that it allows universal read access but user only write access.

Advantages and disadvantages:

.htaccess files are read on every request. Therefore, any changes to these files result in immediate effect, unlike the global settings, which require the server to be restarted. Also, the .htaccess files enables each user to set their permissions for a server with many users.

There is a big catch, though. Since every request requires the web server to read all .htaccess files, it can lead to performance issues if there is a considerable load. Also, decentralizing the settings to different users can lead to security issues, especially if the .htaccess files are not configured properly.

If you want to make changes to a .htaccess file, make sure you keep a backup as a slight syntax error can bring the server down. It is also advisable to put appropriate comments when changing .htaccess files.

Listed here are a few practical things that you can do with .htaccess files.

Prevent Hotlinking:

Hotlinking is a process where one web site links directly to an object on another site. The object might be an image, an audio or a video file. Although it doesn’t really hurt for your site’s content to be on another, the downside is a significant amount of your bandwidth that this process eats up!

Any post on .htaccess would be incomplete without mentioning how we can prevent hotlinking using a simple text file! All you need to do is add the following to your .htaccess files.

    RewriteEngine on
    RewriteCond %{HTTP_REFERER} !^$
    #domains that can link to your content (images here)
    #add as many as you want
     RewriteCond %{HTTP_REFERER} !^http(s)?://(www\.)?yoursite.com [NC]
     #show no image when hotlinked
     RewriteRule \.(jpg|png|gif)$ – [NC,F,L]

Let’s try and understand what this code means. We first turn on the RewriteEngine so that we can redirect requests to a different URL. RewriteCond is a directive that takes in two arguments TestString and CondPattern. ${HTTP_REFERER} is a variable that stores the domain from which the request is coming. We try to match it with a pattern to make sure it’s from your site. The (www\.)? is to make sure that both the forms www.yoursite.com and yoursite.com are allowed. NC tells the system to ignore casing, F shows a 403 forbidden error  and L tells the system to stop rewriting.

You can also add an alternate image on hotlinking.

   RewriteRule \.(jpg|png|gif)$ http://<path_to_your_hotlinked_image> [NC,R,L]

The R rule here redirects the request.

Block/Allowing Users from a certain IP address:

The .htaccess file’s basic use is to allow or block access to a certain directory. You can configure it to selectively allow or disallow requests from a certain user.

If you find that a spammer has been bothering you, or someone has been trying to scrape your content, you can block their IP address by adding the following-

   Order Deny
   Deny from 192.168.121.45

You can also redirect such users to a certain URL.

   RewriteCond %{REMOTE_ADDR} 192\.168\.121\.
   RewriteRule .* https://google.com [R,L]

For security purposes, you might sometimes need to selectively allow only certain IP addresses to a certain location. For instance, you might want only your IP to access the admin area of your site. In such cases, you block requests from all IPs and selectively add your IPs to the whitelist.

    order deny,allow
    Deny from all
    # whitelist IP Address 1
    allow from xx.xxx.xx.xx
    #whitelist IP Address 2
    allow from xx.xxx.xx.xx

The process of blocking and allowing users can also be done using a firewall (like ufw).

Hiding Directory Listing:

It might so happen that opening a certain path on the browser lists the directory structure in it (in the absence of an index.html file). For security reasons, it is considered a good idea to restrict this. To make this happen, you just need to add a single line to your .htaccess file.

   Options -Indexes

Doing so shows a Forbidden page when you try to view the directory structure.

Display Custom Error Pages

Although this is possible with most of the popular CMS or frameworks, the easiest way to show a custom 404 or 500 error pages are through simple changes in the .htaccess files.

   ErrorDocument 404

Force trailing slash

Search engines treat www.yoursite.com/something and www.yoursite.com/something/ as two different URLs even though they may point to the exact same thing. The Google Webmasters Blog advises that the best way to go about this problem is for the non-slash version to redirect to the slash version. Search engines, thus, would understand that it’s the same content that they are pointing to!

   <IfModule mod_rewrite.c>
    RewriteCond %{REQUEST_URI} /+[^\.]+$
    RewriteRule ^(.+[^/])$ %{REQUEST_URI}/ [R=301,L]
    </IfModule>

Improving Performance of .htaccess:

As discussed above, .htaccess files can decrease performance in the case of substantially increased loads. However, this performance dip can be limited by enabling the AllowOverride option in only certain directories. By default, it is enabled for the whole site, and the server checks every directory on every request. You can basically set AllowOverride to None and the allow it for certain directories as show below.

    # enable allowoverride privileges
    <Directory /some/directory/where/you/want/to/enable/AllowOverride>
     AllowOverride Options
     </Directory>

Perishable Press came up with a huge list of things that you could accomplish with .htaccess, illustrating its power. I would suggest you check that out for further reading.

We hope that this post got you started with .htaccess and served as a launchpad for you to do far more interesting thing with .htacess. Always be careful with .htaccess because of its power- small mistakes regarding this can lead to huge implications!