Table of Contents
.htaccess recipes
Note: .htaccess
only works at SDF on pages served by Apache, which at the time of this writing (2022/02/23) is only on the cluster. Meaning that .htaccess
will not do anything if placed in folders served by The MetaArray or VHOST
Introduction
.htaccess
is the default file used by the Apache HTTP server (and others) in order to allow dynamic configuration. It's a plain text file that uses the same syntax present in the main configuration files (e.g., httpd.conf
). It can contain a subset of Apache directives. The size of this subset depends on wheter the directives can be overridden or not (and this is present in the server configuration). In the Apache documentation you can see if a directive can be placed in a .htaccess
file by checking that in the Context: line appears .htaccess
. For instance, it's possible for the ForceType directive, but it's not for the ErrorLog directive. file.
The configuration directives placed in a .htaccess
file will take effect immediately when a document, located in the directory where the .htaccess
file is located and all subdirectories, is accessed. The server will also search for .htaccess
files in all the parent directories. If there is a conflicting configuration directive, the server will apply the one that is in the .htaccess
file closer to the requested resource. For instance, suppose that X and Y are two generic options. If you have Options +X -Y
in $HOME/html/.htaccess and Options -X
in $HOME/html/files/test/.htaccess, when you access a file in http://YOURUSERNAME.freeshell.org/files/test/ (and all subdirectories, unless you have another .htaccess
file that reverts the configuration) options X and Y will be disabled, but if you access a file in http://YOURUSERNAME.freeshell.org/files/ (and above) option X will be enabled and option Y disabled.
Remember that .htaccess
files must be readable by the server, so you can chmod 640 .htaccess
in order to give it the correct permissions. It's, however, a good practice to run mkhomepg -p
in your SDF shell everytime you play with files in your html directory.
Additional information about .htaccess
files can be found in:
OK, let's see some recipes. The URL http://YOURUSERNAME.freeshell.org/ will be used in the examples, so modify it to suit your needs and remember that your .htaccess
file will be placed in $HOME/html/
or in directories under it. Examples solve a specific issue, but they can give you an idea on how to deal with something more generic (i.e., an example could be referred to .pl files, but with a search of the mentioned directives you could generalize it). If you need some help, jump on com or post your request on bboard.
Recipes
Redirect to a custom error page
Do you want your visitors see your custom error pages when something goes wrong (e.g., a page not found error)? There's already a tutorial about it: custom error pages for your site
Deny directory listing
If you type http://YOURUSERNAME.freeshell.org/pics/ you will see a list of the files present in pics. Probably you don't want this (if you don't want that other people see your private stuff, don't put it on-line, or, at least, password protect them). Add this to your .htaccess
file:
Options -Indexes
Save the current directory as an environment variable
You might already be aware that the options in an htaccess file will affect all the files in subdirectories of the htaccess file location, unless overridden by htaccess files deeper into the path. But did you ever wish you could dynamically infer the location of your base htaccess file, for substitution into later directives? It turns out that Perl regexes (used in Apache's mod_rewrite) are flexible enough to do the trick.
If you made your website accessible via tilde-style URLs (running mkhomepg -d
at the shell), then requests to anything under http://sdf.org/~YOURUSERNAME will have a different setting for DOCUMENT_ROOT than requests to the same content at http://YOURUSERNAME.sdf.org (assuming that ~/html and ~/public_html are symlinks to the same folder). A single htaccess serving both types of URLs cannot blindly make relative path substitutions in a RewriteRule directive (or even well-intentioned absolute path substitutions based on your knowledge of the SDF filesystem). Instead, you can first pass the requested URL through a do-nothing RewriteRule (the pattern ^.*$
matches everything, and the substitution -
leaves it unchanged), extracting the current directory into the environment variable CWD
. Then you can use the value of CWD
in a subsequent RewriteRule that performs an actual substitution. Here is an example:
RewriteEngine on RewriteBase "/" RewriteCond $0#%{REQUEST_URI} ([^#]*)#(.*)\1$ RewriteRule ^.*$ - [E=CWD:%2] RewriteCond %{HTTP_HOST} ^(www\.)?sdf\.org$ RewriteCond %{REQUEST_FILENAME} !-f RewriteCond %{REQUEST_FILENAME} !-d RewriteRule ^.*$ %{ENV:CWD}/custom404.html
The # symbol serves as an arbitrary delimiter, ensuring that the second parenthesized group captures the working directory that's known to the server at the time of processing. Whether the working directory is an absolute pathname on the filesystem or an alias that will later be expanded by mod_userdir, the environment variable CWD
will have a value compatible with the order in which Apache processes UserDir and RewriteRule directives.
This example can be extended to perform different redirections depending on which virtual host handled the transaction (just create a new stanza with a different condition on %{HTTP_HOST}). In this way you can give your visitors the same experience whether they've bookmarked your site as http://YOURUSERNAME.sdf.org or http://sdf.org/~YOURUSERNAME, without having to duplicate your content in multiple folders.
Because these RewriteRules impose additional processing demands on the server for each request, hacks like these should be a last resort. If the other workarounds mentioned in custom error pages for your site address your use case, they should be implemented instead.
Add (or force) MIME-type
The server could not be aware of all kind of files out there, so will have some troubles trying to figure out what to do with an unknown extension. You can tell the server what to do with unknown file types. Say you have a .cab file. Apache will communicate to your user agent the correct information about the file with:
AddType application/vnd.ms-cab-compressed .cab
AddType is the directive, application/vnd.ms-cab-compressed is the MIME-type present in the Content-Type: entry in the HTTP headers sent by Apache, and .cab is the extension.
Even if the server knows what's the MIME-type of a specific file extension, you could prefer it to use another one. Let's say that you want .html files to be served as application/xhtml+xml (because you are hardcore). Try this:
AddType application/xhtml+xml .html
You can look for common MIME-type on wikipedia or read a full list on IANA's website.
Access files without specifying the extension
It could be desirable to avoid specifying extensions for your html pages. Why? Suppose you've always used http://YOURUSERNAME.freeshell.org/contact.php in your .sig and, at some point, you decide that you want to use perl, so that the new address is http://YOURUSERNAME.freeshell.org/contact.pl. Unless you take other actions (redirection) people that go to the old address will find a 404 page. It would be better to use http://YOURUSERNAME.freeshell.org/contact so that you can go crazy and rewrite your site with all known languages as frequently as you want.
You can use URIs without extensions with:
Options +MultiViews
I want to access files without extension, but my (cgi|pl|php) is not found
Suppose that you have a cgi file called script.cgi and that, once you enable MultiViews (see above), when you access http://YOURUSERNAME.freeshell.org/script you get a 404 page. It's likely that the server have some problems in determining the MIME-type. In this case, put in your .htaccess
file:
AddType application/x-httpd-cgi .cgi
If you have perl and/or php files, add (modify the extension as needed):
AddType application/x-httpd-php .php AddType application/x-httpd-perl .pl
Serve .pl .php .cgi etc. as plain text files
If you want the server to execute your files, in order to be able to read the code of some specific files, you can remove the handlers. Let's say that the code you want to read is located in $HOME/html/code/. Now, you can put in $HOME/html/code/.htaccess the following bits:
RemoveHandler .pl .php .cgi
Force a download with a specific filename
Let's say that you have a pdf file with an unintuitive name aaa222.pdf. You might want to force a download when people access the file and, in doing so, specify a default file name for the file that will be saved. This will do the job:
<Files x.cab> Header set Content-Disposition “attachment; filename=Thesis.pdf” </Files>
Specify a default character encoding
If you want all your html documents to be served with UTF-8 as the default encoding (or charset):
AddCharset UTF-8 .html
UTF-8 was used as an example, but you can use whatever encoding is appropriate. Note also that in the example only files with extension html will have a default encoding. If you want to extend that behavior to other file extensions, add them on the same line. For instance, AddCharset UTF-8 .html .htm .txt
.
This can also be useful if you want that only pages written in a specific language are served with a default encoding, while the others use the encoding sent normally by the server. So, suppose that you're using language negotiation and have resources in two languages, English (with extension en.html) and Chinese (with extension cn.html). With the following line:
AddCharset UTF-8 .cn
only .cn.html files will have UTF-8 as the default encoding. (The order of the language in the extension is not relevant, i.e., the files could have been html.en and html.cn; also, the leading dot in the extension in the .htaccess
file is optional).
Password protect your directories
This is a FAQ: http://sdf.org/index.cgi?faq?WEB?04
Force visitors to use SSL/HTTPS
As SDF expands its support of Let's Encrypt, offering free SSL certificates, you may wish to require all visitors of your site to use HTTPS. (This also may improve your search engine ranking, and many Web browsers will soon flag non-SSL sites as “Not Secure.”) Adding this to the .htaccess
file in your site's root directory will redirect your non-HTTPS visitors accordingly:
RewriteEngine On RewriteCond %{HTTPS} off RewriteRule ^(.*)$ https://%{HTTP_HOST}%{REQUEST_URI} [L,R=301]
If this suddenly renders your whole site inaccessible, be sure the permissions on the .htaccess
file itself are suitable (chmod 644, or run mkhomepg -p
) – the default umask will not allow the Web server itself to read your .htaccess
file.
$Id: htaccess.html,v 1.3 2018/07/30 15:30:01 dave Exp $ .htaccess recipes - traditional link (using RCS)