Configuring Apache2 to redirect a HTTP request based on its Accept header

authored by Frank Lynam at 07/02/2013 11:33:58

In this post I am going to look at how you go about configuring your Apache 2 daemon to redirect incoming HTTP requests to particular resources depending on what is included in their Accept headers. The Accept header is set by the HTTP client in order to tell the server what types of content they are looking for. Here are a few typical Accept headers and their corresponding contents:


text/html		HTML content
*/*			Any content
image/jpeg		JPEG image content

In reality the Accept header for something like a Web browser will look something like this (for Chrome 24.0.1312.57 running on Ubuntu):


text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8

This is simply just a concatenated list of all the types that will be accepted by the client alongside some additional parameters that I am not going to get into here.

For my purposes, I’m interested in RDF headers and as far as I can tell at this point the following seems to be a generally accepted list:


Application/rdf+xml			RDF/XML
application/x-turtle			Turtle
text/plain				N-Triples
text/rdf+n3				N3
application/trix			TriX
application/x-trig			TriG
application/x-binary-rdf		Sesame Binary RDF
application/sparql-results+xml	        SPARQL results in XML format
application/sparql-results+json	        SPARQL results in JSON format
application/x-binary-rdf-results-table  Binary RDF results table format
text/boolean 				plain text boolean result format

There is, however, no guarantee that the HTTP client will use these exactly as they appear here. It is entirely up to the client in question and this obviously can be problematic for the designer of an RDF daemon.

It’s relatively easy to setup this routing when you are using the Apache2 HTTP server. I have entered the following commands in the configuration file for my virtual site but I’m pretty sure that they could be entered in the main httpd.conf or in a .htaccess file with the same effect. For this example I’ve only put in the routing for HTML and XML/RDF requests. The scenario as outlined below sets up Apache2 to wait for a request to [site_root]/example.


# HTML handler
RewriteCond %{HTTP_ACCEPT} !application/rdf+xml.*(text/html|application/xhtml+xml)
RewriteCond %{HTTP_ACCEPT} text/html [OR]
RewriteCond %{HTTP_ACCEPT} application/xhtml+xml [OR]
RewriteCond %{HTTP_USER_AGENT} ^Mozilla/.*
RewriteRule ^example$ example-content/rdf.html [R=303]

# RDF/XML handler
RewriteCond %{HTTP_ACCEPT} application/rdf+xml
RewriteRule ^example$ example-content/example.rdf [R=303]

# DEFAULT handler
RewriteRule ^example$ example-content/example.rdf [R=303]

The first block of conditions checks if the Accept header contains any reference to HTML content. If the conditions hold then it redirects the client to the rdf.html resource, which is located in the example-content folder. The second block checks for RDF/XML requests and the final block catches any requests that weren’t caught by the previous logic.

Note that I’ve included the [R=303] postfix at the end of the RewriteRule commands. For my own setup, I’m not actually using these for now as they cause a difficulty with my query string logic (For my actual setup I’m sending on any additional text that has been requested by the client to a Python script). The Redirect 303 seems to cause this a problem. Having said all that, the RDF spec decrees that the ‘303 redirect’ be the preferred method of dealing with these sorts of content requests so no doubt I’ll have to return to this at some point.

Comments

submit