Handling requests for instances of CIDOC CRM’s E38_Image class

authored by Frank Lynam at 11/01/2015 17:33:05

I am currently working on mapping the data of the Priniatikos Pyrgos archaeological project onto the CIDOC CRM ontology (ecrm) or to be more specific onto the English Heritage extension (crmeh) of the ecrm. The ecrm is a complex multi-level beast and due to a general lack of examples it can be difficult to implement depending on your target level of compliance. This post considers the problem of how to create instances of the ecrm’s E38_Image class and it addresses the issue of how to handle a URI that effectively addresses two different data types at once, which as far as I can see the official documentation fails to deal with.

OK, so here’s the problem in a nutshell. I have an instance of the crmeh:EHE0007_Context class and it links to an instance of the ecrm:E38_Image class via the ecrm:P67i_is_referred_to_by predicate. I want to be able to store the source URL for an image somewhere around this ecrm:E38_Image  instance. Now if the OWL declaration for ecrm:E38_Image included a predicate to store a source image URL there wouldn’t be an issue. Unfortunately it does not and it seems that the ecrm wants you to make the URI for the ecrm:E38_Image instance reference not only an RDF resource but also an image source URL depending on what the client is looking for.

The British Museum seems to implement this sort of polymorphism as follows. Let us take the Rosetta Stone RDF resource as an example. It links to instances of type ecrm:E38_Image via the ecrm:P138i_has_representation predicate and will return the source image to the client if they request the URI (which includes the .jpg file extension). However, in the HTML view of the Rosetta Stone resource, the link addresses the ecrm:E38_Image instance in the form, http://collection.britishmuseum.org/resource?uri=URI_OF_RESOURCE and in this case the server returns not the source image but the RDF resource.

What I decided to do was similar but rather I put the onus on the client to explicitly request the image’s source file. I used Apache mod-rewrite’s RewriteRule to catch requests for URIs of a certain type, something like http://example.com/imageXXX.jpg. The important part of the URI is the ‘image’ part and the ‘.jpg’ file extension. If the request URI matches this pattern I return to the client the source image. However, if the client requests http://example.com/imageXXX I return the RDF resource associated with the image instead. Here’s the rule:

RewriteRule ^/data/image(.*).jpg$ images/$1.jpg