Sparql servers.


I’m aiming to get a sparql server up and running, with reasoning, rules etc. In particular: I want to be able to create parent/child relationships for our accepted taxonomy by specifying OWL 2 DL rules over our existing rdf.

The key thing is that I want to use named graphs to isolate sets of triples. The idea will be that the botanical and the zoological data will each be in its own area. The inferred accepted taxonomy will be in a separate area. A sparql query will then be able to run against the data using our accepted taxonomy – or not.

At the moment, I have a Jena “SDB” data store hosted on a local copy of mysql. The key reasons for using SDB are

  • It’s backed by a relational database, which means that I use Oracle for the backing store. This gives me all the benefits of using Oracle – by which I mean our DBA. Backups, storage management – all the tiresome things that developers don’t care about.
  • It supports quadruples. I can put data in separate logical “graphs”. I envisage graphs for APNI, AFD … even for distinct data loads. The vocabulary descriptions will also live in their own spaces. Thus, it might be possible to run a qury agains APNI, using the “Set XYZ” vocabulary rules.
  • The schema is simple enough that I can probably trick jena into using a materialised view.
  • No vendor lock-in. We are keen to not lock ourselves into Oracle, which has its own semweb gear

So far, I have successfully loaded two graphs into SDB.

Graph <; contains this fact:

<> <> "My name is A"

Graph <; contains these two facts:

<> <> "My name is B"
<> <> "My name is C"

Great! Having loaded these two graphs into SDB, I want to make them available via sparql. Most particularly, I want each graph gettable-at, and I also want the union of them to be useable as a single item. The goal, ultimately, is that we will be able to serve up graphs named “All our current data” and “Just the vascular plants and bryophytes” and so on – defining the unions at this end so that our data users don’t have to figure out which subgraphs they are interested in.

Getting this happening is a matter of building a jena “assembler” descriptor. In accordance with the usual rule “when you have a hammer, everything looks like a nail”, the authors of Jena have decided that the appropriate way to do configuration for a bit of software that happens to host rdf graphs is with an rdf graph.

The resulting config file that I have at this stage looks like this:

As you can easily see, I have put the union graph in both as the default graph of the dataset, and as a separate named graph in its own right.

At this stage, these queries do what you would expect when the joseki server is started:

select ?s ?p ?o where { ?s ?p ?o }
select ?s ?p ?o where { GRAPH <> { ?s ?p ?o }}
select ?g ?s ?p ?o where { GRAPH ?g { ?s ?p ?o }}

next: inferencing!


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: