A fine distinction, but an important one.


Wrestling with vocabulary. Again.

I am approaching the job bottom-up: we have a set of tables, and we want to publish them. Rather than proposing an vocabulary and attempting a procrustean solution, I am simply exposing the data using d2rq and using the vocabulary to document what is exposed.

More or less.

Of course, nothing is ever that simple.

We have a number of data items in our system, eg: NAME.
We also have a number of tables that hold enumerations: NAME_GROUP, NAME_TYPE, NAME_STATUS and so on.

Now, every NAME has a NAME_GROUP. Name group is “botanical” or “zoological”. There is some argument about this, as these things are called “Nomenclatural Codes”. However, “CODE” tends to mean something else inside databases.

But NAME_RANK also has a name group (the collapsing of RDF identifiers means that this is many-to-many). As does NAME_STATUS.

My problem is – what class and predicate names do I use? I could use hasGroup for everything, It seemed to me to be wrong that the question “what is in this group?” would pull out both names and vocabulary items – ranks and statuses.

I could have a separate predicate name for each place where group appears, but this seems horribly over engineered. I mean, there’s more than one kind of group, so we are looking at nameNameGroup, nameTypeNameGroup, nameStatusNameGroup and so on. Horrible.

Thinking about this, I decided to take this approach. A name group is primarily a group of names. Then sense in which ‘nom. cons.’ belongs to group ‘botanical’ is different to the sense in which ‘Doodia’ belongs to it. ‘nom. cons.’ isn’t in the group. It’s simply that we are declaring that it is meaningful to apply that term to names that are.

And so I am using one predicate for “this name is in this group”: nsl_name:group, and a different one for “this vocabulary item can be used for names in this group”: nsl_name:nameGroup.

Not 100% sure about what these should be called, of course. I dislike putting ‘has’ on the front of all the predicates – it’s just noise. And maybe nameGroup above should be called something like applicableTo. But then it’s,

“applicable to what?”
“ok then: applicableToGroup”
“groups of what?”
“well all right: applicableToNameGroup”

which is arguably correct but 21 letters long. Any vocabulary term longer than ‘internationalization’ is just not acceptable, if you ask me.

Ok, so let’s just go with ‘group’. The problem now is that we now have two uris, nsl_name:group for the predicate and nsl_name:Group for the class that are identical except for capitalisation.

As I understand it: in RDF, case is significant. But to an http server, it is not supposed to be (although there’s plenty of servers ignore this). But the fact that HTTP servers are supposed to ignore case actually doesn’t matter in this case, because those URIs are uri fragments.

That is, this:

http://example.org/voc/name/group
http://example.org/voc/name/Group

might be a problem in linked data. It might be the case that the http server only ever returns one of the two files and one of those two vocabulary terms become inaccessible. But this:

http://example.org/voc/name#group
http://example.org/voc/name#Group

Isn’t a problem, because the http server is serving up the one document http://example.org/voc/name.rdf.

Meh – maybe I should put the ‘has’ back on the predate names.

WHERE { ?n nsl_name:group ?g }

vs

WHERE { ?n nsl_name:hasGroup ?g }

Maybe I should use inGroup rather than hasGroup.
Maybe GROUP should be a class.
Maybe I should generate a named individual botanical, a class Group.botanical, and declare that Group.botanical is owl:equivalentTo ( owl:hasValue group Group.botanical ).
Maybe I should swap that – doesn’t “botanical” make sense as a class name on its own?

But I’m not sure d2rq can generate something that complex.
Should they go in the static vocabulary files?
Who keeps them up-to-date?
What about NAME_TYPE, which has 25 values? Do I still want a static file for that?

Choices, choices. It’s all very much still in flux.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: