Descending, again, into the maze of documentation that is – well – anything to do with bioinformatics. (as the joke goes: standards are great! So many to chose from!)
We are attempting to build conformant darwin-core archive files. The files we have do validate sucessfully
But we’d like ’em to be better. So. “Darwin Core Achive format, Reference Guide to the XML Descriptor File”. Page 7. “GBIF recommends a GBIF metadata profile1“. Broken link.
the xml element must have a packageId, scope, and system. packageId is just a unique id – I’ll jam the timestamp in there, job done. Scope is fixed to “system”. And what is system? At a guess – it’s the namespace in which the package ids are unique, so in our case it’s meant to be “darwincore taxonomic trees from biodiversity.org.au”.
The profile.xsd does not have a namespace, but specifies that the elementFormDefault is “qualified”. Not sure what happens there. Will I need to explicitly define a namespace prefix for the empty namespace? Having trouble running xmllint owing to proxy nonsense. I need to set up a local XML catalog with xsd, xml, dc, dcterms and so on. So I am not 100% positive that my XML is correct against the schema, yet.
What else …
My, the schema sure does insist on bunch of stuff. You must have creator, metadata provider, and contact blocks – all of which are “agent” blocks, although you can get away with each having only one subelement (organisation name).
The intellectual rights block is just a chunk of free text – no support for including mixed data. It would be nice to have dcterms:license or even creative commons elements in there, but the schema does not support it.
Coverage is nice, but the taxonomic coverage element is odd: an optional generalTaxonomicCoverage, then any number of taxonomicClassification elements. Each taxonomicClassification element is taxonRankName, taxonRankValue, commonName. So it seems that there’s no way to say that your data covers regnum PLANTAE unless you put it in as a common name. Perhaps taxonRankValue was meant to be the taxon name, and the wires got crossed somewhere?
Fun times, anyway.