Whenever a name is used, it is used somewhere. Every instance of a name appearing in print is either de novo, or it’s a citation of that name appearing somewhere else. That’s logic.
And so, our “name” table should be treated as an optional one-to-one table on name instances. Those name instances that are nomenclatural events
(the act of giving a specimen a name) have additional data about the name they establish, and other instances cite them. In most cases where a name is simply used, it should be treated as a citation of the protonymic instance.
In other words, the NAME_ID on each instance that is not a protonym is – in principle – a derived field. It is (just making up some notation here) “instance is citation of instance”* -> “instance creates name”, over the real world of publications and the names in them.
In a perfect world. We don’t have the whole of the real world in our database. Very few databases do.
Here are a couple of issues:
Common names. Common names sort of exist in the aether, there is no nomenclatural event that creates them. There are also many names which even if they are real, scientific names created validly under whatever code governs them – we don’t necessarily have the creating instance in our data (eg, stuff that doesn’t occur in australia, obscure papers, other reasons).
Invalidly published names that are subsequently validly published. Someone names a specimen, but they didn’t dot the i’s and cross the t’s (often, it’s that they didn’t describe the specimen properly. In Latin, dammit, like what God talks.). Someone else subsequently – sometimes even the same author in the same year – does it right. Now, from one point of view the second work is citing the first. But from the point of view of scientific naming, that first name doesn’t “count” as really being a
name. So things that want to cite the protonym ought to be citing the second occurrence, not the first. I think what happens here is that one is the protologue and the other is the protonym.
Chains of citations. If we in the database modelled it the way it “really” is, you would have to walk the chain of citations to get at the actual name.
So what’s the upshot of this?
The upshot is that of course we have a name table, and of course every instance holds a pointer to the name that is an instance of. What I’m saying is that this should be viewed as a denormalised data structure that – for convenience – doesn’t exactly model what’s really going on, and that’s perfectly ok.