One of the decisions that has to be made when creating an ontology is which concepts you encode as classes and which you encode as properties of those classes. One of the difficulties is that there is no overarching ‘right answer’ (although there are wrong ones) to how you should model your domain, in has to be decided on a case-by-case basis of what works best for the type of world view that you are trying to encapsulate within your model. This post is a request for feedback to help us decide which model works best for both the project and the wider community.
In the previous post we considered three patterns that we could use to describe relationships. Further discussion has led us to discarding the third, event-driven, option both in a drive towards simplicity and more importantly because it has the furthest conceptual distance from the information we want to represent. The source material is diverse in both type and style but if we consider what is normally captured in prosopographical data, and why, we would expect something like:
Επιγόνη daughter of Επίγονος (from Thasos) http://www.lgpn.ox.ac.uk/id/V1-37074
There are a number of events that we can hypothesise from these type of statements, in this case that Επίγονος fathered a girl, Επιγόνη. This fits in with the logical rules that it is possible to create in structured data: when person A fathered/gave birth to a girl B then B is the daughter of A. While epigraphs, like that in the example, are unlikely to go into further detail other sources may have specific description of some events moving them from the realm of the assumed (we assume that Επίγονος fathered Επιγόνη and did not, for example, adopt or was cuckolded resulting in the above statement) to the evidenced (trustworthiness of that evidence is a issue for a different day/post). For those people familiar with CIDOC CRM, this is basically the model that they employ – and it is a good one allowing a rich and detailed encoding of the biographical history of the person (or object). However much of this information is well beyond the scope of what SNAP sets out to model. If it wasn’t then we could just use CIDOC CRM, a well known and common standard, and all go home early for tea. One of the guiding principles behind SNAP is that we are only encoding the minimum information necessary to name/identify an individual entity. We need to know that Επιγόνη is the daughter of Επίγονος only in so much as that is part of her significant identity. So while we would encourage projects to encode this level of information in there own data, events are beyond the scope of SNAP, which leaves us with two other possibilities.
Defining every possible relationship via properties is arguably the simplest way that we could encode the information we need:
[Επιγόνη] -- daughter-of --> [Επίγονος]
There are two potential downsides to this. Firstly, the number of properties expands pretty fast. Not only do we have the basic property tree with
- parent-of
- father-of
- mother-of
- sibling-of
- brother-of
- sister-of
- child-of
- daughter-of
- son-of
but each of those needs to have versions for ‘acknowledged’, ‘claimed’, ‘foster’, ‘adopted’, ‘step’. And then there is the extended family and even if we only go as far as the grandparent/grandchild relationship along with the basic aunt-of/uncle-of (interesting there is no collective gender-neutral word for this relationship), cousin-of (and no non-gender neutral term for this), nephew-of/niece-of then we still have to add in maternal and paternal versions (although we can probably be forgiven for dropping the ‘acknowledged’, ‘claimed’ etc). Added to these we need the important non-“blood” relationships: formalised intimate relationships (i.e. recognised marriage), non-formalised intimate relationships (i.e mistresses), slave-of, master-of, freedman-of, parton-of, client-of…
All in all that gets to approximately 90 relationships, plus a few more if we start including things like disciple-of and teacher-of.
This is not necessarily a problem in itself, although it does get a bit messy. It is at least nicely organised into a hierarchy and there are plenty of opportunities for adding disjunct and inverse property restrictions. However what we gain in the simplicity of the direct link we loose in sacrificing the possibility of relating additional information to the connection such as provenance, reference or certainty. If we model the relationship as a concept (i.e. a Class) rather than as a property connecting two entities they we immediately open up more possibilities.
There are three obvious ways to do this:
1.
[Entity1] --<generic-linking-property>--> [Relationship Class] --<relationship-specification>--> [Entity2]
e.g. [Επιγόνη] –has-relationship–> [AcknowledgedRelationship] –daughter-of–> [Επίγονος]
2.
[Entity1] --<generic-linking-property>--> [Relationship] --<generic-linking-property>--> [Entity2] --<generic-type-linking-property>--> [RelationshipSpecification]
e.g. [Επιγόνη] — has-relationship –> [AcknowledgedRelationship]
–relationship-with–> [Επίγονος]
–relationship-type–> [Daughter]
3.
[Entity1] --<generic-linking-property>--> [Relationship Classes] --<generic-linking-property>--> [Entity2]
e.g. [Επιγόνη] — has-relationship –> [AcknowledgedRelationship, Daughter] — relationship-with –> [Επίγονος]
Although the first two of which could just as easily be modelled the other way around depending on where we preferred to put the emphasis:
[Επιγόνη] --has-relationship--> [Daughter] --acknowledged-with--> [Επίγονος]
[Επιγόνη] --has-relationship--> [Daughter] --relationship-with--> [Επίγονος] --relationship-type--> [AcknowledgedRelationship]
This is important because any additional information such as provenance, reference or certainty would be attached to the intermediary class and comes down to whether we see the hierarchy as being:
- FamilyRelationship
- AcknowledgedRelationship
- FosteredRelationship
- AdoptedRelationship
- ClaimedRelationship
- StepRelationship
- RelationshipType
- Parent
- Father
- Mother
- Sibling
- Brother
- Sister
- Child
- Daughter
- Son
- Parent
or
- FamilyRelationship
- Parent
- Father
- Mother
- Sibling
- Brother
- Sister
- Child
- Daughter
- Son
- Parent
- RelationshipType
- AcknowledgedRelationship
- FosteredRelationship
- AdoptedRelationship
- ClaimedRelationship
- StepRelationship
We can cut out some of this discussion by dropping the additional property and duel-classing the instance as shown in the third example. Expanding on that our class hierarchy would look like:
- SocialContract
- ExtendedHousehold
- Household
- FamilyRelationship
- HereditaryFamily (If anyone can think of a better term I am open to suggestions)
- Parent
- Father
- Mother
- Sibling
- Brother
- Sister
- Child
- Daughter
- Son
- Parent
- Extended Family
- Aunt
- Uncle
- Nephew
- Niece
- Cousin
- Ancestor
- Grandparent
- Grandfather
- Grandmother
- GreatGrandparent
- GreatGrandfather
- GreatGrandmother
- Grandparent
- Descendent
- Grandchild
- Grandson
- Granddaughter
- Grandchild
- [SeriousIntimateRelationship]
- [LegallyRecognisedRelationship]
- HereditaryFamily (If anyone can think of a better term I am open to suggestions)
- [HouseSlave]
- FamilyRelationship
- Household
- Slave
- HouseSlave
- FreedSlave
- Freedman
- Freedwoman
- IntimateRelationship
- SeriousIntimateRelationship
- LegallyRecognisedRelationship
- CasualIntimateRelationship
- SeriousIntimateRelationship
- ExtendedHousehold
- RelationshipQualifier (all disjoint with everything except HereditaryFamily classes)
- Acknowledged
- Adopted
- Fostered
- Claimed
- Step
- Half (disjount with everything except Sibling classes)
- RelationshipAxis (all disjoint with everything except ExtendedFamily classes)
- Maternal
- Paternal
- Inlaw (disjoint with everything except HereditaryFamily and ExtendedFamily classes)
Disjoints would be defined for the gender specific classes (Son/Daughter, Mother/Father, Aunt/Uncle etc) and for those that are impossible without the use of time travel (Child/Parent, Ancestor/Descendent etc) but given the period we are dealing with (Romans and Egyptians – I’m looking at you) it would be unwise to add any additional disjoints that we might otherwise consider between related people.
Of the options that use classes instead of, or in addition to properties, then this is the simplest. It tends to be bad design when you end up making everything a Class which is what we have ended up doing here. Equally we can go to far in the opposite direction in search of “simplicity” and the desire to have as few classes as possible. The intermediary options offer a combination of properties and classes but also raise some options as to how we want the emphasis of the encoding to lie. These are questions that we feel it would be better to open up to discussion by the wider community rather than just making an executive decision.
To review:
Option 1: All properties
<http://clas-lgpn2.classics.ox.ac.uk/id/V1-37074> &snap;daughter-of <http://clas-lgpn2.classics.ox.ac.uk/id/V1-40436>
Option 2a: Combination of Classes and Properties (classes defines the relationship, properties the specific relationship)
<http://clas-lgpn2.classics.ox.ac.uk/id/V1-37074> &snap;has-relationship [ a &snap;AcknowledgedRelationship; &snap;daughter-of <http://clas-lgpn2.classics.ox.ac.uk/id/V1-40436>] .
Option 2b: Combination of Classes and Properties (classes define the specific relationship, properties the relationship type)
<http://clas-lgpn2.classics.ox.ac.uk/id/V1-37074> &snap;has-relationship [ a &snap;Daughter; &snap;acknowledged-with <http://clas-lgpn2.classics.ox.ac.uk/id/V1-40436>]
Option 3a: Combination of Classes and Properties (emphasis on classes but with properties explicitly linking rather than duel classing, main class is the relationship)
<http://clas-lgpn2.classics.ox.ac.uk/id/V1-37074> &snap;has-relationship [ a &snap;AcknowledgedRelationship; &snap;relationship-with <http://clas-lgpn2.classics.ox.ac.uk/id/V1-40436> &snap;relationship-type &snap:Daughter] .
Option 3b: Combination of Classes and Properties (emphasis on classes but with properties explicitly linking rather than duel classing, main class is the specific relationship)
<http://clas-lgpn2.classics.ox.ac.uk/id/V1-37074> &snap;has-relationship [ a &snap;Daughter; &snap;relationship-with <http://clas-lgpn2.classics.ox.ac.uk/id/V1-40436> &snap;relationship-type &snap:Acknowledged] .
Option 4: All classes
<http://clas-lgpn2.classics.ox.ac.uk/id/V1-37074> &snap;has-relationship [ a &snap;AcknowledgedRelationship; a &snap;Daughter; &snap;relationship-with <http://clas-lgpn2.classics.ox.ac.uk/id/V1-40436>] .
I hope this post has clearly laid out the options as we see them and I’d like to invite your opinions and suggestions as to which way we go.
In the CRM-SIG we have discussed the problem of modelling social (e.g. marriage) and other relations (e.g. parenthood). One suggestion (2006) was to introduce a general typed relationship (shortcut) and a long path via an event describing the establishment of the relation (adoption, marriage etc).
We ended with a different solution. A married couple was modeled as an instance of the class E74 Group with type married couple. We (sig) also suggest that adoption and the like can be modeled by instances of E74 Group. In principle any relation between actors/persons which can in some sense can be seen as a whole can be modeled in this way. I personally don’t find this way of modeling human relations completely ideal since it is very close to using groups as set theoretical way of describing relations. But ok.
The use of instances typed instances of E74 Group, is basically identical to the use of classes Faith suggests in this posting. The main difference is that in CIDOC-CRM we encourage to use events whenever possible. In the case of become a member of a group or leaving it, the path from a person to the group (relation qua class instance) is via an event. It is of course possible to suggest an introduction of a shortcut directly from the person to the group (relation qua class instance). With this adjustment the CIDOC-CRM models relations very close to what Faith suggests. The hierarchy of the relations (relation qua class instance) can be modeled by a corresponding type hierarchy. So one may use CIDOC-CRM with a few adjustments for the relations in the SNAP DRGN model.
Below I have quoted from the CRM definition page 73 version 5.12
(http://cidoc-crm.org/docs/cidoc_crm_version_5.1.2.doc)
P144 joined with (gained member by)
Domain: E85 Joining
Range: E74 Group
Subproperty of: E5 Event. P11 had participant (participated in): E39 Actor
Quantification: many to many, necessary (1,n:0,n)
Scope note: This property identifies the instance of E74 Group of which an instance of E39 Actor becomes a member through an instance of E85 Joining.
Although a Joining activity normally concerns only one instance of E74 Group, it is possible to imagine circumstances under which becoming member of one Group implies becoming member of another Group as well.
Joining events allow for describing people becoming members of a group with a more detailed path from E74 Group through P144 joined with (gained member by), E85 Joining, P143 joined (was joined by) to E39 Actor, compared to the shortcut offered by P107 has current or former member (is current or former member of).
The property P144.1 kind of member can be used to specify the type of membership or the role the member has in the group.
Examples:
The election of Sir Isaac Newton as Member of Parliament to the Convention Parliament of 1689 (E85) joined with the Convention Parliament (E40)
The inauguration of Mikhail Sergeyevich Gorbachev as Leader of the Union of Soviet Socialist Republics (USSR) in 1985 (E85) joined with the office of Leader of the Union of Soviet Socialist Republics (USSR) (E40) with P144.1 kind of member President (E55)
The implementation of the membership treaty January 1. 1973 between EU and Denmark (E85) joined with EU (E40)
Properties: P144.1 kind of member: E55 Type