Upcoming workshop on SNAP ontology and data

Next week we shall be hosting a small workshop (already fully booked) at King’s College London to introduce a selection of potential project partners, data providers, advisors and other stakeholders to the SNAP:DRGN project.

The principal aims of the workshop include:

  1. Introduce the goals of the SNAP:DRGN project, the core datasets (LGPN, TM, PIR) and their current formats and contents, and the data models and ontologies that we propose to use for the preliminary data ingest.
  2. Learn about other prosopographical and linked data approaches in use in history and the digital humanities.
  3. Learn about other classical person datasets that might be suitable for exposing in SNAP-recommended RDF format and adding to our triplestore.

On the second day, we shall move away from formal presentations and whole-group discussion, and instead aim to achieve some concrete, manageable tasks in smaller groups of concentrated work. Many of the break-out groups and hack-fest events will be decided during the workshop itself, but among the tasks/discussion we invisage taking place are:

  1. Working to get your digital data into SNAP RDF
  2. Discussing ontologies/approaches for fuller prosopographical/biographical information
  3. Strategies for programmatically finding names and person-references in texts (Named Entity Recognition)
  4. Strageties for finding matches/co-references between persons in the SNAP graph contributed by different datasets
  5. Discussing APIs and services that users of SNAP might find desirable

What other information about SNAP:DRGN would it be useful to discuss, impart, or work on at this event (or after)?

8 thoughts on “Upcoming workshop on SNAP ontology and data”

  1. Looking forward to these discussions. I’m especially keen to start thinking about how simple annotation of different kinds of named entities (people, places, categories, canonical references, etc) might be used together to offer more than the sum of their parts.

  2. I am especially looking forward to discuss the fine-tuning of annotating all those different kinds of entities: names, name variants, declined forms, identification clusters, titles etc. And how to forge them into something new.

  3. I’m looking forward to the discussion on the data models that we need to link the entities together – what relationships we need to be able to describe and how for we can use existing ontologies to do that.

  4. It is going to be very useful at the workshop to start showing other people the sometimes complicated ideas we’ve been batting back and forth in the initial stages of SNAP. Can we explain the vision properly, or will we get blank looks when we start displaying RDF? The Tuesday sessions especially will provide the environment in which we can see whether other data sets than the initial three can be expressed in the SNAP way.

  5. The Epigraphic Database Heidelberg (EDH) will publish all prosopographical data as RDF so I’m looking forward to discuss details of the SNAP ontology. As a data provider I’m especially interested in methods of aligning URIs of EDH prosopographical data to SNAP URIs and building the neccessary annotations for creating links on our website to other data providers. Another interesting aspect will be to find out ways in which EDH could support quality assurance of the NER process.

  6. Among many other things I am interested to learn more about how we approach the problem of data provenance in the upcoming future – Hugh has pointed at this in his blog post already.
    Some assertions we’d like to make will be easy, others immensely hard, probably most of them somewhere in between. What’s the praenomen of Tacitus again?
    The tension between simplicity of use and complexity to suit all scholarly needs is quite a challenge…

  7. Our canonical data format is EAC-CPF, but I plan to implement whatever RDF model SNAP needs. The EAC-CPF creation, editing, and publishing software I’m developing, xEAC (https://github.com/ewg118/xEAC), will export into at least three different flavors of RDF. I’m really interested in discussing the SNAP ontology and model because so far, no clear RDF standard has emerged in the archival community. There’s an EAC-to-RDF ontology (http://archivi.ibc.regione.emilia-romagna.it/ontology/reference_document/referencedocument.html), but I don’t know if anyone actually uses it. Whatever ideas emerge from SNAP may also have an impact on the wider LAM community. I spent almost five years working for the University of Virginia library, so I still have one foot firmly planted in the archive/library community. Whatever I learn at the SNAP meeting next week, I’ll report back to the EAD/EAC-CPF list.

  8. I’ll bring along a quick run-down of the Pelagios annotation and gazetteer-alignment principles for my lightning talk. I think there are some parallels in particular with the gazetteer alignment. Although Pelagios is potentially trivial compared to SNAP, since we don’t care about what kind of “equality” exists two place records in different gazetteers – only that there *is* a similarity. So much will be *very* interesting for me to contrast this to the SNAP ideas!

Leave a Reply to Mark Depauw Cancel reply

Your email address will not be published. Required fields are marked *