As we come to the end of the first year of SNAP:DRGN funding, and start planning applications for follow-up funding, it is worth rehearsing the main academic and other benefits of the SNAP:DRGN projects and the prosopographical-onomastic graph that we hope it feeds into.
A virtual authority list of ancient persons for scholars and cultural heritage institutions to point to.
The key role of the SNAP:DRGN project from the start was to provide a set of prosopographical unique identifiers for use in disambiguating names and other person (and person-like) references in digital texts and other datasets. Having a single URI to specify Alexander the Great as opposed to Alexander of Aphrodisias, Alexander of Abounoteichos, or Alexander of Troy will be very powerful and useful. Scholarly and heritage databases will be able to use SNAP identifiers both in their own code (an @ref attribute in a TEI edition, for example) and produce SNAP/Pelagios-style open annotations associating names in their texts with person identifiers.
Ground truth/gold standard dataset for future data mining/NER work.
The long lists of names, people, titles and other references coming from the many corpora that make up the SNAP:DRGN graph will be a valuable dataset in their own right. Lists of names and persons can be seen as a Gold Standard for seeding named-entity recognition tools, especially those involving machine learning. Name lists will also contribute to the back-end of spellcheckers, morphological parsers such as Morpheus/Alpheios, and the toolkits for correcting OCR of ancient texts.
Annotation graph, visualization and API for research purposes.
When we have not only a graph of persons, co-references and relationships from our contributing prosopographical datasets, but also a large collection of open annotations, as described above, we expect that the network of people, names, places, references and citations will be a research tool in its own right. It is our intention to test this assumption by building an API and search interface, perhaps in association with visualization and social network analysis tools, and put them in front of historians to see what they make of them. Vigorous criticism that inevitably results could then the followed by cycles of rapid implementation and further testing.
Public engagement, scholarly contribution and citizen science.
Once we have a large amount of data in the SNAP:DRGN graph, the task of identifying co-references between persons from different databases, recording relationships between persons, and disambiguating other kinds of information will become essential. Some of these may be partly automated, others may only be possible to perform by hand. Scholars, students, and even citizen contributors might be recruited to help with the task of identifying or confirming such annotations, or adding structured commentary of other kinds. The most important outcome of such a “citizen science” exercise, in my opinion, is the engagement of scholars and public from outside the project in this kind of material. Students could benefit hugely from engaging directly with prosopographies, epigraphy and other primary sources, making decision and receiving credit for their work. As I argued recently, “If you’re doing your job properly, there’s no distinction between citizen science and pedagogy.”
Contributions to the tools and ontologies ecosystem.
Finally, SNAP:DRGN is participating in several communities of tooling (Pelagios, Recogito, Perseids, Berkeley Prosopography Services), standards (Pelagios, TEI), and ontologies (LAWD, CIDOC-CRM, Open Annotation Collaboration, Ontology for Historical Prosopography). It is our hope that we will not only benefit from building on these existing community projects and resources, but also contribute experience, code and documentation to many of these projects in turn. If collaborative work of this kind does not result in shared code and outcomes, something has gone horribly wrong. SNAP participants have already been invited onto several special interest groups and advisory boards.
Comments on, objections about, and additions to this account are very welcome.