digital humanities; Knowledge graph creation; knowledge graph management; CIDOC CRM; Curation; Digital humanities; Ireland; Knowledge graph management; Knowledge graphs; Named graphs; Ontology's; Conservation; Information Systems; Computer Science Applications; Computer Graphics and Computer-Aided Design
Abstract :
[en] The Beyond 2022 project aims to create a virtual archive by digitally reconstructing and digitizing historical records lost in a catastrophic fire which consumed items in the Public Record Office of Ireland in 1922. The project is developing a knowledge graph (KG) to facilitate information retrieval and discovery over the reconstructed items. The project decided to adopt Semantic Web technologies to support its distributed KG and reasoning. In this article, we present our approach to KG generation and management. We elaborate on how we help historians contribute to the KG (via a suite of spreadsheets) and its ontology. We furthermore demonstrate how we use named graphs to store different versions of factoids and their provenance information and how these are serviced in two different endpoints. Modeling data in this manner allows us to acknowledge that history is, to some extent, subjective and different perspectives can exist in parallel. The construction of the KG is driven by competency questions elicited from subject matter experts within the consortium. We avail of CIDOC-CRM as our KG's foundation, though we needed to extend this ontology with various qualifiers (types) and relations to support the competency questions. We illustrate how one can explore the KG to gain insights and answer questions. We conclude that CIDOC-CRM provides an adequate, albeit complex, foundation for the KG and that named graphs and Linked Data principles are a suitable mechanism to manage sets of factoids and their provenance.
Disciplines :
Computer science Arts & humanities: Multidisciplinary, general & others
Author, co-author :
Debruyne, Christophe ; Université de Liège - ULiège > Montefiore Institute of Electrical Engineering and Computer Science ; Adapt Centre, Trinity College Dublin, College Green, Dublin, Ireland
Munnelly, Gary; Adapt Centre, Trinity College Dublin, College Green, Dublin, Ireland
Kilgallon, Lynn; Department of History, Trinity College Dublin, College Green, Dublin, Ireland
O'Sullivan, Declan; Adapt Centre, Trinity College Dublin, College Green, Dublin, Ireland
Crooks, Peter; Department of History, Trinity College Dublin, College Green, Dublin, Ireland
Language :
English
Title :
Creating a Knowledge Graph for Ireland's Lost History: Knowledge Engineering and Curation in the Beyond 2022 Project
Government of Ireland ADAPT - ADAPT Centre for Digital Content Technology
Funding text :
Beyond 2022 is funded by the Government of Ireland, through the Department of Tourism, Culture, Arts, Gaeltacht, Sport and Media, under the Project Ireland 2040 framework. The project is also partially supported by the ADAPT Centre for Digital Content Technology under the SFI Research Centres Programme (Grant 13/RC/2106_P2). Authors’ addresses: C. Debruyne, G. Munnelly, and D. O’Sullivan, ADAPT Centre, Trinity College Dublin, College Green, Dublin 2, Ireland; emails: {debruync, munnellg, declan.osullivan}@tcd.ie; L. Kilgallon and P. Crooks, Department of History, Trinity College Dublin, College Green, Dublin 2, Ireland; emails: {kilgalll, pcrooks}@tcd.ie. Authors current address: C. Debruyne’s, Montefiore Institute, University of Liège, Quartier Polytech 1, Allée de la découverte 10, 4000 Liège, Belgium.
Sebastian Colutto, Philip Kahle, Günter Hackl, and Günter Mühlberger. 2019. Transkribus. A platform for automated text recognition and searching of historical documents. In Proceedings of the 15th International Conference on eScience. IEEE, 463-466. DOI:https://doi. org/10.1109/eScience.2019.00060
Mariana Damova and Dana Dannells. 2011. Reason-able view of linked data for cultural heritage. In Proceedings of the 3rd International Conference on Software, Services and Semantic Technologies. Darina Dicheva, Zdravko Markov, and Eliza Stefanova (Eds.). Springer Berlin, 17-24.
Souripriya Das, Richard Cyganiak, and Seema Sundara. 2012. R2RML: RDB to RDF Mapping Language. W3C Recommendation. W3C. Retrieved October 28, 2020 from https://www.w3.org/TR/2012/REC-r2rml-20120927/.
Christophe Debruyne, Oya Deniz Beyan, Rebecca Grant, Sandra Collins, Stefan Decker, and Natalie Harrower. 2016. A semantic architecture for preserving and interpreting the information contained in Irish historical vital records. International Journal on Digital Libraries 17, 3 (2016), 159-174. DOI:https://doi.org/10.1007/s00799-016-0180-8
Christophe Debruyne, Gary Munnelly, Lynn Kilgallon, Declan O'Sullivan, and Peter Crooks. 2020. Beyond 2022 Knowledge Graph Sample Data. Retrieved November 15, 2020 from https://doi.org/10.5281/zenodo.4276353
Christophe Debruyne and Declan O'Sullivan. 2016. R2RML-F: Towards sharing and executing domain logic in R2RML mappings. In Proceedings of the Workshop on Linked Data on the Web, LDOW 2016, Co-located with 25th International World Wide Web Conference .Sören Auer, Tim Berners-Lee, Christian Bizer, and TomHeath (Eds.). CEUR-WS.org. Retrieved fromhttp://ceur-ws.org/Vol-1593/article-13.pdf.
Kevin Chekov Feeney, Declan O'Sullivan,Wei Tai, and Rob Brennan. 2014. Improving curated web-data quality with structured harvesting and assessment. International Journal on Semantic Web Information Systems 10, 2 (2014), 35-62. DOI:https://doi.org/10.4018/ijswis. 2014040103
Michael Fewer. 2019. The battle of the four fourts, 28-30 June 1922. History Ireland 27, 4 (2019), 44-47. Retrieved from https://www. jstor.org/stable/26853089.
Daniel Garijo. 2017. WIDOCO: A wizard for documenting ontologies. In Proceedings of the 16th International Semantic Web Conference. Claudia d'Amato, Miriam Fernández, Valentina A. M. Tamma, Freddy Lécué, Philippe Cudré-Mauroux, Juan F. Sequeda, Christoph Lange, and Jeff Heflin (Eds.), Lecture Notes in Computer Science, Vol. 10588, Springer, 94-102. DOI:https://doi.org/10.1007/978-3-319-68204-4_9
Günther Goerz, Martin Oischinger, and Bernhard Schiemann. 2008. An implementation of the CIDOC conceptual reference model (4.2. 4) in OWL-DL. In Proceedings of the 2008 Annual Conference of CIDOC-The Digital Curation of Cultural Heritage.
José Manuél Gómez-Pérez, Jeff Z. Pan, Guido Vetere, and HonghanWu. 2017. Enterprise knowledge graph: An introduction. In Proceedings of the Exploiting Linked Data and Knowledge Graphs in Large Organisations. Jeff Z. Pan, Guido Vetere, José Manuél Gómez-Pérez, and Honghan Wu (Eds.). Springer, 1-14. DOI:https://doi.org/10.1007/978-3-319-45654-6_1
Tom Gruber. 2009. Ontology. Springer US, Boston, MA, 1963-1965. DOI:https://doi.org/10.1007/978-0-387-39940-9_1318
Michael Grüninger and Mark S. Fox. 1995. The Role of Competency Questions in Enterprise Engineering. Springer US, Boston, MA, 22-31. DOI:https://doi.org/10.1007/978-0-387-34847-6_3
Steven Harris and Andy Seaborne. 2013. SPARQL 1.1 Query Language. W3C Recommendation. W3C. Retrieved October 28, 2020 from https://www.w3.org/TR/2013/REC-sparql11-query-20130321/.
Olaf Hartig and Jun Zhao. 2010. Publishing and consuming provenance metadata on the web of linked data. In Proceedings of the Provenance and Annotation of Data and Processes-3rd International Provenance and AnnotationWorkshop. Deborah L. McGuinness, James Michaelis, and Luc Moreau (Eds.), Lecture Notes in Computer Science, Vol. 6378, Springer, 78-90. DOI:https://doi.org/10.1007/978-3-642-17819-1_10
Martin Hepp. 2008. Ontologies: State of the art, business potential, and grand challenges. In Proceedings of the Ontology Management, Semantic Web, Semantic Web Services, and Business Applications. Martin Hepp, Pieter De Leenheer, Aldo de Moor, and York Sure (Eds.). SemanticWeb and Beyond: Computing for Human Experience, Vol. 7. Springer, 3-22. DOI:https://doi.org/10.1007/978-0-387-69900-4_1
Rinke Hoekstra and Paul Groth. 2014. PROV-O-Viz-understanding the role of activities in provenance. In Proceedings of the Provenance and Annotation of Data and Processes-5th International Provenance and Annotation Workshop. Bertram Ludäscher and Beth Plale (Eds.), Lecture Notes in Computer Science, Vol. 8628. Springer, 215-220. DOI:https://doi.org/10.1007/978-3-319-16462-5_18
Eero Hyvönen, Miika Alonen, Esko Ikkala, and Eetu Mäkelä. 2014. Life stories as event-based linked data: Case semantic national biography. In Proceedings of the ISWC 2014 Posters & Demonstrations Track a track within the 13th International SemanticWeb Conference . Matthew Horridge, Marco Rospocher, and Jacco van Ossenbruggen (Eds.), Vol. 1272. CEUR-WS.org, 1-4. Retrieved from http://ceurws. org/Vol-1272/paper_5.pdf.
Mustafa Jarrar and Robert Meersman. 2009. Ontology engineering-the DOGMA approach. In Proceedings of the Advances in Web Semantics I-Ontologies, Web Services and Applied Semantic Web. Tharam S. Dillon, Elizabeth Chang, Robert Meersman, and Katia P. Sycara (Eds.), Lecture Notes in Computer Science, Vol. 4891. Springer, 7-34. DOI:https://doi.org/10.1007/978-3-540-89784-2_2
Holger Knublauch and Dimitris Kontokostas. 2017. Shapes Constraint Language (SHACL). W3C Recommendation. W3C. Retrieved October 28, 2020 from https://www.w3.org/TR/2017/REC-shacl-20170720/.
Mikko Koho, Erkki Heino, and Eero Hyvönen. 2016. SPARQL faceter-client-side faceted search based on SPARQL. In Joint Proceedings of the 4th International Workshop on Linked Media and the 3rd Developers Hackshop Co-located with the 13th Extended Semantic Web Conference ESWC 2016. Raphael Troncy, Ruben Verborgh, Lyndon J. B. Nixon, Thomas Kurz, Kai Schlegel, and Miel Vander Sande (Eds.), Vol. 1615. CEUR-WS.org. Retrieved from http://ceur-ws.org/Vol-1615/semdevPaper5.pdf.
Timothy Lebo, Deborah McGuinness, and Satya Sahoo. 2013. PROV-O: The PROV Ontology. W3C Recommendation. W3C. Retrieved July 14, 2020 from https://www.w3.org/TR/2013/REC-prov-o-20130430/.
Chia-Hung Lin, Jen-Shin Hong, and Martin Doerr. 2008. Issues in an inference platform for generating deductive knowledge: A case study in cultural heritage digital libraries using the CIDOC CRM. International Journal on Digital Libraries 8, 2 (2008), 115-132. DOI:https: //doi.org/10.1007/s00799-008-0034-0
Eetu Mäkelä, Juha Törnroos, Thea Lindquist, and Eero Hyvönen. 2017.WW1LOD: An application of CIDOC-CRM to world war 1 linked data. International Journal on Digital Libraries 18, 4 (2017), 333-343. DOI:https://doi.org/10.1007/s00799-016-0186-2
Lucy McKenna, Christophe Debruyne, and Declan O'Sullivan. 2018. Understanding the position of information professionals with regards to linked data: A survey of libraries, archives and museums. In Proceedings of the 18th ACM/IEEE on Joint Conference on Digital Libraries. Jiangping Chen, Marcos André Gonçalves, Jeff M. Allen, Edward A. Fox, Min-Yen Kan, and Vivien Petras (Eds.). ACM, 7-16. DOI:https://doi.org/10.1145/3197026.3197041
T. W. Moody, F. X. Martin, and F. J. Byrne. 2011. A New History of Ireland, Volume IX: Maps, Genealogies, Lists: A Companion to Irish History, Part II Illustrated Edition A New History of Ireland, Volume IX: Maps, Genealogies, Lists: A Companion to Irish History, Part II Illustrated Edition. Oxford University Press.
Dmitry Mouromtsev, Dmitry Pavlov, Yury Emelyanov, Alexey Morozov, Daniil Razdyakonov, and Mikhail Galkin. 2015. The simple web-based tool for visualization and sharing of semantic data and ontologies. In Proceedings of the ISWC 2015 Posters & Demonstrations Track co-located with the 14th International Semantic Web Conference. Serena Villata, Jeff Z. Pan, and Mauro Dragoni (Eds.). Vol. 1486. CEUR-WS.org. Retrieved from http://ceur-ws.org/Vol-1486/paper_77.pdf.
Yves Raimond and Guus Schreiber. 2014. RDF 1.1 Primer. W3C Note. W3C. Retrieved July 14, 2020 from https://www.w3.org/TR/2014/ NOTE-rdf11-primer-20140624/.
Zoe Reid. 2018. Unwrapping the Past: Conserving Archives Damaged in the Fire That Destroyed the Public Record Office of Ireland. Retrieved July 15, 2021 from https://beyond2022.ie/wp-content/uploads/2019/01/9-Unwrapping-the-past.-Zoe-Reid.pdf.
Catherine Ryan, Rebecca Grant, Eoghan O Carragáin, Sandra Collins, Stefan Decker, and Nuno Lopes. 2015. Linked data authority records for Irish place names. International Journal on Digital Libraries 15, 2-4 (2015), 73-85. DOI:https://doi.org/10.1007/s00799-014-0129-8
Cogan Shimizu, Pascal Hitzler, Quinn Hirt, Dean Rehberger, Seila Gonzalez Estrecha, Catherine Foley, Alicia M. Sheill, Walter Hawthorne, Jeff Mixter, Ethan Watrall, Ryan Carty, and Duncan Tarr. 2020. The enslaved ontology: Peoples of the historic slave trade. Journal of Web Semantics 63 (2020), 100567. DOI:https://doi.org/10.1016/j.websem.2020.100567
Peter Spyns, Robert Meersman, and Mustafa Jarrar. 2002. Data modelling versus ontology engineering. SIGMOD Record 31, 4 (2002), 12-17. DOI:https://doi.org/10.1145/637411.637413
Ed Summers and Antoine Isaac. 2009. SKOS Simple Knowledge Organization System Primer. W3C Note. W3C. Retrieved July 14, 2020 from https://www.w3.org/TR/2009/NOTE-skos-primer-20090818/.
Jeni Tennison. 2016. CSV on the Web: A Primer. W3C Note. W3C. Retrieved July 15, 2021 from https://www.w3.org/TR/2016/NOTEtabular-data-primer-20160225/.
Stefano Valtolina, Piero Mussio, Giovanna Gianni Bagnasco, Pietro Mazzoleni, Stefano Franzoni, Muriel Geroli, and Cristina Ridi. 2007. Media for knowledge creation and dissemination: Semantic model and narrations for a new accessibility to cultural heritage. In Proceedings of the 6th Conference on Creativity & Cognition. Ben Shneiderman, Gerhard Fischer, Elisa Giaccardi, and Michael Eisenberg (Eds.). ACM, 107-116. DOI:https://doi.org/10.1145/1254960.1254976
GemmaWebster, Hai H. Nguyen, David E. Beel, ChrisMellish, Claire D.Wallace, and Jeff Z. Pan. 2015. CURIOS: Connecting community heritage through linked data. In Proceedings of the 18th ACM Conference on Computer Supported Cooperative Work & Social Computing. Dan Cosley, Andrea Forte, Luigina Ciolfi, and David McDonald (Eds.). ACM, 639-648. DOI:https://doi.org/10.1145/2675133.2675247
Jun Zhao and Olaf Hartig. 2012. Towards interoperable provenance publication on the linked data web. In Proceedings of theWWW2012 Workshop on Linked Data on the Web.