Linked Data: From interoperable to interoperating

Posted on July 27, 2015

0


Today’s batch of videos come from CAA, the international conference, in Siena, Italy. I was privileged enough to be there and lucky that some session organisers and presenters agreed to let me film them. The first session up is about Linked Data:

Linked Data and Semantic Web based approaches to data management have now become commonplace in the field of heritage. So commonplace in fact, that despite frequent mention in digital literature, and a growing familiarity with concepts such as URIs and RDF across the domain, it is starting to see fall off in Computer Science conferences and journals as many of the purely technical issues are seen to be ‘solved’. So is the revolution over? We propose that until the benefits of Linked Data are seen in real interconnections between independent systems it will not properly have begun. This session will discuss the socio-technical challenges required to build a concrete Semantic Web in the heritage sector. We particularly invite papers that offer practical approaches and experience relating to:

Interface development and user support for ingestion, annotation and consumption
Management, publication and sustainability of Linked Data resources
Building cross and inter-domain Linked Data communities
Processes for establishing usage conventions of specific terms, vocabularies and ontologies
Alignment processes for overlapping vocabularies
Engage non-technical users with adopting semantic technologies
Licensing and acknowledgment in distributed systems (especially those across multiple legal jurisdictions)
Incorporation within other software paradigms: TEI, GIS, plain text, imaging software, VR, etc.
Access implications of integrating open and private content
Mapping the Field – what components are now properly in place? What remains to be done?

Papers should try to provide evidence of proposed approaches in use across multiple systems wherever possible. Purely theoretical papers and those dealing solely with a single data system are explicitly out of scope for this session.

The Syrian Heritage Project in the IT infrastructure of the German Archaeological Institute

Authors: Philipp Gerth, Sebastian Cuy

Abstract: The ongoing armed conflict in Syria has led not only to a humanitarian disaster, but also threatened the rich cultural heritage of Syria. As an active involvement is not possible at the moment, the German Archaeological Institute (DAI) in cooperation with the Museum for Islamic Art Berlin and the German Foreign Office have started the Syrian Heritage Archive Project in November 2013. Its aim is to create an extensive systematic documentation system from the various available source that can be used to support preservation and reconstruction and also helps to prevent illegal trade. One major challenge posed in that project was integrating the various types of research data generated by archaeological, art historical and architectural history projects in order to create a comprehensive national registry of archaeological sites. In the past years the DAI was able to implement applications and web services for various domains and types of data relevant for archaeological studies. Not only did using these existing Systems in the Syrian Heritage Project ensure sustainability. The open application interfaces of the applications allowed us to aggregate the heterogeneous research data like excavation data, texts, archives, cadastral plans, historical maps, etc. and to present them in a unified user interface. Especially the DAI’s resources for managing geo data were of particular value for this enterprise. On the on hand the geoserver “iDAI.geo” makes various sets of spatial information accessible, while on the other hand the “iDAI.gazetteer” is used as a hub to connect the different resources within the DAI infrastructure and also acts as a gate to the Linked Data cloud by establishing links to existing resources like Pleiades, Pelagios and geonames. Thus users can not only view all data collected in the project but also to use the resulting application as a starting point for further research into other systems.

Using CIDOC CRM for dynamically querying ArSol, a relational database, from the semantic web.

Authors: Olivier Marlet, Stéphane Curet, Xavier Rodier, Béatrice Bouchou-Markhoff

Abstract: MASA Consortium (Mémoire des archéologues et des sites archéologiques http://masa.hypotheses.org/27), from the Très Grande Infrastructure de Recherche Huma-Num (http://www.huma-num.fr/) aims to provide to the french community of archaeologists several tools to improve their data interoperability. In this context, we propose to open the ArSol database to the semantic web, using the CIDOC-CRM ontology and a tool that implements Ontology-Based Data Access (OBDA) principles. The ArSol (Archives du Sol: Soil Archives) system has been used by the “Archéologie et Territoires” Laboratory (CNRS – Tours University) since 1990 for processing archaeological data. It can be used for all stratigraphic excavations and has the dual purpose of data management and research. It was constructed, with proprietary software, as an open system that is flexible and above all not conditioned by the integration of predefined thesauri. The ArSol client-server system is designed to integrate data from different sites. ArSol is designed both as a recording and data management research tool for use during excavation, and as an exploratory data analysis system for post-excavation work. Firstly, we designed a set of mappings from a selection of ArSol fields to the CIDOC CRM ontology. This manual alignment has been reported in CAA 2014. It allows us to transpose the ArSol data in an RDF format fully compatible with the CIDOC-CRM. We present in this article a new step, which is to implement the software architecture to query ArSol from a SPARQL endpoint. We chose to use Ontop, software developed at the University of Bozen-Bolzano that allows to query a relational database via an ontology, using SPARQL. In this way we do not need to move our data from our efficient database in order to benefit from the semantic web capabilities (semantic interoperability, RDF/OWL 2 QL inference, etc.). We avoid the extract-transform-load (ETL) process for exporting our data in an RDF store and for updating it when data change in ArSol. Via the SPARQL endpoint, users or applications can query ArSol using the CIDOC-CRM part that we selected to represent our ArSol data. We used the Ontop Protégé plugin to design the OBDA mappings that are necessary for the SPARQL-to-SQL rewritings. Our final goal is to devise an application that will offer a single interface to query several distributed and independent archaeological databases, with heterogeneous structures, using CIDOC-CRM to relate them to each other. Querying ArSol in SPARQL via the CIDOC CRM is an important step towards this goal.

How to move from Relational to Linked Open Data 5 Star – a numismatic example

Authors: Karsten Tolle, David Wigg-Wolf

Abstract: In our database solution Antike Fundmünzen Europa (AFE), where we record finds of ancient coins, we want to preserve as much information as possible. This also includes containments of possible coin types, or marking attributes of a coin as uncertain if the exact value can not be assured. As many others our backend-system is based on a traditional relational database (MySQL). In order to become a Linked Open Data 5 Star, we mapped our data to different ontologies from Nomisma.org, Dublin Core, SKOS and others. Besides providing these data to others, we also benefit from the new ability to view our relational data in a totally different way, by loading the data back to a graph database. We will present how we mapped our data based on an existing mapping language called D2RQ Mapping Language, without the need for changing the underlying database. In our case this was less problematic due to the fact that internally we had already set AFE up based on Nomisma.org thesauri. However, the thesaurus mapping can also be part of the mapping. With this mapping established, one can for example provide a SPARQL endpoint to others in order to allow them to access the data in an ontological way. However, for full interoperability there are still barriers that need to be overcome. Even if the same vocabulary is used, different modelling approaches might hinder full interoperability – this will be the focus or our talk, explaining what we mean by this. This problem does of course not occur when the modelling is identical. We are currently planning to combine different databases instances that are all build on top of AFE (such as Germany and Poland, as well as Romania which is under construction) based on the same mapping in order to demonstrate the potential. We will further report about benefits we see from the ability to use graph visualizations of the data. We will report on our experiences with AllegroGraph as a graph database allowing reasoning for some standard properties, and Gruff as a visualization and query interface on top of it.

The Labeling System: A bottom-up approach for enriched vocabularies in the humanities

Authors: Florian Thiery, Thomas Engel

Shared thesauri of concepts are increasingly used in the process of data modelling and annotating resources in the Semantic Web. This growing family of linked data thesauri [1] follows a top-down principle. Vocabularies and broader concepts (SKOS-) are being created, maintained and provided under the supervision of central authorities to provide general and generic approaches used by scientists in the humanities. But the diversity of research questions in the humanities makes it virtually impossible to create shared controlled vocabularies that cover a wide range of potential applications, and therefore satisfy the needs of diverse stakeholders. Reliable interconnections among independent systems could solve this conceptual bottleneck of controlled vocabularies. The Labeling System (LS), developed by i3mainz and IEG [2] in contrast follows a bottom-up approach, enabling scientists working in the digital humanities to manage, create and publish their own controlled vocabularies as a SKOS concept scheme and concepts provided via a REST API and URIs [3]. One term of the vocabulary can be linked to broader corresponding concepts of domain experts and will become labels. The labels embed those broader concepts persistently into existing structures using a clean and straightforward UI. Technically the LS is defined over a flat ontology and can be queried through its triple store [4]. The created concepts can then be interlinked with well known LOD resources from e.g. The Getty Research Institute or the British Museum, but also to authorities maintaining linked data resources from natural science domains. The LS is domain independent, while uniting perspectives of different scientific disciplines on the same label and therefore contributing to interdisciplinary collaboration for building up cross and inter-domain linked data communities. As the newly created expert resources are available persistently, the concept is quotable, which strengthens the scientific discourse of their semantic shape. The paper addresses principles of the Labeling System in the light of heterogeneous archaeological data from Western Europe and the Middle East. Consequently, “usual” archaeological topics of conceptualizing and interlinking temporal and spatial concepts (meaning) will be discussed. To what extent is it possible to align existing concepts with “inserting” specific concepts of domain experts? How can the LS be used to solve the ambiguity of a place type and its role or function in a specific archaeological meaning? Furthermore, we will show how the non-technical researcher can use the Labeling System to get introduced into the process of linked-data conceptualization. Finally, the paper details the benefits of enriching linked-data concepts through relating to linked-data communities of other domains, e.g. geology [5] or anatomy [6]

From interoperable to interoperating Geosemantic resources

Authors: Paul J Cripps, Douglas Tudhope

Abstract: The concept of using geospatial information within Semantic Web and Linked Data environments is not new. For example, geospatial information was very much at the heart of the CRMEH archaeological extension to the CIDOC CRM a decade ago (Cripps et al. 2004) although this was not implemented; a review of the situation regarding geosemantics in 2005 commented “the semantic web is not ready to provide the expressiveness in terms of rules and language for geospatial application” (O’Dea et al. 2005 p.73). It is only recently that Linked Geospatial Data has begun to become a reality through works such as GeoSPARQL (Perry & Herring 2012; Battle & Kolas 2012), a W3C/OGC standard, and the emerging CRMgeo standard (Doerr & Hiebel 2013). This paper presents some real world, practical examples of creating and working with archaeological geosemantic resources using currently available standards and Open Source tools. The first example demonstrates a lightweight mapping between the CRMEH, CIDOC CRM and GeoSPARQL ontologies using data available from the Archaeology Data Service (ADS) digital archive and Linked Data repository. The second example demonstrates the use of Ordnance Survey (OS) Open Data within a Linked Data resource published via the ADS Linked Data repository. Both examples feature the use of Open Source tools including the STELLAR toolkit, Open Refine, Parliament, OS OpenSpace API and custom components developed and released under open license. The first example will also be placed in the context of the GSTAR project which is using the approaches described to produce Linked Geospatial Data for research purposes from commonly used platforms for managing archaeological resources within the UK heritage sector. These include the Historic Buildings and Sites and Monuments Record (HBSMR) software from exeGesIS, used by UK Historic Environment Records (HERs), and MODES, used by museums for managing museum collections. As such, the outputs from the GSTAR project have wider applicability in moving geosemantic information from interoperable to interoperating.

To see more videos like these please go to the YouTube channel Recording Archaeology- http://www.youtube.com/channel/UC08QKQO1qs6OPQs9l1kMQPg

Posted in: Uncategorized