Open Access and Open Data in Archaeology: Following the Ariadne Thread

Posted on January 4, 2017


‘Open’ be it ‘Data’, ‘Access’, or ‘Source’ is a favourite subject of mine so I was very pleased to see there was a session the topics of Open Data and Open Access at the EAA conference. My colleague Ben Lewis helped film the session and you can view the videos below as I return to my weekly video posts on Wednesdays:

Session Abstract:

Will the availability of open data change the nature of archaeological research and publication? Will it also impact the ways in which archaeologists engage with wider communities? The European Science Foundation and other leading European research funders have declared their support for the “Berlin Declaration on Open Access to Knowledge in the Sciences and Humanities”: a far-reaching restructuring of scientific publishing in favour of open access that will take place before the end of the current decade. In parallel, the infrastructure necessary for open data is being created and the political pressure to use it will increase. Many areas of the humanities in Europe, including archaeology, still find this a difficult step to take. At present, the majority of highly renowned journals continue to be published in the traditional way, and research data are still generally unpublished. At the same time, the early adopters of open access and open data are still battling with the problems of how to implement it in practice. The EC Infrastructures funded ARIADNE project is working to bring together archaeological research data from across Europe, for use and re-use in new research. There are challenges, such as raising awareness about the available data, integrating datasets produced by very different projects and methodologies and various technologies. There are GIS, databases, 3D data, scientific datasets and more, all produced in a variety of languages, and all requiring differing approaches. This session is sponsored by the ARIADNE project, follows on from Barriers and opportunities: Open Access and Open Data in Archaeology at EAA 2015, and will provide further updates and overviews relating to open access and open data.

Author – Wright, Holly, Archaeology Data Service, York, United Kingdom
Co-author(s) – Richards, Julian, University of York, York, United Kingdom (Presenting author)
Co-author(s) – Siegmund, Frank, Universität Düsseldorf, Düsseldorf, Germany
Co-author(s) – Geser, Guntram, Salzburg Research, Salzburg, Austria
Keywords: Open Access, Open Data, Publication

Antiquarians in the 21st Century: Opening up our data

Author – O’Riordan, Emma Jane, Society of Antiquaries of Scotland, Edinburgh, United Kingdom
Co-author(s) – Osborne-Martin, Erin, Society of Antiquaries of Scotland, Edinburgh,
United Kingdom (Presenting author)
Keywords: open access, publishing, research

The Society of Antiquaries of Scotland has been an active publisher of Scotland’s history and archaeology since 1792; the Proceedings of the Society of Antiquaries of Scotland (PSAS), has been the primary journal dealing with Scotland’s past in its British and European context since 1851. Publication in PSAS has often been seen by many archaeologists as the ‘end’ of the research cycle: excavation is followed by publication, and the process is complete. However, there is increasing awareness that the final report alone does not tell the whole story, and many readers would also like to examine raw data. In 2001, the Society created a new, fully peer reviewed, freely available online journal, Scottish Archaeological Internet Reports (SAIR), so was an early adopter of Open Access in an archaeological context. SAIR was intended to provide a new, lower- cost publication outlet for detailed archaeological reports; over the last fifteen years it has evolved to include the publication of many different types of projects – including large-scale surveys, gazetteers and conference proceedings – which would not be possible or desirable to publish in print for various reasons.
The Society also runs the Scottish Archaeological Research Framework (ScARF). Launched in 2012, this collaborative project brought together experts from a range of disciplines to compile a peer-reviewed summary of our archaeological knowledge up to that point and agree where future research should be directed. The entirety of Scottish archaeology was split into nine panel reports, all of which are available for free download from the project website or can be viewed on the wiki- style website itself. As such, it is it the first framework of its kind in archaeology. The Society is contemplating how best to take PSAS, SAIR and ScARF forward in an Open Access world. Our audiences are increasing, both in number and in variety. There have been over 400,000 downloads from Society’s publication archives held by the Archaeology Data Service (ADS) since 2011, and over the past three years ScARF has seen over 262,000 page views. And yet these final reports are only the tip of the archaeological data mountain. As an archaeological publisher, if we aspire to the true aims of Open Access, we should be making the original data available for re- use, data mining and new interpretations. But how can these aspirations be carried out in practice when the data is so vast and varied? As a small independent organisation, we must look to collaboration. How best to do this? One possibility is drawing from the models created by computer scientists and scientific publishers more used to dealing with raw data rather than ‘coffeetable books’. However, making the data available is not only a technological issue – there are already data downloads available
in parts of ScARF and SAIR, for example – but a cultural one. Many archaeologists are cautious about openly sharing raw data and we must consider how best to reconcile the needs of authors and remaining true to our own aims of truly open knowledge.

Beyond the Pale: grey literature as a method of publication

Author – Dr. Evans, Tim, Archaeology Data Service, York, United Kingdom (Presenting author)
Keywords: grey literature, open access, publication

Since the beginnings of Rescue archaeology, the successful publication of archaeological projects has been a contentious issue. The switch from preservation solely by publication, to one of preservation by record, has placed increasing emphasis on archive and a descriptive written output via a journal or monograph. Somewhere between the two lies the corpus of written material sometimes known as grey literature: the ostensibly unpublished outputs often created to inform or satisfy a particular condition required by the curatorial sector. The opinions and perceptions surrounding this corpus are varied, albeit with a long-held belief that it is of poor quality and often inaccessible; a weakness of a discipline which is by its very nature cyclical.
This paper presents the findings of recent research on the nature of publication and archive in England, based on regional case studies it presents evidence for the nature of the divide between published and non-published interventions. In many cases, either by accident or design, so-called grey literature is the only written output produced by excavation, including nationally or regionally significant findings. Furthermore, the amount of grey literature often matches or surpasses what may be considered the traditional published record.
Although recent projects have done much to highlight the potential of this corpus, and initiatives such as OASIS and the ADS’ Library of Unpublished Fieldwork Reports have made significant strides in publishing fieldwork reports online, the extent of the significance of our grey literature may still be understated. Although the lack of traditional publication may be decried by some, in contrast to pay-on-access journals and monographs it represents an online and free corpus of information to fieldworkers, researchers and the wider community. It is argued that grey literature is not simply a failure, or a cause for concern, but an opportunity to reverse the traditional crises in publication and to use online systems as part of an evolution in publication strategies of archaeological projects.

Requirements for open sharing of archaeological research data

Author – Dr. Geser, Guntram, Salzburg Research, Salzburg, Austria (Presenting author)
Keywords: e-infrastructures, open data, repositories

There are several good arguments for open research data and over the last few years expectations of open sharing of publicly funded data have increased. For example, re-use of data in further research (e.g. based on combined data) is expected to provide much return on investment.
Considerable progress has been achieved with regard to e-infrastructures and services for data sharing, access and (re-) use, but the institutional requirements are lagging somewhat behind. Such requirements include the extension of open access mandates from papers to research data, available repositories adequate for research data, and making sure that data sharers receive the credit they deserve. Researchers still perceive more obstacles than incentives for opening up their data, including additional effort, lack of academic reward, concerns that data might be misused, and more. Indeed, clear evidence of benefits of data publication, re-use and citation – both on the community and individual levels – is crucial for pushing forward the open data agenda. The paper will give an overview of the current landscape of e-infrastructures and open access resources for archaeological and other cultural heritage research, and highlight institutional and other requirements for further progress and innovation through open data over the next 5 to 10 years.

Integrating data for archaeology

Author – Dr. Gavrilis, Dimitris, Athena Research Center, Maroussi, Greece (Presenting author)

Co-author(s) – Fihn, Johan, Swedish National Data Service, Gothenburg, Sweden
Co-author(s) – Olsson, Olof, Swedish National Data Service, Gothenburg, Sweden
Co-author(s) – Afiontzi, Eleni, Athena Research Center, Maroussi, Greece
Co-author(s) – Felicetti, Achille, University of Florence, Florence, Italy
Co-author(s) – Niccolucci, Franco, University of Florence, Florence, Italy
Co-author(s) – Cuy, Sebastian, German Archaeological Institute, Berlin, Germany
Keywords: Data enrichment, Data integration, Infrastructure

In the past years, infrastructure projects in the Archaeology domain have focused on data aggregation in order to bring to the end users the vast amount of information gathered from various organizations and stakeholders. The typical processes found in a data aggregation infrastructure include: ingestion, normalization, transformation and validation processes that mainly focus on the homogenization and cleaning of heterogenous data. A portal is usually employed to present this information to the end users and is met with limited success due to the vast information contained. In order to increase the quality of services that are provided to end users, the European funded project Ariadne ( aims at integrating this data by modelling the underlying domain and providing the technical framework for automatic integration of heterogeneous resources.
The heart of the infrastructure lies in the underlying domain model: Ariadne Catalog Data Model (ACDM), a DCAT derived model which models a large number of entities such as Agents, Language resources, datasets, collections, reports, services, databases, etc. With the help of a of micro-service oriented architecture and a set of powerful enrichment micro-services all aggregated data are transformed into XML and RDF, annotated over subject, space and time with the help of AAT, Geonames and thesauri (thus establishing a common reference) and interlinked with each other based on their structural or logical relationiships. The data integration services can mine for links among resources, link them together and against language resources such as vocabularies. Complex records can be split into their individual components, represented, enriched and stored separately while maintaining their identity using semantic linking. Each integrated resource is assigned a URI and published to:

a) Virtuoso RDF Store in RDF which provides a SPARQL interface
b) to Elastic Search in JSON which provides a powerful indexing mechanism that can help present and associate resources
accurately in real-time.
This approach can provide developers and creative industries with the means to create innovative applications and mine
information form the RDF store. End users ranging from simple visitors to domain researchers can access this data through the infrastructure’s portal which is capable of hiding the complexity of this plethora of data, filter the results using a plethora of filters and present connected resources in a way that can help guide the user instead of confusing him/her.
The technical infrastructure has been developed using various programming languages such as Java, PHP, Javascript, it is distributed spanning multiple virtual machines and brings together different established technologies and components. Both the technical infrastructure and the portal will be presented and demonstrated.

Linked Open Data Approaches within the ARIADNE Project

Author – Dr. Wright, Holly, University of York, York, United Kingdom (Presenting author)
Keywords: ARIADNE, Linked Data, Open Data

ARIADNE is a four-year EU FP7 Infrastructures funded project, made up of 24 partners across 16 European countries, which hold archaeological data in at least 13 languages. These are the accumulated outcome of the research of individuals, teams and institutions, but form a vast and fragmented corpus, and their potential has been constrained by difficult access and nonhomogenous perspectives. ARIADNE aims to bring together and integrate existing archaeological research data infrastructures, so researchers can use these distributed datasets in combination, and in new ways. This paper will give an overview of the progress of the ARIADNE project, focussing on efforts to create a shared infrastructure into which metadata is gathered, and a portal to allow cross-search of this metadata. To this end mapping work has been carried out to facilitate searching across space, time and subjects, using Linked Open Data (LOD). This work represents LOD best practice by incorporating existing international initiatives such as the Getty Art & Architecture Thesaurus, and contributing to emerging best practice initiatives like PeriodO. As ARIADNE is in its final year, conclusions can begin to be drawn about the challenges faced along the way, and possible directions for the future.

Legacy data and archaeological archives in Europe and North Africa

Author – Dr. Fentress, Elizabeth, Rome, Italy (Presenting author)
Keywords: Archives, Legacy Data, North Africa

Perhaps the hardest data to render open access is that of the archaeological archive, even when, as is not necessarily the case, it is lodged in an institution. A survey of practices for the archiving of excavation data in a number of European countries has revealed that centralized archiving is vanishingly rare, while even university archives of excavation data are hardly easy to access. A particular example of legacy data is offered in this paper, the case of the archives of 150 years of excavations in North Africa. Carried out initially by colonial regimes, many of the archives of these excavations were returned to Europe, where they remain in large part inaccessible to the countries where they were created. No functioning archives were left in their wake, so data collected since then has rarely been properly organized. Many of these archives are in the hands of the descendants of the original excavators, some of whom have sold them, while others have simply left them in the attic. A new project, the North African Heritage Archive Network (NAHAN), is attempting to assemble on one platform the catalogues of as many as possible of these archives, which are found in four North African countries, seven European countries and the US. Under the aegis of ICCROM, the project will build on the ARIADNE infrastructure model to provide information about these resources, in the hope of generating new scholarship on this massive collection of data, and of rendering this information available to the archaeological services of
the countries where it was created.

