Past Events
Tuesday 08 March 2016

Standing on the Digits of Giants: Research data, preservation and innovation


Chair: Dr William Kilbride FSA, Executive Director, The Digital Preservation CoalitionWilliam Kilbride photo crop 635887949178951557

 Venue: Arundel House, 13-15 Arundel Street, Temple Place, London WC2R 3DX (map)


Access is not a one time event, it’s an ongoing process.  This seminar will examine emerging trends in scholarly communication from the perspective of the publication and long-term access to the scholarly record.  This includes outputs not traditionally included within the primary scientific canon such as metadata, software and research data.

The academic community has been quick to adopt digital technology.  Many of the technologies we now take for granted derive from the desire of the researchers to communicate more and better, the need for students to learn more flexibly and the need for teachers to guide more assiduously.  Publishers and librarians have responded to meet these changing demands and the supply chain connecting researchers has been transformed in the last twenty years.  This generation has seen a revolution in the scholarly record and scholarly communication.  It’s mostly been for the good but challenges remain. Considerable attention has been paid to issues of ‘access’ whether in economic (subscription, green or gold routes), social (developing communities, citizen science and crowd-sourcing) or qualitative terms (research data management, version control and data sharing).  But advocates on all sides recognise that access is not an event; it’s a process. 

The day will encourage the exchange of ideas and discuss possible solutions or collaborations to issues. Participants will have clarity on the various moving parts that make up the digital research record, which will enable them to develop relevant policies and support for authors, editors and other clients. They will have increased knowledge and access to networks that will support scenario-planning and product development going forward.

This is an ALPSP seminar organised by Fiona Murphy, Senior Associate, Murphy Mitchell Consulting Ltd and produced in association with the Digital Preservation Coalition.

09:15 Registration and coffee  
09:45 Welcome from Chair - Dr William Kilbride FSA, Chief Executive, The Digital Preservation Coalition (audio)  
10:15 Why It's Critical to Engage - Mark Thorley, Data Management Co-ordinator, NERC
"If I have seen further it is by standing on the shoulders of giants
." So said Newton in a letter to Hooke of 1676.  Newton himself was using a metaphor, attributed to Bernard of Chartres.  It is a very apt way of describing the process of scientific research, building incrementally on the findings and discoveries of others.  Is Open Data just another incremental step in this process, or does it mark a fundamental change in the way research is carried out?

In this talk I review the drivers for openness, including the integrity and transparency of research, data driven discovery, the opportunities for innovative re-use and re-purposing of research data, and the wider expectations of a modern, networked world.  I will also consider how the research community, including publishers, need to respond to these drivers, using the principles outlined in the draft Concordat on Open Research Data. (audio) (presentation)
 Mark Thorley photo crop 635887930608317673
10:45 Science of the (near) Future: Its power and requirements - Professor Robert Gurney OBE, Professor of Earth Observation Science, University of Reading
Global environmental change is one of the most pervasive concerns of the 21st century.  Scientists throughout the world are undertaking research to determine the nature and extent of these changes, and their impacts on humans and the environment.  This research increasingly requires integrating large amounts of diverse data across scientific disciplines to deliver the policy-relevant and decision-focused knowledge that societies require to respond and adapt to global environmental change and extreme hazards, to manage natural resources responsibly, to grow our economies, and to limit or even escape the effects of poverty.  To carry out this research, data need to be discoverable, accessible, usable, curated and preserved for the long-term.  This needs to be done within a supporting data intensive e-infrastructure framework that enables data exploitation, and that evolves in response to research needs and technological innovation.  Without such data and the supporting e-infrastructure, policy makers and scientists will be forced to feel our way into the future without the benefit of new scientific understanding, unfocused and ill-prepared.  The Belmont Forum, an association of the world’s largest environmental funding agencies, is undertaking to make all their data and information available as openly as possible, and have adopted a set of principles and policies to do this.  There are a set of activities to use exemplar projects to design how to exchange data efficiently, to train people to have the necessary skills, to draft and implement the relevant data policies, including security and legal concerns, and an office to coordinate these.  (audio) (presentation)
 Robert Gurney
11:15 Coffee  
11:30 Transformations in Scholarly Communications - Phill Jones, Head of Publisher Outreach, Digital Science
Scholarly publishing as we know it today is the result of three and a half centuries of evolution. That historical legacy shapes both the way we think about publishing and how it is done in practice. What if we were able to redesign the entire system from scratch? What would it look like and how would it meet the needs of our customers?

By working closely with academics, institutions and funders, it’s possible to see the areas where our current industry serves academics well and what might be improved. By asking deeper questions about what academics really think, we can go beyond the rhetoric and identify true unmet needs. Understanding those needs allows us to predict how technology and the market is evolving and informs product development. (audio) (presentation)
 phill jones
11:50 Ensuring the Integrity (& Continuity) of Our Record of Scholarship - Peter Burnhill, Director, EDINA & Data Library, Information Services Group, University of Edinburgh
Citations in articles are important pointers to what is significant or provide the evidence for what is stated. Sometimes citations are to other published articles and that is where the Keepers Registry plays a role, reporting what is done by CLOCKSS, Portico et al to ensure continuity of access to that content. However, increasingly what is cited as evidence is to content available on the world wild web-at-large. The evidential base for web-resident scholarly statement has wide variety of material, including datasets, algorithm, newsmedia, government documents, etc.

Using several large corpus of full-text works the Hiberlink project set out to investigate and measure the scale of what is termed ‘reference rot’, the combined effect of link rot (those 404s) and content drift (where what was cited has changed or is no longer available). The results are now out, showing that the threat of reference rot is all too real, and so undermining the integrity of much that is published, within weeks of publication. The digital shoulders of giants appear to be much more brittle than supposed: a case of 'web today and gone tomorrow', with the only trace being the date the webpage was allegedly viewed.

Fortunately, the international team working in the Hiberlink project made more progress than originally envisaged. This means that there are also remedies to share that can alleviate or even eliminate reference rot. Two references to Hiberlink are instructive and worth a read beforehand. The first is to the research paper in PLOS. The second a discursive Insight paper on the Hiberlink website. The latter contains ‘robust links’ within its citations, to guard against reference rot. The publishing platform for PLOS and for the article in Insight,, did not allow, as remedy has yet to be implemented!  (audio) (presentation)
 Peter Burnhill photo
12:10 Enabling Data Citation and Metrics - Mike Taylor, Senior Product Manager, Infometrics, Elsevier
Elsevier has been working with numerous community groups to develop a culture of research data citation and metrics. FORCE11, RDA, CASRAI and NISO have been keen partners for us. Our implementation plan to support data citation is underway, and we're sharing our experiences through the FORCE11 Data Citation Pilot project. Also on our roadmap is support for NISO's research data metrics. Mike Taylor of Elsevier and NISO's data metrics workgroup presents some of the work that the community and Elsevier are undertaking to create the reward / recognition of research data sharing with citation and metrics.  (audio) (presentation)
 Mike Taylor photo
12:45 Lunch  
13:45 Recognition Across the Entire Lifecycle: Identifying contributions to research - Josh BrownORCiD Regional Director, Europe
The use of identfiers for authors is well established, and helps to ensure accurate attribution and reward. Recent innovations mean that this approach can be applied across the entire research lifecycle, and hitherto much less visible contributions to research, such as software, data visualisation and peer review, can be brought into the light as first-class research activities, taking their rightful place alongside authorship as valued contributions to the health and progress of research. (audio not available) (presentation)
 Josh Brown photo crop 635887913307402226
14:05 Research Data Preservation: The rewards of doing things better - Dr Matthew Addis, CTO, Arkivum
Long-term access and use of research data is an increasingly important part of scholarly communication. Whilst innovation abounds in the way data is created and shared, less work has been done on how to ensure this output remains available and usable over decade long time scales. This talk will look at some of the things involved in achieving digital preservation of research data and how HEIs are tackling these issues. Examples presented will include how HEIs are successfully applying ‘parsimonious preservation’ in a way that provides short-term benefits, especially to researchers, as well as helping secure the long-term value that the data holds.  (audio not available) (robust link to presentation: Research Data Preservation. The rewards of doing things better.)
 Matthew Addis crop 635901768712178318
14:35 Coffee  
15:00 Practical Adaptations:

Global Challenges, Enabling Cultures: Enriching collaborations for the digital age - Wendy White, Associate Director (Research Engagement), Hartley Library, Southampton University
Researchers are currently exploring significant global challenges and advancing knowledge in an era where technological and social developments have facilitated an unprecedented increase in the volume of data production and exchange. This talk will explore some of the key issues for researchers and how support for research can be developed within an enabling culture. It will highlight examples of services, both within and outwith HE institutions, working in partnership with researchers on innovations that enhance research quality, impact and long-term reproducibility. These will include engagement with technical, ethical and methodological issues. (audio) (presentation(*We apologise that the first few minutes of the audio for this session are missing)
Wendy White photo crop 635887893388613368
15:20 Openness, trust, transparency: access during the data deluge - Dr Sarah Callaghan, Senior Research Scientist at STFC and Editor-in-Chief of Data Science Journal
Giving people access to data is easy. Making the data available in such a way that it is understandable and usable, far into the future, is a lot harder. It takes time, effort and energy to make a dataset suitable for use by others, time that some may feel is better spent preparing the next grant proposal. Thankfully, this issue has been acknowledged by funders, and effort continues to be put into developing tools and services to assist researchers in managing and publishing their data, thereby preserving the scientific record and providing well deserved attribution and credit for the data producers.

This talk will discuss data publication, in the formal and informal senses, and will also describe the access issues that data faces that are not experienced by traditional scientific outputs such as journal papers. (audio) (presentation)
 Sarah Callaghan photo crop 635888907509813869
15:40 Data Publishing, Data Journals and Long-Term Preservation of the Scholarly Record: Some experiences in the Netherlands - Peter Doorn, Director, Data Archiving and Networked Services (DANS), The Netherlands 
Providing long-term access to research data already has a considerable tradition in The Netherlands. The first data archive for the social sciences in that country was created in 1964. DANS, Data Archiving and Networked Services, is a descendant of that ancestor. The way in which data archives make available their holdings is usually through some catalogue of metadata. Although that can be considered a form of data publishing, it is a fairly unexciting form. Moreover, although most data archives actively encourage data citation, this is only slowly taken up by the scholarly community. And although there is some evidence that researchers who share data are more referenced to than others, a data citation hardly has an impact on your citation score.

We have been promoting the linkage of data and publications in a variety of ways, with mixed results. Data journals can help to improve the situation in several ways. A data paper in a reviewed journal is closer to the traditional way of scholarly communication, especially in the humanities and social sciences. Data journals may close the gap between data publishing and traditional publishing. In my presentation I will explain the ideas and principles behind the Research Data Journal for the Humanities and Social Sciences (RDJ), which DANS recently started in collaboration with Brill publishers.

At the same time, we see journal and book publishers diversifying into the data domain. For example, Figshare (a portfolio company of Digital Science, operated by Macmillan Publishers) offers an online digital repository where researchers can preserve and share their research outputs, including datasets. Mendeley Data is a new service from the Elsevier Publishing Company, with which DANS has teamed up to guarantee the long-term preservation and open accessibility in a certified archive and publicly funded environment.  (audio) (presentation)
 Peter Doorn photo
16:15 Q&A session  
16:30 Close followed by drinks and networking  

 Who should attend

Publishers, research managers, specialists in digital preservation, repository managers, research funders and policy makers, learned society representatives and research librarians.

