Last week IDCC arrived in San Francisco, land of the start-up, for the annual gathering of the research data management and curation community. My assumption that the stereotype of everyone in this part of the world being involved in some sort of tech start-up was indeed merely a stereotype quickly turned out to be misplaced. The first plenary speaker had started up six biotech companies, and seven of his students had one each. Feeling like the kind of risk-averse loser who was unworthy of admission into the state, let alone the conferences. I slunk into a chair near the back of the hall and started listening for clues as to the Next Big Thing in RDM.
This was my fifth IDCC conference, so it is interesting to note how the topics under discussion have been evolving over a period of quite rapid change. Training staff to assist with data management has been an issue since I first attended, but whereas a few years ago the conversations centred mostly around how to get research data management into librarian training, delegates at IDCC2014 were beginning to think more broadly in terms of what sorts of skills were actually required.
Much attention has been given recently to the role of ‘data scientist’, notoriously described in the Harvard Business Review as the ‘sexiest job of the 21st century’. With an endorsement like that, who wouldn’t want to style themselves a data scientist? But this year’s conference saw a more serious attempt to actually delineate the ways in which research data needs support and who has the skills to provide it. With a large dollop of realism, conference delegates were disabused of the notion that a data curator was the same thing as a data scientist. There seemed to be general resignation to this fact amongst those self-identifying as data curators in the hall. To make matters worse, data curation was described as a ‘janitorial’ role, to quiet groans and slumping shoulders. Not only do we not all have our own start-ups, but it would appear that many of us are essentially janitors. This led into a quite enlightening panel discussion about the various roles, and indeed jobs, that were coming into existence in order to deal with research data, with terms such as ‘data steward’, ‘data governance’, ‘data analyst’, and ‘data curator’ all appearing with greatly increasing frequency in job advertisements, alongside a simultaneous decline in job advertisements for more traditional jobs such as ‘librarian’ or ‘database administrator’ [Ronald Larson]. Former students at American library schools were reporting that they wished more emphasis had been given whilst training to programming skills, change management, and knowledge of ‘domain science processes’ and that they had more opportunity for engagement with scientific research communities [Carole Palmer]. The notion of ‘domain disconnect’ [Liz Lyon], particularly with regards to the ‘hard’ sciences, technology, and engineering disciplines was one that was returned to repeatedly, with the envisaged solution being a greater degree of faculty involvement in information management training.
Other themes included the emergence (or increasing presence) of groups looking to provide international coordination of research data standards and policies. The Research Data Alliance (RDA), which did not yet exist when the last IDCC conference was held, and CODATA, which was formed as long ago as 1966, were both the subjects of keynote presentations. Whereas there had been plenty of examples of national coordination and some at the European level in previous conferences, largely community-organized international coordination struck me as a sign of a maturing field. I was fortunate enough to see an example of how such coordination was actually making a difference (rather than merely providing an excuse for some international travel) at the post-conference FORCE11 workshop on data citation. This involved a number of participants from very different disciplinary and organizational backgrounds working through the unique challenges of consistent data citation in their domains to produce a core set of universally-applicable principles, with specific implementations to follow. Currently seeking endorsements, more about their efforts can be found at their website.
I myself managed to justify my international travel costs on the basis of giving a demonstration of the ever-improving Online Research Database Service (ORDS). It was one of a number of very well-attended software demonstrations that included DataUp, DMPonline, Labtrove, and Archivematica (unfortunately I was unable to witness the last two due to the exigencies of parallel sessions). Gradually tools for easing the burden of data sharing are coming of age and moving from being pilots to mature services that real researchers are really using. It’s almost as exciting as having one’s own start-up, albeit with the JISC and other such agencies standing it for the venture capitalists with dollar bills spilling out of their pockets (or not).
A final, if somewhat frustrating, observation from the conference is that there is still work to do in some universities to convince the support departments of the benefits of collaboration. The same desire for departments to ‘own’ research data management for an institution seems to be as prevalent now as before, whereas I think our experiences at Oxford have indicated that no one department, where the research office, IT Services, or the library really has all the expertise required to provide the best support for its researchers and it helps to work together. This is a theory that we will be putting to the test over the next few months as we launch our single-point-of-contact advisory service, which I’ll write about in another blog post.
It was announced that next year’s conference will be in London, so less glamorous maybe, but at least it won’t be such a drain on the expenses account, and I suspect there won’t be so many start-ups, or data janitors flattering themselves that they’re sexy.