Make your data count

A Report on the 6th Research Data Alliance Plenary Meeting

Last week I had the opportunity to attend the 6th Research Data Alliance Plenary meeting in Paris. With over 600 delegates, the RDA plenary events have turned into the largest regular gatherings of the international research data management crowd that there is. Consisting of various Interest Groups and Working Groups dedicated to particular aspects of research data, the RDA has become the de facto body for establishing common policies, approaches, and standards.

This autumn, the focus of the conference was on research data for climate change, and a special effort was made to engage with the business sector and small start-ups. This included an interesting ‘minute-madness’ session in which thirty or so young entrepreneurs gave a pitch for their enterprises, explaining how they were gathering and/or using big data as a central plank of their business models. If the chosen enterprises were representative then It seems that the world will soon be awash with electrical appliance metering apps, data visualization tools, and computer-heated water. The winner was an app called ‘Plume’, which measures and reports air pollution levels in various cities around the world – perfect if you want to avoid going for a jog at the worst times of the day, although personally I try to play it safe by not going jogging at all. I was rather taken by a ‘hyperlocal’ weather app called Wezzoo Oombrella (the Web is clearly running out of unclaimed words), which would have been useful to have known about when dodging the rain last Tuesday.

Elsewhere the assembled special interest groups were continuing to pursue more academic concerns. Being a member of the Interest Group for the ‘Long Tail of Research Data’ I was particularly pleased to hear the results announced of our recent survey into the different research data management tools and software that researchers (and others) are using. Whilst the expected tools were all frequently cited (Excel, R, SPSS, MatLab, etc.), the survey also exposed some less well-known analysis software and subject-specific tools. We haven’t officially published the results yet, but I’ll link to them once they’re generally available. The downside to surveys like this is that I now have several days’ work ahead of me trying to learn more about what all these tools really do and considering whether they offer any scope to improve practices across academic disciplines.

Of particular interest amongst the sessions I attended were those relating to my obligatory new interest in sensitive data (see my post about the new Participant Data Project). Having attended the UK Data Service’s very informative ‘5 Safes of Secure Access to Confidential Data’ just the previous week, I feel as though I’m getting a pretty intensive crash-course in the finer points of data anonymization and secure sharing. Both the session on International Access to Sensitive Data and that on the Life Sciences on Sensitive Data were very informative and engaging in good work. The Big Health Data IG is still taking shape, but looks as though it will play an important role in the future. Finally, the IG on Reproducibility, whilst somewhat neglected by those RDA members that had frequented it in earlier plenaries, offered a good forum for sharing ideas regarding software and good practice in a field that I suspect will become much more important over the next few years as researchers get the hang of depositing data and the attention moves more towards what people can actually do with it.