A month in the life of a Panton Fellow: May 2012

And so another month draws to a close, and it’s time for the Panton Fellows to update you on what they’ve been up to recently. Before I start talking about my work though, I should draw your attention to Ross Mounce‘s Panton summary for May, addressing topics ranging from Michael Nielsen’s excellent read “Reinventing Discovery” to Ross’ recent attendance at the Progressive Palaeontology conference in Cambridge. May has indeed been a busy month, but also an enjoyable one. Quite a few different things have happened: I’ll provide edited highlights only here to avoid things getting too long, but I’ll try and blog at greater length about a couple of items over the coming week.

My month started out with a trip to Tavistock Square, London, meeting with John Wood and Ben Prasadam-Halls of the Association of Commonwealth Universities. I was also joined by Peter Murray-Rust and Laura Newman, which made for plenty of interesting discussion. As well as hearing from Laura about the newly-launched OKFN School of Data, we talked about the implications of the open data movement for the development of distance learning initiatives worldwide, and what’s being done at present to help achieve this. John is a great proponent of graduate training in open science and recognises the need to develop appropriate initiatives to train data experts who can support the evolution of scientific practice in this age of “Big Data”. Those of you unfamiliar with his existing work with the European open science agenda may be interested in reading the excellent 2010 report, “Riding the Wave: How Europe can gain from the rising tide of scientific data” or watch a video of John’s keynote speech from APE 2011 in Berlin:

The following week saw me travel to Helsingør, Denmark, where I attended Integrative Network Biology 2012: Network Medicine, an interdisciplinary symposium attracting scientists from a plethora of disciplines including biologists, biochemists, statisticians and mathematicians. Although the conference’s primary focus was network science in relation to disease treatment, it also provided many welcome opportunities for me to discuss open science with fellow delegates and to informally promote both the Panton Principles and the work of the OKFN. For now I’ll highlight two main items of interest. Data mining aficionados amongst you may be intrigued by the advances being made by Søren Brunak of the DTU and the University of Copenhagen. Søren’s group performs text mining of Danish medical records, using this information to identify hitherto unexplored links between medical conditions, paving the way for novel studies on protein interaction networks – valuable work which promises to have a real impact on healthcare provision in the near future. It was most encouraging to see the progress his group has made through data mining, and I hope it will inspire other communities to adopt similar approaches. You can view a short video of the symposium here – the complexity of the problems discussed during INB2012, and the benefit some researchers have gained from having access to large reserves of data, really underlines how vital it is that we as a community work to foster a climate of data sharing, appropriate licensing, and open research.

While at INB I also had the opportunity to speak at length with Peter Fraser Curle of IBM Zurich, who was promoting “IMPROVER”, a new crowdsourcing initiative which aims to foster greater verification and reproducibility in systems biology research. It’s good to see that some scientists are attempting to address these issues, especially given Begley and Ellis’ commentary in Nature earlier this year, which critiqued the lack of reproducibility of many experimental findings in oncology research. The project involves collaboration between academia and industry, participants being supplied with training data to develop their methods, before receiving fresh challenge data for the competitive stage. By challenging many groups to work on the same problem, they’re hoping to provide a means of evaluating the performance of different methods on a common data set and to “[identify] complementary methods to solve a problem“. Peter and I discussed the data licensing issues of the project and I also introduced him to the Panton Principles. He provided me with some extra literature, including a 2011 paper discussing the potential of crowdsourcing for driving greater scrutiny of scientific results. Although I’m a computational rather than an experimental scientist, I like to keep tabs on reproducability studies like this – it would be great if I could adapt the approaches of my open science training scheme for use in experimental disciplines as well (experimentalists, feel free to share your thoughts on this!). The IMPROVER team are keen to encourage scientists of all backgrounds to register, follow the projects and contribute ideas and expertise. Significant cash prizes are available to fund further research – so take a look at their website if you’re interested. Bear in mind that they intend to present several new challenges over time, so even if you’re too late for the first one, fresh challenges will be announced later.

Last week I also met up with Bushra Connors, a Senior Lecturer at the University of Hertfordshire, who asked to interview me as part of her research into the changing nature of education in the 21st century. Already familiar with the work of the Open Knowledge Foundation, Bushra was very interested in the graduate training initiative I’m developing during my Panton Fellowship. We spoke at length about the open data movement and the reproducability issues that affect a vast proportion of scientific research. She also provided a few literature references as a starting point for the question I asked in my blog last month about research group influence on the evolving style of a young researcher. I’m looking forward to following her work in the coming months to see what findings emerge from her interviews and analysis.

While all these things have been happening, I’ve also been further developing plans for my graduate training pilot scheme. At the moment I’m finalising my next meeting date with David Gavaghan (Director of the Oxford Doctoral Training Centre) and James Osborne (Associate Director at the DTC) to discuss my plans for this Michaelmas and to work out how I integrate these into the existing DTC programme most effectively. Much of the last month has been spent exploring the wide variety of teaching and training exercises available out there, particularly focusing on those in data management, coding practice and collaborative working. These include various exercises from the Peer-to-Peer University, the 20 Questions from David Shotton I mentioned last month, and the MRC’s new online course in Research Data and Confidentiality. I’m hoping to present you with a more comprehensive discussion of all this, along with a provisional course outline, next month, so watch this space!

And finally, a little insight into some of the open science reading I’ll be doing over the coming month. I’m lucky to have friends who also work in networked science and promote the open science agenda: one such person is Lucy Power, who’s just completed her doctorate at the Oxford Internet Institute. Entitled, “e-Research in the Life Sciences: from Invisible to Virtual Colleges“, her thesis addresses the evolution of scientific working methods in the life sciences in relation to the rise of the internet age. Interviews with a variety of academics engaged with open science practices form an integral part of the study, including, amongst others, the OKFN’s own Peter Murray-Rust and Cameron Neylon. Lucy’s work addresses many issues I’m hoping to weave into my open science training programme, so I’m really looking forward to working my way through this over the course of June.

And on that note, I shall leave you to enjoy your respective weekends. Just a little sneak preview for June though: a provisional schedule for my open science training scheme should be available for your perusal and comments; I’ll be talking to DSpace’s Anna Collins about our shared interests in open science graduate training, and also meeting with Kevin Page and David Ratcliffe in Oxford to discuss data mining and machine learning. And those are just a few highlights! Looking forward to sharing the outcomes of these events and others in a few weeks’ time…see you then!