Open Science Training: Lego, Languages & Lo-Fi, No-Fi

Well, it’s been quite a while since I last had a chance to blog about progress with the Open Science Training Initiative, so it’s about time I provided you with a bit of an update. Nor have things have been quiet on the open science front – admittedly I have been providing some soundbites over at the News feed of the main OSTI website – but juggling the final months of thesis writing with everything else is making things pretty busy!

So: this month’s update gives you a bit of Lego, a bit of Berlin, some opportunities to get involved with translation and/or education activities and a little glimpse at some upcoming changes to the OSTI website. Read on…

Calling All Linguists!

Currently, the bulk of OSTI teaching materials are only available in English, over at the Open Science Training GitHub repository. However, OSTI was designed for in-person teaching and for adapting local, subject-specific courses to deliver integrated open science training too. English-language versions alone cannot provide for this. Last year, some of the slides made it into Finnish as part of the Finland Open Knowledge Roadshow, care of Joona Lehtomaki and colleagues. I’d love to see a broader range of translations to take things further – some of you may already know about this from our recent discussion on the OKFN Open Science community call the other week.

Image by Tobias Mikkelsen (Flickr), CC-BY-NC-SA 2.0

Do you have language skills to offer? Image by Tobias Mikkelsen (Flickr), CC-BY-NC-SA 2.0

Realistically, I’m going to need YOUR help in translating OSTI materials into other languages. I’ve already heard from individuals from a variety of countries who would like to translate the resources we have into French, Spanish, Portuguese and Russian – I’d like to add Arabic and German to that list too. If you have language experience and an interest in open science, then I would love to hear from you – feel free to email me via the OSTI “Contact Us” details, or drop me a message in the reply box below. And if your language isn’t listed above but you’d like to be the person to add it to the list and recruit a communtiy of fellow translators, then let me know!

So, a few things you might want to know:

  • Before we can start the process of translating OSTI, I’m looking to revise the materials and get them into Markdown or similar;
  • Transifex has been suggested to me as one tool to assist with translations. If you know of any others which might be useful, or have any experience (good or bad) of working with Transifex, then leave a message below…
  • I’ll also be adding a Translations page to the OSTI website, as a central place for information, and establishing some mailing lists for our volunteer translator team to share their thoughts and ideas and to discuss any obstacles they meet during the translation process;
  • Obviously the above will take me a little time, so keep an eye on this blog and the OSTI site for further announcements – if I know you’re interested in being one of our translators, then I can email you once plans are taking shape.

So get in touch now and help to lead OSTI to pastures new!

Learning With Lego

Some of you may recall the  “Consequences of (Bad) Communication” workshop which I ran at last year’s SpotOn conference in London, which addressed the issue of science communication through the fabulous medium of Lego. I’ve been absolutely delighted with the response to this one – but then, who doesn’t love Lego (bare feet treading-on-bricks notwithstanding)? I have a suspicion that part of the appeal of Lego-based teaching sessions lies in the happy childhood memories it evokes in so many of us…

Happiness and Lego at SpotOn 2013 :) Photo by Sophie Kay

Happiness and Lego at SpotOn 2013 🙂 Photo by Sophie Kay (@StilettoFiend), licensed under a Creative Commons Attribution Licence, CC-BY-4.0

Microscope base in progress. Photo by Sophie Kay (@StilettoFiend), licensed under a Creative Commons Attribution Licence, CC-BY.

Microscope base in progress at SpotOn 2013. Photo by Sophie Kay (@StilettoFiend), licensed under a Creative Commons Attribution Licence, CC-BY 4.0.

Since I ran the session at SpotOn in November 2013, the session instructions have been downloaded from the OSTI website a fantastic 193 times. Although I can’t be sure which of these were read out of interest and which involved practical use, hopefully this means the ideas surrounding the session are spreading further. Furthermore, the mathematics department at Royal Holloway, University of London, will be adopting our Learning With Lego workshop as of this September. It’ll form a compulsory course for the first-year undergraduates and will take place on a weekly basis. It’s designed to get the students to identify what makes for good communication in a general (for which read, “Lego”) setting and, it is hoped, to pave the way for translating these experiences into improved communication of mathematical concepts during their day-to-day work.

Lego on Mozilla’s “Lo-Fi, No-Fi” Kit

And if you’ve been keeping an eye on that OSTI News page, you’ll also be aware that the Learning With Lego workshop is soon to appear as part of Mozilla’s “Lo-Fi, No-Fi” teaching kit. Established by Kat Braybrooke and colleagues at Mozilla and drawing on input from a variety of educators, the kit provides templates and ideas for teaching the web – and associated skills for using the web – in situations where connectivity might be low or even non-existent. I’m currently revising my original, informal instructions and packaging them for the kit, so I’ll be letting you know when our Lego lesson has officially appeared.

Homepage of Mozilla's Lo-Fi, No-Fi Teaching Kit
Homepage of Mozilla’s Lo-Fi, No-Fi Teaching Kit, offering educational sessions ranging from “Code Thief Cards to Teach Javascript Offline” to “Use Puzzles to Teach HTML”.

Open Knowledge Festival 2014: Berlin, July 15th-17th

Well, I did promise a little of Berlin at the start of this post, although it’s a visit to come rather than one that’s already taken place. Thanks to the generosity of the Wikimedia Foundation, I’ll be attending OKFest next month on a Wikimedia Scholarship. While I’m in Berlin, I’ll be looking to find ways of extending and adapting OSTI, as well as starting to build a strong community of educators willing to teach OSTI programmes in their home institutions – if that sounds like your kind of thing, then please come and talk to me at OKFest! I’ll be around for all three days of the festival and will also be hosting a session – I’m co-presenting Skills and Tools for Web Native Open Science with Karthik Ram on the final day of the programme, so I hope to see a mixture of new and familiar faces in the audience… And if you haven’t bought a ticket yet, then sign up here.

Well, it seems as though my “short” update is more than long enough for now. There’ll be more news later this week though, so watch out for a second post before we hit the weekend. You wait ages for a bus eh, and then… 🙂

Promotion, Preparation and Productivity: Open Science Sabbatical, December 2012

This month’s posting comes to you from a train somewhere between Manchester and Oxford – I’m making my most of the work time as I journey home from the seventh wedding I’ve been to in the past eight months. At time of writing, the start of the OSTI pilot is only 5 days away, so as you can imagine it’s been a bit of a nonstop month! The run-up to Christmas brought a combination of a website launch, promotional work, design and brand development for the OSTI, masses of lecture planning and preparation of course materials.

Perhaps the most significant development of December was the supervisors giving the thumbs-up to a “mini-sabbatical” of sorts, allowing me to focus solely on my open science fellowship. It’s really helped shape the course materials into an almost-finished state. I’ll save the finer details for the OSTI blogging phase later in the week, but the rough schedule of lightning lectures looks something like this:

  • Thursday 10th – (2 lectures) Reproducibility and Open Science; Open Source Coding & Version Control Using GitHub
  • Friday 11th – Licensing Your Data
  • Monday 14th – Data Management Plans & Scientific Workflows (incl. guest speaker Jun Zhao)
  • Tuesday 15th – The Changing Face of Publication
  • Thursday 17th – OKFN Session
  • Friday 18th – Presentation Day (assessment requirement for all participants)

Bear in mind that by the start of the course, the students will have already received 2 weeks’ training in Matlab and its applications, including GUI development and parallel implementation. The OSTI phase will span the assessment period for the course, themed around mathematical modelling of cancer and infectious disease.

The NERC Town Meeting (as I mentioned in my post from August 2012) provided considerable motivation for development of a website and other promotional materials for the OSTI, and took place in London on December 11th. Trialling the OSTI in an EPSRC DTC provides an excellent basis for transferring the course to similar DTP teaching models in other disciplines, and so I joined the preliminary meeting to promote the OSTI to prospective contract bidders. Drawing academics from across the UK, the meeting proved to be a reasonably productive day for open science discussion and I enjoyed some really good conversations with representatives and educationalists from, amongst others, Warwick, Oxford, Royal Holloway and the Natural History Museum.

So, what of the new aesthetic for the OSTI brand? In the interests of developing a cohesive identity for the initiative, the design needed to be consistent across all physical handouts and the website. I opted for a green, black and gold colour scheme in the end, and you can see the results in the images below (front and reverse sides of the leaflet are shown). And in keeping with the spirit of OSTI, the striking images in the design are all Creative Commons licensed content – it’s a pleasure to see such high-quality images available for use under CC license and certainly made the design process much easier for me. A CMYK version for printing will be made available via the OSTI website once the content is expanded.

OSTI Promotional Leaflet (Reverse)So, what of that website? I should warn you now that the site is live in its basic form, but hasn’t had its official public launch yet (announcement on that will follow when the time comes). You can find it at – at present there’s just a mission statement on the opening page and a couple of other tabs with contact details. I’ll be adding content over the next month, starting with a description of the course structure and lectures, and extending to downloadable slides and materials once the course is underway. Feel free to drop me a message if you’d like to be emailed once full content and materials downloads start to appear…

Another exciting development in December was a meeting with Will Hutton, author of the bestselling work “The State We’re In” and current Principal of Hertford College, Oxford. Organised by Jenny Molloy, the gathering included a variety of faces from the Open community in Oxford, including Chas Bountra of the Structural Genomics Consortium, Simon Benjamin of Quantalk and Sally Rumsey of the Bodleian Library. Will discussed his plans to establish a series of studentships in Open Science at Hertford College, potentially in association with the Big Innovation Centre, and provided us all with a fantastic opportunity to debate the state of open science too. If this project gains the necessary funding and support to come to fruition then it could lead to a considerable hub of open research activity being established in Oxford, with the power to unify the diverse threads of open activity already taking place within the University’s departments, and to inspire novel working practices in young academics. I should stress that it’s early days yet, so keep an eye out for further news as the project develops.

So, what for January 2013? This year involves something of a running start, given the imminent beginning of the OSTI pilot on the 10th. I’m aiming to blog my progress with the course as it happens, or at least every other day if things end up being pretty hectic. Once we hit the 18th (and, moreover, once marking of the assessed work is out of the way) it’ll be onto the evaluation phase and the post-pilot report. I’ll also be following up with a few people from the NERC Town Meeting and meeting with MPLS (the physical and life sciences division) in Oxford to discuss how the OSTI might be applied to other departments outside the DTC. And there may even be a trip to the States in the pipeline…but more on that in a few weeks’ time…

A month in the life of a Panton Fellow: June 2012

Well, June has been another productive month of fellowship work! To start on a positive note, Ross Mounce and I received the good news that our proposal for OKFest has been accepted, so we’ll be in Helsinki this September to tell you about the work we’re doing for our Panton Fellowships, as part of the “Open Research and Education” topic stream on Wednesday 19th September. Looking forward to it! June has also seen several different online meetings with various working groups, in addition to my first official quarterly report for the Fellowship, so there’s been plenty to keep me occupied.

Many of you reading this will already be aware of my focus on developing graduate training schemes for open science, data management and reproducible computation. I’m really conscious of how much our early research years are influenced by the ethos of the first group we join: this emphasises a pressing need to adequately train our graduates while they’re still at a pre-doctoral stage. So you can imagine how interested I was to read the newly released JISC-funded report, entitled,“Researchers of Tomorrow: the research behaviour of Generation Y doctoral students.” The report outlines the findings of a three-year study on our youngest research generation, the children of the so-called “baby boomers”. Amongst other things, the findings identify the need for enhanced training in digital technologies, data management and collaborative working – so encouraging to hear this while I’m in the process of developing my graduate training initiative. You can download a PDF of the report here – definitely worth a look!

June has also seen further discussion with Greg Wilson and the rest of the team involved in the development of the Software Carpentry initiative. I first mentioned SWC back in my April blog posting – they provide fantastic courses in coding and software development for scientists with a limited experience of programming, combining intense in-person workshops with online learning materials. I initially heard of them as a result of my contact with the Software Sustainability Insititute, and was keen to hear more about their work and how they’ve scaled the initiative up to work in many different countries and locations. After a great Skype call with Greg earlier this month, I remotely joined their conference on 20th June, which gave me the chance to meet (from across the Atlantic, at least!) many other people involved in the project (including OKFN’s own Cameron Neylon). I’m keen on the idea of integrating some of their courses – all available under a Creative Commons Attribution license – into my own training scheme later this year, so I really appreciated getting a chance to hear about how their work is progressing. One further note: the guys at SWC are really keen to get more female scientists into programming too (something which I completely support!), so if your department/organisation might be interested in holding a female-targeted session, then please do get in touch with them ASAP.

On 28th June, Jenny Molloy and I met up with various representatives from Oxford’s Bodleian Library. Alena Ptak-Danchak, Sally Rumsey, Juliet Ralph and Oliver Bridle all took time out of their busy schedules to talk to us, providing a picture of the existing state of data training provision across Oxford and discussing where my course might fit into that framework. Our librarians (and I mean this in a country-wide sense) represent a massive source of expertise in information management that we’re lucky to have. All the Bodleian representatives provided us with valuable insights into what kinds of training the students are most receptive to, and how I might adjust my own approach to course delivery in order to account for this. And I now have plenty of resources to explore and contacts to pursue. All in all, a successful meeting – and many thanks to Jenny for helping to bring this about!

I’ve also started to organise the Oxford Open Science meeting for August 22nd, provisionally entitled, “How best can we train graduates for research in the age of ‘Big Data’?” I’m hoping to:

  • generate debate on the evolution of training schemes for open science, data management and/or digital technologies;
  • discuss how we as a community can maximise the uptake of training initiatives in these areas;
  • think about how we might begin to use such training as a platform to engage those outside the open science community.

The group wiki can be found here and includes details of other upcoming meetings too: we’re a friendly bunch of people, so please do come and join, whether you want to listen to the discussion or to actively add to the debate. I’m in the process of recruiting speakers at the moment – if you, or someone you know, might be interested in speaking at our meeting, then I would love to hear from you. I’d better hold back on full details until names are fully confirmed, so watch this space…

July looks to be an exciting month, with several big meetings planned already. On 5th July I’m heading over to Cambridge for the day to meet with Anna Collins of DSpace, the digital repository for the University of Cambridge, to chat about our shared interests in data management and graduate training. The trip will also provide me with a chance to meet up with OKFN’s Laura Newman, Peter Murray-Rust and Tom Oinn over lunch – we should have plenty to talk about, and I’m really looking forward to hearing about the progress of the newly-launched School of Data. I’ll also be meeting with David Gavaghan and James Osborne of the Oxford DTC this Friday in order to develop plans for the open science training initiative I’ll be piloting this Michaelmas. Despite juggling work with a house move in a couple of days’ time, I’m hoping to join the OKFN hackday over Skype for a couple of hours this Saturday (unpacking chaos permitting!). Furthermore, I should also be meeting with David De Roure, Jenny Molloy and Peter Murray-Rust to discuss the potential for an open science workshop at Digital Research 2012, due to take place in Oxford this September. This month’s going to be a busy one…so if you wait a couple of weeks for my next Panton blog entry, I’ll let you know how it all turns out!

A month in the life of a Panton Fellow: May 2012

And so another month draws to a close, and it’s time for the Panton Fellows to update you on what they’ve been up to recently. Before I start talking about my work though, I should draw your attention to Ross Mounce‘s Panton summary for May, addressing topics ranging from Michael Nielsen’s excellent read “Reinventing Discovery” to Ross’ recent attendance at the Progressive Palaeontology conference in Cambridge. May has indeed been a busy month, but also an enjoyable one. Quite a few different things have happened: I’ll provide edited highlights only here to avoid things getting too long, but I’ll try and blog at greater length about a couple of items over the coming week.

My month started out with a trip to Tavistock Square, London, meeting with John Wood and Ben Prasadam-Halls of the Association of Commonwealth Universities. I was also joined by Peter Murray-Rust and Laura Newman, which made for plenty of interesting discussion. As well as hearing from Laura about the newly-launched OKFN School of Data, we talked about the implications of the open data movement for the development of distance learning initiatives worldwide, and what’s being done at present to help achieve this. John is a great proponent of graduate training in open science and recognises the need to develop appropriate initiatives to train data experts who can support the evolution of scientific practice in this age of “Big Data”. Those of you unfamiliar with his existing work with the European open science agenda may be interested in reading the excellent 2010 report, “Riding the Wave: How Europe can gain from the rising tide of scientific data” or watch a video of John’s keynote speech from APE 2011 in Berlin:

The following week saw me travel to Helsingør, Denmark, where I attended Integrative Network Biology 2012: Network Medicine, an interdisciplinary symposium attracting scientists from a plethora of disciplines including biologists, biochemists, statisticians and mathematicians. Although the conference’s primary focus was network science in relation to disease treatment, it also provided many welcome opportunities for me to discuss open science with fellow delegates and to informally promote both the Panton Principles and the work of the OKFN. For now I’ll highlight two main items of interest. Data mining aficionados amongst you may be intrigued by the advances being made by Søren Brunak of the DTU and the University of Copenhagen. Søren’s group performs text mining of Danish medical records, using this information to identify hitherto unexplored links between medical conditions, paving the way for novel studies on protein interaction networks – valuable work which promises to have a real impact on healthcare provision in the near future. It was most encouraging to see the progress his group has made through data mining, and I hope it will inspire other communities to adopt similar approaches. You can view a short video of the symposium here – the complexity of the problems discussed during INB2012, and the benefit some researchers have gained from having access to large reserves of data, really underlines how vital it is that we as a community work to foster a climate of data sharing, appropriate licensing, and open research.

While at INB I also had the opportunity to speak at length with Peter Fraser Curle of IBM Zurich, who was promoting “IMPROVER”, a new crowdsourcing initiative which aims to foster greater verification and reproducibility in systems biology research. It’s good to see that some scientists are attempting to address these issues, especially given Begley and Ellis’ commentary in Nature earlier this year, which critiqued the lack of reproducibility of many experimental findings in oncology research. The project involves collaboration between academia and industry, participants being supplied with training data to develop their methods, before receiving fresh challenge data for the competitive stage. By challenging many groups to work on the same problem, they’re hoping to provide a means of evaluating the performance of different methods on a common data set and to “[identify] complementary methods to solve a problem“. Peter and I discussed the data licensing issues of the project and I also introduced him to the Panton Principles. He provided me with some extra literature, including a 2011 paper discussing the potential of crowdsourcing for driving greater scrutiny of scientific results. Although I’m a computational rather than an experimental scientist, I like to keep tabs on reproducability studies like this – it would be great if I could adapt the approaches of my open science training scheme for use in experimental disciplines as well (experimentalists, feel free to share your thoughts on this!). The IMPROVER team are keen to encourage scientists of all backgrounds to register, follow the projects and contribute ideas and expertise. Significant cash prizes are available to fund further research – so take a look at their website if you’re interested. Bear in mind that they intend to present several new challenges over time, so even if you’re too late for the first one, fresh challenges will be announced later.

Last week I also met up with Bushra Connors, a Senior Lecturer at the University of Hertfordshire, who asked to interview me as part of her research into the changing nature of education in the 21st century. Already familiar with the work of the Open Knowledge Foundation, Bushra was very interested in the graduate training initiative I’m developing during my Panton Fellowship. We spoke at length about the open data movement and the reproducability issues that affect a vast proportion of scientific research. She also provided a few literature references as a starting point for the question I asked in my blog last month about research group influence on the evolving style of a young researcher. I’m looking forward to following her work in the coming months to see what findings emerge from her interviews and analysis.

While all these things have been happening, I’ve also been further developing plans for my graduate training pilot scheme. At the moment I’m finalising my next meeting date with David Gavaghan (Director of the Oxford Doctoral Training Centre) and James Osborne (Associate Director at the DTC) to discuss my plans for this Michaelmas and to work out how I integrate these into the existing DTC programme most effectively. Much of the last month has been spent exploring the wide variety of teaching and training exercises available out there, particularly focusing on those in data management, coding practice and collaborative working. These include various exercises from the Peer-to-Peer University, the 20 Questions from David Shotton I mentioned last month, and the MRC’s new online course in Research Data and Confidentiality. I’m hoping to present you with a more comprehensive discussion of all this, along with a provisional course outline, next month, so watch this space!

And finally, a little insight into some of the open science reading I’ll be doing over the coming month. I’m lucky to have friends who also work in networked science and promote the open science agenda: one such person is Lucy Power, who’s just completed her doctorate at the Oxford Internet Institute. Entitled, “e-Research in the Life Sciences: from Invisible to Virtual Colleges“, her thesis addresses the evolution of scientific working methods in the life sciences in relation to the rise of the internet age. Interviews with a variety of academics engaged with open science practices form an integral part of the study, including, amongst others, the OKFN’s own Peter Murray-Rust and Cameron Neylon. Lucy’s work addresses many issues I’m hoping to weave into my open science training programme, so I’m really looking forward to working my way through this over the course of June.

And on that note, I shall leave you to enjoy your respective weekends. Just a little sneak preview for June though: a provisional schedule for my open science training scheme should be available for your perusal and comments; I’ll be talking to DSpace’s Anna Collins about our shared interests in open science graduate training, and also meeting with Kevin Page and David Ratcliffe in Oxford to discuss data mining and machine learning. And those are just a few highlights! Looking forward to sharing the outcomes of these events and others in a few weeks’ time…see you then!

Panton Fellowship Application: Proposal of Work 2012-13

Just in case any of you are interested, here’s my proposal for the Panton Fellowship for 2012-13 – you can also view the corresponding video here. Much of the work will focus on establishing an open science training scheme for pre-doctoral graduates, as you’ll see from the following…

The open science movement is rapidly gaining momentum, as is evident from the level of interest in the recent ‘Evolution of Science’ debate in Oxford in February 2012, the well-publicised boycott of the publisher Elsevier, and the variety of blogs and discussion groups currently being established. Our data-rich scientific world requires an appropriate infrastructure to disseminate data, code and associated writing, as well as a means of training people in the access and use of this information. Adoption of open data practices throughout the research community therefore demands changes in the working habits of existing academics and appropriate training of upcoming new researchers.

My proposal for the Panton Fellowship focuses primarily on the latter aspect. I intend to establish an Open Science Training Initiative (OSTI) for graduate students, prior to the onset of their doctoral research. Participants in the scheme will learn about the constraints and legal frameworks governing open data and discuss the ethics of scientific research. The programme will educate in the use of open data, will provide first-hand experience of implementing this approach and will equip students with the skills and knowledge to sustain this outlook when they finally enter the research environment. Successful conclusion of the pilot scheme will provide the foundations for expansion of the training programme across other universities and educational centres.

The Doctoral Training Centre (DTC) at the University of Oxford has an intake of around 40 graduate students per annum, all of whom are required to complete a pre-doctoral taught year. Many exercises in the existing DTC programme involve trying to reproduce results from published work. Often (> 50%) of the time it is not possible to fully achieve this, most commonly due to omission of detail (parameter values, etc.) or, occasionally, errors in the paper of interest. The DTC therefore provides a natural opening for our OSTI pilot scheme. The further particulars of the scheme, as detailed below, have already secured the support of Prof. David Gavaghan (DTC Director) and Ms. Sam Miles (Centre Administrator). Our OSTI approach is motivated by the need to include everything in the publication needed to reproduce results and links directly to development of standards such as MIASE, SED-ML, CELL-ML and SBML.

The OSTI pilot will take the form of group-based research over several days and will be incorporated into a two week course in computational/mathematical biology. The participants will be split into two groups, and each group presented with a different research problem. The chosen problems will provide a broad starting point (e.g. a general model in mathematical biology, or an existing paper in the field), rather than an explicit list of tasks to complete. Each group will devise their own research question to work on, supported throughout by experienced academic demonstrators. Members of each group will work collaboratively, but discussion between the separate groups will be prohibited, for reasons which we now outline.

After a specified period (which could range from 1-2 days up to a full week, according to the scope and difficulty of the exercise), the problems will rotate; successors will be expected to develop their new research question solely on the basis of the data, code and documentation supplied by their predecessors. Verbal queries are not permitted between the groups, and successors will need to verify the existing findings they are presented with before developing the research further. This heightens the need for teams to provide coherent documentation of their work and adequate means of verifying the data they release. Academic demonstrators will provide particular support during the group
rotation phase to ensure a smooth transition, especially where a group has to deal with omissions in the research documentation. Successors will also be expected to complete a brief questionnaire critiquing the success of their predecessors in providing a coherent, accessible research story.

Ultimately the aim is for the entire cohort to maximise their collective research progress throughout the rotation phase, thus inducing groups to aid one another through good written communication, rather than compete to produce results. The rotation phase will culminate in oral presentation and discussion of the work (potentially streamed online), along with release of the documentation, code and findings.

2.1 – Phase 1: Planning (April 2012 – October 2012)
Success of the pilot programme will require rigorous planning and preparation. Ongoing work throughout this phase will include:

  • Course Preparation: The course will be developed in discussion with the DTC; in particular, this phase will require preparation of appropriate seminars and selection of the computational research problems involved. I will also need to establish and test the online infrastructure for the sharing of data and code during the project. Care must also be taken to ensure adequate division of subject expertise when assigning students to groups.
  • Legal Frameworks: Liaise with the legal department of the University of Oxford in advance to obtain an appropriate licence for the data.
  • Long-Term Sustainability: We are already in regular contact with Neil Chue Hong of the Software Sustainability Institute and intend to maintain this dialogue throughout Phase 1 to lay the foundations for the report and promotion tasks of Stage 3. This phase would benefit from coordination with the Open Knowledge Foundation, as regards the development of data-sharing approaches and open science seminar content.

2.2 – Phase 2: Delivery (November 2012)
The two main aims of the scheme are to educate in open science definitions and to provide first-hand experience of open science practice. The OSTI pilot will require students to question the suitability of their own working practices in relation to the implementation of open science and will encourage them to improve their research methods over the course of the problem rotations.

  • Open Data Aspects: A computational teaching module will be used for the pilot scheme, providing data-intense problems to work on and requiring students to produce open-source code. Students will be required to develop good coding skills (e.g. sensible structuring, appropriate commenting) that facilitate code reuse by others, as well as providing accessible documentation. For example, coding could be performed using the open-source Chaste framework, which is already covered by an OSI-conformant licence.
  • Public Process: There is added potential to advertise the pilot amongst the open science community and to establish a public webpage for the project, helping to generate advance interest in OSTI which could aid its adoption and extension afterwards.
  • External Involvement: It may also be possible to admit a small number of talented undergraduates if space allows; at the very least, the potential for undergraduate involvement in future will be explored in the Report (see Section 3).

The finer details of the OSTI pilot (e.g. group numbers, number of research problems) are
contingent on the capacity of the DTC in 2012-13 and will become clearer in the coming months. Dates of the Delivery phase will be known upon confirmation of the module schedule for the next academic year.

2.3 – Phase 3: Evaluation and Promotion (December 2012 – April 2013)
Successful completion of the Delivery phase will be followed by an evaluation period. I will solicit feedback from all participants and compile a detailed report on the outcomes of the OSTI pilot. This report will form the basis of my submission to the Software Sustainability Institute, although the SSI submission will also include details of proposed target locations and suitable OSTI leaders, along with the necessary support and documentation to expand the initiative further afield. We have already established a dialogue with Neil Chue Hong of the SSI to discuss the potential for extending the OSTI scheme and the SSI believe there will be a good market for our initiative. Indeed the OSTI format could be realised in many other forms and could lend itself to internship-style undergraduate training just as easily, or even to a two-week workshop for established academics. Its content and duration can be tailored accordingly, providing a flexible training approach that can be adapted to the needs of the institution or department wishing to adopt it. Furthermore, I have already generated interest in setting up a postdoctoral OSTI for the academics involved in the 2020 Science project and a young researchers’ OSTI for new students joining the Oxford Computational Biology Group. These additional implementations would also provide useful comparative studies for my report to the SSI and would help to showcase the adaptability of the OSTI format.

Planning, delivery and promotion of the OSTI scheme(s) would be expected to form the most significant component of my open science activities over the course of the coming year. Nonetheless, I would also be engaged in the following endeavours, as discussed in my covering letter:

  • Research Tasks: Dissemination of my own research online as regards combined release of data and code, involving bolt-on projects in the Chaste framework, and promotion of these approaches when speaking at conferences and other events.
  • Communication and Access: My involvement with the Ashlawn Pathways Conference as Director of Science would enable me to introduce data ethics to young scientists at the top end of pre-18 education while also promoting scientific careers to the brightest students.
  • Additional involvement: Potential to institute a series of Open Science seminars across Oxford, possibly through furthering my existing involvement with the recently established Oxford Open Science working group.