Panton Fellowship Application: Proposal of Work 2012-13

Just in case any of you are interested, here’s my proposal for the Panton Fellowship for 2012-13 – you can also view the corresponding video here. Much of the work will focus on establishing an open science training scheme for pre-doctoral graduates, as you’ll see from the following…

1 BACKGROUND AND MOTIVATION
The open science movement is rapidly gaining momentum, as is evident from the level of interest in the recent ‘Evolution of Science’ debate in Oxford in February 2012, the well-publicised boycott of the publisher Elsevier, and the variety of blogs and discussion groups currently being established. Our data-rich scientific world requires an appropriate infrastructure to disseminate data, code and associated writing, as well as a means of training people in the access and use of this information. Adoption of open data practices throughout the research community therefore demands changes in the working habits of existing academics and appropriate training of upcoming new researchers.

My proposal for the Panton Fellowship focuses primarily on the latter aspect. I intend to establish an Open Science Training Initiative (OSTI) for graduate students, prior to the onset of their doctoral research. Participants in the scheme will learn about the constraints and legal frameworks governing open data and discuss the ethics of scientific research. The programme will educate in the use of open data, will provide first-hand experience of implementing this approach and will equip students with the skills and knowledge to sustain this outlook when they finally enter the research environment. Successful conclusion of the pilot scheme will provide the foundations for expansion of the training programme across other universities and educational centres.

2 – PROJECT DESCRIPTION
The Doctoral Training Centre (DTC) at the University of Oxford has an intake of around 40 graduate students per annum, all of whom are required to complete a pre-doctoral taught year. Many exercises in the existing DTC programme involve trying to reproduce results from published work. Often (> 50%) of the time it is not possible to fully achieve this, most commonly due to omission of detail (parameter values, etc.) or, occasionally, errors in the paper of interest. The DTC therefore provides a natural opening for our OSTI pilot scheme. The further particulars of the scheme, as detailed below, have already secured the support of Prof. David Gavaghan (DTC Director) and Ms. Sam Miles (Centre Administrator). Our OSTI approach is motivated by the need to include everything in the publication needed to reproduce results and links directly to development of standards such as MIASE, SED-ML, CELL-ML and SBML.

The OSTI pilot will take the form of group-based research over several days and will be incorporated into a two week course in computational/mathematical biology. The participants will be split into two groups, and each group presented with a different research problem. The chosen problems will provide a broad starting point (e.g. a general model in mathematical biology, or an existing paper in the field), rather than an explicit list of tasks to complete. Each group will devise their own research question to work on, supported throughout by experienced academic demonstrators. Members of each group will work collaboratively, but discussion between the separate groups will be prohibited, for reasons which we now outline.

After a specified period (which could range from 1-2 days up to a full week, according to the scope and difficulty of the exercise), the problems will rotate; successors will be expected to develop their new research question solely on the basis of the data, code and documentation supplied by their predecessors. Verbal queries are not permitted between the groups, and successors will need to verify the existing findings they are presented with before developing the research further. This heightens the need for teams to provide coherent documentation of their work and adequate means of verifying the data they release. Academic demonstrators will provide particular support during the group
rotation phase to ensure a smooth transition, especially where a group has to deal with omissions in the research documentation. Successors will also be expected to complete a brief questionnaire critiquing the success of their predecessors in providing a coherent, accessible research story.

Ultimately the aim is for the entire cohort to maximise their collective research progress throughout the rotation phase, thus inducing groups to aid one another through good written communication, rather than compete to produce results. The rotation phase will culminate in oral presentation and discussion of the work (potentially streamed online), along with release of the documentation, code and findings.

2.1 – Phase 1: Planning (April 2012 – October 2012)
Success of the pilot programme will require rigorous planning and preparation. Ongoing work throughout this phase will include:

  • Course Preparation: The course will be developed in discussion with the DTC; in particular, this phase will require preparation of appropriate seminars and selection of the computational research problems involved. I will also need to establish and test the online infrastructure for the sharing of data and code during the project. Care must also be taken to ensure adequate division of subject expertise when assigning students to groups.
  • Legal Frameworks: Liaise with the legal department of the University of Oxford in advance to obtain an appropriate licence for the data.
  • Long-Term Sustainability: We are already in regular contact with Neil Chue Hong of the Software Sustainability Institute and intend to maintain this dialogue throughout Phase 1 to lay the foundations for the report and promotion tasks of Stage 3. This phase would benefit from coordination with the Open Knowledge Foundation, as regards the development of data-sharing approaches and open science seminar content.

2.2 – Phase 2: Delivery (November 2012)
The two main aims of the scheme are to educate in open science definitions and to provide first-hand experience of open science practice. The OSTI pilot will require students to question the suitability of their own working practices in relation to the implementation of open science and will encourage them to improve their research methods over the course of the problem rotations.

  • Open Data Aspects: A computational teaching module will be used for the pilot scheme, providing data-intense problems to work on and requiring students to produce open-source code. Students will be required to develop good coding skills (e.g. sensible structuring, appropriate commenting) that facilitate code reuse by others, as well as providing accessible documentation. For example, coding could be performed using the open-source Chaste framework, which is already covered by an OSI-conformant licence.
  • Public Process: There is added potential to advertise the pilot amongst the open science community and to establish a public webpage for the project, helping to generate advance interest in OSTI which could aid its adoption and extension afterwards.
  • External Involvement: It may also be possible to admit a small number of talented undergraduates if space allows; at the very least, the potential for undergraduate involvement in future will be explored in the Report (see Section 3).

The finer details of the OSTI pilot (e.g. group numbers, number of research problems) are
contingent on the capacity of the DTC in 2012-13 and will become clearer in the coming months. Dates of the Delivery phase will be known upon confirmation of the module schedule for the next academic year.

2.3 – Phase 3: Evaluation and Promotion (December 2012 – April 2013)
Successful completion of the Delivery phase will be followed by an evaluation period. I will solicit feedback from all participants and compile a detailed report on the outcomes of the OSTI pilot. This report will form the basis of my submission to the Software Sustainability Institute, although the SSI submission will also include details of proposed target locations and suitable OSTI leaders, along with the necessary support and documentation to expand the initiative further afield. We have already established a dialogue with Neil Chue Hong of the SSI to discuss the potential for extending the OSTI scheme and the SSI believe there will be a good market for our initiative. Indeed the OSTI format could be realised in many other forms and could lend itself to internship-style undergraduate training just as easily, or even to a two-week workshop for established academics. Its content and duration can be tailored accordingly, providing a flexible training approach that can be adapted to the needs of the institution or department wishing to adopt it. Furthermore, I have already generated interest in setting up a postdoctoral OSTI for the academics involved in the 2020 Science project and a young researchers’ OSTI for new students joining the Oxford Computational Biology Group. These additional implementations would also provide useful comparative studies for my report to the SSI and would help to showcase the adaptability of the OSTI format.

3 ADDITIONAL TASKS
Planning, delivery and promotion of the OSTI scheme(s) would be expected to form the most significant component of my open science activities over the course of the coming year. Nonetheless, I would also be engaged in the following endeavours, as discussed in my covering letter:

  • Research Tasks: Dissemination of my own research online as regards combined release of data and code, involving bolt-on projects in the Chaste framework, and promotion of these approaches when speaking at conferences and other events.
  • Communication and Access: My involvement with the Ashlawn Pathways Conference as Director of Science would enable me to introduce data ethics to young scientists at the top end of pre-18 education while also promoting scientific careers to the brightest students.
  • Additional involvement: Potential to institute a series of Open Science seminars across Oxford, possibly through furthering my existing involvement with the recently established Oxford Open Science working group.
Advertisements

Preparing for University Science

Back in March 2012 I presented my work at one of the Open Days at the University of Oxford’s Department of Computer Science. It’s one of those tasks that I really enjoy – it’s always really rewarding to speak with bright young students who are aiming for a career in science and who have a genuine interest in the research you’re presenting.

Some of the attendees have asked me for advice about applying to university: how best to prepare? Some of the following points might sound like stating the obvious, but if followed should lighten the burden of the decision and application process. And don’t forget: these are my own opinions from my experience of academia on both the application and selection sides of the university process. Speak to other people too, as they’ll no doubt have other pieces of advice (and different perspectives) which you might find useful.

Feel free to comment below if there are other things you’d like to see covered. This list is far from exhaustive: there were so many points crowding into my mind that I haven’t been able to set them all down in one go. I’ll try to post an extension to this at some point in the future, and maybe some advice for female science applicants and a discussion of Oxbridge interviews if people would find that useful. Let me know what you’d like to see discussed here!

1. Investigate course content thoroughly. Gaining a degree is becoming an expensive business with the rise of university fees. As a prospective applicant, you want to make sure that you choose the right course for you – remember, you’ll be studying this subject for at least 3, if not 4 years. It’s really important to check course content before you apply. For example, just because you’ve drawn up a list of 12 courses all with the title of ‘Computer Science’ doesn’t mean that they all take the same approach to the subject. Some CS courses will be more heavily mathematical in content and focus on the linear algebra and logic alongside the programming. Others may avoid this and stay very much to the practical coding side of things. One approach or other may appeal to you more, so make sure that you’re aware of these differences and choose a course that reflects your own interests.

2. University and school perspectives on a subject may differ. Really important one, this, and a remark which relates to my first point pretty closely. Of course, in many cases your GCSE and/or A-level courses might provide a reasonable taster for the kind of subject matter you’ll come across at university. In some cases though, the university version of the subject might turn out to be quite different. For example, as an undergraduate I read Mathematics. Now, I loved it (and still do!) but I’d definitely say that my university studies were far more abstract than anything I’d been presented with at school.  I thought this was magical: the first few weeks of my undergraduate education forced me to take a subject I thought I’d known and understood for the best part of 18 years, and proceeded to totally uproot my perspective on it. One of my first problem sheet questions asked me to prove a particular proposition, then commented that, “you are not allowed to use square roots to prove this, as their existence has yet to be proven in the course.” Somehow, I’d reached the age of 18 without being encouraged to truly question the validity of even the most basic mathematical constructs. My degree changed that and taught me new ways of thinking through problems. I loved this, but it might not be your cup of tea (and similarly, not all mathematics degrees would adopt this approach). So don’t assume that a subject stays the same at university: science is a magically vast world and there are all sorts of perspective a course can adopt and material it can cover. So research your choices thoroughly – it’s time well invested!

3. Choose something that interests you and which you genuinely love to study. No matter what you choose, there will always be days when you feel tired or dejected: if you know under all that that you actually love what you do, that can help bring you through any rough patches. If you end up being interviewed for university places, that enthusiasm will shine through even if you’re nervous, and it’s always an encouraging sign for your interviewer.

4. Be involved. There are so many ways in which you can indulge your passion for science and learn more about it, way beyond your school examination syllabus. Do your local science museums need volunteers? Or do they have any projects you could get involved in? Some cities have local science groups or organisations: for example, Science Oxford runs many interesting science projects for young people and indeed the wider community, and it’s not the only one of its kind. So find out what’s going on locally where you are!

5. Keep up with science communication. I think I was one of the first year groups to submit a UCAS form electronically: back then, eBay was just getting off the ground, most of us had a dial-up Internet connection if we had an Internet connection at all, and Facebook didn’t exist. If you’re applying to university nowadays, you’ve got access to all sorts of wonderful science resources on the Internet, along with opportunities to correspond directly with top scientists and science journalists. Make the most of this! Find out which are the most interesting science blogs and follow them. Join Twitter and keep up with the musings of prominent researchers in your field of interest. And don’t be scared to ask them questions if you want to – even if some may be too busy to regularly correspond, most will be delighted that you’re interested in pursuing a career in science and will be glad to point you in the right direction. The national press often prints reviews of popular science books too – you may find some of these appealing, so don’t be afraid to give them a try. And the New Scientist is always a good bet if you want accessible articles about recent science developments (you can also buy individual copies in the shops or receive a cut-price student subscription for a specified period). Internet-based “community science” projects are also becoming more common and are a great way to engage in up-to-the-minute science. Try reading Michael Nielsen’s book, “Reinventing Discovery: The New Era of Networked Science” to find out about projects such as Galaxy Zoo and similar collaborative endeavours.

6. Attend open days. All the pointers I’ve mentioned should prove useful, but in many cases people are your best source of information. University open days are great for this, as current students will usually be around to chat to. They’ll be able to share their own first-hand experiences of their subject, their course and the city they live in. Don’t be scared to ask them questions – that’s why they’re helping out on the open day!

Right – I think I’ll stop there for now for fear of this becoming an overly long piece of advice. I’ll be adding more once I get chance, so watch this space. But one final comment for now:

7. Never stop asking why. ‘Why’ is one of those wonderful questions that pushes scientific research on. If you never tire of wanting to know about the world around you, or how to think your way through a problem, then there’s a very good chance that science is the path for you. Good luck!