2015: A retrospective

It’s been some time since I last posted, but that’s not because I’ve been twiddling my thumbs. The year 2015 saw some big changes, and I wanted to take some time to reflect and share on where we’ve been (and where we’re going).

It was about a year ago when I landed in Davis, CA, to attend Titus Brown’s three-day Train-the-Trainers Software Carpentry instructor training. I met a lot of folks that work in bioinformatics (not my field, but not far off), as well as others involved in open science and open software. It was a transformative experience, with the practical benefit that I was offically able to teach Software Carpentry workshops after finishing up some final requirements. I went on to teach four in 2015, and a fifth just a week ago.

February saw my first trip to the Biophysical Society Annual Meeting with research to present, having first gone in 2012 when I was just starting out. I met many of Oliver’s (my PI) colleagues from Oxford and elsewhere, but spent a good deal of the meeting finishing up some exciting results that had just come in. The trouble with giving a talk is that you can work on it up until you get to the stage…so you will.

The meeting was great, though shaping new results into a talk with short turnaround was made possible by a software package I’d been building called MDSynthesis, which that winter was finally in a state to be (exceptionally) useful. It sped up my ability to slice, dice, and aggregate datasets from myriad simulations I had performed, taking advantage of pandas DataFrames and HDF5 persistence to make fast exploration possible. I was on fire, and the Baltimore cold couldn’t put me out.

I attended my first SciPy conference in July, where I presented MDSynthesis as a poster. The conversations I had there influenced big changes in the package that are still shaking out, one of which being the splitting out of the core persistence functionality into a package called datreant. A few other devs from MDAnalysis and I are now engaged with figuring out the future form of this package, and the results are looking to be really quite awesome.

Speaking of MDAnalysis, Google Code finally gasped its last breath, and this forced the project to move elsewhere. Its new home became GitHub, and, perhaps unexpectedly, this became perhaps the best thing to happen to the project in a long time. The great development tooling that GitHub provides has catalyzed a flurry of new work, and has attracted new contributors, including some that are now coredevs (yours truly included).

The value GitHub added to MDAnalysis development spurred us to (finally) make a Becksteinlab org, at which the open source projects incubated in our lab are now hosted. I’m proud to be in one of those “with it” labs, though a challenge remains in getting busy lab members to actively work on projects that have a public face and getting them in a useful state. I think this is a culture our lab intends to encourage, but it’s no secret that academia doesn’t directly value this kind of work today. But we’re building for the world of tomorrow.

In October I traveled to Germany (my first time in Europe) for the CECAM biomolecular simulation workshop-of-workshops in Jülich. Despite wifi issues in both the venue and the hotel, Philip Fowler and I managed to start things off with a Software Carpentry workshop that covered the Unix shell, Python, and Git. Oliver and I followed this with a (too short) MDAnalysis tutorial, doing things SWC-style and live coding in the Jupyter notebook (using pre-made notebooks as cheat-sheets rendered with nbviewer on my phone, of course; we’re not magicians). Both workshops were very well-received, with people appreciating most the interactivity and the ability to play directly with the library during the breakout components of the live sessions.

In November, Richard Gowers visited our lab here in Arizona for a few weeks. After treating him to some authentic New Mexican food (red or green?), he and I got to work gutting the existing topology system of MDAnalysis. The result is more performant and more flexible (a rarity!), and should finally get merged ahead of the long-sought 1.0 release. We’re still not finished adapting everything in the old topology scheme into the new one, but we’re in the last 10% (where 90% of the time is spent).

So where does that leave us in 2016? Well, this is my last year as a graduate student, and I intend to defend my dissertation in the fall. I am in production mode at the moment preparing new simulation systems to run over the coming months. Once these are baking I’ll be working on MDAnalysis and datreant/MDSynthesis core components, while also preparing my analysis codebases for the data that will be coming down the pipes. Writing (papers, dissertation) is also set to begin in the next month or so. In February I’ll be presenting work that should be published soon at the upcoming 2016 Biophysical Society Annual Meeting. There’s no shortage of things to do. :)

I also hope to attend SciPy again this summer, and will be applying to job opportunities throughout the year in preparation for 2017. If you’re an employer, or know of anything I might be interested in, by all means, shoot me an email.

Cheers to a new year!

— david

related links