It’s been some time since I last posted, but that’s not because I’ve been
twiddling my thumbs. The year 2015 saw some big changes, and I wanted to take some
time to reflect and share on where we’ve been (and where we’re going).
It was about a year ago when I landed in Davis, CA, to attend Titus Brown’s
three-day Train-the-Trainers Software Carpentry instructor training. I met a
lot of folks that work in bioinformatics (not my field, but not far off), as
well as others involved in open science and open software. It was a
transformative experience, with the practical benefit that I was offically able
to teach Software Carpentry workshops after finishing up some final requirements.
I went on to teach four in 2015, and a fifth just a week ago.
February saw my first trip to the Biophysical Society Annual Meeting with
research to present, having first gone in 2012 when I was just starting out.
I met many of Oliver’s (my PI) colleagues from Oxford and elsewhere, but spent
a good deal of the meeting finishing up some exciting results that had just
come in. The trouble with giving a talk is that you can work on it up until
you get to the stage…so you will.
The meeting was great, though shaping new results into a talk with short
turnaround was made possible by a software package I’d been building called
MDSynthesis, which that winter was finally in a state to be (exceptionally)
useful. It sped up my ability to slice, dice, and aggregate datasets from
myriad simulations I had performed, taking advantage of pandas DataFrames and
HDF5 persistence to make fast exploration possible. I was on fire, and the
Baltimore cold couldn’t put me out.
I attended my first SciPy conference in July, where I presented MDSynthesis
as a poster. The conversations I had there influenced big changes in the package
that are still shaking out, one of which being the splitting out of the core
persistence functionality into a package called datreant. A few other devs
from MDAnalysis and I are now engaged with figuring out the future form of
this package, and the results are looking to be really quite awesome.
Speaking of MDAnalysis, Google Code finally gasped its last breath, and this
forced the project to move elsewhere. Its new home became GitHub, and,
perhaps unexpectedly, this became perhaps the best thing to happen to the
project in a long time. The great development tooling that GitHub provides
has catalyzed a flurry of new work, and has attracted new contributors,
including some that are now coredevs (yours truly included).
The value GitHub added to MDAnalysis development spurred us to (finally) make a
Becksteinlab org, at which the open source projects incubated in our lab are
now hosted. I’m proud to be in one of those “with it” labs, though a challenge
remains in getting busy lab members to actively work on projects that have a public
face and getting them in a useful state. I think this is a culture our lab
intends to encourage, but it’s no secret that academia doesn’t directly value
this kind of work today. But we’re building for the world of tomorrow.
In October I traveled to Germany (my first time in Europe) for the CECAM
biomolecular simulation workshop-of-workshops in Jülich. Despite wifi issues
in both the venue and the hotel, Philip Fowler and I managed to start things
off with a Software Carpentry workshop that covered the Unix shell, Python,
and Git. Oliver and I followed this with a (too short) MDAnalysis tutorial,
doing things SWC-style and live coding in the Jupyter notebook (using
pre-made notebooks as cheat-sheets rendered with nbviewer on my phone, of
course; we’re not magicians). Both workshops were very well-received, with
people appreciating most the interactivity and the ability to play directly
with the library during the breakout components of the live sessions.
In November, Richard Gowers visited our lab here in Arizona for a few weeks.
After treating him to some authentic New Mexican food (red or green?), he and I
got to work gutting the existing topology system of MDAnalysis. The result is
more performant and more flexible (a rarity!), and should finally get merged
ahead of the long-sought 1.0 release. We’re still not finished adapting everything
in the old topology scheme into the new one, but we’re in the last 10% (where
90% of the time is spent).
So where does that leave us in 2016? Well, this is my last year as a graduate
student, and I intend to defend my dissertation in the fall. I am in production
mode at the moment preparing new simulation systems to run over the coming months.
Once these are baking I’ll be working on MDAnalysis and datreant/MDSynthesis
core components, while also preparing my analysis codebases for the data that
will be coming down the pipes. Writing (papers, dissertation) is also set to
begin in the next month or so. In February I’ll be presenting work that should be
published soon at the upcoming 2016 Biophysical Society Annual Meeting.
There’s no shortage of things to do. :)
I also hope to attend SciPy again this summer, and will be applying to job
opportunities throughout the year in preparation for 2017. If you’re an employer,
or know of anything I might be interested in, by all means, shoot me an email.
Cheers to a new year!
— david