The problem with software in evolutionary biology
A nice feature article came out a couple of days ago on Nature, about how to better fix bugs in scientific software. Unfortunately, I believe the article to ...
A nice feature article came out a couple of days ago on Nature, about how to better fix bugs in scientific software. Unfortunately, I believe the article to be very useful for some, and very useless for others (or most). Within the field of evolutionary biology, I believe that such an article is more useless than useful. Mostly, I believe that the suggestions presented in the article are completely valid, but only for software that is already of good level to start with. That is, software written by scientists who already have some ideas of principles of good programming.
The harsh truth is that software engineering and programming are very often self-taught disciplines in biology (barred bioinformatics, perhaps). The very few courses of programming that are introduced in biology - at least in the institutes where I have been, which are admittedly few and thus I can only speak out of experience - concern statistical programming and are limited e.g. to R. However, many evolutionary biologists need simulations for their work, and thus need to be introduced to Object-oriented programming. This means being able to code in languages such as Python and/or C++, and being able of more than just defining and calling functions.
I believe my experience to be quite similar to others’. I learned Python during my master thesis (before I moved on to evolutionary biology) and I had to learn it by myself. As many people, I chose a tutorial online and started from there. It wasn’t until much later, during my PhD, that I realized that I was a terrible coder and that the code that I had written up to then was terribly inefficient, difficult to understand even to myself, and prone to bugs. In the following years, I learned about best practices, attended lectures, and now I consider myself a marginally better coder. Even so, one of the bugs I encoded in the beginning of my PhD came to bite me many years later, when I finally submitted the paper resulting from that simulation code. A comment by a reviewer put me on the right course, and I discovered a bug that ultimately changed the results of my work. Luckily it was before publication! Sometimes I wonder if I’m a lonely case, or if there is out there plenty of results that are flawed because of a bug in the code that no-one ever found.
In recent years I have been outspoken about this issue: programming practices are severely lacking in the field of evolutionary biology. Not only that, but software is often treated as an unimportant part of the research process in evolution. There are a few reasons why that is, and I think I identified a few (what follows is from a recent Tweet that I posted on the matter):
It is obvious that we need to change a lot of things if we are to produce good software that can be reviewed and re-used. Most of these changes concern the system behind academia. In a world where we fight for money and we only get it for innovation, nobody will take time to make existing methods available as software. Nobody wants to pay the price and put resources into maintaining code. And nobody sees as necessary to hire research software engineers (a very rare position indeed) because the current clumpsy level of coding still leads to publications, which is the main return for any scientist. But we are wasting resources, which is inadmissible in a field where code is so incredibly important and is becoming even more so as time passes. We need to change.
A nice feature article came out a couple of days ago on Nature, about how to better fix bugs in scientific software. Unfortunately, I believe the article to ...
I took the new year and my current Covid status (positive) to update my website and in particular to migrate to a new theme, always using the very well done ...
I’ve had this feeling in me for a couple of days now: “finally it’s the end of this terrible year”. But there’s a problem: I can’t figure out how and especia...
Here’s the truth: I did not study biology until late in my life. I hated the subject in high school. I was much more attracted to quantitative disciplines, s...
Thinking about it, I’ve had a weird education. My parents were both primary school teachers. They taught me to be human and kind, and to treat others well an...
I’m not an expert of psychology in any way. I was also very lucky in the past, as I never had to endure struggles that had to do with mental health. So, disc...