Standing on Code: Advancing Scientific Discovery Through Sustainable Software Practices
Scientific discovery has long been a cumulative endeavor, each new finding built on a foundation of prior knowledge. Newton’s well-known phrase, "If I have seen further, it is by standing on the shoulders of giants," perfectly captures this tradition of incremental progress. In the modern scientific world, however, this incremental approach has been better maintained in the written literature than in the computational code that underpins scientific analysis. As a climate scientist working on land-atmosphere exchange, I began my career as an empiricist, focused on field measurements and observational analysis, before moving into the domain of modeling. This experience has shown me firsthand how the current paradigm in research coding fails to facilitate long-term, incremental improvement.
Typically, research projects last only a few years and are conducted by small, dedicated teams of fewer than ten individuals. During this period, these teams develop custom code to analyze data and model interactions, which culminates in a publication. However, without a framework for maintenance, extensibility, or community engagement, this code often becomes effectively obsolete after the publication. Future researchers cannot easily build upon this work, which limits the potential for scientific advancement and results in frequent reinvention.
In this talk, I will outline the need for a paradigm shift within scientific software development, grounded in the principles of good software engineering. This approach calls for code to be open-source and treated as a long-term asset, enabling future scientists to build upon prior work as readily as they reference existing literature. Key components of this new framework include modularity, documentation, community engagement, and long-term maintenance practices. By adopting an integrated approach, scientific code can become an evolving infrastructure, empowering cumulative research in ways that were previously unattainable.
Such a shift will ensure that valuable insights in fields like climate science, which rely heavily on complex models of interactions such as land-atmosphere exchange, continue to support future discoveries. With this vision, we can better align scientific coding with the incremental spirit that underpins all scientific endeavor, fostering a more sustainable and collaborative approach to scientific innovation.