We are pleased to announce the release of Dask version 1.0.0!

Usually in release blogposts we outline important features and changes since the last major version. Because of the 1.0 version number, this post will be a bit different. Instead we’ll talk about what this version number means to us, and discuss the broader context of Dask projects more generally.

What 1.0 means to us

Version 1.0 software means different things to different groups. In some communities it might mean …

  • The first version of a package
  • When a package is first ready for production use
  • When a package has reached API stability

It is common in the PyData ecosystem to wait a long time before releasing a version 1.0. For example neither Pandas nor Scikit-Learn, arguably two of the most well used PyData packages in production, have yet declared a 1.0 version number (today they are at versions 0.23 and 0.20 respectively). And yet each package is widely used in production by organizations that demand high degrees of stability.

Dask is not as API-stable as Pandas or Scikit-Learn, but it’s pretty close. The project rarely invents new APIs, instead preferring to implement pre-existing APIs (like the NumPy/Pandas/Scikit-Learn APIs) or standard language protocols (like async-await, concurrent.futures, Queues, Locks, and so on). Additionally, Dask is well used in production today across sectors ranging from risk-tolerant industries like startups and quantitative finance shops, to risk-averse institutions like banks, large enterprises, and governments.

When we say that Dask has reached 1.0 we mean that it is ready to be used in production. We are late in saying this. This happened a long time ago.

Development will continue as before

Dask is living software that exists in a rapidly evolving space. Nothing is changing about our internal stability practices. We will continue to add new features, deprecate old ones, and fix bugs with the same policies. We always try to minimize negative effects on users when making these internal changes while maximizing the speed at which we can deliver new bugfixes and features. This is hard and requires care, but we believe that we’ve done this decently in the past so hopefully you haven’t noticed much. We will continue to operate the same way into the future.

The 1.0 version change does not affect our development cycle. There are no LTS versions beyond what we already provide.

Different Dask packages move at different speeds

Dask is able to evolve and experiment rapidly while maintaining a stable core because it is split into sub-packages, each of which evolves independently, has its own maintainers, its own versions, and its own release cycle. Some Dask subprojects have had versions above 1.0 for a long time, while others are still unstable.

Dask’s version number is hard to define today because it is composed of so many independent efforts by different groups. This is similar to situation in Jupyter, or in the Numeric Python ecosystem itself.

Thanks

Finally, we’re grateful to everyone who has contributed to the project over the years, either by contributing code, reviews, documentation, discussion, bug reports, well written questions and answers, visual designs, and well wishes. This means a lot to us.

Today there are dozens of dask-* packages on PyPI that support thousands of users and several more that incorporate Dask for parallelism. We’re thankful to play a role in such a vibrant community.


blog comments powered by Disqus