The recent revision of the array API standard marks another major milestone in the collective effort to achieve array interoperability across the Python data ecosystem.
A recent post by Quansight Labs’ Athan Reines on the Data APIs blog shares updates on the consortium’s progress and plans for the future.
The Consortium for Python Data API Standards is a coalition of stakeholders from across the Scientific Python Ecosystem working together to create common rules and tools for how different Python libraries handle data. These libraries are used for tasks like data analysis, machine learning, and scientific computing.
By agreeing on standard ways to work with data, the Consortium aims to make it easier for developers to use multiple libraries together without running into compatibility issues. This helps improve collaboration and efficiency in the Python data ecosystem.
"The 2023 release of the array API specification standardizes several key APIs necessary for facilitating adoption among array-consuming libraries and should help accelerate array interoperability within the Scientific Python Ecosystem."
Athan Reines, Senior Engineering Manager, Data APIs & Quansight Labs
The Array API Standard aims to standardize the fundamental building blocks of scientific computing: multi-dimensional arrays (tensors). Historically, working across different array libraries, such as NumPy, CuPy, PyTorch, JAX, and others, has been challenging due to divergent APIs and behavioral inconsistencies. The Consortium was established to facilitate coordination and provide a transparent process for standardizing array API design.
The revision introduces several additions and improvements, including:
On the adoption front, the past year has seen incredible progress:
To further adoption, the Consortium has continued developing a comprehensive test suite for compliance testing and a compatibility layer to smooth over behavioral differences among libraries as they work toward full conformance.
Moving forward, the focus remains on driving widespread adoption and addressing gaps identified by downstream consumers. More robust tooling for compliance monitoring and increasing transparency around supported APIs and edge cases are key priorities.
The journey toward array interoperability has been a long one, but the 2023 revision shows how far we’ve come thanks to the relentless effort and coordination across the Python data community. We’re proud to be part of this collaborative effort and can’t wait to see what the future holds.
Consortium members from Quansight & Quansight Labs: Athan Reines, Ralf Gommers, Aaron Meurer, Matthew Barber, Marco Gorelli