This is a live snapshot of tasks and features we will be investing resources in. It may be used to encourage and inspire developers and to search for funding.
Interoperability protocols & duck typing¶
- Easier custom dtypes - Simplify and/or wrap the current C-API - More consistent support for dtype metadata - Support for writing a dtype in Python
- New string dtype(s): - Encoded strings with fixed-width storage (utf8, latin1, …) and/or - Variable length strings (could share implementation with dtype=object, but are explicitly type-checked) - One of these should probably be the default for text data. The current behavior on Python 3 is neither efficient nor user friendly.
- np.int should not be platform dependent
- better coercion for string + number
NumPy is much more than just the code base itself, we also maintain docs, CI, benchmarks, etc.
- Rewrite numpy.org
- Benchmarking: improve the extent of the existing suite, and run & render
the results as part of the docs or website.
- Hardware: find a machine that can reliably run serial benchmarks
- ASV produces graphs, could we set up a site? Currently at https://pv.github.io/numpy-bench/, should that become a community resource?
Functionality outside core¶
Some things inside NumPy do not actually match the Scope of NumPy.
- A backend system for numpy.fft (so that e.g. fft-mkl doesn’t need to monkeypatch numpy)
- Rewrite masked arrays to not be a ndarray subclass – maybe in a separate project?
- MaskedArray as a duck-array type, and/or
- dtypes that support missing values
- Write a strategy on how to deal with overlap between numpy and scipy for linalg and fft (and implement it).
- Deprecate np.matrix
We depend on CI to discover problems as we continue to develop NumPy before the code reaches downstream users.
- CI for more exotic platforms (e.g. ARM is now available from http://www.shippable.com/, but it is not free).
- Multi-package testing
- Add an official channel for numpy dev builds for CI usage by other projects so they may confirm new builds do not break their package.
Python type annotation syntax should support ndarrays and dtypes.
- Type annotations for NumPy: github.com/numpy/numpy-stubs
- Support for typing shape and dtype in multi-dimensional arrays in Python more generally
Numpy has both scalars and zero-dimensional arrays.
- The current implementation adds a large maintenance burden – can we remove scalars and/or simplify it internally?
- Zero dimensional arrays get converted into scalars by most NumPy functions (i.e., output of np.sin(x) depends on whether x is zero-dimensional or not). This inconsistency should be addressed, so that one could, e.g., write sane type annotations.