Numpydantic - array typing and validation for pydantic and beyond

Hello everyone!

@stefanv recommended i share this here, hopefully y’all find it interesting!

I just released a package numpydantic that provides generic typing for array shape and dtype validation using pydantic (and also generally), post here: Dr. jonny phd: "Here's an ~ official ~ release announcement for #…" - Neuromatch Social

The idea is that we not only want to be able to specify arrays in pydantic, specify shape and dtype constraints for those arrays, but also be able to treat that specification as being generic across array implementations. I see this topic has come up a few times before in different ways:

This package isn’t intending to be a unified interface to all arrays, but a passthrough to various array backends after validation - it does provide some proxy classes for eg. treating videos and hdf5 arrays as being very similar to lazy-loaded numpy-like arrays. This is part of a broader data formats effort to be able to specify array data in formats like neurodata without borders and etc. as abstract schema and then generate many flexible representations of that schema, decoupling schema from implementation.

would love to hear thoughts - it’s a v1 release (well v1.1) so there are of course tons of improvements left to do, and i’d love to hear what could be made better or how this might be useful!!!

1 Like

Thanks for sharing @sneakers-the-rat.

You may be interested in TYP: Make array _ShapeType covariant by Jacob-Stevens-Haas · Pull Request #26081 · numpy/numpy · GitHub, which is very close to the finish line and adds a new np.typing.Array object which is shape and dtype-aware.

You’re linking to discussions about runtime support that is generic across array libraries. That is fairly far along, see for example:

Generic static tying on top of that has a longer way to go - in particular if it also has to be shape-aware. I think we’ll only see that once all of the above has stabilized.