Generated by DeepSeek V3.2| NumPy | |
|---|---|
| Name | NumPy |
| Developer | Travis Oliphant, NumPy community |
| Released | 2006 |
| Programming language | Python (programming language), C (programming language) |
| Operating system | Cross-platform |
| Genre | Library (computing) |
| License | BSD licenses |
NumPy. It is a fundamental package for scientific computing in the Python (programming language) ecosystem, providing support for large, multi-dimensional arrays and matrices, along with a vast collection of high-level mathematical functions to operate on these data structures. The library is a cornerstone for the entire PyData stack, enabling efficient numerical computations that are essential in fields ranging from machine learning to astrophysics. Its design and performance have made it an indispensable tool for researchers, engineers, and data scientists worldwide.
NumPy introduces the powerful `ndarray` object, an efficient multidimensional container for homogeneous data, which forms the basis for most numerical work in Python. This array programming capability allows for concise and readable code that can perform complex mathematical operations without the need for slow, explicit loops. The library is deeply integrated with other key scientific packages like SciPy and matplotlib, forming the core of a rich ecosystem for technical computing. Its widespread adoption is evident in major organizations such as NASA, CERN, and Google, which rely on it for data analysis and simulation.
The project originated from its predecessor, Numeric, created by Jim Hugunin in the mid-1990s, and was later merged with another package called Numarray. In 2005, Travis Oliphant spearheaded the effort to unify these libraries into a single, more powerful system, leading to the first public release in 2006. This consolidation was supported by the broader open-source software community and received early funding from entities like the National Institutes of Health. Subsequent development has been guided by a large community of contributors, with significant stewardship from organizations like the Python Software Foundation and NumFOCUS.
At its heart, NumPy provides the `ndarray` object, which supports vectorization for fast operations using pre-compiled C (programming language) code. It includes comprehensive mathematical functions for linear algebra, Fourier transform, and random number generation, often implemented via interfaces to established FORTRAN libraries like LAPACK. The broadcasting mechanism allows arrays of different shapes to be used in arithmetic operations, while sophisticated indexing capabilities, including boolean indexing and fancy indexing, enable powerful data manipulation. These features are critical for algorithms used in domains like computational fluid dynamics and quantum mechanics.
Arrays can be created from scratch using functions like `numpy.zeros`, `numpy.ones`, and `numpy.arange`, or from existing Python (programming language) sequences. Data can also be loaded from files using utilities that interface with formats common in scientific work, such as those from HDF Group. Manipulation routines include `numpy.reshape` to alter array dimensions, `numpy.concatenate` for joining arrays, and `numpy.split` for dividing them. Operations like `numpy.transpose` and `numpy.rollaxis` are essential for rearranging data, which is a common requirement in image processing libraries like OpenCV and geospatial analysis with GDAL.
NumPy arrays serve as the common data exchange format for a vast ecosystem of Python libraries. SciPy builds directly on it to provide more advanced modules for optimization (mathematics), signal processing, and statistics. pandas uses it as the foundation for its DataFrame object, enabling sophisticated data analysis. For machine learning, frameworks like scikit-learn, TensorFlow, and PyTorch seamlessly accept and return NumPy arrays. Visualization libraries, most notably matplotlib, rely on it for plotting data, and it interoperates with symbolic mathematics systems like SymPy.
Performance is achieved by implementing core routines in C (programming language) and FORTRAN, and by utilizing CPU SIMD instructions through compilers like GNU Compiler Collection. The memory layout of `ndarray` objects is designed for cache locality, and operations can leverage multithreading via libraries such as OpenBLAS. For very large datasets that exceed system memory, integrations with libraries like Dask (software) enable out-of-core computations. The Universal Function API (ufunc) allows for the creation of compiled functions that operate element-wise on arrays, a feature exploited by projects like Numba for just-in-time compilation.
Category:Free science software Category:Python (programming language) libraries Category:Numerical programming languages