Table of Contents
NumPy Data Types
NumPy’s power stems from its efficient ndarray
(N-dimensional array) object. Unlike Python lists, NumPy arrays are homogeneous; all elements share the same data type. This homogeneity allows for optimized vectorized operations, significantly boosting performance. NumPy offers a wide variety of data types, many mirroring those in C and Fortran, each with a shorthand code:
Data Type | Description | NumPy Type Code |
---|---|---|
Integer | Signed integer | int8 , int16 , int32 , int64 |
Unsigned Integer | Unsigned integer | uint8 , uint16 , uint32 , uint64 |
Floating-Point | Single and double precision floating-point | float16 , float32 , float64 (commonly float ) |
Complex Floating-Point | Single and double precision complex numbers | complex64 , complex128 (commonly complex ) |
Boolean | True/False values | bool |
String | Fixed-length strings | str (or string_ with length, e.g., string_10 ) |
Unicode | Fixed-length Unicode strings | unicode_ (e.g., unicode_10 ) |
Object | Arbitrary Python objects | object |
Specifying Data Types:
When creating an array, you can explicitly set the data type using the dtype
argument:
import numpy as np
arr_int = np.array([1, 2, 3, 4], dtype=np.int32)
print(arr_int.dtype) # Output: int32
arr_float = np.array([1.1, 2.2, 3.3], dtype=np.float64)
print(arr_float.dtype) # Output: float64
arr_mixed = np.array([1, 2.5, 3]) # dtype will be upcast to float64
print(arr_mixed.dtype) # Output: float64
Careful data type selection is crucial for memory efficiency and performance. Smaller types (e.g., int32
) save memory but risk overflow if values exceed their range.
Data Type Conversion
Converting between data types is common. NumPy provides several methods:
1. Using astype()
:
astype()
creates a copy with the new data type:
arr_int = np.array([1, 2, 3], dtype=np.int32)
arr_float = arr_int.astype(np.float64)
print(arr_float) # Output: [1. 2. 3.]
print(arr_float.dtype) # Output: float64
2. Implicit Type Conversion:
NumPy sometimes implicitly converts types during operations:
arr_int = np.array([1, 2, 3])
arr_float = np.array([1.1, 2.2, 3.3])
result = arr_int + arr_float
print(result) # Output: [2.1 4.2 6.3]
print(result.dtype) # Output: float64
3. Using view()
(Use with Caution!):
view()
creates a new array sharing the same data but with a different type. Changes to one view affect the other. This is memory-efficient but risky:
arr_int = np.array([1, 2, 3], dtype=np.int32)
arr_view = arr_int.view(np.float32) # Potentially dangerous!
print(arr_view) # Output: [1. 2. 3.]
arr_view[0] = 10.5 # Modifies the original as well!
print(arr_int) # Output: [10 2 3]
Understanding NumPy’s data types and conversion is vital for writing efficient and reliable code. Choose appropriate types, and use view()
cautiously, mindful of potential data loss or unexpected behavior.