๐ NumPy โ Complete Educational Guide
Digital E-Filing Coach โ Amanuddin Education | All branches explained with examples, flowchart, mindmap & roadmap
โ ๏ธ Educational Disclaimer: This resource is for educational purposes only and does not constitute legal or professional advice. All code examples are for learning use only.
๐ 1. Introduction to NumPy
๐น What is NumPy?
- NumPy stands for Numerical Python. It is the most important library for doing maths and data work in Python.
- It gives us a special array called ndarray (N-dimensional array) โ much faster than a normal Python list.
- Without NumPy, data science and machine learning in Python would be very slow.
- Built on top of C language internally โ that's why it is so fast!
Simple Example: Think of a Python list like a handwritten list on paper. NumPy array is like a spreadsheet on a computer โ faster, smarter, and can do maths on every cell at once!
๐น Installation & Import
# Step 1 โ Install NumPy using pip
pip install numpy
# Step 2 โ Import in Python file
import numpy as np # "np" is just a short nickname
# Step 3 โ Check version
print(np.__version__) # Output: 1.26.x or similar
๐น NumPy vs Python List โ Comparison Table
| Feature | Python List | NumPy Array |
|---|---|---|
| Speed | Slow | โก Very Fast (C-based) |
| Data Types | Mixed (int, str together) | Same type only (homogeneous) |
| Memory | Uses more memory | Uses less memory |
| Math Operations | Need loops | Works on whole array at once |
| Dimensions | 1D only easily | 1D, 2D, 3D, ND easily |
| Syntax | [1,2,3] | np.array([1,2,3]) |
๐ข 2. Array Creation
๐น Creating Arrays from Data
- np.array() โ Convert a Python list into a NumPy array.
- Can create 1D (single row), 2D (rows ร columns), or 3D arrays.
import numpy as np
# 1D Array (like one row of numbers)
a = np.array([1, 2, 3, 4, 5])
print(a) # [1 2 3 4 5]
# 2D Array (like a table โ 2 rows, 3 columns)
b = np.array([[1,2,3], [4,5,6]])
print(b.shape) # (2, 3)
# Specify data type
c = np.array([1.5, 2.5], dtype='float32')
๐น Special Array Creators
# All zeros โ like blank cells in Excel
np.zeros((3, 3)) # 3ร3 array of 0s
# All ones
np.ones((2, 4)) # 2ร4 array of 1s
# Range of numbers (like Python range)
np.arange(0, 10, 2) # [0 2 4 6 8]
# Evenly spaced numbers
np.linspace(0, 1, 5) # [0. 0.25 0.5 0.75 1.0]
# Identity matrix (diagonal = 1, rest = 0)
np.eye(3)
# Random numbers between 0 and 1
np.random.rand(3, 3)
# Random integers between 1 and 100
np.random.randint(1, 100, size=(3,3))
๐น Array Creation Methods โ Table
| Function | What it does | Example Output |
|---|---|---|
| np.array([1,2,3]) | Create from list | [1 2 3] |
| np.zeros((2,3)) | All zeros | [[0,0,0],[0,0,0]] |
| np.ones((2,2)) | All ones | [[1,1],[1,1]] |
| np.arange(0,10,2) | Even numbers 0-8 | [0 2 4 6 8] |
| np.linspace(0,1,5) | 5 equally spaced values | [0.0, 0.25, 0.5, 0.75, 1.0] |
| np.eye(3) | Identity matrix 3ร3 | Diagonal = 1 |
| np.random.rand(3) | 3 random floats 0-1 | [0.43, 0.76, 0.12] |
| np.full((2,2), 7) | Fill with value 7 | [[7,7],[7,7]] |
๐ 3. Array Properties & Manipulation
๐น Key Array Properties
- shape โ Tells the size of each dimension. Like (3, 4) means 3 rows, 4 columns.
- ndim โ Number of dimensions. 1D = 1, 2D = 2, 3D = 3.
- dtype โ What type of data is stored (int32, float64, etc.).
- size โ Total number of elements (rows ร columns).
- itemsize โ Memory used by one element in bytes.
a = np.array([[1,2,3],[4,5,6]])
print(a.shape) # (2, 3) โ 2 rows, 3 columns
print(a.ndim) # 2 โ it's a 2D array
print(a.dtype) # int64 โ stores integers
print(a.size) # 6 โ 2ร3 = 6 elements
print(a.itemsize) # 8 โ 8 bytes per int64
print(a.nbytes) # 48 โ 8ร6 = 48 bytes total
๐น Reshape & Flatten
- reshape() โ Change the shape without changing data. (6,) โ (2,3)
- flatten() โ Always returns a copy as 1D array.
- ravel() โ Returns 1D array (may be a view, not copy).
- transpose() / .T โ Flip rows and columns.
a = np.arange(1, 7) # [1 2 3 4 5 6]
b = a.reshape(2, 3) # [[1,2,3],[4,5,6]]
c = b.flatten() # [1 2 3 4 5 6] โ back to 1D
d = b.T # [[1,4],[2,5],[3,6]] โ transposed
print(b.reshape(3, 2)) # [[1,2],[3,4],[5,6]]
๐น Array Properties โ Summary Table
| Attribute/Method | Meaning | Example |
|---|---|---|
| .shape | Dimensions tuple | (2,3) = 2 rows, 3 cols |
| .ndim | Number of dimensions | 2 for a 2D array |
| .dtype | Data type of elements | float64, int32 |
| .size | Total elements | 6 for (2,3) |
| .reshape(r,c) | Change shape | (6,) โ (2,3) |
| .flatten() | 1D copy | (2,3) โ (6,) |
| .T | Transpose (flip) | rowsโcolumns |
| .astype() | Change data type | int โ float |
๐ด 4. Mathematical Operations
๐น Element-wise Arithmetic
- In NumPy, +, โ, ร, รท work on every element at the same time โ no loop needed!
- This is called vectorized operation โ super fast!
a = np.array([10, 20, 30])
b = np.array([1, 2, 3])
print(a + b) # [11 22 33]
print(a - b) # [ 9 18 27]
print(a * b) # [10 40 90]
print(a / b) # [10. 10. 10.]
print(a ** 2) # [100 400 900] โ square each
print(a % 3) # [1 2 0] โ modulo
๐น Universal Functions (ufuncs)
- ufuncs = universal functions โ they work on every element automatically.
- Much faster than writing a Python loop (for loop).
a = np.array([4, 9, 16, 25])
np.sqrt(a) # [2. 3. 4. 5.] โ square root
np.square(a) # [16 81 256 625] โ square
np.abs([-3, -1, 2]) # [3 1 2] โ absolute value
np.exp([1,2]) # [2.718 7.389] โ e^x
np.log([1,np.e]) # [0. 1.] โ natural log
np.sin(np.pi/2) # 1.0 โ trigonometry
๐น Broadcasting โ Magic of NumPy!
- Broadcasting lets you add a small array to a big array automatically.
- Example: Add 10 to every element without a loop.
a = np.array([[1,2,3],
[4,5,6]])
print(a + 10) # [[11,12,13],[14,15,16]] โ 10 added to all
print(a * 2) # [[2,4,6],[8,10,12]] โ doubled
| Operation | NumPy Function | Example Input โ Output |
|---|---|---|
| Square root | np.sqrt() | [4,9] โ [2,3] |
| Power / Exponent | np.power(a,2) | [3,4] โ [9,16] |
| Absolute value | np.abs() | [-3,2] โ [3,2] |
| Natural log | np.log() | [1,e] โ [0,1] |
| Base-10 log | np.log10() | [10,100] โ [1,2] |
| Exponential e^x | np.exp() | [0,1] โ [1, 2.718] |
| Sin / Cos / Tan | np.sin(), np.cos() | np.sin(ฯ/2) โ 1.0 |
| Sum of all | np.sum() | [1,2,3] โ 6 |
๐ฃ 5. Indexing & Slicing
๐น Basic Indexing
- Indexing means picking one or more elements from an array by position number.
- Counting starts from 0 (first element = index 0).
- Negative index: -1 = last element, -2 = second last, etc.
a = np.array([10, 20, 30, 40, 50])
print(a[0]) # 10 โ first element
print(a[2]) # 30 โ third element
print(a[-1]) # 50 โ last element
# 2D Array indexing [row, column]
b = np.array([[1,2,3],[4,5,6]])
print(b[0, 1]) # 2 โ row 0, column 1
print(b[1, 2]) # 6 โ row 1, column 2
๐น Slicing (Getting a Range)
- Syntax: arr[start : stop : step] โ picks elements from start to stop-1.
a = np.array([10, 20, 30, 40, 50])
print(a[1:4]) # [20 30 40] โ index 1,2,3
print(a[:3]) # [10 20 30] โ first 3
print(a[2:]) # [30 40 50] โ from index 2
print(a[::2]) # [10 30 50] โ every 2nd
print(a[::-1]) # [50 40 30 20 10] โ reversed!
๐น Fancy Indexing & Boolean Masking
a = np.array([10, 20, 30, 40, 50])
# Fancy indexing โ pick specific positions
print(a[[0, 2, 4]]) # [10 30 50]
# Boolean mask โ filter values > 25
mask = a > 25
print(mask) # [False False True True True]
print(a[mask]) # [30 40 50]
# One-liner version
print(a[a > 25]) # [30 40 50] โ same result
| Type | Syntax | What It Picks |
|---|---|---|
| Single index | a[2] | 3rd element |
| Negative index | a[-1] | Last element |
| Slice | a[1:4] | Index 1,2,3 |
| Step slice | a[::2] | Every 2nd element |
| Fancy index | a[[0,2,4]] | Elements at 0,2,4 |
| Boolean mask | a[a>25] | All values > 25 |
| 2D index | a[1,2] | Row 1, Column 2 |
๐ต 6. Statistical Functions
๐น What are Statistical Functions?
- NumPy has built-in functions to analyze data โ like how much students scored on average.
- These are used in data science, finance, research, and machine learning.
- Use axis=0 for column-wise, axis=1 for row-wise calculations.
data = np.array([45, 60, 72, 88, 55, 91, 67])
print("Mean:", np.mean(data)) # 68.28
print("Median:", np.median(data)) # 67.0
print("Std Dev:", np.std(data)) # 16.31
print("Variance:", np.var(data)) # 266.2
print("Min:", np.min(data)) # 45
print("Max:", np.max(data)) # 91
print("Sum:", np.sum(data)) # 478
print("50th %ile:", np.percentile(data, 50)) # 67
Real Example: A teacher has marks: [45, 60, 72, 88, 55, 91, 67].
The class average (mean) = 68.28. Half the class scored above 67 (median).
Most marks are within ยฑ16 of the average (std dev = 16.31).
| Function | What It Calculates | Simple Meaning |
|---|---|---|
| np.mean() | Average | Sum รท Count |
| np.median() | Middle value | Sort then pick middle |
| np.std() | Standard deviation | How spread out values are |
| np.var() | Variance | stdยฒ (more spread info) |
| np.min() | Smallest value | Lowest score |
| np.max() | Largest value | Highest score |
| np.sum() | Total sum | Add all values |
| np.percentile(a,p) | p% of data below | 50th percentile = median |
| np.cumsum() | Running total | [1,2,3]โ[1,3,6] |
| np.corrcoef() | Correlation | How related 2 arrays are |
๐ค 7. Sorting & Searching
๐น Sorting Arrays
- np.sort() โ Returns a sorted copy of the array (original unchanged).
- np.argsort() โ Returns the index positions that would sort the array.
- np.sort(a, axis=0) โ Sort each column. axis=1 sorts each row.
scores = np.array([72, 45, 88, 33, 67])
print(np.sort(scores)) # [33 45 67 72 88] โ sorted
print(np.argsort(scores)) # [3 1 4 0 2] โ positions
# Reverse sort (descending)
print(np.sort(scores)[::-1]) # [88 72 67 45 33]
# Get rank of each element
ranks = np.argsort(np.argsort(scores))
print(ranks) # rank of each score
๐น Searching in Arrays
a = np.array([5, 10, 15, 20, 25])
# Find WHERE condition is True
result = np.where(a > 12)
print(result) # (array([2, 3, 4]),) โ indices
# Replace with condition: if >12 keep, else replace with 0
print(np.where(a > 12, a, 0)) # [ 0 0 15 20 25]
# Find unique values
b = np.array([1,2,2,3,1,4])
print(np.unique(b)) # [1 2 3 4]
# searchsorted โ where to insert a value to keep sorted order
print(np.searchsorted(a, 13)) # 2 โ insert at index 2
| Function | Purpose | Example |
|---|---|---|
| np.sort() | Sorted copy (ascending) | [3,1,2]โ[1,2,3] |
| np.argsort() | Sorted index positions | [3,1,2]โ[1,2,0] |
| np.where(cond) | Indices where True | a>5 โ positions |
| np.where(c,x,y) | Replace based on condition | if>5: keep, else 0 |
| np.unique() | Remove duplicates | [1,1,2,3]โ[1,2,3] |
| np.argmin() | Index of minimum | [5,2,8]โ1 |
| np.argmax() | Index of maximum | [5,2,8]โ2 |
| np.searchsorted() | Where to insert value | sorted array insert pos |
๐ท 8. Linear Algebra (np.linalg)
๐น What is Linear Algebra in NumPy?
- Linear algebra deals with vectors and matrices โ used heavily in Machine Learning and Physics.
- np.dot() โ Dot product (matrix multiplication style).
- np.linalg โ A special sub-module for advanced linear algebra.
A = np.array([[1,2],[3,4]])
B = np.array([[5,6],[7,8]])
# Matrix Multiplication
print(np.dot(A, B))
# [[19 22] โ (1ร5+2ร7), (1ร6+2ร8)
# [43 50]] โ (3ร5+4ร7), (3ร6+4ร8)
# Determinant
print(np.linalg.det(A)) # -2.0
# Inverse matrix
print(np.linalg.inv(A)) # [[-2, 1],[1.5, -0.5]]
# Eigenvalues & Eigenvectors
vals, vecs = np.linalg.eig(A)
# Solve linear equations Ax = b
b = np.array([5, 11])
x = np.linalg.solve(A, b)
print(x) # solution for x and y
| Function | What it does | Used in |
|---|---|---|
| np.dot(A,B) | Matrix multiplication | ML, Physics |
| np.matmul(A,B) | Same as dot for 2D | Deep Learning |
| np.linalg.det() | Determinant of matrix | Solving equations |
| np.linalg.inv() | Inverse of matrix | Regression |
| np.linalg.eig() | Eigenvalues & vectors | PCA (ML) |
| np.linalg.solve() | Solve Ax=b | Linear equations |
| np.cross() | Cross product | 3D geometry, Physics |
| np.linalg.norm() | Length (magnitude) | Distance, Vectors |
๐ก 9. File Input / Output (I/O)
๐น Saving & Loading Arrays
- np.save() โ Saves array as binary file .npy (very fast).
- np.load() โ Loads back the saved .npy file.
- np.savez() โ Save multiple arrays in one .npz archive.
- np.savetxt() โ Save as human-readable text/CSV file.
- np.loadtxt() โ Load from a text/CSV file.
a = np.array([1, 2, 3, 4, 5])
# Save as binary (.npy) โ fastest method
np.save('mydata.npy', a)
loaded = np.load('mydata.npy')
# Save multiple arrays in one file
np.savez('archive.npz', arr1=a, arr2=a*2)
data = np.load('archive.npz')
print(data['arr1'])
# Save as readable text (CSV style)
np.savetxt('data.csv', a, delimiter=',', fmt='%d')
# Load from text file
loaded_txt = np.loadtxt('data.csv', delimiter=',')
| Function | Format | When to Use |
|---|---|---|
| np.save() | .npy (binary) | Fast save for single array |
| np.load() | .npy or .npz | Load saved NumPy files |
| np.savez() | .npz (multiple) | Save multiple arrays together |
| np.savetxt() | .txt / .csv | Human-readable export |
| np.loadtxt() | .txt / .csv | Import CSV data |
| np.genfromtxt() | .csv with gaps | CSV with missing values |
๐ฎ 10. Advanced Concepts
๐น Array Manipulation Functions
a = np.array([1,2,3])
b = np.array([4,5,6])
# Stack vertically (add rows)
np.vstack([a, b]) # [[1,2,3],[4,5,6]]
# Stack horizontally (add columns)
np.hstack([a, b]) # [1,2,3,4,5,6]
# Concatenate along axis
np.concatenate([a, b]) # [1,2,3,4,5,6]
# Split into parts
np.split(a, 3) # [array([1]), array([2]), array([3])]
# Tile โ repeat like tiles
np.tile(a, 2) # [1,2,3,1,2,3]
# Repeat each element
np.repeat(a, 2) # [1,1,2,2,3,3]
๐น Vectorization (Make Python Loops Fast)
- Vectorization means applying a function to all elements at once instead of using a loop.
- np.vectorize() โ Wraps any Python function so it works on arrays.
def tax(income):
return income * 0.3 if income > 500000 else income * 0.1
v_tax = np.vectorize(tax)
incomes = np.array([300000, 600000, 200000, 800000])
print(v_tax(incomes))
# [30000. 180000. 20000. 240000.]
๐น Structured Arrays & Memory Layout
# Structured array โ like a table with named columns
dt = np.dtype([('name', 'U10'), ('age', 'i4'), ('salary', 'f8')])
people = np.array([('Amanuddin', 35, 75000.0),
('Rahul', 28, 50000.0)], dtype=dt)
print(people['name']) # ['Amanuddin' 'Rahul']
print(people['salary']) # [75000. 50000.]
| Concept | What It Means | Example |
|---|---|---|
| vstack | Stack rows on top of each other | [1,2] + [3,4] โ [[1,2],[3,4]] |
| hstack | Stack columns side by side | [1,2] + [3,4] โ [1,2,3,4] |
| concatenate | Join arrays along axis | Flexible joining |
| split | Divide array into parts | [1-6] โ [1-2],[3-4],[5-6] |
| tile | Repeat whole array n times | [1,2] ร 3 โ [1,2,1,2,1,2] |
| repeat | Repeat each element n times | [1,2] ร 2 โ [1,1,2,2] |
| vectorize | Apply any function to array | Custom tax function on all |
| Structured array | Named columns like a table | name, age, salary columns |
๐ NumPy Learning Flowchart
๐ง NumPy Mind Map
๐บ๏ธ NumPy Learning Roadmap
โ ๏ธ Educational Disclaimer: This resource is for educational purposes only and does not constitute legal or professional advice. Prepared by Digital E-Filing Coach โ Amanuddin Education. All examples are simplified for learning.
