# NumPy matrix multiplication (From Theory to Real-World Examples)

Matrix multiplication is a fundamental operation in linear algebra that takes a pair of matrices and produces another matrix.

In mathematical terms, given two matrices ( A ) and ( B ), the product ( AB ) is computed by taking the dot product of the rows of ( A ) with the columns of ( B ).

This tutorial aims to provide an in-depth understanding of matrix multiplication using NumPy,

**Table of Contents**hide

- 1 Element-wise vs. Matrix Multiplication
- 2 Multiplication using np.dot
- 3 The np.matmul Function
- 4 The @ Operator
- 5 Comparing np.dot, np.matmul, and @
- 6 Multiplication with Scalars
- 7 In-place Multiplication with a Scalar
- 8 Time Complexity of np.dot, np.matmul, and @
- 9 Parallel Processing and GPU Acceleration
- 10 Type Errors
- 11 Real-World Example Using Matrix Multiplication
- 12 Resources

## Element-wise vs. Matrix Multiplication

Element-wise multiplication, also known as the Hadamard product, multiplies the corresponding elements of two matrices of the same shape.

Example:

```
import numpy as np
A = np.array([[1, 2], [3, 4]])
B = np.array([[5, 6], [7, 8]])
result = A * B
print(result)
```

Output:

[[ 5, 12], [21, 32]]

Each element in the resulting matrix is the product of the corresponding elements in matrices ( A ) and ( B ):

- ( 1 X 5 = 5 )
- ( 2X 6 = 12 )
- ( 3 X 7 = 21 )
- ( 4 X 8 = 32 )

Matrix multiplication, on the other hand, requires that the number of columns in the first matrix is equal to the number of rows in the second matrix.

It’s a more complex operation that involves the dot product of rows and columns.

Using the same matrices ( A ) and ( B ), we get:

result = np.dot(A, B) print(result)

Output:

[[19, 22], [43, 50]]

## Multiplication using np.dot

The `np.dot`

function in NumPy is one of the primary ways to perform matrix multiplication. It handles both two-dimensional arrays (matrices) and one-dimensional arrays (vectors).

**Multiplying Matrices**

Here’s an example using the `np.dot`

function to multiply two matrices:

import numpy as np A = np.array([[1, 2], [3, 4]]) B = np.array([[5, 6], [7, 8]]) result = np.dot(A, B) print(result)

Output:

[[19, 22], [43, 50]]

The output is obtained by performing standard matrix multiplication, where the rows of the first matrix are multiplied with the corresponding columns of the second matrix.

**Multiplying Vectors**

`np.dot`

can also be used to compute the dot product of two vectors:

v1 = np.array([3, 4]) v2 = np.array([5, 6]) result = np.dot(v1, v2) print(result)

Output:

39

The output is the dot product of the vectors ( v1 ) and ( v2 ), computed as ( 3X 5 + 4X 6 = 39 ).

## The np.matmul Function

Unlike `np.dot`

, `np.matmul`

is specifically designed for matrix multiplication and provides a more explicit way to perform this operation.

Here’s how to use `np.matmul`

to multiply two matrices:

import numpy as np A = np.array([[1, 2], [3, 4]]) B = np.array([[5, 6], [7, 8]]) result = np.matmul(A, B) print(result)

Output:

[[19, 22], [43, 50]]

The output is identical to the result obtained with `np.dot`

, as `np.matmul`

also performs standard matrix multiplication.

### Multi-dimensional arrays multiplication

`np.matmul`

also handles multi-dimensional arrays and provides specific behavior for array broadcasting.

Example:

A = np.random.rand(2, 3, 4) B = np.random.rand(2, 4, 3) result = np.matmul(A, B) print(result.shape)

Output:

(2, 3, 3)

In this case, `np.matmul`

performs matrix multiplication over the last two dimensions of the input arrays and broadcasts over the remaining dimensions. The resulting shape is `(2, 3, 3)`

.

## The @ Operator

In Python 3.5 and later versions, you can use the `@`

operator as a shorthand for matrix multiplication. This operator provides a clean and concise way to multiply matrices.

Here’s an example of how to use the `@`

operator to multiply two matrices:

import numpy as np A = np.array([[1, 2], [3, 4]]) B = np.array([[5, 6], [7, 8]]) result = A @ B print(result)

Output:

[[19, 22], [43, 50]]

The result is identical to that obtained with `np.dot`

or `np.matmul`

. The `@`

operator simplifies the code and makes the matrix multiplication operation visually clear.

The `@`

operator can also be used with higher-dimensional arrays, just like `np.matmul`

, following the same broadcasting rules.

## Comparing np.dot, np.matmul, and @

Understanding the differences and similarities between `np.dot`

, `np.matmul`

, and the `@`

operator can help you choose the right tool for your specific needs.

**np.dot**

- Can be used for both matrix multiplication and vector dot products.
- Handles two-dimensional arrays as general matrix multiplication.
- For higher-dimensional arrays, its behavior may differ from
`np.matmul`

.

**np.matmul**

- Specifically designed for matrix multiplication.
- Provides consistent behavior across different dimensions.
- Handles broadcasting over the last two dimensions for multi-dimensional arrays.

**@ Operator**

- For Python 3.5 and above.
- Provides a clean and concise notation for matrix multiplication.

## Multiplication with Scalars

In addition to multiplying matrices, you can multiply a matrix by a scalar (a single numerical value). This operation multiplies every element of the matrix by the scalar.

import numpy as np A = np.array([[1, 2], [3, 4]]) scalar = 2 result = A * scalar print(result)

Output:

[[2, 4], [6, 8]]

The output is obtained by multiplying every element in matrix ( A ) by the scalar value 2:

- ( 1X2 = 2 )
- ( 2X 2 = 4 )
- ( 3 X 2 = 6 )
- ( 4 X 2 = 8 )

## In-place Multiplication with a Scalar

In-place multiplication with a scalar means modifying the original matrix by multiplying it with a scalar, without creating a new copy of the matrix.

This can be useful for memory efficiency, especially when dealing with large matrices.

You can use the `*=`

operator to multiply a matrix with a scalar in place.

import numpy as np A = np.array([[1, 2], [3, 4]]) scalar = 2 A *= scalar print(A)

Output:

[[2 4] [6 8]]

The matrix `A`

is multiplied by 2 in-place, meaning that the original matrix `A`

is modified without creating a new matrix.

You can also use `numpy.multiply`

function with the `out`

argument to achieve in-place multiplication.

A = np.array([[1, 2], [3, 4]]) scalar = 2 np.multiply(A, scalar, out=A) # In-place multiplication print(A)

Output: [[2 4] [6 8]]

The `np.multiply`

function with the `out`

argument pointing to the original matrix `A`

ensures that the multiplication is performed in-place.

### Space complexity comparison

We can use the `memory_profiler`

package to profile the memory usage of the scalar multiplication.

from memory_profiler import memory_usage def inplace_multiplication(numbers, scalar): for i in range(len(numbers)): numbers[i] *= scalar def multiplication_with_scalar(numbers, scalar): result = [num * scalar for num in numbers] def main(): numbers = [1, 2, 3, 4, 5] * 1000000 scalar = 2 mem_usage_inplace = memory_usage((inplace_multiplication, (numbers, scalar))) print(f'In-place multiplication memory used: {max(mem_usage_inplace) - min(mem_usage_inplace)} MiB') mem_usage_with_scalar = memory_usage((multiplication_with_scalar, (numbers, scalar))) print(f'Multiplication with scalar memory used: {max(mem_usage_with_scalar) - min(mem_usage_with_scalar)} MiB') if __name__ == '__main__': main()

Output:

In-place multiplication memory used: 0.2265625 MiB Multiplication with scalar memory used: 37.125 MiB

In-place multiplication is significantly lower in memory usage.

## Time Complexity of np.dot, np.matmul, and @

Let’s measure the execution time for each method using the `time`

module.

import numpy as np import time A = np.random.rand(5000, 5000) B = np.random.rand(5000, 5000) # Measuring time for np.dot start_time = time.time() result_dot = np.dot(A, B) end_time = time.time() print(f"np.dot execution time: {(end_time - start_time) :.2f} seconds") # Measuring time for np.matmul start_time = time.time() result_matmul = np.matmul(A, B) end_time = time.time() print(f"np.matmul execution time: {(end_time - start_time) :.2f} seconds") # Measuring time for @ operator start_time = time.time() result_at_operator = A @ B end_time = time.time() print(f"@ operator execution time: {(end_time - start_time) :.2f} seconds")

Output (example times; actual times may vary):

np.dot execution time: 1.13 seconds np.matmul execution time: 1.18 seconds @ operator execution time: 1.17 seconds

The execution times for `np.dot`

, `np.matmul`

, and the `@`

operator are reflecting the actual performance for these methods which is almost the same.

## Parallel Processing and GPU Acceleration

Parallel processing and GPU (Graphics Processing Unit) acceleration can significantly speed up matrix multiplication, especially with large matrices.

NumPy itself does not offer direct GPU acceleration but can leverage libraries that do.

Libraries like CuPy provide a NumPy-compatible interface for GPU-accelerated computations.

First, install CuPy which corresponds to your CUDA version. Then, you can perform GPU-accelerated matrix multiplication as follows:

import cupy as cp A_gpu = cp.random.rand(1000, 1000) B_gpu = cp.random.rand(1000, 1000) # Matrix multiplication on the GPU result_gpu = A_gpu @ B_gpu

By simply replacing NumPy with CuPy, you can execute the same code on a GPU.

This can lead to a substantial performance increase for large-scale matrix multiplication.

**Considerations**

- GPU acceleration is most effective for large matrices where the parallelism of the GPU can be fully utilized.
- Data transfer between CPU and GPU can introduce overhead, especially for small matrices.
- Not all NumPy functions are supported in GPU-accelerated libraries, so compatibility should be checked.

## Type Errors

In matrix multiplication, understanding the data types of the matrices involved is essential.

Mixing integer and float matrices can lead to unexpected results or type errors. Here’s how to handle them:

**Implicit Type Conversion**

When multiplying an integer matrix with a float matrix, NumPy performs implicit type conversion, converting the integer matrix to float.

import numpy as np A = np.array([[1, 2], [3, 4]], dtype=int) B = np.array([[1.5, 2.5], [3.5, 4.5]], dtype=float) result = A @ B print(result.dtype)

Output:

float64

Here, the integer matrix `A`

is implicitly converted to float before multiplication, and the result is a float matrix.

**Explicit Type Conversion**

Explicitly converting matrices to a common data type can prevent unexpected behavior.

Example:

A = A.astype(float) # Explicitly converting A to float result = A @ B

By explicitly converting `A`

to float, you ensure that the multiplication proceeds as expected, with no surprises related to data types.

## Real-World Example Using Matrix Multiplication

### Scaling a 2D Object in Computer Graphics

Suppose you have a 2D triangle defined by its vertices, and you want to scale it.

**1. Define the Object**

Start by defining the coordinates of the triangle’s vertices as a matrix.

import numpy as np # Vertices of the triangle triangle_vertices = np.array([ [0, 0], [4, 0], [2, 3] ])

Here, the vertices are augmented with an additional dimension to handle the translation using homogeneous coordinates.

**2. Define the Scaling Matrix**

Create a scaling matrix that scales by 2 in the x-direction and 3 in the y-direction.

scale_x = 2 scale_y = 3 scaling_matrix = np.array([ [scale_x, 0], [0, scale_y] ])

**3. Perform the Scaling**

Multiply the vertices by the scaling matrix to scale the triangle.

scaled_triangle_vertices = triangle_vertices.dot(scaling_matrix) print(scaled_triangle_vertices)

Output:

array([[0, 0], [8, 0], [4, 9]]))

The triangle has been scaled by a factor of 2 in the x-direction and 3 in the y-direction.

Let’s draw the scaled coordinates to draw the scaled object in matplotlib:

import matplotlib.pyplot as plt def plot_triangle(vertices, color, label): plt.plot(*vertices.T, color=color, label=label) # Plot the edges plt.scatter(*vertices.T, color=color) # Plot the vertices plt.fill(vertices[:, 0], vertices[:, 1], alpha=0.1, color=color) # Fill the triangle # Plot the original triangle plot_triangle(triangle_vertices, color='blue', label='Original Triangle') # Plot the scaled triangle plot_triangle(scaled_triangle_vertices, color='red', label='Scaled Triangle') # Add labels and legend plt.xlabel('X') plt.ylabel('Y') plt.legend() plt.grid(True) plt.axis('equal') # Equal scaling ensures that the plot is not distorted plt.title('Scaling Transformation of a 2D Triangle') plt.show()

Output:

### Image Rotation

First, the image is loaded in color, and the color scheme is converted from BGR to RGB since OpenCV reads images in BGR by default. The necessary libraries are also imported.

import numpy as np from scipy.ndimage import affine_transform import cv2 import matplotlib.pyplot as plt image = cv2.imread("image.jpg") image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB) # Convert from BGR to RGB

**Combine Transformation using matrix multiplication**

Then we define the rotation angle and create the corresponding rotation matrix.

Additionally, we compute the transformation matrix to shift the center of rotation to the center of the image.

Matrix multiplication is then used to combine these transformations into the `full_transform`

matrix.

angle_degrees = 60 angle_radians = np.radians(angle_degrees) rotation_matrix = np.array([[np.cos(angle_radians), -np.sin(angle_radians), 0], [np.sin(angle_radians), np.cos(angle_radians), 0], [0, 0, 1]]) height, width, _ = image.shape center_x, center_y = width // 2, height // 2 transform_matrix = np.array([[1, 0, -center_x], [0, 1, -center_y], [0, 0, 1]]) # Combine the shift and rotation using matrix multiplication full_transform = np.dot(np.linalg.inv(transform_matrix), rotation_matrix) full_transform = np.dot(full_transform, transform_matrix)

In the final step, the rotation is applied to the image by using the `affine_transform`

function from the SciPy library.

This is done separately for each color channel to handle the 3D nature of the image. Finally, the rotated image is displayed using Matplotlib.

rotated_image = np.zeros_like(image) for channel in range(image.shape[2]): rotated_image[..., channel] = affine_transform(image[..., channel], full_transform[:2, :2], offset=(full_transform[0, 2], full_transform[1, 2])) plt.imshow(rotated_image) plt.title("Rotated Image") plt.axis("off") plt.show()

Output:

## Resources

https://numpy.org/doc/stable/reference/generated/numpy.dot.html

https://numpy.org/doc/stable/reference/generated/numpy.matmul.html

Mokhtar is the founder of LikeGeeks.com. He is a seasoned technologist and accomplished author, with expertise in Linux system administration and Python development. Since 2010, Mokhtar has built an impressive career, transitioning from system administration to Python development in 2015. His work spans large corporations to freelance clients around the globe. Alongside his technical work, Mokhtar has authored some insightful books in his field. Known for his innovative solutions, meticulous attention to detail, and high-quality work, Mokhtar continually seeks new challenges within the dynamic field of technology.