NumPy matrix multiplication (From Theory to Real-World Examples)

Matrix multiplication is a fundamental operation in linear algebra that takes a pair of matrices and produces another matrix.

In mathematical terms, given two matrices ( A ) and ( B ), the product ( AB ) is computed by taking the dot product of the rows of ( A ) with the columns of ( B ).

This tutorial aims to provide an in-depth understanding of matrix multiplication using NumPy,



Element-wise vs. Matrix Multiplication

Element-wise multiplication, also known as the Hadamard product, multiplies the corresponding elements of two matrices of the same shape.


import numpy as np
A = np.array([[1, 2], [3, 4]])
B = np.array([[5, 6], [7, 8]])
result = A * B


[[ 5, 12],
 [21, 32]]

Each element in the resulting matrix is the product of the corresponding elements in matrices ( A ) and ( B ):

  • ( 1 X 5 = 5 )
  • ( 2X 6 = 12 )
  • ( 3 X 7 = 21 )
  • ( 4 X 8 = 32 )

Matrix multiplication, on the other hand, requires that the number of columns in the first matrix is equal to the number of rows in the second matrix.

It’s a more complex operation that involves the dot product of rows and columns.

Using the same matrices ( A ) and ( B ), we get:

result =, B)


[[19, 22],
 [43, 50]]


Multiplication using

The function in NumPy is one of the primary ways to perform matrix multiplication. It handles both two-dimensional arrays (matrices) and one-dimensional arrays (vectors).

Multiplying Matrices

Here’s an example using the function to multiply two matrices:

import numpy as np
A = np.array([[1, 2], [3, 4]])
B = np.array([[5, 6], [7, 8]])
result =, B)


[[19, 22],
 [43, 50]]

The output is obtained by performing standard matrix multiplication, where the rows of the first matrix are multiplied with the corresponding columns of the second matrix.

Multiplying Vectors can also be used to compute the dot product of two vectors:

v1 = np.array([3, 4])
v2 = np.array([5, 6])
result =, v2)



The output is the dot product of the vectors ( v1 ) and ( v2 ), computed as ( 3X 5 + 4X 6 = 39 ).


The np.matmul Function

Unlike np.dotnp.matmul is specifically designed for matrix multiplication and provides a more explicit way to perform this operation.

Here’s how to use np.matmul to multiply two matrices:

import numpy as np
A = np.array([[1, 2], [3, 4]])
B = np.array([[5, 6], [7, 8]])
result = np.matmul(A, B)


[[19, 22],
 [43, 50]]

The output is identical to the result obtained with, as np.matmul also performs standard matrix multiplication.

Multi-dimensional arrays multiplication

np.matmul also handles multi-dimensional arrays and provides specific behavior for array broadcasting.


A = np.random.rand(2, 3, 4)
B = np.random.rand(2, 4, 3)
result = np.matmul(A, B)


(2, 3, 3)

In this case, np.matmul performs matrix multiplication over the last two dimensions of the input arrays and broadcasts over the remaining dimensions. The resulting shape is (2, 3, 3).


The @ Operator

In Python 3.5 and later versions, you can use the @ operator as a shorthand for matrix multiplication. This operator provides a clean and concise way to multiply matrices.

Here’s an example of how to use the @ operator to multiply two matrices:

import numpy as np
A = np.array([[1, 2], [3, 4]])
B = np.array([[5, 6], [7, 8]])
result = A @ B


[[19, 22],
 [43, 50]]

The result is identical to that obtained with or np.matmul. The @ operator simplifies the code and makes the matrix multiplication operation visually clear.

The @ operator can also be used with higher-dimensional arrays, just like np.matmul, following the same broadcasting rules.


Comparing, np.matmul, and @

Understanding the differences and similarities between np.dotnp.matmul, and the @ operator can help you choose the right tool for your specific needs.

  • Can be used for both matrix multiplication and vector dot products.
  • Handles two-dimensional arrays as general matrix multiplication.
  • For higher-dimensional arrays, its behavior may differ from np.matmul.


  • Specifically designed for matrix multiplication.
  • Provides consistent behavior across different dimensions.
  • Handles broadcasting over the last two dimensions for multi-dimensional arrays.

@ Operator

  • For Python 3.5 and above.
  • Provides a clean and concise notation for matrix multiplication.


Multiplication with Scalars

In addition to multiplying matrices, you can multiply a matrix by a scalar (a single numerical value). This operation multiplies every element of the matrix by the scalar.

import numpy as np
A = np.array([[1, 2], [3, 4]])
scalar = 2
result = A * scalar


[[2, 4],
 [6, 8]]

The output is obtained by multiplying every element in matrix ( A ) by the scalar value 2:

  • ( 1X2 = 2 )
  • ( 2X 2 = 4 )
  • ( 3 X 2 = 6 )
  • ( 4 X 2 = 8 )


In-place Multiplication with a Scalar

In-place multiplication with a scalar means modifying the original matrix by multiplying it with a scalar, without creating a new copy of the matrix.

This can be useful for memory efficiency, especially when dealing with large matrices.

You can use the *= operator to multiply a matrix with a scalar in place.

import numpy as np
A = np.array([[1, 2], [3, 4]])
scalar = 2
A *= scalar


[[2 4]
[6 8]]

The matrix A is multiplied by 2 in-place, meaning that the original matrix A is modified without creating a new matrix.

You can also use  numpy.multiply function with the out argument to achieve in-place multiplication.

A = np.array([[1, 2], [3, 4]])
scalar = 2
np.multiply(A, scalar, out=A) # In-place multiplication
[[2 4]
[6 8]]

The np.multiply function with the out argument pointing to the original matrix A ensures that the multiplication is performed in-place.

Space complexity comparison

We can use the memory_profiler package to profile the memory usage of the scalar multiplication.

from memory_profiler import memory_usage

def inplace_multiplication(numbers, scalar):
    for i in range(len(numbers)):
        numbers[i] *= scalar

def multiplication_with_scalar(numbers, scalar):
    result = [num * scalar for num in numbers]

def main():
    numbers = [1, 2, 3, 4, 5] * 1000000
    scalar = 2
    mem_usage_inplace = memory_usage((inplace_multiplication, (numbers, scalar)))
    print(f'In-place multiplication memory used: {max(mem_usage_inplace) - min(mem_usage_inplace)} MiB')
    mem_usage_with_scalar = memory_usage((multiplication_with_scalar, (numbers, scalar)))
    print(f'Multiplication with scalar memory used: {max(mem_usage_with_scalar) - min(mem_usage_with_scalar)} MiB')
if __name__ == '__main__':


In-place multiplication memory used: 0.2265625 MiB
Multiplication with scalar memory used: 37.125 MiB

In-place multiplication is significantly lower in memory usage.


Time Complexity of, np.matmul, and @

Let’s measure the execution time for each method using the time module.

import numpy as np
import time
A = np.random.rand(5000, 5000)
B = np.random.rand(5000, 5000)

# Measuring time for
start_time = time.time()
result_dot =, B)
end_time = time.time()
print(f" execution time: {(end_time - start_time) :.2f} seconds")

# Measuring time for np.matmul
start_time = time.time()
result_matmul = np.matmul(A, B)
end_time = time.time()
print(f"np.matmul execution time: {(end_time - start_time) :.2f} seconds")

# Measuring time for @ operator
start_time = time.time()
result_at_operator = A @ B
end_time = time.time()
print(f"@ operator execution time: {(end_time - start_time) :.2f} seconds")

Output (example times; actual times may vary): execution time: 1.13 seconds
np.matmul execution time: 1.18 seconds
@ operator execution time: 1.17 seconds

The execution times for np.dotnp.matmul, and the @ operator are reflecting the actual performance for these methods which is almost the same.


Parallel Processing and GPU Acceleration

Parallel processing and GPU (Graphics Processing Unit) acceleration can significantly speed up matrix multiplication, especially with large matrices.

NumPy itself does not offer direct GPU acceleration but can leverage libraries that do.

Libraries like CuPy provide a NumPy-compatible interface for GPU-accelerated computations.

First, install CuPy which corresponds to your CUDA version. Then, you can perform GPU-accelerated matrix multiplication as follows:

import cupy as cp
A_gpu = cp.random.rand(1000, 1000)
B_gpu = cp.random.rand(1000, 1000)

# Matrix multiplication on the GPU
result_gpu = A_gpu @ B_gpu

By simply replacing NumPy with CuPy, you can execute the same code on a GPU.

This can lead to a substantial performance increase for large-scale matrix multiplication.


  • GPU acceleration is most effective for large matrices where the parallelism of the GPU can be fully utilized.
  • Data transfer between CPU and GPU can introduce overhead, especially for small matrices.
  • Not all NumPy functions are supported in GPU-accelerated libraries, so compatibility should be checked.


Type Errors

In matrix multiplication, understanding the data types of the matrices involved is essential.

Mixing integer and float matrices can lead to unexpected results or type errors. Here’s how to handle them:

Implicit Type Conversion

When multiplying an integer matrix with a float matrix, NumPy performs implicit type conversion, converting the integer matrix to float.

import numpy as np
A = np.array([[1, 2], [3, 4]], dtype=int)
B = np.array([[1.5, 2.5], [3.5, 4.5]], dtype=float)
result = A @ B



Here, the integer matrix A is implicitly converted to float before multiplication, and the result is a float matrix.

Explicit Type Conversion

Explicitly converting matrices to a common data type can prevent unexpected behavior.


A = A.astype(float)  # Explicitly converting A to float
result = A @ B

By explicitly converting A to float, you ensure that the multiplication proceeds as expected, with no surprises related to data types.


Real-World Example Using Matrix Multiplication

Scaling a 2D Object in Computer Graphics

Suppose you have a 2D triangle defined by its vertices, and you want to scale it.

1. Define the Object

Start by defining the coordinates of the triangle’s vertices as a matrix.

import numpy as np

# Vertices of the triangle
triangle_vertices = np.array([
    [0, 0],
    [4, 0],
    [2, 3]

Here, the vertices are augmented with an additional dimension to handle the translation using homogeneous coordinates.

2. Define the Scaling Matrix

Create a scaling matrix that scales by 2 in the x-direction and 3 in the y-direction.

scale_x = 2
scale_y = 3
scaling_matrix = np.array([
    [scale_x, 0],
    [0, scale_y]

3. Perform the Scaling

Multiply the vertices by the scaling matrix to scale the triangle.

scaled_triangle_vertices =


array([[0, 0],
       [8, 0],
       [4, 9]]))

The triangle has been scaled by a factor of 2 in the x-direction and 3 in the y-direction.

Let’s draw the scaled coordinates to draw the scaled object in matplotlib:

import matplotlib.pyplot as plt
def plot_triangle(vertices, color, label):
    plt.plot(*vertices.T, color=color, label=label) # Plot the edges
    plt.scatter(*vertices.T, color=color) # Plot the vertices
    plt.fill(vertices[:, 0], vertices[:, 1], alpha=0.1, color=color) # Fill the triangle

# Plot the original triangle
plot_triangle(triangle_vertices, color='blue', label='Original Triangle')

# Plot the scaled triangle
plot_triangle(scaled_triangle_vertices, color='red', label='Scaled Triangle')

# Add labels and legend
plt.axis('equal') # Equal scaling ensures that the plot is not distorted
plt.title('Scaling Transformation of a 2D Triangle')


Scaling using numpy matrix multiplication

Image Rotation

First, the image is loaded in color, and the color scheme is converted from BGR to RGB since OpenCV reads images in BGR by default. The necessary libraries are also imported.

import numpy as np
from scipy.ndimage import affine_transform
import cv2
import matplotlib.pyplot as plt
image = cv2.imread("image.jpg")
image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB) # Convert from BGR to RGB

Combine Transformation using matrix multiplication

Then we define the rotation angle and create the corresponding rotation matrix.

Additionally, we compute the transformation matrix to shift the center of rotation to the center of the image.

Matrix multiplication is then used to combine these transformations into the full_transform matrix.

angle_degrees = 60
angle_radians = np.radians(angle_degrees)
rotation_matrix = np.array([[np.cos(angle_radians), -np.sin(angle_radians), 0],
                            [np.sin(angle_radians), np.cos(angle_radians), 0],
                            [0, 0, 1]])
height, width, _ = image.shape
center_x, center_y = width // 2, height // 2
transform_matrix = np.array([[1, 0, -center_x],
                             [0, 1, -center_y],
                             [0, 0, 1]])

# Combine the shift and rotation using matrix multiplication
full_transform =, rotation_matrix)
full_transform =, transform_matrix)

In the final step, the rotation is applied to the image by using the affine_transform function from the SciPy library.

This is done separately for each color channel to handle the 3D nature of the image. Finally, the rotated image is displayed using Matplotlib.

rotated_image = np.zeros_like(image)
for channel in range(image.shape[2]):
    rotated_image[..., channel] = affine_transform(image[..., channel], full_transform[:2, :2], offset=(full_transform[0, 2], full_transform[1, 2]))
plt.title("Rotated Image")


Rotating image using numpy matrix multiplication



Leave a Reply

Your email address will not be published. Required fields are marked *