Python string interpolation (Make Dynamic Strings)

String interpolation is a process of substituting values of variables into placeholders in a string. This is a powerful feature in Python that enables you to create a dynamic string in Python by embedding or substituting the values of variables into the string at runtime.

Python supports multiple ways to format strings and perform string interpolation, making format strings easier to maintain and easier to modify dynamically.

 

 

Python string interpolation using % operator

The % operator allows you to use format specifiers such as %s for string, %d for integers, and %f for floating-point numbers.

name = "Alice"
age = 25
height = 1.68

print("Hello, my name is %s, I'm %d years old and I'm %.2f meters tall." % (name, age, height))

Output:

Hello, my name is Alice, I'm 25 years old and I'm 1.68 meters tall.

In this case, we have used the %s, %d, and %.2f format specifiers for string, integer, and floating-point number with two digits after the decimal point respectively.
However, there are some limitations with using the % operator:

  1. It can be more error-prone and harder to maintain as the number of variables or complexity increases.
  2. There is a risk of tuple size mismatch with the number of placeholders in a string.

 

str.format()

Python introduced the str.format() method in Python 3, which offered a new way of formatting strings.

This method uses placeholders, denoted by curly braces {}, where you can place the variable names.

name = "Bob"
age = 45
print("Hello, my name is {} and I'm {} years old.".format(name, age))

Output:

Hello, my name is Bob and I'm 45 years old.

Here, the arguments to the format function are placed inside the curly braces {} in the order they are passed. This literal string interpolation method is more readable than using the % operator.

Positional arguments

With str.format(), you can use positional arguments, which allows for re-arranging the order of display without changing the arguments passed.

name = "Carol"
age = 50
print("Hello, I'm {1} years old and my name is {0}.".format(name, age))

Output:

Hello, I'm 50 years old and my name is Carol.

The numbers inside the curly braces refer to the positions of arguments. This is quite a powerful feature as it enables re-arranging the order of display without changing the arguments passed.

Named arguments

You can also refer to variable substitutions by name and use these named arguments within the format method.

print("Hello, I'm {age} years old and my name is {name}.".format(name="David", age=55))

Output:

Hello, I'm 55 years old and my name is David.

By using the variable name inside the curly braces, it becomes easier to understand which variable is being substituted where, especially when the string and the data to substitute are far apart in the code.

It makes substitutions by name and use of named arguments to the format method clear and maintainable.

 

Format specification mini-language

Python also provides a mini-language for specifying the format of the replacement fields. This is especially handy when you need to control the width of fields, the alignment of the data, the number of decimal places, and so forth.

import math
print("The value of pi is approximately {0:.3f}.".format(math.pi))

Output:

The value of pi is approximately 3.142.

Here, {0:.3f} denotes that we want to format the first argument as a floating-point number with 3 digits after the decimal point.

 

Template strings

The string module in Python provides another method to perform string interpolation by using the Template class.

from string import Template
t = Template('Hello, my name is $name and I am $age years old.')
s = t.substitute(name='Emma', age=35)
print(s)

Output:

Hello, my name is Emma and I am 35 years old.

Here, we first import the Template class from the string module. We then create a template using a string that contains placeholder variables prefixed with a $ symbol.

Finally, we call the substitute method on the template object and pass a mapping of variables to replace the placeholders.

This method is less powerful but more user-friendly for simple substitutions or when the format string is user-supplied.

 

f-strings

Starting from Python 3.6, a new string formatting mechanism known as f-strings (formatted string literals) was introduced.

This new string formatting mechanism makes Python string interpolation more readable and concise.

name = "Frank"
age = 60
print(f"Hello, my name is {name} and I'm {age} years old.")

Output:

Hello, my name is Frank and I'm 60 years old.

In this code, an f-string is a literal string, prefixed with ‘f’, which contains expressions inside curly braces.

The expressions are evaluated at runtime and their values are inserted into the string.

Embedded expressions

One of the advantages of f-strings is the ability to embed arbitrary Python expressions inside string literals.

name = "Gary"
age = 65
print(f"{name} will be {age + 5} years old in five years.")

Output:

Gary will be 70 years old in five years.

Here, we are performing an arithmetic operation inside the f-string. The expression age + 5 is evaluated at runtime and its result is inserted into the string.

Formatting specifiers

F-strings support the same format specifiers as the str.format() method.

import math
print(f"The value of pi is approximately {math.pi:.3f}.")

Output:

The value of pi is approximately 3.142.

In this example, the Python expression inside the curly braces is math.pi, and :.3f is the format specifier which denotes that we want to format the result as a floating-point number with 3 digits after the decimal point.

Inline arithmetic and function calls

You can even do inline arithmetic operations, function calls and much more with f-strings.

x = 10
y = 20
print(f"The sum of {x} and {y} is {x + y}.")
print(f"The square root of {x} is {math.sqrt(x):.2f}.")

Output:

The sum of 10 and 20 is 30.
The square root of 10 is 3.16.

Here, we are performing arithmetic operations and function calls directly inside the string. The results of these operations and calls are evaluated and inserted at runtime.

Lambda and f-strings

You can also use lambda functions inside f-strings.

x = 10
y = 20
print(f"The maximum of {x} and {y} is {(lambda a, b: a if a > b else b)(x, y)}.")

Output:

The maximum of 10 and 20 is 20.

In this example, we defined a lambda function inside the curly braces of the f-string that calculates the maximum of two numbers. We immediately call this function with x and y as arguments.

Multiline f-strings

F-strings can also span multiple lines. This can be useful when you want to create a complex string with multiple substitutions in a single string.

x = 10
y = 20
z = 30
multi_line_f_string = (
    f"The value of x is {x}."
    f"The value of y is {y}."
    f"The sum of x and y is {x + y}."
    f"The value of z is {z}."
    f"The sum of x, y, and z is {x + y + z}."
)
print(multi_line_f_string)

Output:

The value of x is 10.The value of y is 20.The sum of x and y is 30.The value of z is 30.The sum of x, y, and z is 60.

Here, we are creating a multiline f-string by enclosing the entire f-string within parentheses.

Each line is a separate f-string, and they all get concatenated together into one long string.

Dynamic formatting

Dynamic formatting allows you to determine the format of your output string at runtime.

You can accomplish this using f-strings:

for align in ['<', '^', '>']:
    print(f"{'hello':{align}10}")

Output:

hello     
  hello   
     hello

In this example, we are using an f-string with a dynamic format specifier.

Inside the curly braces {}, align is a variable that is part of the format specifier and its value changes with each iteration of the loop.

The 10 following {align} is the field width for the formatted string. Depending on the value of align, the string ‘hello’ is aligned to the left, center, or right of the field.

 

F-string Injection Attacks and How to Avoid Them

F-string injection can happen if you’re not careful with what expressions you evaluate inside your f-string.

If any part of an f-string is constructed from user input or external sources, there’s a risk that an attacker could inject malicious code.

This can lead to arbitrary code execution, memory disclosure, and even crashes.

import sys
user_input = 'sys.exit()'
dangerous_f_string = f'Hello, {eval(user_input)}!'

Here, eval(user_input) will cause Python to execute sys.exit(), and your program will quit.

To avoid this:

Never construct a format string from untrusted input: This is the most straightforward way to prevent format string attacks. Always ensure that any data used to construct a format string is not derived from an untrusted or external source.

Use Template strings for user-supplied format strings: The string.Template class in Python provides a safer way to realize user-defined format strings.

from string import Template
t = Template('Hello, ${name}!')
user_input = 'sys.exit()'
print(t.safe_substitute(name=user_input))

 

Performance Evaluation (f-strings is faster)

Let’s conduct a simple performance evaluation to compare the speed of the four different string interpolation methods in Python.

We’ll use the timeit module to measure the time each method takes.

import timeit
name = "Henry"
age = 70
profession = "doctor"
hobby = "gardening"
location = "New York"

# Using % operator
time_percent_op = timeit.timeit("'Hello, my name is %s. I am %s years old, a %s from %s, and I love %s.' % (name, age, profession, location, hobby)", globals=globals())
print("% Operator: ", time_percent_op)

# Using str.format()
time_str_format = timeit.timeit("'Hello, my name is {}. I am {} years old, a {} from {}, and I love {}.'.format(name, age, profession, location, hobby)", globals=globals())
print("str.format(): ", time_str_format)

# Using Template strings
time_template_str = timeit.timeit("Template('Hello, my name is $name. I am $age years old, a $profession from $location, and I love $hobby.').substitute(name=name, age=age, profession=profession, location=location, hobby=hobby)", setup="from string import Template", globals=globals())
print("Template strings: ", time_template_str)

# Using f-strings
time_f_strings = timeit.timeit("f'Hello, my name is {name}. I am {age} years old, a {profession} from {location}, and I love {hobby}. '", globals=globals())
print("f-strings: ", time_f_strings)

Output:

% Operator: 1.0545442999864463
str.format(): 1.1527161999838427
Template strings: 9.305830199999036
f-strings: 0.5057788999984041

f-strings tend to perform better than the other methods due to their compile-time expression evaluation.

However, the difference is not significant for small strings, and other factors such as readability and security should also be taken into account.

 

Using string interpolation with regular expressions

String interpolation is very useful when working with regular expressions. Let’s demonstrate this with an example:

import re
pattern = "fox"
text = "The quick brown fox jumps over the lazy dog"
regex = re.compile(fr"\b{pattern}\b")  # Using f-string
matches = regex.findall(text)
print(matches)

Output:

['fox']

In this code, we are creating a regular expression that matches the word ‘fox’. The f-string fr"\b{pattern}\b" allows us to dynamically insert the value of the variable pattern into the regular expression.

 

Summary

To summarize, here is a simple table that includes the strengths and weaknesses of each method for string interpolation in Python:

Method Strengths Weaknesses
`%` operator Simple and straightforward for basic usage, familiar to C programmers Less readable with multiple substitutions, type specification is mandatory
`str.format()` Improved readability, positional and keyword substitutions, versatile formatting options More verbose compared to f-strings.
Template Strings User-friendly syntax, ideal for simple substitutions or user-supplied format strings, and secured against injection attacks. Limited functionality, no support for complex expressions or custom formatting, slower execution time
f-strings Concise and readable, supports inline expressions and complex formatting, faster execution time Only available in Python 3.6 and above, potential security risk if format string is constructed at runtime

Remember that the most suitable method often depends on your specific use case. Each of these methods can be a better fit depending on the context, whether it’s the complexity of the formatting, performance considerations, or the version of Python you’re using.

Leave a Reply

Your email address will not be published. Required fields are marked *