Skip to main content

Common Python faux pas

Many code examples for beginners show only simplified Python code and habe been derived from well-known program sequences (e.g. as already known from C). If you frequently use this kind of code, even advanced programmers can develop habits that lead to the written code being error-prone or not doing what it is supposed to do at all.

Through my own mistakes, I have gained a lot of experience and, drawing from this, in this post I list bad programming habits that I believe are commonly made when programming in Python; but habits that can be avoided easily.

The post is currently being translated. A (possibly incorrect) machine translation is provided below.

Formatting a string

To place values of variables to strings it is often shown that you can join strings with the + operator before outputting the result:


def string_formatting_bad(name: str, subscriber: int) -> None:
    if subscriber < 100000:
        print(name + " does not have 100k subscribers yet.")
    else:
        print("Amazing! " + name + " has " + str(subscriber) + " subscribers!")

However, it is more advisable to format the string with the built-in format() method, or to mark the string itself as a formatted string (i.e. with f in front of the quotes), so that the variables can be directly embedded in curly braces:


def string_formatting_good(name: str, subscriber: int) -> None:
    ``if subscriber < 100000:
        print(f"{name} does not have 100k subscribers yet.")
    else:
        print(f "Amazing! {name} has {subscriber} subscribers!")

Closing files correctly

Based on low-level programming languages like C, it is often shown that files must be (explicitly) closed after they have been opened.


def using_a_file_bad(file_name: str) -> None:
    f = open(file_name, "w")
    f.write("Hello, world!\n")
    f.close()

However, if the code aborts or fails before calling close() (e.g. because an error occurs while writing to the file), the file will never be closed. Therefore, it is recommended to open the file as part of a context manager. This is done through the keyword with, where the context (here the opened file) is assigned to a variable by as, which can then be used within the subsequent code section. If the code section is exited (regularly or due to an error) the context is automatically terminated (i.e. the file is closed).


def using_a_file_good(file_name: str) -> None:
    with open(file_name, "w") as f:
        f.write("Hello, world!\n")

Context manager

You can also achieve context management with the combination of try and finally:


def using_context_manager_bad(host: str, port: int) -> None:
    s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
    try:
        s.connect((host, port))
        s.sendall(b "Hello, world!")
    finally:
        s.close()

This, however, also entails the possibility of forgetting to call close() on the context (here a socket). A context manager using with and as takes care of this by itself and leads to more readable code.


def using_context_manager_good(host: str, port: int) -> None:
    with socket.socket(socket.AF_INET, socket.SOCK_STREAM) as s:
        s.connect((host, port))
        s.sendall(b "Hello, world!")

Correct handling of exception

If you want to avoid code halting due to an thrown exception, you can use the combination of the keywords try and catch to catch the exception:


def exception_bad() -> None:
    while True:
        try:
            s = input("Please enter a number: ")
            print(f "You entered the number: {int(s)}")
            break
        except:
            print(f "You did not enter a number")

However, if you do not explicitly define which exceptions should be caught, in the above example, the user would not be able to exit the program via strg+c. Therefore, you should specify that you want to catch at least a code-based error:


def exception_good() -> None:
    while True:
        try:
            s = input("Please enter a number: ")
            print(f "You entered the number: {int(s)}")
            break
        except Exception:
            print(f "You did not enter a number")

Better still, you should explicitly specify the more concrete data type of the exception you want to catch:


def exception_better() -> None:
    while True:
        try:
            s = input("Please enter a number: ")
            print(f "You entered the number: {int(s)}")
            break
        except ValueError:
            print(f "You did not enter a number")

Wrong operator for exponential calculation

Other programming languages often use the operator ^ to calculate an exponential. In Python, however, this is the XOR operator:


def square_bad(a: int) -> int:
    return a ^ 2

Instead of ^, two asterisks ** denote the exponential calculation in Python:


def square_good(a: int) -> int:
    return a ** 2

Mutable arguments as default value

If you want to define a default argument so that the user needs not repeatedly specify the default value when calling a function multiple times, you would probably write the following code:


def append_bad(n: int, l: list = []) -> list:
    l.append(n)
    return l

l_1 = append_bad(0)
l_2 = append_bad(1)

However, the default argument defined here, can be changed or mutated. This means that l_2 will contain the value [0, 1] at the end. This is because default values for arguments to a function are defined when the function itself is defined, and not when the function is executed. For the above example, this means that each call to the function shares the same list l, and by the second call to the function, it already holds the value [0]. To avoid this behavior, mutable default arguments should be set to None and defined inside the function (at runtime):


def append_good(n: int, l: list = None) -> list:
    ``if l is None:
        l = []
    l.append(n)
    return l

List comprehensions

If you want to fill a data list with values, you could come up with the idea of adding each value individually in the case of a list (or list) or assigning the values individually in the case of a directory (or dict). This would probably result in the following code as a summary or a comprehension:


def list_comprehension_bad() -> None:
    numbers = []
    for i in range(10):
        numbers.append(i)

    squares = {}
    for i in numbers:
        squares[i] = i * i

However, this simple code is quite long and hard to read. Therefore, Python provides the so-called list comprehension, which can reduce the above code to two lines (not including the function definition):


def list_comprehension_good() -> None:
    numbers = [i for i in range(10)]
    squares = {i: i for i in numbers}

List comprehension with scoped variables

While the list comprehension is intended to provide a simplified notation for defining data lists, it can also be abused:


def list_comprehension_bad(a: list[int], b: list[int], n: int) -> list[int]:
    return [
        sum(a[n * i * k] * b[n * k + j] for k in range(n))
        for i in range(n)
        for j in range(n)
    ]

This code merely calculates the product of two matrices a and b, each having a size of n x n. However, this fact is not immediately obvious. Therefore, in this case, it is preferable to avoid the excessive and nested use of list comprehensions and use the classic for loop instead:


def list_comprehension_good(a: list[int], b: list[int], n: int) -> list[int]:
    c = []
    for i in range(n):
        for j in range(n):
            ij = sum(a[n * i * k] * b[n * k + j] for k in range(n))
            c.append(ij)
    
    return c

Matching the data type

Especially with an object-oriented programming language like Python, occasionally code needs to be executed depending on the type of a variable. The function type() returns the name of the data type as a string:


from collections import namedtuple

def type_comparison_bad() -> None:
    Point = namedtuple("Point", ["x", "y"])
    p = Point(3.5, 7.4)

    if type(p) == tuple:
        print("This is a tuple")
    else:
        print("It is not a tuple")

However, this code cannot determine equality, e.g. for inherited classes or data types with labels that do not literally correspond to the underlying data type. For most cases, however, it does not depend on the equality as such, but on the fact that the object behind the variable, has the function of the desired data type. This is not checked with equality, but with the function isinstance().


from collections import namedtuple

def type_comparison_good() -> None:
    Point = namedtuple("Point", ["x", "y"])
    p = Point(3.5, 7.4)

    if isinstance(p, tuple):
        print("This is a tuple")
    else:
        print("It is not a tuple")

Comparison with Null, True and False

If you want to make sure that a variable is set (i.e. not null) and has the value True or False, you could also do this with the comparison operator ==:


def singleton_comparison_bad(x) -> None:
    ``if x == None:
        pass
    if x == True:
        pass
    if x == False:
        pass

However, not the equality as such should be checked with ==, but the identity with is.


def singleton_comparison_good(x) -> None:
    if x is None:
        pass
    if x is True:
        pass
    if x is False:
        pass

Unnecessary length or boolean comparisons

To check whether a variable is set or a list is filled, one could resort to the functions bool() and len():


def check_bad(x) -> None:
    if bool(x):
        pass

    if len(x) != 0:
        pass

However, this query can also be formulated much more simply, namely by applying the if condition directly to the variable:


def check_good(x) -> None:
    if x:
        pass

Loops over a list

From programming languages like C, you may already be familiar with arrays and the syntax used to read values from an array. The same syntax, i.e. applying the index in square brackets to the list, also works in Python:


def loop_over_list_bad(x: list) -> None:
    for i in range(len(x)):
        print(x[i])

However, if you only want to extract the values, it makes more sense to apply the for loop directly to the list:


def loop_over_list_good(x: list) -> None:
    for v in x:
        print(v)

If you need the indexes as well as the values, you can use the function enumerate():


def loop_over_list_better(x: list) -> None:
    for i, v in enumerate(x):
        print(f"@{i} = {v}")

Loop over multiple lists

When looping over multiple lists, you might get the idea of using a common counter for indexing:


def loop_over_lists_bad(x: list, y: list) -> None:
    for i in range(len(x)):
        print(x[i])
        print(y[i])

However, you can use the function zip() to combine two lists of the same length and pass their values (at the same index level) to the for loop:


def loop_over_lists_good(x: list, y: list) -> None:
    for vx, vy in zip(x, y):
        print(vx)
        print(vy)

Just as described above, the function enumerate() can provide the index for multiple lists in addition to their values.


def loop_over_lists_better(x: list, y: list) -> None:
    for i, (vx, vy) in enumerate(zip(x, y)):
        print(f"@{i} = {vx}")
        print(f"@{i} = {vy}")

Loop over a dictionary

If you want to run a loop over a dictionary, this does not work with a continuous index, but you have to use the keywords in the dictionary:


def loop_oper_dict_bad(d: dict) -> None:
    for k in d.keys():
        print(f "Key: {k}")

However, the for loop automatically runs over the dictionary keywords, so explicit specification using keys() can be omitted:


def loop_oper_dict_good(d: dict) -> None:
    for k in d:
        print(f "Key: {k}")

If you want to edit the dictionary, you should make a copy of the keywords, because the dictionary's underlying list of keywords can change when values are added or deleted. Again, you can do this without explicitly retrieving the keywords using keys().


def loop_oper_dict_better(d: dict) -> None:
    for k in list(d): # not list(d.keys())
        print(f "Key: {k}")

Loop to use the values of a dictionary

If you also want to access the values of the dictionary in the loop, you could assume that you have to explicitly read the values with the current keyword in the loop:


def loop_oper_dict_bad(d: dict) -> None:
    for k in d:
        print(f"{k} = {d[k]}")

However, Python provides the function items() for this purpose, which is used to pass both the keywords and the corresponding value to the loop:


def loop_oper_dict_good(d: dict) -> None:
    for k, v in d.items():
        print(f"{k} = {v}")

Unpacking a tuple

Just like lists, tuples can be read using indexes.


def tuple_unpacking_bad() -> None:
    t = 4.33, 3.44
    x = t[0]
    y = t[2]

However, if you know how many fields the tuple contains, you can specify in one line all the variables to which the values of the fields should be assigned:


def tuple_unpacking_good() -> None:
    t = 4.33, 3.44
    x, y = t

The same procedure works for functions returning a tuple.

Loop with counter

From other programming languages you may take the definition of an external counter (e.g. i), which is then incremented within the loop.


def index_in_loop_bad() -> None:
    l = [3.44, 8.14, 5.0]
    i = 0
    for x in l:
        print(f"@{i} = {x}")
        i += 1

But if the counter is only incremented by one at a time (i.e. by i += 1), you can use the counting function or iterator called enumerate():


def index_in_loop_good() -> None:
    l = [3.44, 8.14, 5.0]
    for i, x in enumerate(l):
        print(f"@{i} = {x}")

Execution speed determination

Sometimes you like to check the speed of your code and compare the time before and after code execution for this purpose:


import time

def timimg_bad() -> None:
    start = time.time()
    time.sleep(1) # Simulating code execution ...
    end = time.time()
    print(end - start)

However, the function time() returns a classical time, which may be inaccurate. For a more precise determination of the execution time of the code, the output of perf_counter() should be used. This function outputs the time as a floating point number with the highest possible accuracy:


import time

def timimg_good() -> None:
    start = time.perf_counter()
    time.sleep(1) # Simulating code execution ...
    end = time.perf_counter()
    print(end - start)

Use a log instead print statements

To keep track of whether code is executing correctly and to check where the code's execution is at, you can print strings to the console with print().


def execution_progress_bad() -> None:
    print("Start")
    print("Running")
    print("Something went wrong!")

If you however want to suppress the output (e.g. in the finished code), you have to delete or comment out every place where the print() function is called. In addition, this kind of progress monitoring offers no possibility to limit it to error, status or debug outputs only. Therefore it is better to use the logging module.


import logging

def execution_progress_good() -> None:
    logging.debug("Start")
    logging.info("Running")
    logging.error("Something went wrong!")

def main() -> None:
    level = logging.DEBUG
    format = "[%(levelname)s] %(asctime)s - %(message)s"
    logging.basicConfig(level = level, format = format)

This can be used to specify both the level of the output and the format of the output.

Shell commands

To interact with programs of the operating system, you may sometimes calls shell commands.


import subprocess

def subprocess_call_bad() -> None:
    subprocess.run(
        ["ls -l"]
        capture_output = True,
        shell = True
    )

However, you should refrain from setting shell to True, as this can lead to security vulnerabilities. In addition, the passed commands should be passed separately in the list. Maybe, because the list has not been set properly, you would have set the shell argument.


import subprocess

def subprocess_call_good() -> None:
    subprocess.run(
        ["ls", "-l"],
        capture_output = True
    )

Mathematics with Python

To do more complex mathematical calculations (especially matrix or vector calculations), you can of course use lists:


def add_vectors_bad() -> list:
    a = list(range(1000))
    b = list(range(1000))
    return [i + j for i, j in zip(a, b)]

However, this makes the code less readable and it becomes more difficult to detect errors that happened when translating the mathematical formula into code - besides other programming errors. Therefore it is recommended to use the library Numpy:


import numpy as np
def add_vectors_good() -> np.ndarray:
    a = np.arange(1000)
    b = np.arange(1000)
    return a + b

For more general data analysis, the Pandas library should also be used.

Import

It is possible to import all functions, submodules and classes of a module using *:


from itertools import *

However, this will litter the namespace. Instead, it is better to import only the functions etc. that are really used:


from itertools import count

Importing your own files

In custom projects, dependencies are often imported immediately, since they are stored in a flat folder structure in close proximity to the main code.


from my_module import my_func

However, it is recommended to group modules into packages and import them more selectively. This does not change the code itself, but it makes the project structure more organized and clear.


from my_package.my_module import my_func

PEP 8

Finally, something about coding style. In fact, PEP 8 is a style guide only and its implementation does not affect the code itself.


def pep8_bad() -> None:
    x = (1,2)
    y=6.44
    l = [1,6,4]
    
    def my_func(i: AnyStr=None, limit = 100) -> None:
        pass

However, and especially for working in larger development teams, you should adopt a uniform coding style. And if there is a standardized coding style, why not use it?


def pep8_good() -> None:
    x = (1, 2)
    y = 6.44
    l = [1, 6, 4]
    
    def my_func(i: AnyStr = None, limit=100) -> None:
        pass