File I/O and the with Statement#

Reading and writing files is one of the most common tasks in any program. Python opens files with open(), but every opened file must be closed—otherwise it holds OS resources and buffered writes may never reach disk. The with statement is Python’s syntactic sugar that guarantees the file is closed when the block ends.


open() basics#

open() returns a file object. When you’re done with it, you must call close() to release the resource:

1
2
3
4
5
# The simplest read (shown the old way; we'll switch to with next)
f = open("data.txt", "r", encoding="utf-8")
content = f.read()
f.close()
print(content)

try/finally → with: the motivation#

The snippet above hides a bug: if anything between f.read() and f.close() raises, close() never runs.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
# v1: forgot to close — bug!
f = open("data.txt", "r", encoding="utf-8")
content = f.read()
raise ValueError("something went wrong")  # simulate a mid-processing error
f.close()                                  # ← never reached; the file stays open

# Output:
# Traceback (most recent call last):
#   File "demo.py", line 4, in <module>
#     raise ValueError("something went wrong")
# ValueError: something went wrong
#
# ↑ Execution stops at raise; f.close() never runs.
1
2
3
4
5
6
7
8
# v2: try/finally guarantees the close
f = open("data.txt", "r", encoding="utf-8")
try:
    content = f.read()
    raise ValueError("something went wrong")  # simulate a mid-processing error
finally:
    f.close()      # runs whether or not the try block raised
# The exception still propagates, but f is properly closed first

That works, but you’d repeat the same boilerplate every time you handle a file, a database connection, or a lock. The with statement bakes the “acquire → use → guarantee release” pattern into the language:

1
2
3
4
5
6
# v3: with handles close automatically
with open("data.txt", "r", encoding="utf-8") as f:
    content = f.read()
    raise ValueError("something went wrong")  # simulate a mid-processing error
# When the block exits — normally or via exception — f.close() is called
# The exception still propagates, but f is closed first — same effect as v2, less code

This works because the file object returned by open() implements the context manager protocol: __enter__() runs when the block starts, __exit__() runs when it ends. Any object satisfying this protocol can go inside with—not just files, but also threading.Lock, database connections, temporary directory changes, and more.

From here on, every file example in this chapter uses with.

Managing multiple resources at once#

1
2
3
4
5
6
# One with, multiple files
with open("input.txt", "r", encoding="utf-8") as src, \
     open("output.txt", "w", encoding="utf-8") as dst:
    for line in src:
        dst.write(line.upper())
# Both src and dst are closed automatically

Open modes#

ModeMeaningIf file doesn’t exist
"r"Read (default)raises FileNotFoundError
"w"Write (truncates existing content)created
"a"Append (write to end)created
"x"Exclusive createraises FileExistsError if it exists
"b"Binary mode (combine with above, e.g. "rb", "wb")
"+"Read+write (combine with above, e.g. "r+", "w+")
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
# Write: clears existing content
with open("output.txt", "w", encoding="utf-8") as f:
    f.write("first line\n")
    f.write("second line\n")

# Append: keeps existing content, adds at the end
with open("output.txt", "a", encoding="utf-8") as f:
    f.write("third line (appended)\n")

# Binary mode: images, video, any non-text data
with open("photo.jpg", "rb") as f:
    data = f.read()
    print(f"size: {len(data)} bytes")

encoding#

When reading or writing text files, always specify encoding explicitly. Without it, Python uses the platform default—often cp1252 on Windows, utf-8 on Linux/macOS—so the same code on a different machine can produce mojibake or raise UnicodeDecodeError:

1
2
3
4
5
6
# Always do this: be explicit about utf-8
with open("greeting.txt", "w", encoding="utf-8") as f:
    f.write("Hello, world\n")

with open("greeting.txt", "r", encoding="utf-8") as f:
    print(f.read())

Binary modes "rb" / "wb" neither need nor accept an encoding argument—you’re working with raw bytes.


Three ways to read#

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
# Way 1: read the whole file at once (fine for small files)
with open("data.txt", "r", encoding="utf-8") as f:
    content = f.read()
    print(content)

# Way 2: read into a list, one line per element (newlines included)
with open("data.txt", "r", encoding="utf-8") as f:
    lines = f.readlines()
    print(lines)  # ['first\n', 'second\n', ...]

# Way 3: iterate line by line (memory-friendly; prefer this for large files)
with open("data.txt", "r", encoding="utf-8") as f:
    for line in f:
        print(line.rstrip())  # rstrip() removes the trailing \n

Writing your own context manager#

To make your own object usable with with, you have two options.

Option 1: class-based (__enter__ / __exit__)#

Implement these two methods and the object becomes a context manager. The Timer below records a start time on entry and prints the elapsed time on exit:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
import time

class Timer:
    def __init__(self, label):
        self.label = label

    def __enter__(self):
        self.start = time.perf_counter()
        return self          # `with ... as t` binds t to whatever __enter__ returns

    def __exit__(self, exc_type, exc_value, traceback):
        elapsed = time.perf_counter() - self.start
        print(f"[{self.label}] took {elapsed:.4f}s")
        # Return False (or None): exceptions propagate
        # Return True: exception is swallowed
        return False


with Timer("sum of squares"):
    total = sum(i * i for i in range(1_000_000))
print(f"total: {total}")

The three arguments to __exit__ are all None on a normal exit and carry exception info if the block raised. The next example mimics a database connection that always closes, success or failure:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
class FakeDBConnection:
    def __init__(self, dsn):
        self.dsn = dsn

    def __enter__(self):
        print(f"connecting to {self.dsn}")
        return self

    def query(self, sql):
        print(f"executing: {sql}")
        if "DROP" in sql:
            raise RuntimeError("DROP statements are not allowed")

    def __exit__(self, exc_type, exc_value, traceback):
        print("closing connection")
        if exc_type is not None:
            print(f"  block raised: {exc_type.__name__}: {exc_value}")
        return False  # let the exception propagate


with FakeDBConnection("localhost:5432") as db:
    db.query("SELECT * FROM users")
    db.query("DROP TABLE users")  # raises, but the connection still closes

Option 2: the @contextlib.contextmanager decorator#

Writing a whole class is overkill for simple cases. The standard library’s contextlib provides a decorator that turns a generator function into a context manager—everything before yield is the __enter__ part, everything after is __exit__:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
from contextlib import contextmanager
import time

@contextmanager
def timer(label):
    start = time.perf_counter()
    try:
        yield                 # control passes to the with block
    finally:
        elapsed = time.perf_counter() - start
        print(f"[{label}] took {elapsed:.4f}s")


with timer("sum of squares"):
    total = sum(i * i for i in range(1_000_000))

Important: yield must be inside a try / finally. Without it, an exception inside the with block skips the cleanup logic.

You can yield a value out for the as clause to bind:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
import os
from contextlib import contextmanager

@contextmanager
def change_dir(path):
    """Switch to `path` for the duration of the block, then restore."""
    original = os.getcwd()
    os.chdir(path)
    try:
        yield path
    finally:
        os.chdir(original)


with change_dir("/tmp") as cwd:
    print(f"now in: {cwd}")
    # ... operate on files under /tmp
print(f"restored to: {os.getcwd()}")

Which one should I use?#

SituationRecommended
Simple “enter → exit” pairing@contextmanager
Need to keep state, expose multiple methods (like db.query())class-based
Need inheritance or compositionclass-based
One-off, short and readable@contextmanager

Practical examples#

Safe write: temp file then rename#

Avoids leaving a half-written file if the process is interrupted:

1
2
3
4
5
6
7
8
9
import os

def safe_write(path, content):
    tmp = path + ".tmp"
    with open(tmp, "w", encoding="utf-8") as f:
        f.write(content)
    os.replace(tmp, path)  # atomic rename

safe_write("config.json", '{"version": 2}')

Suppress a specific exception: contextlib.suppress#

1
2
3
4
5
6
from contextlib import suppress
import os

# Don't crash if the file is already gone
with suppress(FileNotFoundError):
    os.remove("maybe_not_there.tmp")

Counting lines in a huge file (memory-friendly)#

1
2
3
4
5
def count_lines(path):
    with open(path, "r", encoding="utf-8") as f:
        return sum(1 for _ in f)  # iterates line by line; never loads the whole file

print(count_lines("huge.log"))

Copying a file (binary mode)#

1
2
3
4
5
6
def copy_file(src, dst, chunk_size=64 * 1024):
    with open(src, "rb") as fin, open(dst, "wb") as fout:
        while chunk := fin.read(chunk_size):
            fout.write(chunk)

copy_file("photo.jpg", "photo_backup.jpg")