Python read file as bytes

Reading binary file in Python and looping over each byte

New in Python 3.5 is the pathlib module, which has a convenience method specifically to read in a file as bytes, allowing us to iterate over the bytes. I consider this a decent (if quick and dirty) answer:

import pathlib

for byte in pathlib.Path(path).read_bytes():
    print(byte)

Interesting that this is the only answer to mention pathlib.

In Python 2, you probably would do this (as Vinay Sajip also suggests):

with open(path, 'b') as file:
    for byte in file.read():
        print(byte)

In the case that the file may be too large to iterate over in-memory, you would chunk it, idiomatically, using the iter function with the callable, sentinel signature - the Python 2 version:

with open(path, 'b') as file:
    callable = lambda: file.read(1024)
    sentinel = bytes() # or b''
    for chunk in iter(callable, sentinel): 
        for byte in chunk:
            print(byte)

(Several other answers mention this, but few offer a sensible read size.)

Best practice for large files or buffered/interactive reading

Let's create a function to do this, including idiomatic uses of the standard library for Python 3.5+:

from pathlib import Path
from functools import partial
from io import DEFAULT_BUFFER_SIZE

def file_byte_iterator(path):
    """given a path, return an iterator over the file
    that lazily loads the file
    """
    path = Path(path)
    with path.open('rb') as file:
        reader = partial(file.read1, DEFAULT_BUFFER_SIZE)
        file_iterator = iter(reader, bytes())
        for chunk in file_iterator:
            yield from chunk

Note that we use file.read1. file.read blocks until it gets all the bytes requested of it or EOF. file.read1 allows us to avoid blocking, and it can return more quickly because of this. No other answers mention this as well.

Demonstration of best practice usage:

Let's make a file with a megabyte (actually mebibyte) of pseudorandom data:

import random
import pathlib
path = 'pseudorandom_bytes'
pathobj = pathlib.Path(path)

pathobj.write_bytes(
  bytes(random.randint(0, 255) for _ in range(2**20)))

Now let's iterate over it and materialize it in memory:

>>> l = list(file_byte_iterator(path))
>>> len(l)
1048576

We can inspect any part of the data, for example, the last 100 and first 100 bytes:

>>> l[-100:]
[208, 5, 156, 186, 58, 107, 24, 12, 75, 15, 1, 252, 216, 183, 235, 6, 136, 50, 222, 218, 7, 65, 234, 129, 240, 195, 165, 215, 245, 201, 222, 95, 87, 71, 232, 235, 36, 224, 190, 185, 12, 40, 131, 54, 79, 93, 210, 6, 154, 184, 82, 222, 80, 141, 117, 110, 254, 82, 29, 166, 91, 42, 232, 72, 231, 235, 33, 180, 238, 29, 61, 250, 38, 86, 120, 38, 49, 141, 17, 190, 191, 107, 95, 223, 222, 162, 116, 153, 232, 85, 100, 97, 41, 61, 219, 233, 237, 55, 246, 181]
>>> l[:100]
[28, 172, 79, 126, 36, 99, 103, 191, 146, 225, 24, 48, 113, 187, 48, 185, 31, 142, 216, 187, 27, 146, 215, 61, 111, 218, 171, 4, 160, 250, 110, 51, 128, 106, 3, 10, 116, 123, 128, 31, 73, 152, 58, 49, 184, 223, 17, 176, 166, 195, 6, 35, 206, 206, 39, 231, 89, 249, 21, 112, 168, 4, 88, 169, 215, 132, 255, 168, 129, 127, 60, 252, 244, 160, 80, 155, 246, 147, 234, 227, 157, 137, 101, 84, 115, 103, 77, 44, 84, 134, 140, 77, 224, 176, 242, 254, 171, 115, 193, 29]

Don't iterate by lines for binary files

Don't do the following - this pulls a chunk of arbitrary size until it gets to a newline character - too slow when the chunks are too small, and possibly too large as well:

    with open(path, 'rb') as file:
        for chunk in file: # text newline iteration - not for bytes
            yield from chunk

The above is only good for what are semantically human readable text files (like plain text, code, markup, markdown etc... essentially anything ascii, utf, latin, etc... encoded) that you should open without the 'b' flag.

In this Python tutorial, we will learn how to read a binary file in python, and also we will cover these topics:

  • How to read a binary file to an array in Python
  • How to read a binary file into a byte array in Python
  • How to read a binary file line by line in Python
  • Python read a binary file to Ascii
  • How to read a binary file into a NumPy array in Python
  • How to read a binary file into CSV in Python

Here, we will see how to read a binary file in Python.

  • Before reading a file we have to write the file. In this example, I have opened a file using file = open(“document.bin”,”wb”) and used the “wb” mode to write the binary file.
  • The document.bin is the name of the file.
  • I have taken a variable as a sentence and assigned a sentence “This is good”, To decode the sentence, I have used sentence = bytearray(“This is good”.encode(“ascii”)).
  • And to write the sentence in the file, I have used the file.write() method.
  • The write() is used to write the specified text to the file. And then to close the file, I have used the file.close().

Example to write the file:

file = open("document.bin","wb")
sentence = bytearray("This is good".encode("ascii"))
file.write(sentence)
file.close()
  • To read the file, I have taken the already created file document.bin and used the “rb” mode to read the binary file.
  • The document.bin is the file name. And, I have using the read() method. The read() method returns the specified number of bytes from the file.

Example to read the file:

file = open("document.bin","rb")
print(file.read(4))
file.close()

In this output, you can see that I have used print(file.read(4)). Here, from the sentence, it will read only four words. As shown in the output.

Python read file as bytes
Python read a binary file

You may like Python Pandas CSV Tutorial and File does not exist Python.

Python read a binary file to an array

Here, we can see how to read a binary file to an array in Python.

  • In this example, I have opened a file as array.bin and used the “wb” mode to write thebinary file. The array.bin is the name of the file.
  • And assigned an array as num=[2,4,6,8,10] to get the array in byte converted format, I have used bytearray(). The bytearray() method returns the byte array objects.
  • To writes the array in the file, I have used the file.write(). And file.close() to close the file.

Example to write an array to the file:

file=open("array.bin","wb")
num=[2,4,6,8,10]
array=bytearray(num)
file.write(array)
file.close()
  • To read the written array from the file, I have used the same file i.e,file=open(“array.bin”,”rb”).
  • The “rb” mode is used to read the array from the file.
  • The list() function is used to create the list object number=list(file.read(3)). The file.read() is used to read the bytes from the file.
  • The file.read(3) is used to read-only three numbers from the array. The file.close() is used to close the file.

Example to read an array from the file:

file=open("array.bin","rb")
number=list(file.read(3))
print (number)
file.close()

To get the output, I have used print(number). And to close the file, I have used file.close(). In the below screenshot you can see the output.

Python read file as bytes
Python read a binary file to an array
  • How to Convert Python string to byte array with Examples
  • Python Array with Examples
  • Create an empty array in Python

Python read a binary file into a byte array

Now, we can see how to read a binary file into a byte array in Python.

  • In this example, I have opened a file called sonu.bin and “rb” mode is used to read a binary file, and sonu.bin is the name of the file. Here, I have stored some data in the sonu.bin file.
  • The byte = file.read(3) is used to read the file, and file.read(3) is used to read only 3 bytes from the file.
  • The while loop is used to read and iterate all the bytes from the file.

Example:

file = open("sonu.bin", "rb")
byte = file.read(3)
while byte:
    print(byte)
    byte = file.read(3)

To read the byte from the file, I have used print(byte). You can refer to the below screenshot for the output.

Python read file as bytes
Python read a binary file into a byte array

Python read a binary file line by line

Here, we can see how to read a binary file line by line in Python.

  • In this example, I have taken a line as lines=[“Welcome to python guides\n”] and open a file named as file=open(“document1.txt”,”wb”) document1.txt is the filename.
  • The “wb” is the mode used to write the binary files. The file.writelines(lines) is used to write the lines from the file.
  • The writelines() returns the sequence of string to the file. The file.close() method is used to close the file.

Example to write the file:

lines=["Welcome to python guides\n"]
file=open("document1.txt","wb")
file.writelines(lines)
file.close()
  • To read the written file, I have used the same filename as document1.txt, I have used file=open(“document1.txt”,”rb”) to open the file, “rb” mode is used to read the binary file and, To read the line from the file I have used line=file.readline().
  • The readline() returns one line from the file.

Example to read the file:

file=open("document1.txt","rb")
line=file.readline()
print(line)
file.close()

To get the output, print(line) is used and lastly to close the file, I have used file.close().

Python read file as bytes
Python read a binary file line by line

Python read a binary file to Ascii

Now, we can see how to read a binary file to Ascii in Python.

  • In this example, I have opened a file named test.bin using file = open(‘test.bin’, ‘wb’), The ‘wb’ mode is used to write the binary file and I have taken a variable as a sentence and assigned a sentence = ‘Hello Python’. To encode the sentence.
  • I have used file_encode = sentence.encode(‘ASCII’). To write the encoded sentence in the file, I have used the file.write(file_encode).
  • The file.seek() method returns the new position. To read the written file, I have used the file.read() which returns a byte from the file.
  • And then to convert the binary sentence into Ascii, I have used new_sentence = bdata. decode(‘ASCII’).

Example:

file = open('test.bin', 'wb')
sentence = 'Hello Python'
file_encode = sentence.encode('ASCII')
file.write(file_encode)
file.seek(0)
bdata = file.read()
print('Binary sentence', bdata)
new_sentence = bdata.decode('ASCII')
print('ASCII sentence', new_sentence)

To get the output as an encoded sentence, I have used print(‘ASCII sentence’, new_sentence). You can refer to the below screenshot for the output.

Python read file as bytes
Python read a binary file to Ascii

Python read a binary file into a NumPy array

Here, we can see how to read a binary file into a numpy array in Python.

  • In this example, I have imported a module called NumPy. The array = np.array([2,8,7]) is used to create an array, The .tofile is used to write all the array to the file. The array.bin is the name of the binary file.
  • The np.fromfile is used to construct an array from the data in the file. The dtype=np.int8 is the datatype object. The output of the array changes if we change np.int8 to int32 or int64.

Example:

import numpy as np
array = np.array([2,8,7]).tofile("array.bin")
print(np.fromfile("array.bin",  dtype=np.int8))

To get the output, I have used print(np.fromfile(“array.bin”, dtype=np.int8)). The below screenshot shows the output.

Python read file as bytes
Python read a binary file into a NumPy array

Python read a binary file into CSV

Here, we can see how to read binary file into csv in Python.

  • In this example, I have imported a module called CSV. The CSV module is a comma-separated value module. It is used to read and write tabular data in CSV format.
  • I have opened a file called lock.bin and “w” mode is used to write the file writer = csv.writer(f) is used to write the objects in the file. The lock.bin is the name of the file.
  • The writer() returns the write object which converts data into a string.
  • The writer.writerows is used to write all the rows into the file. To close the file, f.close() is used.

Example to write the csv file:

import csv
f = open("lock.bin", "w")
writer = csv.writer(f)
writer.writerows([["a", 1], ["b", 2], ["c", 3], ["d",4]])
f.close()

To read the CSV file, I have opened the file lock.bin in which data is already written, The ‘r‘ mode is used to read the file. To read the CSV file, I have used reader = csv.reader(file) to return a list of rows from the file.

Example to read the csv file:

import csv
with open('lock.bin', 'r') as file:
    reader = csv.reader(file)
    for row in reader:
        print(row)

To get the output I have used print(row). The below screenshot shows the output.

Python read file as bytes
Python read a binary file into CSV

You may like the following Python tutorials:

  • How to draw a shape in python using Turtle
  • Python ask for user input (Examples)
  • How to Convert Python string to byte array with Examples
  • Python pass by reference or value with examples
  • Python select from a list + Examples
  • Union of sets Python + Examples
  • Introduction to Python Interface
  • How to convert a String to DateTime in Python
  • Python list comprehension using if-else

In this tutorial we have learned about Python read a binary file, also we have covered these topics:

  • Python read a binary file to an array
  • Python read a binary file into a byte array
  • Python read a binary file line by line
  • Python read a binary file to Ascii
  • Python read a binary file into a NumPy array
  • Python read a binary file into CSV

Python read file as bytes

Python is one of the most popular languages in the United States of America. I have been working with Python for a long time and I have expertise in working with various libraries on Tkinter, Pandas, NumPy, Turtle, Django, Matplotlib, Tensorflow, Scipy, Scikit-Learn, etc… I have experience in working with various clients in countries like United States, Canada, United Kingdom, Australia, New Zealand, etc. Check out my profile.

How do I read a binary file in Python?

The open() function opens a file in text format by default. To open a file in binary format, add 'b' to the mode parameter. Hence the "rb" mode opens the file in binary format for reading, while the "wb" mode opens the file in binary format for writing. Unlike text files, binary files are not human-readable.

How do you convert a text file to binary in Python?

“how to convert text file to binary file in python” Code Answer.
file = open("sample.bin", "wb").
COPYfile. write(b"This binary string will be written to sample.bin").
COPYfile. close().

What is RB mode in Python?

rb : Opens the file as read-only in binary format and starts reading from the beginning of the file. While binary format can be used for different purposes, it is usually used when dealing with things like images, videos, etc. r+ : Opens a file for reading and writing, placing the pointer at the beginning of the file.

What is binary file Python?

"Binary" files are any files where the format isn't made up of readable characters. Binary files can range from image files like JPEGs or GIFs, audio files like MP3s or binary document formats like Word or PDF. In Python, files are opened in text mode by default.