Python bytes to int array

Python doesn't traditionally have much use for "numbers in big-endian C layout" that are too big for C. (If you're dealing with 2-byte, 4-byte, or 8-byte numbers, then struct.unpack is the answer.)

But enough people got sick of there not being one obvious way to do this that Python 3.2 added a method int.from_bytes that does exactly what you want:

int.from_bytes(b, byteorder='big', signed=False)

Unfortunately, if you're using an older version of Python, you don't have this. So, what options do you have? (Besides the obvious one: update to 3.2, or, better, 3.4…)


First, there's your code. I think binascii.hexlify is a better way to spell it than .encode('hex'), because "encode" has always seemed a little weird for a method on byte strings (as opposed to Unicode strings), and it's in fact been banished in Python 3. But otherwise, it seems pretty readable and obvious to me. And it should be pretty fast—yes, it has to create an intermediate string, but it's doing all the looping and arithmetic in C (at least in CPython), which is generally an order of magnitude or two faster than in Python. Unless your bytearray is so big that allocating the string will itself be costly, I wouldn't worry about performance here.

Alternatively, you could do it in a loop. But that's going to be more verbose and, at least in CPython, a lot slower.

You could try to eliminate the explicit loop for an implicit one, but the obvious function to do that is reduce, which is considered un-Pythonic by part of the community—and of course it's going to require calling a function for each byte.

You could unroll the loop or reduce by breaking it into chunks of 8 bytes and looping over struct.unpack_from, or by just doing a big struct.unpack('Q'*len(b)//8 + 'B' * len(b)%8) and looping over that, but that makes it a lot less readable and probably not that much faster.

You could use NumPy… but if you're going bigger than either 64 or maybe 128 bits, it's going to end up converting everything to Python objects anyway.

So, I think your answer is the best option.


Here are some timings comparing it to the most obvious manual conversion:

import binascii
import functools
import numpy as np

def hexint(b):
    return int(binascii.hexlify(b), 16)

def loop1(b):
    def f(x, y): return (x<<8)|y
    return functools.reduce(f, b, 0)

def loop2(b):
    x = 0
    for c in b:
        x <<= 8
        x |= c
    return x

def numpily(b):
    n = np.array(list(b))
    p = 1 << np.arange(len(b)-1, -1, -1, dtype=object)
    return np.sum(n * p)

In [226]: b = bytearray(range(256))

In [227]: %timeit hexint(b)
1000000 loops, best of 3: 1.8 µs per loop

In [228]: %timeit loop1(b)
10000 loops, best of 3: 57.7 µs per loop

In [229]: %timeit loop2(b)
10000 loops, best of 3: 46.4 µs per loop

In [283]: %timeit numpily(b)
10000 loops, best of 3: 88.5 µs per loop

For comparison in Python 3.4:

In [17]: %timeit hexint(b)
1000000 loops, best of 3: 1.69 µs per loop

In [17]: %timeit int.from_bytes(b, byteorder='big', signed=False)
1000000 loops, best of 3: 1.42 µs per loop

So, your method is still pretty fast…

View Discussion

Improve Article

Save Article

  • Read
  • Discuss
  • View Discussion

    Improve Article

    Save Article

    A bytes object can be converted to an integer value easily using Python. Python provides us various in-built methds like from_bytes() as well as classes to carry out this interconversion.

    int.from_bytes() method

    A byte value can be interchanged to an int value by using the int.from_bytes() method. This method requires at least Python 3.2 and has the following syntax : 

    Syntax: int.from_bytes(bytes, byteorder, *, signed=False)

    Parameters:

    • bytes – A byte object 
    • byteorder – Determines the order of representation of the integer value. byteorder can have values as either “little” where most significant bit is stored at the end and least at the beginning, or big, where MSB is stored at start and LSB at the end. Big byte order calculates the value of an integer in base 256. 
    • signed – Default value – False . Indicates whether to represent 2’s complement of a number. 

    Returns – an int equivalent to the given byte

    The following snippets indicate the conversion of byte to int object. 

    Example 1:

    Python3

    byte_val = b'\x00\x01'

    int_val = int.from_bytes(byte_val, "big")

    print(int_val)

    Output:

    1

    Example 2:

    Python3

    byte_val = b'\x00\x10'

    int_val = int.from_bytes(byte_val, "little")

    print(int_val)

    Output:

    4096

    Example 3:

    Python3

    byte_val = b'\xfc\x00'

    int_val = int.from_bytes(byte_val, "big", signed="True")

    print(int_val)

    Output:

    -1024

    How do you convert bytes to integers in Python?

    Syntax: int.from_bytes(bytes, byteorder, *, signed=False).
    Parameters:.
    Returns – an int equivalent to the given byte..

    How do you create a byte array in Python?

    string = "Python is interesting." # string with encoding 'utf-8' arr = bytearray(string, 'utf-8') print(arr) Run Code..
    size = 5. arr = bytearray(size) print(arr) Run Code..
    rList = [1, 2, 3, 4, 5] arr = bytearray(rList) print(arr) Run Code..

    What is Bytearray Python?

    The Python bytearray() function converts strings or collections of integers into a mutable sequence of bytes. It provides developers the usual methods Python affords to both mutable and byte data types. Python's bytearray() built-in allows for high-efficiency manipulation of data in several common situations.

    Can we assign byte to int?

    We can directly assign the byte to the int data type. Secondly, we have a Wrapper class method intValue() that returns the value of byte as an int after widening the primitive conversion as we're storing a smaller data type into a larger one. If we take the byte as unsigned, then we have the Byte.