Description
Python tuple method cmp[] compares elements of two tuples.
Syntax
Following is the syntax for cmp[] method −
cmp[tuple1, tuple2]
Parameters
tuple1 − This is the first tuple to be compared
tuple2 − This is the second tuple to be compared
Return Value
If elements are of the same type, perform the compare and return the result. If elements are different types, check to see if they are numbers.
If numbers, perform numeric coercion if necessary and compare.
If either element is a number, then the other element is "larger" [numbers are "smallest"].
Otherwise, types are sorted alphabetically by name.
If we reached the end of one of the tuples, the longer tuple is "larger." If we exhaust both tuples and share the same data, the result is a tie, meaning that 0 is returned.
Example
The following example shows the usage of cmp[] method.
#!/usr/bin/python tuple1, tuple2 = [123, 'xyz'], [456, 'abc'] print cmp[tuple1, tuple2] print cmp[tuple2, tuple1] tuple3 = tuple2 + [786,]; print cmp[tuple2, tuple3]
When we run above program, it produces following result −
-1 1 -1
python_tuples.htm
Is there a way to reversibly compress a tuple of integers in Python? I have a large number of 5 digit tuples [values range from 0-100000] that I want to be able to store in a more memory efficient way but I also need to use their original values at a later point.
If I had a tuple like this:
test_tuple = [520, 203, 9721, 12, 4839]
I'd like
to be able to compress it to a single integer value, similar to pythons default hash
function, except I need to be able to recreate the tuple from the integer value, which is not possible with the hash
function.
So something like:
compressed = compress[test_tuple]
og_tuple = decompress[compressed]
Where compressed
is an integer [or other small memory] representation of test_tuple
and og_tuple
is the original tuple extracted from the compressed representation. Ideally the solution should also be fast.
martineau
115k25 gold badges160 silver badges284 bronze badges
asked Oct 23, 2021 at 20:09
8
This probably isn’t the fastest or smallest form of compressing but it’s prolly the easiest to follow. Get the lengths of all the integers and append that to the integers as a string:
test_tuple = [520, 203, 9721, 12, 4839]
strs = list[map[str, test_tuple]]
compressed = int[''.join[map[str, map[len, strs]]] + ''.join[strs]]
Output:
334245202039721124839
Then to decompress, since all tuples contain 5 integers and the lengths
won’t be over 9 you know the first 5 characters will be the lengths, so just use that to unpack the integers again. I used itertools.islice
for this btw:
from itertools import islice
s = str[compressed]
idxs, nums = map[int, s[:5]], [iter[s[5:]]]*5
decompressed = tuple[int[''.join[islice[n, i]]] for i, n in zip[idxs, nums]]
Output:
[520, 203, 9721, 12, 4839]
answered Oct 23, 2021 at 20:42
JabJab
25.8k21 gold badges74 silver badges113 bronze badges
2
array
stores fixed length integers efficiently. You could use that to build a single bytes object holding all of your data and then use a compression algorithm like LZMA to slim it down further still.
import array
import lzma
test_list = [tuple[range[i, i+5]] for i in range[100_000]]
arr_bytes = b"".join[array.array["I", tup].tobytes[] for tup in test_list]
compressed = lzma.compress[arr_bytes]
A longer version that includes checking sizes and unwinding back to the original array is
import array
import lzma
import sys
# generate a test list and get its size
test_list = [tuple[range[i, i+5]] for i in range[100_000]]
sz = sum[sum[sys.getsizeof[i] for i in tup] for tup in test_list]
sz += sum[sys.getsizeof[tup] for tup in test_list]
sz += sys.getsizeof[test_list]
print["orig size" , sz]
# convert tuples to fixed size array of 4 byte ints, then concat to bytes
arr_format = "I"
arr_item_size = 4
assert array.array["I"].itemsize == arr_item_size, "pack to 4 bytes"
arr_len = arr_item_size * 5
arr_bytes = b"".join[array.array[arr_format, tup].tobytes[] for tup in test_list]
print["as bytes", len[arr_bytes], f"{len[arr_bytes]/sz*100:.2f}%"]
# compress
compressed = lzma.compress[arr_bytes]
print["compressed", len[compressed], f"{len[compressed]/sz*100:.2f}%"]
# sanity check that we got the same stuff back
arr_bytes_decompressed = lzma.decompress[compressed]
assert arr_bytes_decompressed == arr_bytes, "decompressed right"
test_list_decompressed = [tuple[array.array[arr_format, arr_bytes_decompressed[i:i+arr_len]]]
for i in range[0, len[arr_bytes_decompressed], arr_len]]
assert test_list_decompressed == test_list, "test list"
Running I get
orig size 22800980
as bytes 2000000 8.77%
compressed 55972 0.25%
The compressed size depends on how random your integers are, but still, that ain't bad!
answered Oct 23, 2021 at 21:38
tdelaneytdelaney
66k5 gold badges74 silver badges106 bronze badges
Sometimes the obvious solution pops up after going through an answer. Since this is the best solution [IMHO], I'm posting it separately....
Just pickle it.
import pickle
import lzma
test_list = [tuple[range[i, i+5]] for i in range[100_000]]
compressed = lzma.compress[pickle.dumps[test_list]]
More complicated with data measurements...
# generate a test list and get its size
test_list = [tuple[range[i, i+5]] for i in range[100_000]]
sz = sum[sum[sys.getsizeof[i] for i in tup] for tup in test_list]
sz += sum[sys.getsizeof[tup] for tup in test_list]
sz += sys.getsizeof[test_list]
print["orig size" , sz]
pickled = pickle.dumps[test_list]
print["pickled", len[pickled], f"{len[pickled]/sz*100:.2f}%"]
compressed = lzma.compress[pickled]
print["compressed", len[compressed], f"{len[compressed]/sz*100:.2f}%"]
Output
orig size 22800980
pickled 2143892 9.40%
compressed 152068 0.67%
answered Oct 23, 2021 at 22:08
tdelaneytdelaney
66k5 gold badges74 silver badges106 bronze badges
2