Best way to check if a list is empty
For example, if passed the following:
a = []
How do I check to see if a is empty?
Short Answer:
Place the list in a boolean context [for example, with an if
or while
statement]. It will test False
if it is empty, and True
otherwise. For example:
if not a: # do this!
print['a is an empty list']
PEP 8
PEP 8, the official Python style guide for Python code in Python's standard library, asserts:
For sequences, [strings, lists, tuples], use the fact that empty sequences are false.
Yes: if not seq: if seq: No: if len[seq]: if not len[seq]:
We should expect that standard library code should be as performant and correct as possible. But why is that the case, and why do we need this guidance?
Explanation
I frequently see code like this from experienced programmers new to Python:
if len[a] == 0: # Don't do this!
print['a is an empty list']
And users of lazy languages may be tempted to do this:
if a == []: # Don't do this!
print['a is an empty list']
These are correct in their respective other languages. And this is even semantically correct in Python.
But we consider it un-Pythonic because Python supports these semantics directly in the list object's interface via boolean coercion.
From the docs [and note specifically the inclusion of the
empty list, []
]:
By default, an object is considered true unless its class defines either a
__bool__[]
method that returnsFalse
or a__len__[]
method that returns zero, when called with the object. Here are most of the built-in objects considered false:
- constants defined to be false:
None
andFalse
.- zero of any numeric type:
0
,0.0
,0j
,Decimal[0]
,Fraction[0, 1]
- empty sequences and collections:
''
,[]
,[]
,{}
,set[]
,range[0]
And the datamodel documentation:
object.__bool__[self]
Called to implement truth value testing and the built-in operation
bool[]
; should returnFalse
orTrue
. When this method is not defined,__len__[]
is called, if it is defined, and the object is considered true if its result is nonzero. If a class defines neither__len__[]
nor__bool__[]
, all its instances are considered true.
and
object.__len__[self]
Called to implement the built-in function
len[]
. Should return the length of the object, an integer >= 0. Also, an object that doesn’t define a__bool__[]
method and whose__len__[]
method returns zero is considered to be false in a Boolean context.
So instead of this:
if len[a] == 0: # Don't do this!
print['a is an empty list']
or this:
if a == []: # Don't do this!
print['a is an empty list']
Do this:
if not a:
print['a is an empty list']
Doing what's Pythonic usually pays off in performance:
Does it pay off? [Note that less time to perform an equivalent operation is better:]
>>> import timeit
>>> min[timeit.repeat[lambda: len[[]] == 0, repeat=100]]
0.13775854044661884
>>> min[timeit.repeat[lambda: [] == [], repeat=100]]
0.0984637276455409
>>> min[timeit.repeat[lambda: not [], repeat=100]]
0.07878462291455435
For scale, here's the cost of calling the function and constructing and returning an empty list, which you might subtract from the costs of the emptiness checks used above:
>>> min[timeit.repeat[lambda: [], repeat=100]]
0.07074015751817342
We see that either checking for length
with the builtin function len
compared to 0
or checking against an empty list is much less performant than using the builtin syntax of the language as documented.
Why?
For the len[a] == 0
check:
First Python has to check the globals to see if len
is shadowed.
Then it must call the function, load 0
, and do the equality comparison in Python [instead of with C]:
>>> import dis
>>> dis.dis[lambda: len[[]] == 0]
1 0 LOAD_GLOBAL 0 [len]
2 BUILD_LIST 0
4 CALL_FUNCTION 1
6 LOAD_CONST 1 [0]
8 COMPARE_OP 2 [==]
10 RETURN_VALUE
And for the [] == []
it has to build an unnecessary
list and then, again, do the comparison operation in Python's virtual machine [as opposed to C]
>>> dis.dis[lambda: [] == []]
1 0 BUILD_LIST 0
2 BUILD_LIST 0
4 COMPARE_OP 2 [==]
6 RETURN_VALUE
The "Pythonic" way is a much simpler and faster check since the length of the list is cached in the object instance header:
>>> dis.dis[lambda: not []]
1 0 BUILD_LIST 0
2 UNARY_NOT
4 RETURN_VALUE
Evidence from the C source and documentation
PyVarObject
This is an extension of
PyObject
that adds theob_size
field. This is only used for objects that have some notion of length. This type does not often appear in the Python/C API. It corresponds to the fields defined by the expansion of thePyObject_VAR_HEAD
macro.
From the c source in Include/listobject.h:
typedef struct {
PyObject_VAR_HEAD
/* Vector of pointers to list elements. list[0] is ob_item[0], etc. */
PyObject **ob_item;
/* ob_item contains space for 'allocated' elements. The number
* currently in use is ob_size.
* Invariants:
* 0