It is advisable to use PyPi regex
module if you plan to match specific Unicode property classes. This library has also proven to be more stable, especially handling large texts, and yields consistent results across various Python versions. All you need to do is to keep it up-to-date.
If you install it [using pip install regex
or pip3 install regex
], you may use
import regex
print [ regex.sub[r'\P{L}+', '', 'ABCŁąć1-2!Абв3§4“5def”'] ]
// => ABCŁąćАбвdef
to remove all chunks
of 1 or more characters other than Unicode letters from text
. See an online Python demo. You may also use "".join[regex.findall[r'\p{L}+', 'ABCŁąć1-2!Абв3§4“5def”']]
to get the same result.
In Python re
, in order to match any Unicode letter, one may use the [^\W\d_]
construct
[Match any unicode letter?].
So, to remove all non-letter characters, you may either match all letters and join the results:
result = "".join[re.findall[r'[^\W\d_]', text]]
Or, remove all chars matching the [\W\d_]
pattern [opposite to [^\W\d_]
]:
result = re.sub[r'[\W\d_]+', '', text]
See the regex demo online. However, you may get inconsistent results across various
Python versions because the Unicode standard is evolving, and the set of chars matched with \w
will depend on the Python version. Using PyPi regex
library is highly recommended to get consistent results.
Created: May-28, 2021 Alphanumeric characters contain the blend of the 26 characters of the letter set and the numbers 0 to 9. Non-alphanumeric characters include characters that are not letters or digits, like In this tutorial, we will discuss how to remove non-alphanumeric characters from a string in Python. We can use the isalnum[]
Method to Remove All Non-Alphanumeric Characters in Python Stringfilter[]
Function to Remove All Non-Alphanumeric Characters in Python String+
and @
.Use the
isalnum[]
Method to Remove All Non-Alphanumeric Characters in Python Stringisalnum[]
method to check whether a given character or string is alphanumeric or not. We can compare each character individually from a string, and if it is alphanumeric, then we combine it using the join[]
function.
For example,
string_value = "alphanumeric@123__"
s = ''.join[ch for ch in string_value if ch.isalnum[]]
print[s]
Output:
alphanumeric123
Use the filter[]
Function to Remove All Non-Alphanumeric Characters in Python String
The filter[]
function is used to construct an iterator from components of the iterable object and filters the object’s elements using a function.
For our problem, the string is our object, and we will use the isalnum[]
function, which checks whether a given string contains alphanumeric characters or not by
checking each character. The join[]
function combines all the characters to return a string.
For example,
string_value = "alphanumeric@123__"
s = ''.join[filter[str.isalnum, string_value]]
print[s]
Output:
alphanumeric123
This method does not work with Python 3.
Use Regular Expressions to Remove All Non-Alphanumeric Characters in Python String
A regular expression is an exceptional grouping of characters that helps you match different strings or sets of strings, utilizing a specific syntax in a pattern. To use regular expressions, we import the re module.
We can use the sub[]
function from this module to replace all the string that matches a non-alphanumeric character by an empty character.
For example,
import re
string_value = "alphanumeric@123__"
s=re.sub[r'[\W_]+', '', string_value]
print[s]
Output:
alphanumeric123
Alternatively, we can also use the following pattern.
import re
string_value = "alphanumeric@123__"
s = re.sub[r'[^a-zA-Z0-9]', '', string_value]
print[s]
Output:
alphanumeric123