programming python

Remove non numeric characters python

I would not use RegEx for this. It is a lot slower!

Instead let's just use a simple for loop.

TLDR;

This function will get the job done fast...

def filter_non_digits[string: str] -> str:
    result = ''
    for char in string:
        if char in '1234567890':
            result += char
    return result

The Explanation

Let's create a very basic benchmark to test a few different methods that have been proposed. I will test three methods...

For loop method [my idea].
List Comprehension method from Jon Clements' answer.
RegEx method from Moradnejad's answer.

# filters.py

import re

# For loop method
def filter_non_digits_for[string: str] -> str:
    result = ''
    for char in string:
        if char in '1234567890':
            result += char
    return result 


# Comprehension method
def filter_non_digits_comp[s: str] -> str:
    return ''.join[ch for ch in s if ch.isdigit[]]


# RegEx method
def filter_non_digits_re[string: str] -> str:
    return re.sub['[^\d]','', string]

Now that we have an implementation of each way of removing digits, let's benchmark each one.

Here is some very basic and rudimentary benchmark code. However, it will do the trick and give us a good comparison of how each method performs.

# tests.py

import time, platform
from filters import filter_non_digits_re,
                    filter_non_digits_comp,
                    filter_non_digits_for


def benchmark_func[func]:
    start = time.time[]
    # the "_" in the number just makes it more readable
    for i in range[100_000]:
        func['afes098u98sfe']
    end = time.time[]
    return [end-start]/100_000


def bench_all[]:
    print[f'# System [{platform.system[]} {platform.machine[]}]']
    print[f'# Python {platform.python_version[]}\n']

    tests = [
        filter_non_digits_re,
        filter_non_digits_comp,
        filter_non_digits_for,
    ]

    for t in tests:
        duration = benchmark_func[t]
        ns = round[duration * 1_000_000_000]
        print[f'{t.__name__.ljust[30]} {str[ns].rjust[6]} ns/op']


if __name__ == "__main__":
    bench_all[]

Here is the output from the benchmark code.

# System [Windows AMD64]
# Python 3.9.8

filter_non_digits_re             2920 ns/op
filter_non_digits_comp           1280 ns/op
filter_non_digits_for             660 ns/op

As you can see the filter_non_digits_for[] funciton is more than four times faster than using RegEx, and about twice as fast as the comprehension method. Sometimes simple is best.

Remove all non-numeric characters from a String in Python #

Use the re.sub[] method to remove all non-numeric characters from a string, e.g. result = re.sub[r'[^0-9]', '', my_str]. The re.sub[] method will remove all non-numeric characters from the string by replacing them with empty strings.

Copied!
import re

my_str = 'a1s2d3f4g5'

result = re.sub[r'[^0-9]', '', my_str]

print[result]  # 👉️ '12345'

If you're looking to avoid using regular expressions, scroll down to the next subheading.

We used the re.sub[] method to remove all non-numeric characters from a string.

The re.sub method returns a new string that is obtained by replacing the occurrences of the pattern with the provided replacement.

Copied!
import re

my_str = '1apple, 2apple, 3banana'

result = re.sub[r'[^0-9]', '', my_str]

print[result]  # 👉️ 123

If the pattern isn't found, the string is returned as is.

The first argument we passed to the re.sub[] method is a regular expression.

The square brackets [] are used to indicate a set of characters.

If the first character of the set is a caret ^, all characters that are not in the set will be matched.

In other words, our set matches any character that is not a digit in the range 0-9.

The second argument we passed to the re.sub[] method is the replacement for each match.

Copied!
import re

my_str = 'a1s2d3f4g5'

result = re.sub[r'[^0-9]', '', my_str]

print[result]  # 👉️ '12345'

We want to remove all non-numeric characters, so we replace each with an empty string.

There is also a shorthand for the [^0-9] character set.

Copied!
import re

my_str = 'a1s2d3f4g5'

result = re.sub[r'\D', '', my_str]

print[result]  # 👉️ '12345'

The \D special character matches any character that is not a digit. It is very similar to the [^0-9] character set but includes more digit characters.

Remove all non-numeric characters from a String using join[] #

To remove all non-numeric characters from a string:

Use a generator expression to iterate over the string.
Use the str.isdigit[] character to check if each character is a digit.
Use the str.join[] method to join the digits into a string.

Copied!
my_str = 'a1s2d3f4g5'

result = ''.join[char for char in my_str if char.isdigit[]]

print[result]  # 👉️ '12345'

We used a generator expression to iterate over the string.

Generator expressions are used to perform some operation for every element or select a subset of elements that meet a condition.

On each iteration, we use the str.isdigit[] method to check if the current character is a digit and return the result.

The generator object only contains the digits from the string.

Copied!
my_str = 'a1s2d3f4g5'

# 👇️ ['1', '2', '3', '4', '5']
print[list[char for char in my_str if char.isdigit[]]]

The last step is to join the digits into a string.

Copied!
my_str = 'a1s2d3f4g5'

result = ''.join[char for char in my_str if char.isdigit[]]

print[result]  # 👉️ '12345'

The str.join method takes an iterable as an argument and returns a string which is the concatenation of the strings in the iterable.

The string the method is called on is used as the separator between the elements.

For our purposes, we called the join[] method on an empty string to join the digits without a separator.

How do you remove non numeric characters in Python?

sub[] method to remove all non-numeric characters from a string, e.g. result = re. sub[r'[^0-9]', '', my_str] . The re. sub[] method will remove all non-numeric characters from the string by replacing them with empty strings.

How do you exclude non numeric values in Python?

“python remove all non numeric characters” Code Answer's.

def nospecial[text]:.

import re..

text = re. sub["[^a-zA-Z0-9]+", "",text].

return text..

How do I remove non numeric characters from a string?

In order to remove all non-numeric characters from a string, replace[] function is used. replace[] Function: This function searches a string for a specific value, or a RegExp, and returns a new string where the replacement is done.

How do I remove the alphabets from an alphanumeric string in Python?

Let's discuss some Pythonic ways to remove all the characters except numbers and alphabets..

Method #1: Using re.sub..

Method #2: Using isalpha[] and isnumeric[].

Method #3: Using alnum[].

TLDR;

The Explanation

Remove all non-numeric characters from a String in Python #

Remove all non-numeric characters from a String using join[] #

How do you remove non numeric characters in Python?

How do you exclude non numeric values in Python?

How do I remove non numeric characters from a string?

How do I remove the alphabets from an alphanumeric string in Python?

Bài Viết Liên Quan

Toplist mới

Bài mới nhất

Chủ Đề