Regex remove duplicate characters python

>>> import re
>>> re.sub[r'[[a-z]]\1+', r'\1', 'ffffffbbbbbbbqqq']
'fbq'

The [] around the [a-z] specify a capture group, and then the \1 [a backreference] in both the pattern and the replacement refer to the contents of the first capture group.

Thus, the regex reads "find a letter, followed by one or more occurrences of that same letter" and then entire found portion is replaced with a single occurrence of the found letter.

On side note...

Your example code for just a is actually buggy:

>>> re.sub['a*', 'a', 'aaabbbccc']
'abababacacaca'

You really would want to use 'a+' for your regex instead of 'a*', since the * operator matches "0 or more" occurrences, and thus will match empty strings in between two non-a characters, whereas the + operator matches "1 or more".

We are given a string and we need to remove all duplicates from it? What will be the output if the order of character matters? Examples:

Input : geeksforgeeks
Output : efgkors

This problem has existing solution please refer Remove all duplicates from a given string.

Method 1:

Python3

from collections import OrderedDict

def removeDupWithoutOrder[str]:

return "".join[set[str]]

def removeDupWithOrder[str]:

return "".join[OrderedDict.fromkeys[str]]

if __name__ == "__main__":

str = "geeksforgeeks"

print ["Without Order = ",removeDupWithoutOrder[str]]

print ["With Order = ",removeDupWithOrder[str]]

Output

Without Order =  foskerg
With Order =  geksfor

Time complexity: O[n]
Auxiliary Space: O[n]

Method 2:

Python3

def removeDuplicate[str]:

s=set[str]

s="".join[s]

print["Without Order:",s]

t=""

for i in str:

if[i in t]:

pass

else:

t=t+i

print["With Order:",t]

str="geeksforgeeks"

removeDuplicate[str]

Output

Without Order: kogerfs
With Order: g
With Order: ge
With Order: ge
With Order: gek
With Order: geks
With Order: geksf
With Order: geksfo
With Order: geksfor
With Order: geksfor
With Order: geksfor
With Order: geksfor
With Order: geksfor
With Order: geksfor

Time complexity: O[n]
Auxiliary Space: O[n]

What do OrderedDict and fromkeys[] do ?

An OrderedDict is a dictionary that remembers the order of the keys that were inserted first. If a new entry overwrites an existing entry, the original insertion position is left unchanged.

For example see below code snippet :

Python3

from collections import OrderedDict

ordinary_dictionary = {}

ordinary_dictionary['a'] = 1

ordinary_dictionary['b'] = 2

ordinary_dictionary['c'] = 3

ordinary_dictionary['d'] = 4

ordinary_dictionary['e'] = 5

print [ordinary_dictionary]

ordered_dictionary = OrderedDict[]

ordered_dictionary['a'] = 1

ordered_dictionary['b'] = 2

ordered_dictionary['c'] = 3

ordered_dictionary['d'] = 4

ordered_dictionary['e'] = 5

print [ordered_dictionary]

Output

{'a': 1, 'b': 2, 'c': 3, 'd': 4, 'e': 5}
OrderedDict[[['a', 1], ['b', 2], ['c', 3], ['d', 4], ['e', 5]]]

Time complexity: O[n]
Auxiliary Space: O[1]

fromkeys[] creates a new dictionary with keys from seq and values set to value and returns list of keys, fromkeys[seq[, value]] is the syntax for fromkeys[] method. Parameters :

seq : This is the list of values which would be used for dictionary keys preparation.
value : This is optional, if provided then value would be set to this value.

For example see below code snippet :

Python3

from collections import OrderedDict

seq = ['name', 'age', 'gender']

dict = OrderedDict.fromkeys[seq]

print [str[dict]]

dict = OrderedDict.fromkeys[seq, 10]

print [str[dict]]

Output

OrderedDict[[['name', None], ['age', None], ['gender', None]]]
OrderedDict[[['name', 10], ['age', 10], ['gender', 10]]]

Time complexity: O[n]
Auxiliary Space: O[1]

This article is contributed by Shashank Mishra [Gullu]. If you like GeeksforGeeks and would like to contribute, you can also write an article using write.geeksforgeeks.org or mail your article to . See your article appearing on the GeeksforGeeks main page and help other Geeks.

How do you remove duplicate characters in Python?

Program to remove duplicate characters from a given string in Python.

d := a dictionary where keys are stored in order by their insertion order..

for each character c in s, do. if c is not present in d, then. d[c] := 0. d[c] := d[c] + 1..

join the keys one after another in proper order to make the output string and return..

How do you remove all duplicates from a string in Python?

Given a string S, the task is to remove all the duplicates in the given string..

Sort the elements..

Now in a loop, remove duplicates by comparing the current character with previous character..

Remove extra characters at the end of the resultant string..

How do you find duplicate characters in a string in python?

Python.

string = "Great responsibility";.

print["Duplicate characters in a given string: "];.

#Counts each character present in the string..

for i in range[0, len[string]]:.

count = 1;.

for j in range[i+1, len[string]]:.

if[string[i] == string[j] and string[i] != ' ']:.

count = count + 1;.

How do I get rid of consecutive duplicates in Python?

How to remove consecutive duplicates from a string?.

Iterate over each character in the string..

For each character check if it's the same as the previous character [stored in a variable]. If it is, skip to the next iteration, else add the character to our result string..

Return the result string..

On side note...

Python3

Python3

Python3

Python3

How do you remove duplicate characters in Python?

How do you remove all duplicates from a string in Python?

How do you find duplicate characters in a string in python?

How do I get rid of consecutive duplicates in Python?

Bài Viết Liên Quan

Toplist mới

Bài mới nhất

Chủ Đề