>>> import re
>>> re.sub[r'[[a-z]]\1+', r'\1', 'ffffffbbbbbbbqqq']
'fbq'
The []
around the [a-z]
specify a capture group, and then the \1
[a backreference] in both the pattern and the replacement refer to the contents of the first capture group.
Thus, the regex reads "find a letter, followed by one or more occurrences of that same letter" and then entire found portion is replaced with a single occurrence of the found letter.
On side note...
Your example code for just a
is
actually buggy:
>>> re.sub['a*', 'a', 'aaabbbccc']
'abababacacaca'
You really would want to use 'a+'
for your regex instead of 'a*'
, since the *
operator matches "0 or more" occurrences, and thus will match empty strings in between two non-a
characters, whereas the +
operator matches "1 or more".
We are given a string and we need to remove all duplicates from it? What will be the output if the order of character matters? Examples:
Input : geeksforgeeks
Output : efgkors
This problem has existing solution please refer Remove all duplicates from a given string.
Method 1:
Python3
from
collections
import
OrderedDict
def
removeDupWithoutOrder[
str
]:
return
"".join[
set
[
str
]]
def
removeDupWithOrder[
str
]:
return
"".join[OrderedDict.fromkeys[
str
]]
if
__name__
=
=
"__main__"
:
str
=
"geeksforgeeks"
print
[
"Without Order = "
,removeDupWithoutOrder[
str
]]
print
[
"With Order = "
,removeDupWithOrder[
str
]]
Output
Without Order = foskerg With Order = geksfor
Time complexity: O[n]
Auxiliary Space: O[n]
Method 2:
Python3
def
removeDuplicate[
str
]:
s
=
set
[
str
]
s
=
"".join[s]
print
[
"Without Order:"
,s]
t
=
""
for
i
in
str
:
if
[i
in
t]:
pass
else
:
t
=
t
+
i
print
[
"With Order:"
,t]
str
=
"geeksforgeeks"
removeDuplicate[
str
]
Output
Without Order: kogerfs With Order: g With Order: ge With Order: ge With Order: gek With Order: geks With Order: geksf With Order: geksfo With Order: geksfor With Order: geksfor With Order: geksfor With Order: geksfor With Order: geksfor With Order: geksfor
Time complexity: O[n]
Auxiliary Space: O[n]
What do OrderedDict and fromkeys[] do ?
An OrderedDict is a dictionary that remembers the order of the keys that were inserted first. If a new entry overwrites an existing entry, the original insertion position is left unchanged.
For example see below code snippet :
Python3
from
collections
import
OrderedDict
ordinary_dictionary
=
{}
ordinary_dictionary[
'a'
]
=
1
ordinary_dictionary[
'b'
]
=
2
ordinary_dictionary[
'c'
]
=
3
ordinary_dictionary[
'd'
]
=
4
ordinary_dictionary[
'e'
]
=
5
print
[ordinary_dictionary]
ordered_dictionary
=
OrderedDict[]
ordered_dictionary[
'a'
]
=
1
ordered_dictionary[
'b'
]
=
2
ordered_dictionary[
'c'
]
=
3
ordered_dictionary[
'd'
]
=
4
ordered_dictionary[
'e'
]
=
5
print
[ordered_dictionary]
Output
{'a': 1, 'b': 2, 'c': 3, 'd': 4, 'e': 5} OrderedDict[[['a', 1], ['b', 2], ['c', 3], ['d', 4], ['e', 5]]]
Time complexity: O[n]
Auxiliary Space: O[1]
fromkeys[] creates a new dictionary with keys from seq and values set to value and returns list of keys, fromkeys[seq[, value]] is the syntax for fromkeys[] method. Parameters :
- seq : This is the list of values which would be used for dictionary keys preparation.
- value : This is optional, if provided then value would be set to this value.
For example see below code snippet :
Python3
from
collections
import
OrderedDict
seq
=
[
'name'
,
'age'
,
'gender'
]
dict
=
OrderedDict.fromkeys[seq]
print
[
str
[
dict
]]
dict
=
OrderedDict.fromkeys[seq,
10
]
print
[
str
[
dict
]]
Output
OrderedDict[[['name', None], ['age', None], ['gender', None]]] OrderedDict[[['name', 10], ['age', 10], ['gender', 10]]]
Time complexity: O[n]
Auxiliary Space: O[1]
This article is contributed by Shashank Mishra [Gullu]. If you like GeeksforGeeks and would like to contribute, you can also write an article using write.geeksforgeeks.org or mail your article to . See your article appearing on the GeeksforGeeks main page and help other Geeks.