I want to replace whitespace with underscore in a string to create nice URLs. So that for example:
"This should be connected"
Should become
"This_should_be_connected"
I am using Python with Django. Can this be solved using regular expressions?
Tomerikoo
16.6k15 gold badges38 silver badges54 bronze badges
asked Jun 17, 2009 at 14:41
1
You don't need regular expressions. Python has a built-in string method that does what you need:
mystring.replace[" ", "_"]
Luke Exton
3,2362 gold badges17 silver badges32 bronze badges
answered Jun 17, 2009 at 14:44
rogeriopvlrogeriopvl
48.6k8 gold badges52 silver badges58 bronze badges
8
Replacing spaces is fine, but I might suggest going a little further to handle other URL-hostile characters like question marks, apostrophes, exclamation points, etc.
Also note that the general consensus among SEO experts is that dashes are preferred to underscores in URLs.
import re
def urlify[s]:
# Remove all non-word characters [everything except numbers and letters]
s = re.sub[r"[^\w\s]", '', s]
# Replace all runs of whitespace with a single dash
s = re.sub[r"\s+", '-', s]
return s
# Prints: I-cant-get-no-satisfaction"
print[urlify["I can't get no satisfaction!"]]
vvvvv
18.9k16 gold badges43 silver badges62 bronze badges
answered Jun 17, 2009 at 15:03
Kenan BanksKenan Banks
201k34 gold badges151 silver badges171 bronze badges
7
This takes into account blank characters other than space and I think it's faster than using re
module:
url = "_".join[ title.split[] ]
answered Apr 29, 2012 at 23:18
xOnecaxOneca
8039 silver badges20 bronze badges
5
Django has a 'slugify' function which does this, as well as other URL-friendly optimisations. It's hidden away in the defaultfilters module.
>>> from django.template.defaultfilters import slugify
>>> slugify["This should be connected"]
this-should-be-connected
This isn't exactly the output you asked for, but IMO it's better for use in URLs.
answered Jun 17, 2009 at 15:15
Daniel RosemanDaniel Roseman
574k61 gold badges839 silver badges852 bronze badges
6
Using the re
module:
import re
re.sub['\s+', '_', "This should be connected"] # This_should_be_connected
re.sub['\s+', '_', 'And so\tshould this'] # And_so_should_this
Unless you have multiple spaces or other whitespace possibilities as above, you may just wish to use string.replace
as others have suggested.
answered Jun 17, 2009 at 14:45
Jarret HardieJarret Hardie
91.8k10 gold badges130 silver badges125 bronze badges
2
use string's replace method:
"this should be connected".replace[" ", "_"]
"this_should_be_disconnected".replace["_", " "]
answered Jun 17, 2009 at 14:45
mdirolfmdirolf
7,3912 gold badges22 silver badges15 bronze badges
Surprisingly this library not mentioned yet
python package named python-slugify, which does a pretty good job of slugifying:
pip install python-slugify
Works like this:
from slugify import slugify
txt = "This is a test ---"
r = slugify[txt]
self.assertEquals[r, "this-is-a-test"]
txt = "This -- is a ## test ---"
r = slugify[txt]
self.assertEquals[r, "this-is-a-test"]
txt = 'C\'est déjà l\'été.'
r = slugify[txt]
self.assertEquals[r, "cest-deja-lete"]
txt = 'Nín hǎo. Wǒ shì zhōng guó rén'
r = slugify[txt]
self.assertEquals[r, "nin-hao-wo-shi-zhong-guo-ren"]
txt = 'Компьютер'
r = slugify[txt]
self.assertEquals[r, "kompiuter"]
txt = 'jaja---lol-méméméoo--a'
r = slugify[txt]
self.assertEquals[r, "jaja-lol-mememeoo-a"]
answered Sep 28, 2015 at 16:01
YashYash
6,0644 gold badges35 silver badges26 bronze badges
Python has a built in method on strings called replace which is used as so:
string.replace[old, new]
So you would use:
string.replace[" ", "_"]
I had this problem a while ago and I wrote code to replace characters in a string. I have to start remembering to check the python documentation because they've got built in functions for everything.
answered Jun 18, 2009 at 12:34
I'm using the following piece of code for my friendly urls:
from unicodedata import normalize
from re import sub
def slugify[title]:
name = normalize['NFKD', title].encode['ascii', 'ignore'].replace[' ', '-'].lower[]
#remove `other` characters
name = sub['[^a-zA-Z0-9_-]', '', name]
#nomalize dashes
name = sub['-+', '-', name]
return name
It works fine with unicode characters as well.
answered Jun 17, 2009 at 15:36
ArmandasArmandas
1,7601 gold badge17 silver badges24 bronze badges
1
You can try this instead:
mystring.replace[r' ','-']
Jules Dupont
6,7417 gold badges38 silver badges37 bronze badges
answered May 6, 2018 at 15:28
1
mystring.replace [" ", "_"]
if you assign this value to any variable, it will work
s = mystring.replace [" ", "_"]
by default mystring wont have this
answered Jul 31, 2016 at 16:51
OP is using python, but in javascript [something to be careful of since the syntaxes are similar.
// only replaces the first instance of ' ' with '_'
"one two three".replace[' ', '_'];
=> "one_two three"
// replaces all instances of ' ' with '_'
"one two three".replace[/\s/g, '_'];
=> "one_two_three"
answered Jul 4, 2010 at 23:34
skilleoskilleo
2,4221 gold badge27 silver badges34 bronze badges
x = re.sub["\s", "_", txt]
eyllanesc
226k18 gold badges131 silver badges199 bronze badges
answered Nov 28, 2021 at 6:25
4
perl -e 'map { $on=$_; s/ /_/; rename[$on, $_] or warn $!; } ;'
Match et replace space > underscore of all files in current directory
answered Jun 19, 2009 at 7:30