For Python 3, the way to do this that doesn't add double backslashes and simply preserves \n
, \t
, etc. is:
a = 'hello\nbobby\nsally\n'
a.encode['unicode-escape'].decode[].replace['\\\\', '\\']
print[a]
Which gives a value that can be written as CSV:
hello\nbobby\nsally\n
There doesn't seem to be a solution for other special characters, however, that may get a single \ before them. It's a bummer. Solving that would be complex.
For example, to serialize a pandas.Series
containing a list of strings with special characters in to a textfile in the
format BERT expects with a CR between each sentence and a blank line between each document:
with open['sentences.csv', 'w'] as f:
current_idx = 0
for idx, doc in sentences.items[]:
# Insert a newline to separate documents
if idx != current_idx:
f.write['\n']
# Write each sentence exactly as it appared to one line each
for sentence in doc:
f.write[sentence.encode['unicode-escape'].decode[].replace['\\\\', '\\'] + '\n']
This outputs [for the Github CodeSearchNet docstrings for all languages tokenized into sentences]:
Makes sure the fast-path emits in order.
@param value the value to emit or queue up\n@param delayError if true, errors are delayed until the source has terminated\n@param disposable the resource to dispose if the drain terminates
Mirrors the one ObservableSource in an Iterable of several ObservableSources that first either emits an item or sends\na termination notification.
Scheduler:\n{@code amb} does not operate by default on a particular {@link Scheduler}.
@param the common element type\n@param sources\nan Iterable of ObservableSource sources competing to react first.
A subscription to each source will\noccur in the same order as in the Iterable.
@return an Observable that emits the same sequence as whichever of the source ObservableSources first\nemitted an item or sent a termination notification\n@see ReactiveX operators documentation: Amb
...
\[\newcommand{L}[1]{\| #1 \|}\newcommand{VL}[1]{\L{ \vec{#1} }}\newcommand{R}[1]{\operatorname{Re}\,[#1]}\newcommand{I}[1]{\operatorname{Im}\, [#1]}\]
A string literal is where you specify the contents of a string in a program.
Here ‘A string’ is a string literal. The variable a is a string variable, or, better put in Python, a variable that points to a string.
String literals can use single or double quote delimiters.
>>> a = 'A string' # string literal with single quotes >>> b = "A string" # string literal with double quotes >>> b == a # there is no difference between these strings True
Literal strings with single quote delimiters can use double quotes inside them without any extra work.
>>> print['Single quoted string with " is no problem'] Single quoted string with " is no problem
If you need an actual single quote character inside a literal string delimited by single quotes, you can use the backslash character before the single quote, to tell Python not to terminate the string:
>>> print['Single quoted string containing \' is OK with backslash'] Single quoted string containing ' is OK with backslash
Likewise for double quotes:
>>> print["Double quoted string with ' is no problem"] Double quoted string with ' is no problem >>> print["Double quoted string containing \" is OK with backslash"] Double quoted string containing " is OK with backslash
Some characters preceded by a backslash have special meaning. For example:
>>> print['Backslash before "n", as in \n, inserts a new line character'] #doctest: +NORMALIZE_WHITESPACE Backslash before "n", as in , inserts a new line character
If you do not want the backslash to have this special meaning, prefix your string literal with ‘r’, meaning “raw”:
>>> print[r'Prefixed by "r" the \n no longer inserts a new line'] Prefixed by "r" the \n no longer inserts a new line
You can use triple quotes to enclose strings with more than one line:
>>> print['''This string literal ... has more than one ... line'''] This string literal has more than one line
Triple quotes can use single or double quote marks:
>>> print["""This string literal ... also has more than one ... line"""] This string literal also has more than one line