Hướng dẫn htmlspecialchars_decode in javascript

Apparently, this is harder to find than I thought it would be. And it even is so simple...

Show

Is there a function equivalent to PHP's htmlspecialchars built into JavaScript? I know it's fairly easy to implement that yourself, but using a built-in function, if available, is just nicer.

For those unfamiliar with PHP, htmlspecialchars translates stuff like into <htmltag/>

I know that escape() and encodeURI() do not work this way.

Hướng dẫn htmlspecialchars_decode in javascript

asked Nov 24, 2009 at 1:59

Bart van HeukelomBart van Heukelom

42.4k59 gold badges183 silver badges294 bronze badges

3

There is a problem with your solution code--it will only escape the first occurrence of each special character. For example:

escapeHtml('Kip\'s evil "test" code\'s here');
Actual:   Kip's <b>evil "test" code's here
Expected: Kip's <b>evil</b> "test" code's here

Here is code that works properly:

function escapeHtml(text) {
  return text
      .replace(/&/g, "&")
      .replace(//g, ">")
      .replace(/"/g, """)
      .replace(/'/g, "'");
}

Update

The following code will produce identical results to the above, but it performs better, particularly on large blocks of text (thanks jbo5112).

function escapeHtml(text) {
  var map = {
    '&': '&',
    '<': '<',
    '>': '>',
    '"': '"',
    "'": '''
  };
  
  return text.replace(/[&<>"']/g, function(m) { return map[m]; });
}

answered Jan 29, 2011 at 5:48

9

That's HTML Encoding. There's no native javascript function to do that, but you can google and get some nicely done up ones.

E.g. http://sanzon.wordpress.com/2008/05/01/neat-little-html-encoding-trick-in-javascript/

EDIT:
This is what I've tested:

var div = document.createElement('div');
  var text = document.createTextNode('');
  div.appendChild(text);
  console.log(div.innerHTML);

Output: <htmltag/>

answered Nov 24, 2009 at 2:04

o.k.wo.k.w

25.1k6 gold badges64 silver badges62 bronze badges

7

Worth a read: http://bigdingus.com/2007/12/29/html-escaping-in-javascript/

escapeHTML: (function() {
 var MAP = {
   '&': '&',
   '<': '<',
   '>': '>',
   '"': '"',
   "'": '''
 };
  var repl = function(c) { return MAP[c]; };
  return function(s) {
    return s.replace(/[&<>'"]/g, repl);
  };
})()

Note: Only run this once. And don't run it on already encoded strings e.g. & becomes &amp;

answered Mar 13, 2012 at 2:09

Chris JacobChris Jacob

11.5k7 gold badges46 silver badges42 bronze badges

2

Here's a function to escape HTML:

function escapeHtml(str)
{
    var map =
    {
        '&': '&',
        '<': '<',
        '>': '>',
        '"': '"',
        "'": '''
    };
    return str.replace(/[&<>"']/g, function(m) {return map[m];});
}

And to decode:

function decodeHtml(str)
{
    var map =
    {
        '&': '&',
        '<': '<',
        '>': '>',
        '"': '"',
        ''': "'"
    };
    return str.replace(/&|<|>|"|'/g, function(m) {return map[m];});
}

answered Jan 17, 2017 at 14:01

Hướng dẫn htmlspecialchars_decode in javascript

Dan BrayDan Bray

6,7503 gold badges49 silver badges63 bronze badges

With jQuery it can be like this:

var escapedValue = $('
').text(value).html();

From related question Escaping HTML strings with jQuery

As mentioned in comment double quotes and single quotes are left as-is for this implementation. That means this solution should not be used if you need to make element attribute as a raw html string.

answered Sep 2, 2010 at 11:51

7

Underscore.js provides a function for this:

_.escape(string)

Escapes a string for insertion into HTML, replacing &, <, >, ", and ' characters.

http://underscorejs.org/#escape

It's not a built-in JavaScript function, but if you are already using Underscore.js, it is a better alternative than writing your own function if your strings to convert are not too large.

Hướng dẫn htmlspecialchars_decode in javascript

answered Jun 2, 2014 at 12:14

Hướng dẫn htmlspecialchars_decode in javascript

mer10z_techmer10z_tech

6977 silver badges12 bronze badges

2

Yet another take at this is to forgo all the character mapping altogether and to instead convert all unwanted characters into their respective numeric character references, e.g.:

function escapeHtml(raw) {
    return raw.replace(/[&<>"']/g, function onReplace(match) {
        return '&#' + match.charCodeAt(0) + ';';
    });
}

Note that the specified RegEx only handles the specific characters that the OP wanted to escape but, depending on the context that the escaped HTML is going to be used, these characters may not be sufficient. Ryan Grove’s article There's more to HTML escaping than &, <, >, and " is a good read on the topic. And depending on your context, the following RegEx may very well be needed in order to avoid XSS injection:

var regex = /[&<>"'` !@$%()=+{}[\]]/g

answered Sep 8, 2014 at 16:48

FredricFredric

1,19317 silver badges16 bronze badges

Chances are you don't need such a function. Since your code is already in the browser*, you can access the DOM directly instead of generating and encoding HTML that will have to be decoded backwards by the browser to be actually used.

Use innerText property to insert plain text into the DOM safely and much faster than using any of the presented escape functions. Even faster than assigning a static preencoded string to innerHTML.

Use classList to edit classes, dataset to set data- attributes and setAttribute for others.

All of these will handle escaping for you. More precisely, no escaping is needed and no encoding will be performed underneath**, since you are working around HTML, the textual representation of DOM.

// use existing element
var author = 'John "Superman" Doe <>';
var el = document.getElementById('first');
el.dataset.author = author;
el.textContent = 'Author: '+author;

// or create a new element
var a = document.createElement('a');
a.classList.add('important');
a.href = '/search?q=term+"exact"&n=50';
a.textContent = 'Search for "exact" term';
document.body.appendChild(a);

// actual HTML code
console.log(el.outerHTML);
console.log(a.outerHTML);
.important { color: red; }

* This answer is not intended for server-side JavaScript users (Node.js, etc.)

** Unless you explicitly convert it to actual HTML afterwards. E.g. by accessing innerHTML - this is what happens when you run $('

').text(value).html(); suggested in other answers. So if your final goal is to insert some data into the document, by doing it this way you'll be doing the work twice. Also you can see that in the resulting HTML not everything is encoded, only the minimum that is needed for it to be valid. It is done context-dependently, that's why this jQuery method doesn't encode quotes and therefore should not be used as a general purpose escaper. Quotes escaping is needed when you're constructing HTML as a string with untrusted or quote-containing data at the place of an attribute's value. If you use the DOM API, you don't have to care about escaping at all.

answered Nov 29, 2017 at 16:22

useruser

20.8k9 gold badges109 silver badges98 bronze badges

2

Use:

String.prototype.escapeHTML = function() {
        return this.replace(/&/g, "&")
                   .replace(//g, ">")
                   .replace(/"/g, """)
                   .replace(/'/g, "'");
    }

Sample:

var toto = "test
"; alert(toto.escapeHTML());

Hướng dẫn htmlspecialchars_decode in javascript

answered Mar 20, 2014 at 8:31

patrickpatrick

571 silver badge1 bronze badge

2

function htmlEscape(str){
    return str.replace(/[&<>'"]/g,x=>'&#'+x.charCodeAt(0)+';')
}

This solution uses the numerical code of the characters, for example < is replaced by <.

Although its performance is slightly worse than the solution using a map, it has the advantages:

  • Not dependent on a library or DOM
  • Pretty easy to remember (you don't need to memorize the 5 HTML escape characters)
  • Little code
  • Reasonably fast (it's still faster than 5 chained replace)

answered Nov 2, 2018 at 14:33

user202729user202729

2,9003 gold badges20 silver badges32 bronze badges

I am elaborating a bit on o.k.w.'s answer.

You can use the browser's DOM functions for that.

var utils = {
    dummy: document.createElement('div'),
    escapeHTML: function(s) {
        this.dummy.textContent = s
        return this.dummy.innerHTML
    }
}

utils.escapeHTML('&')

This returns <escapeThis>&

It uses the standard function createElement to create an invisible element, then uses the function textContent to set any string as its content and then innerHTML to get the content in its HTML representation.

Hướng dẫn htmlspecialchars_decode in javascript

answered Feb 27, 2019 at 23:02

Jonas EberleJonas Eberle

2,5601 gold badge14 silver badges24 bronze badges

By the books

OWASP recommends that "[e]xcept for alphanumeric characters, [you should] escape all characters with ASCII values less than 256 with the &#xHH; format (or a named entity if available) to prevent switching out of [an] attribute."

So here's a function that does that, with a usage example:

function escapeHTML(unsafe) {
  return unsafe.replace(
    /[\u0000-\u002F\u003A-\u0040\u005B-\u0060\u007B-\u00FF]/g,
    c => '&#' + ('000' + c.charCodeAt(0)).slice(-4) + ';'
  )
}

document.querySelector('div').innerHTML =
  '' +
  escapeHTML('

For Node.js users (or users using the Jade runtime in the browser), you can use Jade's escape function.

require('jade').runtime.escape(...);

There isn't any sense in writing it yourself if someone else is maintaining it. :)

Hướng dẫn htmlspecialchars_decode in javascript

answered Oct 28, 2011 at 20:37

BMinerBMiner

16k11 gold badges52 silver badges53 bronze badges

// Codificamos los caracteres: &, <, >, ", '
function encodeHtml(str) {

  var map = {
    '&': '&',
    '<': '<',
    '>': '>',
    '"': '"',
    "'": '''
  };

  return str.replace(/[&<>"']/g, function(m) {return map[m];});
}

// Decodificamos los caracteres: & < > " '
function decodeHtml(str) {

  var map = {
    '&': '&',
    '<': '<',
    '>': '>',
    '"': '"',
    ''': "'"
  };

  return str.replace(/&|<|>|"|'/g, function(m) {return map[m];});
}

var str = `atttt ++ ' ' " " " " " + {}-´ñ+.'aAAAaaaa"`;

var str2 = `atttt ++ ' ' " " " " " + {}-´ñ+.'aAAAaaaa"`;


console.log(encodeHtml(str));
console.log(decodeHtml(str2));

- String de entrada: atttt ++ ' ' " " " " " + {}-´ñ+.'aAAAaaaa"
- mira la consola 👇

answered Mar 16 at 23:07

Hướng dẫn htmlspecialchars_decode in javascript

function htmlspecialchars(str) {
 if (typeof(str) == "string") {
  str = str.replace(/&/g, "&"); /* must do & first */
  str = str.replace(/"/g, """);
  str = str.replace(/'/g, "'");
  str = str.replace(//g, ">");
  }
 return str;
 }

answered Mar 4, 2013 at 12:35

This isn't directly related to this question, but the reverse could be accomplished in JS through:

> String.fromCharCode(8212);
> "—"

That also works with TypeScript.

answered Dec 14, 2020 at 17:52

Philippe FanaroPhilippe Fanaro

4,8835 gold badges32 silver badges63 bronze badges

I hope this wins the race due to its performance and most important not a chained logic using .replace('&','&').replace('<','<')...

var mapObj = {
   '&':  "&",
   '<':  "<",
   '>':  ">",
   '"':  """,
   '\'': "'"
};
var re = new RegExp(Object.keys(mapObj).join("|"), "gi");

function escapeHtml(str)
{
    return str.replace(re, function(matched)
    {
        return mapObj[matched.toLowerCase()];
    });
}

console.log('');
console.log(escapeHtml(''));

Hướng dẫn htmlspecialchars_decode in javascript

answered Feb 26, 2014 at 16:45

Hướng dẫn htmlspecialchars_decode in javascript

AiryAiry

4,9756 gold badges47 silver badges72 bronze badges

Reversed one:

function decodeHtml(text) {
    return text
        .replace(/&/g, '&')
        .replace(/</ , '<')
        .replace(/>/, '>')
        .replace(/"/g,'"')
        .replace(/'/g,"'");
}

Hướng dẫn htmlspecialchars_decode in javascript

rgmt

14.3k12 gold badges48 silver badges68 bronze badges

answered Dec 1, 2016 at 8:35

Hướng dẫn htmlspecialchars_decode in javascript

5