Regex replace html tag content

I am trying to replace HTML content with regular expression.

from

test test ZZZZZZ test test

to

test test AAAAAA test test

note that only words outside HTML tags are replaced from ZZZ to AAA.

Any idea? Thanks a lot in advance.

asked May 18, 2011 at 7:26

iwaniwan

6,92915 gold badges47 silver badges65 bronze badges

3

You could walk all nodes, replacing text in text ones [.nodeType == 3]:

Something like:

element.find['*:contains[ZZZ]'].contents[].each[function [] {
    if [this.nodeType === 3]
        this.nodeValue = this.nodeValue.replace[/ZZZ/g,'AAA']
}]

Or same without jQuery:

function replaceText[element, from, to] {
    for [var child = element.firstChild; child !== null; child = child.nextSibling] {
        if [child.nodeType === 3]
            this.nodeValue = this.nodeValue.replace[from,to]
        else if [child.nodeType === 1]
            replaceText[child, from, to];
    }
}

replaceText[element, /ZZZ/g, 'AAA'];

answered May 18, 2011 at 7:58

SuorSuor

2,6631 gold badge20 silver badges28 bronze badges

3

The best idea in this case is most certainly to not use regular expressions to do this. At least not on their own. JavaScript surely has a HTML Parser somewhere?

If you really must use regular expressions, you could try to look for every instance of ZZZ that is followed by a "". That would look like

ZZZ[?=[^>]* and [^\>\\[^\>\\
  • follows with any number of characters that are neither > nor \ nor \[[^]*][ZZZ][[^]*]<
  • with:

    >$1AAA$3<
    

    but beware all the savvy suggestions in the post linked in the first comment to your question!

    answered May 18, 2011 at 7:46

    sergiosergio

    68.5k11 gold badges101 silver badges121 bronze badges

    3

    Chủ Đề