Answer a question

I have a string where special characters like ' or " or & (...) can appear. In the string:

string = """ Hello "XYZ" this 'is' a test & so on """

how can I automatically escape every special character, so that I get this:

string = " Hello "XYZ" this 'is' a test & so on "

Answers

In Python 3.2, you could use the html.escape function, e.g.

>>> string = """ Hello "XYZ" this 'is' a test & so on """
>>> import html
>>> html.escape(string)
' Hello "XYZ" this 'is' a test & so on '

For earlier versions of Python, check http://wiki.python.org/moin/EscapingHtml:

The cgi module that comes with Python has an escape() function:

import cgi

s = cgi.escape( """& < >""" )   # s = "&amp; &lt; &gt;"

However, it doesn't escape characters beyond &, <, and >. If it is used as cgi.escape(string_to_escape, quote=True), it also escapes ".


Here's a small snippet that will let you escape quotes and apostrophes as well:

 html_escape_table = {
     "&": "&amp;",
     '"': "&quot;",
     "'": "&apos;",
     ">": "&gt;",
     "<": "&lt;",
     }

 def html_escape(text):
     """Produce entities within text."""
     return "".join(html_escape_table.get(c,c) for c in text)

You can also use escape() from xml.sax.saxutils to escape html. This function should execute faster. The unescape() function of the same module can be passed the same arguments to decode a string.

from xml.sax.saxutils import escape, unescape
# escape() and unescape() takes care of &, < and >.
html_escape_table = {
    '"': "&quot;",
    "'": "&apos;"
}
html_unescape_table = {v:k for k, v in html_escape_table.items()}

def html_escape(text):
    return escape(text, html_escape_table)

def html_unescape(text):
    return unescape(text, html_unescape_table)
Logo

Python社区为您提供最前沿的新闻资讯和知识内容

更多推荐