I have a string where special characters like ' or " or & (...) can appear. In the string:
string = """ Hello "XYZ" this 'is' a test & so on """
how can I automatically escape every special character, so that I get this:
string = " Hello "XYZ" this 'is' a test & so on "
In Python 3.2, you could use the html.escape function, e.g.
>>> string = """ Hello "XYZ" this 'is' a test & so on """
>>> import html
>>> html.escape(string)
' Hello "XYZ" this 'is' a test & so on '
For earlier versions of Python, check http://wiki.python.org/moin/EscapingHtml:
The cgi module that comes with Python has an escape() function:
import cgi
s = cgi.escape( """& < >""" ) # s = "& < >"
However, it doesn't escape characters beyond &, <, and >. If it is used as cgi.escape(string_to_escape, quote=True), it also escapes ".
Here's a small snippet that will let you escape quotes and apostrophes as well:
html_escape_table = {
"&": "&",
'"': """,
"'": "'",
">": ">",
"<": "<",
}
def html_escape(text):
"""Produce entities within text."""
return "".join(html_escape_table.get(c,c) for c in text)
You can also use escape() from xml.sax.saxutils to escape html. This function should execute faster. The unescape() function of the same module can be passed the same arguments to decode a string.
from xml.sax.saxutils import escape, unescape
# escape() and unescape() takes care of &, < and >.
html_escape_table = {
'"': """,
"'": "'"
}
html_unescape_table = {v:k for k, v in html_escape_table.items()}
def html_escape(text):
return escape(text, html_escape_table)
def html_unescape(text):
return unescape(text, html_unescape_table)
所有评论(0)