Using a variable within a regular expression in Pandas str.contains()
·
Answer a question
I'm attempting to select rows from a dataframe using the pandas str.contains() function with a regular expression that contains a variable as shown below.
df = pd.DataFrame(["A test Case","Another Testing Case"], columns=list("A"))
variable = "test"
df[df["A"].str.contains(r'\b' + variable + '\b', regex=True, case=False)] #Returns nothing
While the above returns nothing, the following returns the appropriate row as expected
df[df["A"].str.contains(r'\btest\b', regex=True, case=False)] #Returns values as expected
Any help would be appreciated.
Answers
Both word boundary characters must be inside raw strings. Why not use some sort of string formatting instead? String concatenation as a rule is generally discouraged.
df[df["A"].str.contains(fr'\b{variable}\b', regex=True, case=False)]
# Or,
# df[df["A"].str.contains(r'\b{}\b'.format(variable), regex=True, case=False)]
A
0 A test Case
更多推荐

所有评论(0)