python - Search patern to include square brackets -


i trying search exact words in file. read file lines , loop through lines find exact words. in keyword not suitable finding exact words, using regex pattern.

def findword(w):     return re.compile(r'\b({0})\b'.format(w), flags=re.ignorecase).search 

the problem function is doesn't recognizes square brackets [xyz].

for example

findword('data_var_cod[0]')('cod_byte1 = data_var_cod[0]')  

returns none whereas

findword('data_var_cod')('cod_byte1 = data_var_cod')  

returns <_sre.sre_match object @ 0x0000000015622288>

can please me tweak regex pattern?

it's because of regex engine assume square brackets character class regex characters ride of problem need escape regex characters. can use re.escape function :

def findword(w):     return re.compile(r'\b({0})\b'.format(re.escape(w)), flags=re.ignorecase).search 

also more pythonic way fro matches can use re.fildall() returns list of matches or re.finditer returns iterator contains matchobjects.

but still way not complete , efficient because when using word boundary inner word must contains 1 type characters.

>>> ss = 'hello string [processing] in python.'   >>>re.compile(r'\b({0})\b'.format(re.escape('[processing]')),flags=re.ignorecase).search(ss) >>>  >>>re.compile(r'({})'.format(re.escape('[processing]')),flags=re.ignorecase).search(ss).group(0) '[processing]' 

so suggest remove word boundaries if words contains none word characters.

but more general way can use following regex use positive around match words surround space or come @ end of string or leading:

r'(?: |^)({})(?=[. ]|$) ' 

Comments