regex - How can I get my Regular Expression to take the first match, and ignore any following matches? -


i making regular expressions extract dosage instructions pharmaceutical catalog. getting information many different brands, , formatting not consistent within brand expression has kind of lenient. regular expressions being implemented in ruby (but not me).

my regex follows:

/(take|chew\s|usage:|use:|intake:|dosage:?|dose:|directions:|recommendations:|adults:)\s*(.*take\s+|.*chew\s+|.*mix\s+|.*supplement,\s+)?(?<dosage_amount>\s+(\sto\s\s+)?(\sor\s\s+)?(\s\(\d+\)\s)?\b)[\s,](?<dosage_format>\s+\b(\s\([\w\-\.]+\))?)?[\s,]*?(?<dosage_frequency>[\s\s]*(daily|per day|a day|needed|morning|evening))?[\s,]?\s?(daily\s)?(?<dosage_permutation>(with|on|at|in|before|after|taken)[,\w\s\-]*)?(?=or as|\.)?/ 

an example of code working correctly following description --

"suggested use: dietary supplement, take 1-3 capsules daily,in divided doses, before meal."

-- dosage_amount= 1-3, dosage_format= capsule, dosage_frequency= once per day, , dosage_permutation= "in divided doses, before meal".

however, getting problems descriptions like:

"directions: adults, take 1 (1) tablet daily, preferably meal or follow advice of health care professional. let tablets dissolve on tongue before swallowing. reminder, discuss supplements , medications take health care providers. "

the problem word "take" used more once in description. dosage_amount= with, , dosage_format= your. (it matching second 'take', , not first.)

is there way force regex match first 'take' in description? have tried experimenting making greedy vs. non-greedy outlined here, can not make work.

thank you.

try replace greedy part here:

.*take 

with non greedy version:

.*?take 

the first variant consumes many characters possible, second few possible.


Comments