Closed
Description
Some regular expression engines support (*FAIL)
as a pattern which fails to match anything. (?!)
is an idiomatic way to write this in engines which do not support (*FAIL)
.
It works pretty well, but it can be optimized. Instead of compiling it as ASSERT_NOT opcode
>>> re.compile(r'12(?!)|3', re.DEBUG)
BRANCH
LITERAL 49
LITERAL 50
ASSERT_NOT 1
OR
LITERAL 51
0. INFO 9 0b100 1 2 (to 10)
in
5. LITERAL 0x31 ('1')
7. LITERAL 0x33 ('3')
9. FAILURE
10: BRANCH 11 (to 22)
12. LITERAL 0x31 ('1')
14. LITERAL 0x32 ('2')
16. ASSERT_NOT 3 0 (to 20)
19. SUCCESS
20: JUMP 7 (to 28)
22: branch 5 (to 27)
23. LITERAL 0x33 ('3')
25. JUMP 2 (to 28)
27: FAILURE
28: SUCCESS
re.compile('12(?!)|3', re.DEBUG)
it can be compiled as FAILURE opcode.
>>> re.compile(r'12(?!)|3', re.DEBUG)
BRANCH
LITERAL 49
LITERAL 50
FAILURE
OR
LITERAL 51
0. INFO 9 0b100 1 2 (to 10)
in
5. LITERAL 0x31 ('1')
7. LITERAL 0x33 ('3')
9. FAILURE
10: BRANCH 8 (to 19)
12. LITERAL 0x31 ('1')
14. LITERAL 0x32 ('2')
16. FAILURE
17. JUMP 7 (to 25)
19: branch 5 (to 24)
20. LITERAL 0x33 ('3')
22. JUMP 2 (to 25)
24: FAILURE
25: SUCCESS
re.compile('12(?!)|3', re.DEBUG)
Unfortunately I do not know good examples of using (*FAIL)
in regular expressions (without using (*SKIP)
) to include them in the documentation. Perhaps other patterns of using (*FAIL)
could be optimized future, but I do not know what to optimize.