Skip to content

Compile html5lib with Cython #524

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 22 commits into
base: master
Choose a base branch
from
Draft
Changes from 1 commit
Commits
Show all changes
22 commits
Select commit Hold shift + click to select a range
c52e731
Get rid of getPhases
gsnedders Jun 23, 2020
8cff6aa
fixup! Get rid of getPhases
gsnedders Jun 23, 2020
6eb4d2d
Move tests
gsnedders Oct 18, 2020
d2474af
Make InputStream.readChunk default an int
gsnedders Jan 4, 2021
0904df3
Remove last trace of Tokenizer.lastFourChars
gsnedders Jan 4, 2021
8ebff2e
Move Tokenizer.state to Tokenizer._state
gsnedders Jan 4, 2021
4a8e28a
Instead of comparing with a set of ints, use maths
gsnedders Jan 4, 2021
2ae13cc
Remove unused Tokenizer.escape/escapeFlag
gsnedders Jan 4, 2021
c22d069
Avoid needless setter write, mutate value directly
gsnedders Jan 5, 2021
81b3aaf
Reduce list/tuple access
gsnedders Jan 5, 2021
47df02b
Move lowercasing to _ascii module
gsnedders Jan 5, 2021
7d7a079
Always initialize Parser.tokenizer
gsnedders Jan 5, 2021
1acb5dd
Remove long unused Parser.lastPhase/Parser.beforeRCDataPhase
gsnedders Jan 5, 2021
b6a6484
Speed-up Parser.mainLoop a bit
gsnedders Jan 5, 2021
4822712
Get rid of more frozenset calls around constants
gsnedders Jan 5, 2021
f06451e
Add assert for leavingThisState
gsnedders Oct 30, 2020
9e9ff5f
Avoid recursion in etree.testSerializer
gsnedders Oct 27, 2020
2036738
Get rid of remaining non-decorator property()
gsnedders Oct 27, 2020
2c8e0ec
Call super().f() rather than Base.f(self)
gsnedders Jan 5, 2021
84cbc20
Move _getEtreeTag out of the class
gsnedders Oct 29, 2020
8b89668
Change attributes to be created as dicts from day one
gsnedders Oct 29, 2020
e65c433
Start of Cythonizing the tokenizer
gsnedders Jan 4, 2021
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
Avoid needless setter write, mutate value directly
  • Loading branch information
gsnedders committed Jan 5, 2021
commit c22d06971a85b3ae38a2e1cfcefc316aade6feda
2 changes: 1 addition & 1 deletion html5lib/_inputstream.py
Original file line number Diff line number Diff line change
Expand Up @@ -660,7 +660,7 @@ def matchBytes(self, bytes):
match. Otherwise return False and leave the position alone"""
rv = self.startswith(bytes, self.position)
if rv:
self.position += len(bytes)
self._position += len(bytes)
return rv

def jumpTo(self, bytes):
Expand Down