Skip to content

Compile html5lib with Cython #524

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 22 commits into
base: master
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from 1 commit
Commits
Show all changes
22 commits
Select commit Hold shift + click to select a range
c52e731
Get rid of getPhases
gsnedders Jun 23, 2020
8cff6aa
fixup! Get rid of getPhases
gsnedders Jun 23, 2020
6eb4d2d
Move tests
gsnedders Oct 18, 2020
d2474af
Make InputStream.readChunk default an int
gsnedders Jan 4, 2021
0904df3
Remove last trace of Tokenizer.lastFourChars
gsnedders Jan 4, 2021
8ebff2e
Move Tokenizer.state to Tokenizer._state
gsnedders Jan 4, 2021
4a8e28a
Instead of comparing with a set of ints, use maths
gsnedders Jan 4, 2021
2ae13cc
Remove unused Tokenizer.escape/escapeFlag
gsnedders Jan 4, 2021
c22d069
Avoid needless setter write, mutate value directly
gsnedders Jan 5, 2021
81b3aaf
Reduce list/tuple access
gsnedders Jan 5, 2021
47df02b
Move lowercasing to _ascii module
gsnedders Jan 5, 2021
7d7a079
Always initialize Parser.tokenizer
gsnedders Jan 5, 2021
1acb5dd
Remove long unused Parser.lastPhase/Parser.beforeRCDataPhase
gsnedders Jan 5, 2021
b6a6484
Speed-up Parser.mainLoop a bit
gsnedders Jan 5, 2021
4822712
Get rid of more frozenset calls around constants
gsnedders Jan 5, 2021
f06451e
Add assert for leavingThisState
gsnedders Oct 30, 2020
9e9ff5f
Avoid recursion in etree.testSerializer
gsnedders Oct 27, 2020
2036738
Get rid of remaining non-decorator property()
gsnedders Oct 27, 2020
2c8e0ec
Call super().f() rather than Base.f(self)
gsnedders Jan 5, 2021
84cbc20
Move _getEtreeTag out of the class
gsnedders Oct 29, 2020
8b89668
Change attributes to be created as dicts from day one
gsnedders Oct 29, 2020
e65c433
Start of Cythonizing the tokenizer
gsnedders Jan 4, 2021
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Next Next commit
Get rid of getPhases
This added a fair bit of complexity, and notable made the Phase classes
dynamically generated.

However, by doing this, we no longer include "process the
token using the rules for" phases in the debug log.
  • Loading branch information
gsnedders committed Sep 27, 2020
commit c52e731c5f1de115ff0d9fcb9ecef15c32463b81
12 changes: 0 additions & 12 deletions html5lib/_utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -145,15 +145,3 @@ def moduleFactory(baseModule, *args, **kwargs):
return mod

return moduleFactory


def memoize(func):
cache = {}

def wrapped(*args, **kwargs):
key = (tuple(args), tuple(kwargs.items()))
if key not in cache:
cache[key] = func(*args, **kwargs)
return cache[key]

return wrapped
Loading