Skip to content

Use pospell #75

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
m-aciek opened this issue Apr 30, 2025 · 5 comments
Open

Use pospell #75

m-aciek opened this issue Apr 30, 2025 · 5 comments
Labels
enhancement New feature or request workflows Anything to do with workflows

Comments

@m-aciek
Copy link
Collaborator

m-aciek commented Apr 30, 2025

https://pypi.org/project/pospell/

There are Polish dictionaries available for hunspell (pospell), we could leverage it to improve the quality of the translation. It would require some configuration (extra custom dictionary and skipping code blocks). We could look at the other languages' setups.

% pospell --language pl tutorial/*.po
…
tutorial/stdlib2.po:701:heappop
tutorial/stdlib2.po:778:wywnioskowując
tutorial/stdlib2.po:778:Decimal
tutorial/stdlib2.po:791:modulo
tutorial/venv.po:35:Pythonowe
tutorial/venv.po:146:bash
tutorial/venv.po:187:deaktywować
tutorial/venv.po:199:pragramu
tutorial/venv.po:210:podkomend
tutorial/venv.po:210:install
tutorial/venv.po:210:uninstall
tutorial/venv.po:210:freeze
tutorial/venv.po:239:podajac
tutorial/whatnow.po:43:tutorial
tutorial/whatnow.po:77:Szegółowe
tutorial/whatnow.po:100:Cheese
tutorial/whatnow.po:111:Cookbook
tutorial/whatnow.po:111:Wydawnicto
tutorial/whatnow.po:111:Reilly
tutorial/whatnow.po:130:Scientific
tutorial/whatnow.po:172:Cheese
@m-aciek m-aciek added the enhancement New feature or request label Apr 30, 2025
@StanFromIreland
Copy link
Collaborator

It looks good though it may be annoying with Polishized words like Pythonowe and words like heappop? I will look into the other repos.

@rffontenelle
Copy link

python-docs-es has a nice solution: a Python script that merges (in runtime) a base dictionary (with common words for all docs) and per-doc dictionary, which reduce the duplication if you want a dictionary file per-doc and avoid a huge single-file dictionary.

@StanFromIreland StanFromIreland added the workflows Anything to do with workflows label May 20, 2025
@StanFromIreland
Copy link
Collaborator

Bigger issue, pospell crash on a codeblock, how do we exclude them?:

<rst-doc>:7: (ERROR/3) Unexpected indentation. while parsing: class Parrot:
    def __init__(self):
        self._voltage = 100000
    @property
    def voltage(self):
        """Uzyska aktualne napięcie.""
        return self._voltage
Traceback (most recent call last):
<rst-doc>:3: (ERROR/3) Unexpected indentation. while parsing: # punkt to dwukrotka (x, y)
match point:
    case (0, 0):
        print("Początek")
    case (0, y):
        print(f"Y={y}")
    case (x, 0):
        print(f"X={x}")
    case (x, y):
        print(f"X={x}, Y={y}")
    case _:
        raise ValueError("Nie punkt")
  File "/opt/hostedtoolcache/Python/3.13.3/x64/bin/pospell", line 8, in <module>
    sys.exit(main())
             ~~~~^^
  File "/opt/hostedtoolcache/Python/3.13.3/x64/lib/python3.13/site-packages/pospell.py", line 480, in main
    errors = spell_check(
        args.po_file,
    ...<4 lines>...
        args.jobs,
    )
  File "/opt/hostedtoolcache/Python/3.13.3/x64/lib/python3.13/site-packages/pospell.py", line 384, in spell_check
    errors = flatten(
        pool.map(
    ...<2 lines>...
        )
    )
  File "/opt/hostedtoolcache/Python/3.13.3/x64/lib/python3.13/site-packages/pospell.py", line 342, in flatten
    return [element for a_list in list_of_lists for element in a_list]
                                                               ^^^^^^
TypeError: 'int' object is not iterable

@m-aciek
Copy link
Collaborator Author

m-aciek commented May 29, 2025

I believe Sphinx should be adding code-block flag to msgids made from code blocks in gettext builder. Then pospell should enable us to filter out those msgids from checking.

@rffontenelle
Copy link

Bigger issue, pospell crash on a codeblock, how do we exclude them?:

In python-docs-pt-br, when I was having tons of sphinx-lint errors because of literal-blocks being extracted, my work-around was the following: 1) make gettext disabling literal blocks to generate POT without it; 2) 'sphinx-intl update' to update PO files with the newly generated POT files; 3) run pospell; 4) discard changes to PO files (or simply don't commit).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request workflows Anything to do with workflows
Projects
None yet
Development

No branches or pull requests

3 participants