Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
28 commits
Select commit Hold shift + click to select a range
d145747
LLVM 20 bump
savannahostrowski Sep 18, 2025
d86b66e
Merge main
savannahostrowski Sep 19, 2025
9324b14
Add flags for testing
savannahostrowski Sep 19, 2025
f3bb6b9
Merge branch 'main' into llvm-20
savannahostrowski Sep 20, 2025
b6e7981
Merge main
savannahostrowski Oct 6, 2025
e867644
Fix windows
savannahostrowski Oct 6, 2025
13e9f5b
Merge main
savannahostrowski Oct 7, 2025
84781b4
Merge branch 'main' into llvm-20
savannahostrowski Oct 8, 2025
0034f14
Download binaries from GitHub releases
emmatyping Aug 6, 2025
cc98d30
Add hash checking
emmatyping Aug 7, 2025
76842eb
Apply Emma's commits for grabbing binaries from release artifacts
savannahostrowski Oct 8, 2025
a732cec
Merge branch 'main' into llvm-20
savannahostrowski Oct 11, 2025
e8395ce
Fix up LLVM via release artifacts
savannahostrowski Oct 11, 2025
d1e4363
Merge branch 'main' into llvm-20
savannahostrowski Oct 12, 2025
01aed67
Remove model flags for x86_64 darwin causing GOT relocation issues
savannahostrowski Oct 12, 2025
94f1a89
Clean up
savannahostrowski Oct 12, 2025
e6450de
Only patch x86_64 GOT relocations when relaxation succeeds
savannahostrowski Oct 13, 2025
1adf827
Revert "Only patch x86_64 GOT relocations when relaxation succeeds"
savannahostrowski Oct 13, 2025
b9bfacf
mcmodel=large
savannahostrowski Oct 14, 2025
57c44ee
Add macro to handle debug
savannahostrowski Oct 16, 2025
081ee86
fno-pic
savannahostrowski Oct 16, 2025
0b773f9
remove fno-pic
savannahostrowski Oct 16, 2025
c78af6f
remove hack
savannahostrowski Oct 18, 2025
d715cf2
Trampoline attempt
savannahostrowski Oct 18, 2025
38e11b9
Touch up and add better comments
savannahostrowski Oct 19, 2025
74eb9a4
Merge main
savannahostrowski Oct 19, 2025
ca95652
More clean up
savannahostrowski Oct 19, 2025
47153a5
📜🤖 Added by blurb_it.
blurb-it[bot] Oct 19, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 3 additions & 3 deletions .github/workflows/jit.yml
Original file line number Diff line number Diff line change
Expand Up @@ -68,7 +68,7 @@ jobs:
- true
- false
llvm:
- 19
- 20
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Really not major.. can we stick the version in a variable?

include:
- target: i686-pc-windows-msvc/msvc
architecture: Win32
Expand Down Expand Up @@ -138,7 +138,7 @@ jobs:
fail-fast: false
matrix:
llvm:
- 19
- 20
steps:
- uses: actions/checkout@v4
with:
Expand Down Expand Up @@ -166,7 +166,7 @@ jobs:
fail-fast: false
matrix:
llvm:
- 19
- 20
steps:
- uses: actions/checkout@v4
with:
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
Update JIT compilation to use LLVM 20 at build time.
132 changes: 112 additions & 20 deletions PCbuild/get_external.py
Original file line number Diff line number Diff line change
@@ -1,36 +1,62 @@
#!/usr/bin/env python3

import argparse
import contextlib
import hashlib
import io
import json
import os
import pathlib
import shutil
import sys
import time
import urllib.error
import urllib.request
import zipfile


def retrieve_with_retries(download_location, output_path, reporthook,
max_retries=7):
"""Download a file with exponential backoff retry and save to disk."""
# Mapping of binary dependency tag to GitHub release asset ID
TAG_TO_ASSET_ID = {
'llvm-20.1.8.0': 301710576,
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Non-blocking comment, because what's here should work. However, this hard-coded mapping seems like something that we don't want to have to maintain, and should be relatively easy to avoid by naming things consistently. When I created the 'release' for this I just used the original filename, but it's easy enough to rename it to something like llvm-20.1.8.0.tar.xz, or even re-package it as a ZIP. From there urlopen(f'https://github.com/%7Borg%7D/cpython-bin-deps/releases/download/%7Bexternal_name%7D/%7Bexternal_name%7D.%7Bfile_extension%7D') ought to work just fine without having to mess around with the GH API at all.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah I think I originally set it up this way so the archives could come directly from upstream distributors without any changes, but renaming is probably easier/better.



def request_with_retry(request_func, *args, max_retries=7,
err_msg='Request failed.', **kwargs):
"""Make a request using request_func with exponential backoff"""
for attempt in range(max_retries + 1):
try:
resp = urllib.request.urlretrieve(
download_location,
output_path,
reporthook=reporthook,
)
resp = request_func(*args, **kwargs)
except (urllib.error.URLError, ConnectionError) as ex:
if attempt == max_retries:
msg = f"Download from {download_location} failed."
raise OSError(msg) from ex
raise OSError(err_msg) from ex
time.sleep(2.25**attempt)
else:
return resp


def fetch_zip(commit_hash, zip_dir, *, org='python', binary=False, verbose):
repo = f'cpython-{"bin" if binary else "source"}-deps'
def retrieve_with_retries(download_location, output_path, reporthook):
"""Download a file with retries."""
return request_with_retry(
urllib.request.urlretrieve,
download_location,
output_path,
reporthook,
err_msg=f'Download from {download_location} failed.',
)


def get_with_retries(url, headers):
req = urllib.request.Request(url=url, headers=headers, method='GET')
return request_with_retry(
urllib.request.urlopen,
req,
err_msg=f'Request to {url} failed.'
)
Comment on lines +38 to +55
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we have two similar functions? I see they are different on the function they use. Why do we need them both?
Can we reduce the usage to either one of them?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is code I wrote, so I can try and answer this :)

The idea is that we need to do two types of requests:

  1. download a file (retrieve_with_retries)
  2. get some JSON from GitHub's API (get_with_retries)

For 1, urlretrieve makes things a lot simpler. If we just had one generic function here we'd end up re-implementing urlretrieve (to handle things like writing downloaded content to the output file, etc).

For 2, we need to pass Accept headers to a GET request against GitHub's API to get information from the API about a release asset, so urlretrieve isn't quite right, and we need to use a urllib.request.Request to customize headers.



def fetch_zip(commit_hash, zip_dir, *, org='python', binary=False, verbose=False):
repo = 'cpython-bin-deps' if binary else 'cpython-source-deps'
url = f'https://github.com/{org}/{repo}/archive/{commit_hash}.zip'
reporthook = None
if verbose:
Expand All @@ -44,6 +70,51 @@ def fetch_zip(commit_hash, zip_dir, *, org='python', binary=False, verbose):
return filename


def fetch_release_asset(asset_id, output_path, org):
"""Download a GitHub release asset.
Release assets need the Content-Type header set to
application/octet-stream to download the binary, so we can't use
urlretrieve. Code here is based on urlretrieve.
"""
url = f'https://api.github.com/repos/{org}/cpython-bin-deps/releases/assets/{asset_id}'
metadata_resp = get_with_retries(url,
headers={'Accept': 'application/vnd.github+json'})
json_data = json.loads(metadata_resp.read())
hash_info = json_data.get('digest')
if not hash_info:
raise RuntimeError(f'Release asset {asset_id} missing digest field in metadata')
algorithm, hashsum = hash_info.split(':')
if algorithm != 'sha256':
raise RuntimeError(f'Unknown hash algorithm {algorithm} for asset {asset_id}')
with contextlib.closing(
get_with_retries(url, headers={'Accept': 'application/octet-stream'})
) as resp:
hasher = hashlib.sha256()
with open(output_path, 'wb') as fp:
while block := resp.read(io.DEFAULT_BUFFER_SIZE):
hasher.update(block)
fp.write(block)
if hasher.hexdigest() != hashsum:
raise RuntimeError('Downloaded content hash did not match!')


def fetch_release(tag, tarball_dir, *, org='python', verbose=False):
tarball_dir.mkdir(parents=True, exist_ok=True)
asset_id = TAG_TO_ASSET_ID.get(tag)
if asset_id is None:
raise ValueError(f'Unknown tag for binary dependencies {tag}')
output_path = tarball_dir / f'{tag}.tar.xz'
fetch_release_asset(asset_id, output_path, org)
return output_path


def extract_tarball(externals_dir, tarball_path, tag):
output_path = externals_dir / tag
shutil.unpack_archive(os.fspath(tarball_path), os.fspath(output_path))
return output_path


def extract_zip(externals_dir, zip_path):
with zipfile.ZipFile(os.fspath(zip_path)) as zf:
zf.extractall(os.fspath(externals_dir))
Expand All @@ -67,15 +138,36 @@ def parse_args():

def main():
args = parse_args()
zip_path = fetch_zip(
args.tag,
args.externals_dir / 'zips',
org=args.organization,
binary=args.binary,
verbose=args.verbose,
)
final_name = args.externals_dir / args.tag
extracted = extract_zip(args.externals_dir, zip_path)

# Check if the dependency already exists in externals/ directory
# (either already downloaded/extracted, or checked into the git tree)
if final_name.exists():
if args.verbose:
print(f'{args.tag} already exists at {final_name}, skipping download.')
return

# Determine download method: release artifacts for large deps (like LLVM),
# otherwise zip download from GitHub branches
if args.tag in TAG_TO_ASSET_ID:
tarball_path = fetch_release(
args.tag,
args.externals_dir / 'tarballs',
org=args.organization,
verbose=args.verbose,
)
extracted = extract_tarball(args.externals_dir, tarball_path, args.tag)
else:
# Use zip download from GitHub branches
# (cpython-bin-deps if --binary, cpython-source-deps otherwise)
zip_path = fetch_zip(
args.tag,
args.externals_dir / 'zips',
org=args.organization,
binary=args.binary,
verbose=args.verbose,
)
extracted = extract_zip(args.externals_dir, zip_path)
for wait in [1, 2, 3, 5, 8, 0]:
try:
extracted.replace(final_name)
Expand Down
2 changes: 1 addition & 1 deletion PCbuild/get_externals.bat
Original file line number Diff line number Diff line change
Expand Up @@ -82,7 +82,7 @@ if NOT "%IncludeLibffi%"=="false" set binaries=%binaries% libffi-3.4.4
if NOT "%IncludeSSL%"=="false" set binaries=%binaries% openssl-bin-3.0.18
if NOT "%IncludeTkinter%"=="false" set binaries=%binaries% tcltk-8.6.15.0
if NOT "%IncludeSSLSrc%"=="false" set binaries=%binaries% nasm-2.11.06
if NOT "%IncludeLLVM%"=="false" set binaries=%binaries% llvm-19.1.7.0
if NOT "%IncludeLLVM%"=="false" set binaries=%binaries% llvm-20.1.8.0

for %%b in (%binaries%) do (
if exist "%EXTERNALS_DIR%\%%b" (
Expand Down
48 changes: 48 additions & 0 deletions Python/jit.c
Original file line number Diff line number Diff line change
Expand Up @@ -444,12 +444,18 @@ patch_x86_64_32rx(unsigned char *location, uint64_t value)
}

void patch_aarch64_trampoline(unsigned char *location, int ordinal, jit_state *state);
void patch_x86_64_trampoline(unsigned char *location, int ordinal, jit_state *state);

#include "jit_stencils.h"

#if defined(__aarch64__) || defined(_M_ARM64)
#define TRAMPOLINE_SIZE 16
#define DATA_ALIGN 8
#elif defined(__x86_64__) && defined(__APPLE__)
// LLVM 20 on macOS x86_64 debug builds: GOT entries may exceed ±2GB PC-relative
// range.
#define TRAMPOLINE_SIZE 16 // 14 bytes + 2 bytes padding for alignment
#define DATA_ALIGN 16
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why does the data need to be aligned to 16?

#else
#define TRAMPOLINE_SIZE 0
#define DATA_ALIGN 1
Expand Down Expand Up @@ -501,6 +507,48 @@ patch_aarch64_trampoline(unsigned char *location, int ordinal, jit_state *state)
patch_aarch64_26r(location, (uintptr_t)p);
}

// Generate and patch x86_64 trampolines.
void
patch_x86_64_trampoline(unsigned char *location, int ordinal, jit_state *state)
{
uint64_t value = (uintptr_t)symbols_map[ordinal];
int64_t range = (int64_t)value - 4 - (int64_t)location;

// If we are in range of 32 signed bits, we can patch directly
if (range >= -(1LL << 31) && range < (1LL << 31)) {
patch_32r(location, value - 4);
return;
}

// Out of range - need a trampoline
const uint32_t symbol_mask = 1 << (ordinal % 32);
const uint32_t trampoline_mask = state->trampolines.mask[ordinal / 32];
assert(symbol_mask & trampoline_mask);

// Count the number of set bits in the trampoline mask lower than ordinal
int index = _Py_popcount32(trampoline_mask & (symbol_mask - 1));
for (int i = 0; i < ordinal / 32; i++) {
index += _Py_popcount32(state->trampolines.mask[i]);
}

unsigned char *trampoline = state->trampolines.mem + index * TRAMPOLINE_SIZE;
assert((size_t)(index + 1) * TRAMPOLINE_SIZE <= state->trampolines.size);

/* Generate the trampoline (14 bytes, padded to 16):
0: ff 25 00 00 00 00 jmp *(%rip)
6: XX XX XX XX XX XX XX XX (64-bit target address)

Reference: https://wiki.osdev.org/X86-64_Instruction_Encoding#FF (JMP r/m64)
*/
trampoline[0] = 0xFF;
trampoline[1] = 0x25;
*(uint32_t *)(trampoline + 2) = 0;
*(uint64_t *)(trampoline + 6) = value;

// Patch the call site to call the trampoline instead
patch_32r(location, (uintptr_t)trampoline - 4);
}

static void
combine_symbol_mask(const symbol_mask src, symbol_mask dest)
{
Expand Down
18 changes: 9 additions & 9 deletions Tools/jit/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,32 +9,32 @@ Python 3.11 or newer is required to build the JIT.

The JIT compiler does not require end users to install any third-party dependencies, but part of it must be *built* using LLVM[^why-llvm]. You are *not* required to build the rest of CPython using LLVM, or even the same version of LLVM (in fact, this is uncommon).

LLVM version 19 is the officially supported version. You can modify if needed using the `LLVM_VERSION` env var during configure. Both `clang` and `llvm-readobj` need to be installed and discoverable (version suffixes, like `clang-19`, are okay). It's highly recommended that you also have `llvm-objdump` available, since this allows the build script to dump human-readable assembly for the generated code.
LLVM version 20 is the officially supported version. You can modify if needed using the `LLVM_VERSION` env var during configure. Both `clang` and `llvm-readobj` need to be installed and discoverable (version suffixes, like `clang-19`, are okay). It's highly recommended that you also have `llvm-objdump` available, since this allows the build script to dump human-readable assembly for the generated code.

It's easy to install all of the required tools:

### Linux

Install LLVM 19 on Ubuntu/Debian:
Install LLVM 20 on Ubuntu/Debian:

```sh
wget https://apt.llvm.org/llvm.sh
chmod +x llvm.sh
sudo ./llvm.sh 19
sudo ./llvm.sh 20
```

Install LLVM 19 on Fedora Linux 40 or newer:
Install LLVM 20 on Fedora Linux 40 or newer:

```sh
sudo dnf install 'clang(major) = 19' 'llvm(major) = 19'
sudo dnf install 'clang(major) = 20' 'llvm(major) = 20'
```

### macOS

Install LLVM 19 with [Homebrew](https://brew.sh):
Install LLVM 20 with [Homebrew](https://brew.sh):

```sh
brew install llvm@19
brew install llvm@20
```

Homebrew won't add any of the tools to your `$PATH`. That's okay; the build script knows how to find them.
Expand All @@ -43,12 +43,12 @@ Homebrew won't add any of the tools to your `$PATH`. That's okay; the build scri

LLVM is downloaded automatically (along with other external binary dependencies) by `PCbuild\build.bat`.

Otherwise, you can install LLVM 19 [by searching for it on LLVM's GitHub releases page](https://github.com/llvm/llvm-project/releases?q=19), clicking on "Assets", downloading the appropriate Windows installer for your platform (likely the file ending with `-win64.exe`), and running it. **When installing, be sure to select the option labeled "Add LLVM to the system PATH".**
Otherwise, you can install LLVM 20 [by searching for it on LLVM's GitHub releases page](https://github.com/llvm/llvm-project/releases?q=20), clicking on "Assets", downloading the appropriate Windows installer for your platform (likely the file ending with `-win64.exe`), and running it. **When installing, be sure to select the option labeled "Add LLVM to the system PATH".**

Alternatively, you can use [chocolatey](https://chocolatey.org):

```sh
choco install llvm --version=19.1.0
choco install llvm --version=20.1.8
```

### Dev Containers
Expand Down
4 changes: 2 additions & 2 deletions Tools/jit/_llvm.py
Original file line number Diff line number Diff line change
Expand Up @@ -11,8 +11,8 @@
import _targets


_LLVM_VERSION = "19"
_EXTERNALS_LLVM_TAG = "llvm-19.1.7.0"
_LLVM_VERSION = "20"
_EXTERNALS_LLVM_TAG = "llvm-20.1.8.0"

_P = typing.ParamSpec("_P")
_R = typing.TypeVar("_R")
Expand Down
17 changes: 17 additions & 0 deletions Tools/jit/_stencils.py
Original file line number Diff line number Diff line change
Expand Up @@ -241,6 +241,23 @@ def process_relocations(self, known_symbols: dict[str, int]) -> None:
self._trampolines.add(ordinal)
hole.addend = ordinal
hole.symbol = None
# x86_64 Darwin trampolines for external symbols
elif (
hole.kind == "X86_64_RELOC_BRANCH"
and hole.value is HoleValue.ZERO
and hole.symbol not in self.symbols
):
hole.func = "patch_x86_64_trampoline"
hole.need_state = True
assert hole.symbol is not None
if hole.symbol in known_symbols:
ordinal = known_symbols[hole.symbol]
else:
ordinal = len(known_symbols)
known_symbols[hole.symbol] = ordinal
self._trampolines.add(ordinal)
hole.addend = ordinal
hole.symbol = None
self.data.pad(8)
for stencil in [self.code, self.data]:
for hole in stencil.holes:
Expand Down
Loading
Loading