Skip to content

Pull requests: UKGovernmentBEIS/inspect_evals

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Assigned to nobody Loading
Sort

Pull requests list

add flaky mark with maximum number of runs
#742 opened Dec 4, 2025 by anthonyduong9 Loading…
Add remaining PLR linting
#735 opened Dec 3, 2025 by Jay-Bailey Loading…
SAD implementation Draft
#726 opened Dec 2, 2025 by HugoSave Draft
2 of 5 tasks
Add use_prebuilt_image option for DS1000
#724 opened Dec 2, 2025 by ItsTania Loading…
NoveltyBench benchmark implementation implementation An implementation of a new eval
#717 opened Dec 1, 2025 by iphan Loading…
set Task.version
#714 opened Nov 30, 2025 by anthonyduong9 Draft
Fix/issue 685 abstention types enhancement New feature or request
#695 opened Nov 21, 2025 by mjbroerman Loading…
Add OS-Harm implementation An implementation of a new eval
#692 opened Nov 21, 2025 by bndxn Draft
Add emergent misalignment evals implementation An implementation of a new eval
#682 opened Nov 17, 2025 by dtch1997 Loading…
14 of 17 tasks
Remove as_posix() calls & add custom POSIX-check pre-commit enhancement New feature or request
#666 opened Nov 11, 2025 by AnselmC Loading…
GDPval Implementation implementation An implementation of a new eval
#598 opened Oct 11, 2025 by jeqcho Loading…
Swe Lancer implementation implementation An implementation of a new eval
#352 opened May 26, 2025 by NelsonG-C Loading…
2 of 6 tasks
ProTip! Type g p on any issue or pull request to go back to the pull request listing page.