gh-128515: Add BOLT build to CI #128845

zanieb · 2025-01-14T19:23:16Z

Adds BOLT test coverage to CI, which will allow us to prevent regressions and move towards stabilization of this feature.

Of note:

Does not include aarch64 because we encounter a failure in BOLT; see BOLT optimizations fail on Linux aarch64 #128884
Conditionally some perf trampoline tests because they fail after BOLT optimizations; see perf trampoline tests fail after BOLT optimizations #128883
Excludes one test_pickle case from the profiling corpus because we use a RO file system for builds

Issue: Test BOLT builds in CI #128515

Copied from the JIT workflow

zanieb · 2025-01-14T20:49:30Z

Interesting, test_pickle failing on the instrumented binaries. Will need to investigate that, as I haven't seen it before.

edit: This occurs because test_unpickle_module_race fails on a read-only file system. See c3a3800

zanieb · 2025-01-14T23:06:53Z

I encountered a couple blockers for aarch64, a failed assertion in the instrumented binary

./python -m test --pgo --rerun --verbose3 --timeout=
python: ../cpython-ro-srcdir/Python/generated_cases.c.h:1074: _PyEval_EvalFrameDefault: Assertion `tp->tp_alloc == PyType_GenericAlloc' failed.
Aborted (core dumped)

and (after hacking past that) a segfault in BOLT

# Run bolt against the merged data to produce an optimized binary.
for bin in python; do \
  /usr/lib/llvm-19/bin/llvm-bolt "${bin}.prebolt" -o "${bin}.bolt" -data="${bin}.fdata" -update-debug-sections -skip-funcs=_PyEval_EvalFrameDefault,sre_ucs1_match/1,sre_ucs2_match/1,sre_ucs4_match/1  -reorder-blocks=ext-tsp -reorder-functions=cdsort -split-functions -icf=1 -inline-all -split-eh -reorder-functions-use-hot-size -peepholes=none -jump-tables=aggressive -inline-ap -indirect-call-promotion=all -dyno-stats -use-gnu-stack -frame-opt=hot ; \
  mv "${bin}.bolt" "${bin}"; \
done
BOLT-INFO: Target architecture: aarch64
BOLT-INFO: BOLT version: <unknown>
BOLT-INFO: first alloc address is 0x400000
BOLT-INFO: enabling relocation mode
BOLT-INFO: pre-processing profile using branch profile reader
BOLT-INFO: number of removed linker-inserted veneers: 0
BOLT-INFO: 8500 out of 12058 functions in the binary (70.5%) have non-empty execution profile
BOLT-INFO: 41 functions with profile could not be optimized
BOLT-INFO: profile for 1 objects was ignored
BOLT-INFO: removed 1 empty block
BOLT-INFO: ICF folded 678 out of 12439 functions in 5 passes. 0 functions had jump tables.
BOLT-INFO: Removing all identical functions will save 46.23 KB of code space. Folded functions were called 3909549484 times based on profile.
BOLT-INFO: ICP Total indirect calls = 1808544446, 153 callsites cover 99% of all indirect calls
 #0 0x0000aacc1be768cc (/usr/lib/llvm-19/bin/llvm-bolt+0x1ae68cc)
 #1 0x0000aacc1be74b80 (/usr/lib/llvm-19/bin/llvm-bolt+0x1ae4b80)
 #2 0x0000aacc1be77174 (/usr/lib/llvm-19/bin/llvm-bolt+0x1ae7174)
 #3 0x0000ff03feee37e0 (linux-vdso.so.1+0x7e0)
 #4 0x0000aacc1c397200 (/usr/lib/llvm-19/bin/llvm-bolt+0x2007200)
 #5 0x0000aacc1c39aa1c (/usr/lib/llvm-19/bin/llvm-bolt+0x200aa1c)
 #6 0x0000aacc1c39a9e4 (/usr/lib/llvm-19/bin/llvm-bolt+0x200a9e4)
 #7 0x0000aacc1c39a9e4 (/usr/lib/llvm-19/bin/llvm-bolt+0x200a9e4)
 #8 0x0000aacc1bf1ebc4 (/usr/lib/llvm-19/bin/llvm-bolt+0x1b8ebc4)
 #9 0x0000aacc1bf21328 (/usr/lib/llvm-19/bin/llvm-bolt+0x1b91328)
#10 0x0000aacc1becfe3c (/usr/lib/llvm-19/bin/llvm-bolt+0x1b3fe3c)
#11 0x0000aacc1aadf2f0 (/usr/lib/llvm-19/bin/llvm-bolt+0x74f2f0)
#12 0x0000ff03fe8684c4 __libc_start_call_main ./csu/../sysdeps/nptl/libc_start_call_main.h:74:3
#13 0x0000ff03fe868598 call_init ./csu/../csu/libc-start.c:128:20
#14 0x0000ff03fe868598 __libc_start_main ./csu/../csu/libc-start.c:347:5
#15 0x0000aacc1aadd4f0 (/usr/lib/llvm-19/bin/llvm-bolt+0x74d4f0)
PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace.
Stack dump:
0.	Program arguments: /usr/lib/llvm-19/bin/llvm-bolt python.prebolt -o python.bolt -data=python.fdata -update-debug-sections -skip-funcs=_PyEval_EvalFrameDefault,sre_ucs1_match/1,sre_ucs2_match/1,sre_ucs4_match/1 -reorder-blocks=ext-tsp -reorder-functions=cdsort -split-functions -icf=1 -inline-all -split-eh -reorder-functions-use-hot-size -peepholes=none -jump-tables=aggressive -inline-ap -indirect-call-promotion=all -dyno-stats -use-gnu-stack -frame-opt=hot
Segmentation fault (core dumped)

I dropped aarch64 in 684ece4 — we can add it later.

zanieb · 2025-01-14T23:35:47Z

A few tests are failing after BOLT optimization. I'd appreciate some guidance on that.

test_sys_api (test.test_perf_profiler.TestPerfTrampoline.test_sys_api) ... FAIL
test_trampoline_works (test.test_perf_profiler.TestPerfTrampoline.test_trampoline_works) ... FAIL
test_trampoline_works_with_forks (test.test_perf_profiler.TestPerfTrampoline.test_trampoline_works_with_forks) ... FAIL

======================================================================
FAIL: test_sys_api (test.test_perf_profiler.TestPerfTrampoline.test_sys_api)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/runner/work/cpython/cpython-ro-srcdir/Lib/test/test_perf_profiler.py", line 203, in test_sys_api
    self.assertIn(f"py::spam:{script}", perf_file_contents)
    ~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
AssertionError: 'py::spam:/tmp/test_python_qxe1_ajb/tmpqablk9qp/perftest.py' not found in '7f2d97946000 80600b py::baz:/tmp/test_python_qxe1_ajb/tmpqablk9qp/perftest.py\n'

======================================================================
FAIL: test_trampoline_works (test.test_perf_profiler.TestPerfTrampoline.test_trampoline_works)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/runner/work/cpython/cpython-ro-srcdir/Lib/test/test_perf_profiler.py", line 91, in test_trampoline_works
    self.assertIsNotNone(
    ~~~~~~~~~~~~~~~~~~~~^
        perf_line, f"Could not find {expected_symbol} in perf file"
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    )
    ^
AssertionError: unexpectedly None : Could not find py::foo:/tmp/test_python_qxe1_ajb/tmpdd3d4w9f/perftest.py in perf file

======================================================================
FAIL: test_trampoline_works_with_forks (test.test_perf_profiler.TestPerfTrampoline.test_trampoline_works_with_forks)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/runner/work/cpython/cpython-ro-srcdir/Lib/test/test_perf_profiler.py", line 145, in test_trampoline_works_with_forks
    self.assertEqual(process.returncode, 0)
    ~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^
AssertionError: -11 != 0

----------------------------------------------------------------------
Ran 3 tests in 0.463s

FAILED (failures=3)
test test_perf_profiler failed
1 test failed again:
    test_perf_profiler

zanieb · 2025-01-14T23:43:59Z

The timing on this actually seems pretty reasonable at 13 minutes.

We could expand this to perform other build optimizations, e.g., PGO, to verify they're working as intended? Right now it's just BOLT though.

corona10 · 2025-01-15T01:46:52Z

Two things

We should use this action for BOLT only since the test coverage is different from PGO+LTO build.
Let's skip 3 test failure tests by using @unittest.skipIf(support.check_bolt_optimized.

zanieb · 2025-01-15T02:47:09Z

We should use this action for BOLT only since the test coverage is different from PGO+LTO build.

Can you expand on this comment?

Let's skip 3 test failure tests by using @unittest.skipIf(support.check_bolt_optimized.

Sounds good to me — should I open an issue to investigate why they fail too? Like is the profiler actually broken?

corona10 · 2025-01-15T02:54:25Z

Can you expand on this comment?

Because we skip several tests with BOLTed binary, PGO + LTO can not check the regression issue where tests are skipped. Currently, PGO + LTO is the standard optimization policy of the CPython project.
So this is why I suggested let's handle it separately for the PGO + LTO build.

Sounds good to me — should I open an issue to investigate why they fail too? Like is the profiler actually broken?

Yeah, we should; maybe @pablogsal is interested in this issue.

zanieb · 2025-01-15T14:49:25Z

Created a tracking issue at #128883; skipped the tests in 01cb8d8

zanieb · 2025-01-15T15:18:19Z

.github/workflows/build.yml

+        # Do not test BOLT with free-threading, to conserve resources
+        - bolt: true
+          free-threading: true
+        # BOLT currently crashes during instrumentation on aarch64
+        - os: ubuntu-24.04-aarch64
+          bolt: true


I don't have strong feelings about this pattern (using exclude instead of include), but liked that I could document why we're not running the additional cases.

sobolevn · 2025-01-15T16:19:25Z

.github/workflows/build.yml

@@ -246,10 +250,17 @@ jobs:
        exclude:


Not a strong opinion, but I would prefer to have just 1, 2, or 3 jobs with bolt. unless it is absolutely needed.

We can move some very specific builds to buildbots, while maintatining the bare minimum in CI.

This is just one job with BOLT — I think in the future we'd want a second job for aarch64 once that's unblocked. Are you suggesting I should frame this as an include instead? ref #128845 (comment)

Yes! Sorry for not being clear :)

Add BOLT support to reusable Ubuntu workflow

8699afc

bedevere-app bot mentioned this pull request Jan 14, 2025

Test BOLT builds in CI #128515

Open

zanieb added the skip news label Jan 14, 2025

zanieb force-pushed the zb/bolt branch from 53f3b8e to f20233b Compare January 14, 2025 19:28

Add BOLT to build matrix

254b553

zanieb force-pushed the zb/bolt branch 5 times, most recently from 29351fc to 1d7ab1e Compare January 14, 2025 20:35

Install Clang

9b179cb

Copied from the JIT workflow

zanieb force-pushed the zb/bolt branch from 1d7ab1e to 9b179cb Compare January 14, 2025 20:39

Skip BOLT CI on aarch64

684ece4

zanieb force-pushed the zb/bolt branch from 68d9751 to c3a3800 Compare January 14, 2025 23:04

corona10 self-assigned this Jan 14, 2025

Ignore test_unpickle_module_race during BOLT profiling

03be932

zanieb force-pushed the zb/bolt branch from c3a3800 to 03be932 Compare January 14, 2025 23:17

Skip perf trampoline tests when BOLT instrumented

01cb8d8

zanieb mentioned this pull request Jan 15, 2025

perf trampoline tests fail after BOLT optimizations #128883

Open

zanieb marked this pull request as ready for review January 15, 2025 15:05

zanieb requested review from ezio-melotti and hugovk as code owners January 15, 2025 15:05

bedevere-app bot added the awaiting review label Jan 15, 2025

zanieb added the infra CI, GitHub Actions, buildbots, Dependabot, etc. label Jan 15, 2025

zanieb commented Jan 15, 2025

View reviewed changes

sobolevn reviewed Jan 15, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

gh-128515: Add BOLT build to CI #128845

gh-128515: Add BOLT build to CI #128845

zanieb commented Jan 14, 2025 •

edited

Loading

zanieb commented Jan 14, 2025 •

edited

Loading

zanieb commented Jan 14, 2025 •

edited

Loading

zanieb commented Jan 14, 2025

zanieb commented Jan 14, 2025

corona10 commented Jan 15, 2025

zanieb commented Jan 15, 2025

corona10 commented Jan 15, 2025

zanieb commented Jan 15, 2025

zanieb Jan 15, 2025

sobolevn Jan 15, 2025

zanieb Jan 15, 2025

sobolevn Jan 15, 2025

gh-128515: Add BOLT build to CI #128845

Are you sure you want to change the base?

gh-128515: Add BOLT build to CI #128845

Conversation

zanieb commented Jan 14, 2025 • edited Loading

zanieb commented Jan 14, 2025 • edited Loading

zanieb commented Jan 14, 2025 • edited Loading

zanieb commented Jan 14, 2025

zanieb commented Jan 14, 2025

corona10 commented Jan 15, 2025

zanieb commented Jan 15, 2025

corona10 commented Jan 15, 2025

zanieb commented Jan 15, 2025

zanieb Jan 15, 2025

Choose a reason for hiding this comment

sobolevn Jan 15, 2025

Choose a reason for hiding this comment

zanieb Jan 15, 2025

Choose a reason for hiding this comment

sobolevn Jan 15, 2025

Choose a reason for hiding this comment

zanieb commented Jan 14, 2025 •

edited

Loading

zanieb commented Jan 14, 2025 •

edited

Loading

zanieb commented Jan 14, 2025 •

edited

Loading