-
Notifications
You must be signed in to change notification settings - Fork 46
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Avoid (some) incref/decref immortality checks on 3.12+ #1044
Comments
I think decref adds up to about 20% of mypy runtime on Python 3.11 when doing Do you know off the top of your head if there's other low hanging fruit in mypyc regarding refcounting? Skimming the C code, I think I do see some redundant incref+decref pairs... |
I think much of decref cost is from freeing objects. Having per-type freelists should help with it (e.g. #1018, but many classes could benefit). Mypy allocates a lot of temporary objects, and if we'd have small per-type freelists for selected types, we could avoid a lot of expensive allocation/free operations. Another idea would be to identify the top N most expensive compiled functions in a CPU profile, and manually look for redundant incref/decref operations by inspecting the generated source or IR. Any redundant operations you find are likely to be worth fixing, or at least worth investigating. We could possibly avoid some incref/decref pairs if we'd support borrowing of final attributes. Example where this would help: from typing import Final
class C:
def __init__(self, s: str) -> None:
self.s: Final = s
def foo(s: str) -> None: ...
def bar(c: C) -> None:
foo(c.s) # Redundant incref/decref, since c.s can't be freed during the call We'd also need to make various attributes (e.g. in |
I've been working on upgrading the mypyc benchmarks runner to use Python 3.13 (previously it was using 3.8), and I noticed that some benchmarks are clearly slower on 3.13, and at least much of the impact seems to be from immortality checks. For example, the richards benchmark went from 0.0022s on 3.11 to 0.0038s on 3.12 (around 70% increase in execution time). This now seems high priority, even if the impact to self check is not high. Also, we should try to avoid the overhead by default, instead of putting it behind a compiler flag. |
I have a basic draft implementation that skips immortality checks for native classes and some mutable built-in types that can't be safely shared between subinterpreters. This speeds up richards by about 30%, and self check by about |
Python 3.12 added support for immortal objects (python/cpython#19474). These add immortality checks to incref/decref operations, which add some overhead. Some programs don't get any advantage from immortal objects, and these could benefit from skipping these checks (enabled with a mypyc optimization flag). Skipping the checks seems a safe thing to do, based on this comment in
Include/object.h
in Python 3.12.0:Immortality checks in the Python runtime, stdlib and C extensions will still be present, but if most time is spent in code compiled using mypyc, skipping the checks could improve performance somewhat. In mypy self check I saw a 1.9% performance improvement from skipping them. I wouldn't be surprised if some use cases would get a 5-10% performance improvement. The self check improvement is small enough that it doesn't seem essential to use this in mypy wheels.
The text was updated successfully, but these errors were encountered: