-
-
Notifications
You must be signed in to change notification settings - Fork 348
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Intermittent CI failures in test_local.py on PyPy #379
Comments
Hopefully this will help us diagnose weird freezes like python-triogh-379
Here's another one, on PyPy 5.8 in I'll paste the key bits below in case something happens to the Travis log:
So we see two threads both blocked in The weird thing though is that there are only two threads there: the main thread and one other. We ought to have 3 threads: the main thread and two others. We can't tell which of the two workers is missing, but if it's the one the test calls One possibility is that |
Here's another fun example – pypy nightly this time, and faulthandler segfaults instead of printing a backtrace: https://travis-ci.org/python-trio/trio/jobs/341720284 I managed to reproduce it once locally, by running
I wasn't able to get much debugging info out, but when I hit control-C, it printed a backtrace claiming that it was actually stuck at:
which I guess doesn't say much useful – that's what would happen e.g. if |
We used to use a single queue to send messages to and from the threads, which of course is unreliable because the main thread could end up reading back its own message. In particular, on PyPy this happened regularly, and occasionally it meant that the test deadlocked. So this fixed python-triogh-379. This patch also updates the test harness to actually pull errors back from the child threads, so that if the test does fail then we can detect it.
As usual, I feel very silly after having found it. Trick to tracking it down: for whatever reason, this time I was able to reproduce it by running In the process I noticed that the test was written in a very fragile way, because we don't have a nursery abstraction for threads. Maybe it would be worth rewriting that test using a parent trio + two |
I can't comment on the trio/run_sync_in_worker_thread rewrite, but I wanted to know: are we going to keep this test after #420 is done? |
Yeah. The details will change because we'll probably replace |
As originally tracked in #200, we've been seeing an intermittent freeze in
test_local.py
on PyPy:2017-06-16: happened on PyPy nightly but didn't save logs; failed to reproduce locally
2017-09-07 on PyPy 5.8 in
test_run_local_simultaneous_runs
: https://travis-ci.org/python-trio/trio/jobs/2727635112017-12-20 on PyPy nightly in
test_run_local_simultaneous_runs
: https://travis-ci.org/python-trio/trio/jobs/319497598The test uses threads + blocking queue operations to control sequencing, so it's the sort of thing where you're not shocked to see some weird little freeze, but after staring at it for a while I can't see any bug in the test itself.
The text was updated successfully, but these errors were encountered: