-
Notifications
You must be signed in to change notification settings - Fork 667
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
mark analysis.rdf.InterRDF and analysis.rdf.InterRDF_s as not parallizable #4884
base: develop
Are you sure you want to change the base?
Conversation
Hello @tanishy7777! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found:
Comment last updated at 2025-01-13 20:07:19 UTC |
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## develop #4884 +/- ##
===========================================
- Coverage 93.65% 93.63% -0.03%
===========================================
Files 177 189 +12
Lines 21795 22870 +1075
Branches 3067 3067
===========================================
+ Hits 20413 21415 +1002
- Misses 931 1004 +73
Partials 451 451 ☔ View full report in Codecov by Sentry. |
@tanishy7777 thanks for the prompt PR! I have some comments though:
Can we just sum it up separately though? I mean, make it a part of |
Got it! I have made |
when trying to make
I tried to find out why this is happening by looking at the result itself which is being aggregated along 'count' so the so I tried finding the dimensions of the arrays manually because it wasnt converting to a numpy array. so its not able to convert to a numpy array, because of inconsistent dimensions. I am not sure how to resolve this here |
based on the above comment #4884 (comment) I think we can mark because its not possible to convert array of inhomogenous dimensions to a numpy array and since Because after the self.results.count[i] / norm wont be possible as division is not supported between |
@tanishy7777 the class should be marked as non-parallelizable only if the algorithm to run it is not actually parallelizable, which I'm not yet convinced is the case for all the mentioned classes. But I think you're on the right path here, you just need to implement a custom aggregation function, instead of those implemented among Can you describe what kind of arrays you're trying to aggregate and can not find an appropriate function for? I didn't quite get it from your screenshots. |
it should be the same as if you'd run it without parallelization. And I assume you want to stack/sum/whatever along the dimension that corresponds to the timestep -- you can probably guess which one it is if you run it on some example with known number of frames. Example trajectories you can find in MDAnalysisTests. |
Got it. Will work on that! |
Fixes #4675
Changes made in this Pull Request:
rdf.InterRDF
andrdf.InterRDF_s
as non parallizableIn
_single_frame
method ofanalysis.rdf.InterRDF_s
we can see thatself.volume_cum
is cummulated across the frames so we cant parallize simply using the split-apply-combine technique.Similiary in
_single_frame
method ofanalysis.rdf.InterRDF_s
self.volume_cum
is cummulated across the frames.PR Checklist
Developers certificate of origin
📚 Documentation preview 📚: https://mdanalysis--4884.org.readthedocs.build/en/4884/