You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In analyses e.g. DistanceMatrix in diffusionmap, we need to run analysis of a single frame over the full trajectory---that is sliced and defined by start, stop, and step in run(). However, in the current parallel analysis implementation, information about the full trajectory is lost (only per-process slices are visible).
Describe the solution you’d like
Firstly, in _setup_frames (which only runs in the main process), store the sliced trajectory information in self._global_slicer. This makes it possible to retrieve global details such as the total number of frames.
Secondly, in self._compute, also store self._global_frame_index so the analysis can track the global frame index for each frame, rather than just per-process slices.
@yuxuanzhuang just wanted to say that I saw the PR and plan to review it tomorrow or on Saturday.
as for the _global part, I can think of two solutions:
have them implemented as properties with some documentation (instead of pure attributes), since as I understand, the current solution adds attributes and doesn't really allow to e.g. read the documentaion on them directly (instead you have to go read the actual docs, which is one step further away from jupyter's run.attribute?)
add an attribute constants or something, that would keep track of all readonly properties that the reader assigns during frame reading, and list all of them there
The last one is a bigger change though, but it also seems that there are a lot of different properties already, that might be worth unifying under a common class.
Is your feature request related to a problem?
In analyses e.g.
DistanceMatrix
indiffusionmap
, we need to run analysis of a single frame over the full trajectory---that is sliced and defined bystart
,stop
, andstep
inrun()
. However, in the current parallel analysis implementation, information about the full trajectory is lost (only per-process slices are visible).Describe the solution you’d like
Firstly, in
_setup_frames
(which only runs in the main process), store the sliced trajectory information inself._global_slicer
. This makes it possible to retrieve global details such as the total number of frames.Secondly, in
self._compute
, also storeself._global_frame_index
so the analysis can track the global frame index for each frame, rather than just per-process slices.The following code will illustrate its usage
n_frames
andframe_index
are not the same for serial and parallel analysisRelated PR
#4745
The text was updated successfully, but these errors were encountered: