perf: run map_variations in reconsensus concurrently #107

ivan-aksamentov · 2025-01-15T05:16:55Z

This uses rayon's parallel iterator to run map_variations() (basically Nextclade) for individual sequences concurrently in reconsensus() step.

We already run it concurrently in solve_promise()step, for individual blocks and then concurrently for alignment within each block, so I thought we can repeat this success in the reconsensus() step as well.

This change results in 64.4% speedup in my measurements on ecoli.

Command:

/usr/bin/time -qf 'Cmd : %C\nTime: %E\nMem : %M KB' pangraph build -b 20 -l 500 --circular data/ecoli.fa.gz -o tmp/ecoli.fa.gz.json -v

Branch	Time
rust (commit `2d2e7b0`)	5m 51s
perf/parallel-nextclade-reconsensus	3m 33s

Things to watch out:

increased memory usage due to increased concurrency
potential reorder of the results (see Ordering question rayon-rs/rayon#551)

Both are also true for all our existing parallel loops.

Related: #108

This uses rayon's parallel iterator to run `map_variations()` (basically Nextclade) for individual sequences concurrently in `reconsensus()` step. We already run it concurrently in `solve_promise()`step, [for individual blocks](https://github.com/neherlab/pangraph/blob/2d2e7b046cbbbc24d1c364f5a0afe2176b662a99/packages/pangraph/src/pangraph/graph_merging.rs#L145-L150) and then concurrently [for alignment within each block](https://github.com/neherlab/pangraph/blob/2d2e7b046cbbbc24d1c364f5a0afe2176b662a99/packages/pangraph/src/pangraph/reweave.rs#L38-L49), so I thought we can repeat this success in the `reconsensus()` step as well. This change results in 64.4% speedup in my measurements on ecoli. Command: ```bash /usr/bin/time -qf 'Cmd : %C\nTime: %E\nMem : %M KB' pangraph build -b 20 -l 500 --circular data/ecoli.fa.gz -o tmp/ecoli.fa.gz.json -v ``` | Branch | Time | |-----------------------------------------|---------| | rust (commit 2d2e7b0) | 5m 51s | | perf/parallel-nextclade-reconsensus | 3m 33s | Things to watch out: * increased memory usage due to increased concurrency * potential reorder of the results

Similar to #107 but less succesful. Here I tried to introduce more parallelism in random places where I found `.iter()` or a a plain loop and where rayon's `.par_iter()` could be used (technically; scientific correctness is to be verified). In my measurements this brings no speedup at all compared to base branch. I'll just leave this as an idea for the future optimization. Perhaps there's a smarter way to find places where we can squeeze some more parallelism, or we could restructure the algo such that new parallelization opportunities appear. At the time, this is low priority I think, so let's focus on other things.

ivan-aksamentov requested a review from mmolari January 15, 2025 05:17

ivan-aksamentov mentioned this pull request Jan 15, 2025

perf: more par_iter #108

Draft

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf: run map_variations in reconsensus concurrently #107

perf: run map_variations in reconsensus concurrently #107

ivan-aksamentov commented Jan 15, 2025 •

edited

Loading

perf: run map_variations in reconsensus concurrently #107

Are you sure you want to change the base?

perf: run map_variations in reconsensus concurrently #107

Conversation

ivan-aksamentov commented Jan 15, 2025 • edited Loading

ivan-aksamentov commented Jan 15, 2025 •

edited

Loading