Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Time consuming process #7

Open
karimi81 opened this issue Sep 29, 2021 · 4 comments
Open

Time consuming process #7

karimi81 opened this issue Sep 29, 2021 · 4 comments

Comments

@karimi81
Copy link

Hi There,
I am trying to use TRF to detect tandem repeats in a large genome assembly with the size of 2.68 Gb. Although the program worked for about 12 days on a node with 32 CPUs and 125 Gb memory, finally it was not completed. The following is the command I have used:
trf new_id.fasta 2 5 7 80 10 50 2000
Is there any way to improve the efficiency of the computation? e.g parallel processing or reduce the computation time . I would be appreciated if you could help me in this regard.
Thank you

@Aannaw
Copy link

Aannaw commented Oct 5, 2021

Hi
Have you solved this problem? I split my genome with the size 3.1G and then running trf, but it is not yet completed after about 14 days runnng with 1 cpu and 3 Gb. Does the programme have any options with thread to speed?

@xiekunwhy
Copy link

I meet the same problem, and I think I need to give up trf, and try some others like Look4TRs and Dot2dot.

@Wenfei-Xian
Copy link

Hi all, TRF will get stuck in the long centromere region, if you want to identify tandem repeat, especially in T2T assembly, please set a higher value for -l :)

@hdashnow
Copy link

Increasing the value of -l to >100 for chm13-T2T helped in my case. I tested a few different values to check their memory usage as it can get pretty high.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants