-
Hello, I am attempting to run the DeePMD-Kit Quick Start Tutorial on an HPC. My HPC uses SLURM, so I need to create a SLURM script. I cannot just run "dp" in the command line as shown in the tutorial. I am seeking help duplicating the results of the tutorial to ensure I have the proper setup on my HPC. I am currently running into issues where the tutorial case is running, but is running significantly slower than expected. In particular, I am seeking clarification with the following aspects of my SLURM script so I can properly replicate the tutorial case (I cannot find this information in the documentation or on the discussion board): - How many processors should I use? How many processors per CPU should I use? Do I need to request memory? If so, how much memory should I request? and Anything else I need to know to properly format my SLURM script. Here is what I have done so far:
I am currently using 32 processors on a single node. When I submit my job, it runs, but it takes significantly longer than the example case. As you can see this is significantly slower than the approximately 20 second wall time shown in the tutorial for the same part. Here is the beginning output of my case for reference: here is my SLURM script for reference:
Thank you in advance for any help that can be provided, it is much appreciated :) |
Beta Was this translation helpful? Give feedback.
Replies: 2 comments 2 replies
-
|
Beta Was this translation helpful? Give feedback.
-
You shouldn't use the MPI when using the |
Beta Was this translation helpful? Give feedback.
You shouldn't use the MPI when using the
dp train
on CPUs of a single machine.