Replies: 2 comments
-
Yes, Shao-Chun, dp_train only requires tensorflow's python interface. Most
likely you've installed a python interface of tensorflow's with a GPU
support. Can you double-check?
If you have available GPU machines, you may change the PBS submission
script, require a single GPU, and try again. If not, you may
install tensorflow's python interface with CPU support only.
Best,
Linfeng
…On Mon, Oct 14, 2019 at 1:07 PM k50112113 ***@***.***> wrote:
Hi:
I have installed tensorflow r1.8 and DeePMD r0.11 on cluster and I am
currently trying to train a model.
It works well when I am testing on home node.
But error below occurs when I was trying to submit jobs using PBS scripts:
File
"/usr/local/anaconda/5.2.0/3/lib/python3.6/site-packages/tensorflow/python/pywrap_t
ensorflow.py", line 58, in
from tensorflow.python.pywrap_tensorflow_internal import *
File
"/usr/local/anaconda/5.2.0/3/lib/python3.6/site-packages/tensorflow/python/pywrap_t
ensorflow_internal.py", line 28, in
_pywrap_tensorflow_internal = swig_import_helper()
File
"/usr/local/anaconda/5.2.0/3/lib/python3.6/site-packages/tensorflow/python/pywrap_t
ensorflow_internal.py", line 24, in swig_import_helper
_mod = imp.load_module('_pywrap_tensorflow_internal', fp, pathname,
description)
File "/usr/local/anaconda/5.2.0/3/lib/python3.6/imp.py", line 243, in
load_module
return load_dynamic(name, filename, file)
File "/usr/local/anaconda/5.2.0/3/lib/python3.6/imp.py", line 343, in
load_dynamic
return _load(spec)
ImportError: libcuda.so.1: cannot open shared object file: No such file or
directory
But I did not build my tensorflow with CUDA support and it seems that it
is not using the tensorflow I built (I am installing tensorflow's C++
interface)
It seems that dp_train only works with tensorflow's python interface?
Thanks for your time,
Lee, Shao-Chun
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<https://github.com/deepmodeling/deepmd-kit/issues/99?email_source=notifications&email_token=AEJ6DC5QIV54YT5FSV6UOFDQOSRM3A5CNFSM4JARYVTKYY3PNVWWK3TUL52HS4DFUVEXG43VMWVGG33NNVSW45C7NFSM4HRUYDAA>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AEJ6DC7FHRQILZXVE3VSOELQOSRM3ANCNFSM4JARYVTA>
.
|
Beta Was this translation helpful? Give feedback.
0 replies
Answer selected by
njzjz
-
Thanks, I'll reinstall my tensorflow with Python Interface and see if it works. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hi:
I have installed tensorflow r1.8 and DeePMD r0.11 on cluster and I am currently trying to train a model.
It works well when I am testing on home node.
But error below occurs when I was trying to submit jobs using PBS scripts:
File "/usr/local/anaconda/5.2.0/3/lib/python3.6/site-packages/tensorflow/python/pywrap_t
ensorflow.py", line 58, in
from tensorflow.python.pywrap_tensorflow_internal import *
File "/usr/local/anaconda/5.2.0/3/lib/python3.6/site-packages/tensorflow/python/pywrap_t
ensorflow_internal.py", line 28, in
_pywrap_tensorflow_internal = swig_import_helper()
File "/usr/local/anaconda/5.2.0/3/lib/python3.6/site-packages/tensorflow/python/pywrap_t
ensorflow_internal.py", line 24, in swig_import_helper
_mod = imp.load_module('_pywrap_tensorflow_internal', fp, pathname, description)
File "/usr/local/anaconda/5.2.0/3/lib/python3.6/imp.py", line 243, in load_module
return load_dynamic(name, filename, file)
File "/usr/local/anaconda/5.2.0/3/lib/python3.6/imp.py", line 343, in load_dynamic
return _load(spec)
ImportError: libcuda.so.1: cannot open shared object file: No such file or directory
But I did not build my tensorflow with CUDA support and it seems that it is not using the tensorflow I built (I am installing tensorflow's C++ interface)
It seems that dp_train only works with tensorflow's python interface?
Thanks for your time,
Lee, Shao-Chun
Beta Was this translation helpful? Give feedback.
All reactions