Releases: deepmodeling/deepmd-kit
Releases · deepmodeling/deepmd-kit
v2.0.1
New features:
- correct heat flux calculation: interface for deepmd with the centroids atoms, full 3x3 "atomic-virial" (#1093).
- Enable init-frz-model support for the original model (#1102 #1107 )
- support init-frz-model for hybrid descriptor (#1112)
Enhancements:
- use
np.testing.assert_almost_equal
for array comparing (#1059) - set
allow_growth
indefault_tf_session_config
(#1067) - Enable parallel training UT in GitHub CI. (#1075)
- create cross-references in docstring (#1083)
- add ABC for descriptors (#1081)
- merge duplicated NeighborStat.get_stat (#1103)
Bug fixings:
- fix hybrid descriptor training error (#1052)
- bugs and memory issues in UTs (#1056 #1066 )
- copy
all_virial
for float precision (#1069) - fix building problem on macos (#1071)
- use @loader_path on macos instead of $ORIGIN (#1078)
- Revert "get library extension suffix from built-in method" #1072
- undo reset lcurve.out during the model compression process (#1080)
- fix typo:
lcueve.out
->lcurve.out
(#1077) - create model compression checkpoint, avoid overwriting original checkpoint (#1076)
- Fix shape mismatch when type_embedding is enabled and type_one_side is disabled (#1074 )
- reduce
rcut
andsel
in the example ofse_e3
(#1082) - Fix a potential slice bug in se_t descriptor (#1087)
- make compress work for hybrid descriptor composed of se_e2_a (#1094)
- Fix gradient not averaged when parallel training. (#1104)
- fix bug of single precision model compression (#1110)
- fix bug of single precision transfer (#1111)
- fix LAMMPS_VERSION_NUMBER condition (#1116)
- Fix missing
std::numeric_limits
(#1113) - fix data_modifier OOM problem when set size is too large (#1117)
- fix bugs of dipole charge modifier: binary str and missing frozen node (#1124 )
- fix "Call to method DeepTensor.init with too many arguments" (#1125)
v2.0.0
Breaking changes to v1.3
- Training parameters: Several training parameters have been updated. Original training data is splited into training data and validation data. Please read the document to apply the changes. Old styles can still work but are not recommended.
- Model inference: Old models trained by v1 will not work in v2. Run
dp convert-from
to convert old models to v2. - Python interface:
deepmd.DeepPot
has been moved todeepmd.infer.DeepPot
. - C++ interface:
NNPInter
has been renamed todeepmd::DeepPot
andNNPInter.h
has been renamed toDeepPot.h
. Use-ldeepmd_cc
to link instead.
New features
- Model compression (#350 #586 #610 #921 #948 #956 #1000 #1008 #1020 #1043)
- Parallel training (#892 #905 #913 #1030 #1032) (Bytedance)
- ROCm device support (#656 )
- New descriptor: three body embedding (
se_e3
) - Hybridization of descriptors (
hybrid
) - Type embedding
- Training and inference the dipole (vector) and polarizability (matrix). (#495 #538 #927)
- Support derivatives of the tensor properties. (#805)
- Split of training and validation dataset.
- Model deviation for virial
- Add subcommand and python interface to calculate model-deviation (#715)
- Automatically determine the sel from the training data. (#831)
- Building with lammps with plugin mode (#930 #945)
Performance improvement:
- More efficient training: all customized OPs are implemented with GPU.
- MPI support for atomic model deviation #628
- speedup ROCm kernels which use atomicAdd (#809 #815 ) (from ByteDance)
- speedup CUDA kernels (use atomicAdd inside) by reducing the global memory write (#811)
- speedup tabulate cuda kernel by reducing shm using (#830) (Bytedance)
- speedup
format_nlist_b
(#832 #845) - speedup
scan_nlist
kernel (#1028)
Enhancements
- Strict argument check in the input script.
- Auto conversion of input file to v2.0 compatibility
- Append out_file when lammps restarts #640
- Document and examples for the C++ interface #652 #663
- Instructions for the i-pi #660
- Document for the network size and sel #657
- Use fmod to wrap the coord of atoms (solve slow PBC) (#741)
- bit operations to encode neighbor information
- add CUDA/ROCM buidling documents (#739)
- add type-embedding developer doc (#762 #967)
- add model compression support for models with exclude_types feature (#754)
- improve the doc and user interface of model compression (#772)
- support converting models generated in v1.3 to 2.0 compatibility (#725)
- give a default value to T and convert models from v1.2 to 2.0 compatibility (#789)
- improved documents for conda (#798 #925)
- throw a message if tf runtime is incompatible (#797)
- capture OOM and print debug message (#801)
- add message for DecodeError raised when using model compression (#839)
- Passing error to TF instead of exit (#918)
- refactor docs (#952)
- add an example of
nopbc
and related docs (#994) - add
dp --version
(#995) - add the argument
tensorboard_freq
to control sampling ratio during training. (#996) - add sphinx plugins
viewcode
andintersphinx
(#997) - generate Python API document automatically (#998)
- give a clear message if
model.get_ntypes()<data.get_ntypes()
(#1016) - add docstring for
descrpt/se_e2_a
(#1017) - add docstring for
fit/ener
(#1024) - add
InputNlist
into API doc (#1009) - save checkpoint files with step and keep recent files (#1031)
Improvement of the code for developers
- Support version of the model. Easily check model compatability
- Clear and pythonic python interface
- C++ lib that can be tested independently
- C++ API that can be tested independently
- OP supports multi-device.
- Added
deepmd
namespace for the C++ API - UT for Cuda/ROCm code (#569)
- UT for model compression (#586)
- UT for prod_force/virial ops (#703 #741)
- CI test Lammps build (#600)
- allow c++ tests to run without internet (#785)
- build low and high precision at the same time (#879)
- support to specify CUDA/ROCm root in python pkg building (#834) (Bytedance)
- use cached Session to speed up py tests (#833)
- remove cub include for CUDA>=11 (#866)
- Add Errcheck after every kernel function runs And merge redundant code (#855)
- adapt changes to auditwheel directory in manylinux (#889)
- enhance the cli to generate doc json file (#891)
- raise warning before training if
sel
is not enough (#914) - make native MD compatible with v2.0 (#950)
- fix type hints and add doc for
exclude_types
(#1005) - use TF's built-in method to get numpy dtype (#1035)
Bug fixings:
- Remove
using namespace std
. Solve compiling compatability problem. cuda
memory access error #566- Relative force model deviation is not copied back at single precision #599
- Correct way of allocating memory in float precision #612
- Fix TB logdir remove bug #617
- Illegal nlist #680
- Bug in
prod_virial_grad
that causes wrong results when training with virials #685 - Uniform random seed #691
- Illegal nlist #680
- Bug in
prod_virial_grad
that causes wrong results when training with virials #685 - Uniform random seed #691
- fix bug of adding int to a None random seed (#705)
- reuse the zero layer rather than building a new one (#714)
- fix bug in CI (#739)
- fix bug 824 and Synchronize updates to CUDA cod (#828)
- Fix the empty neighbor distance array in neighbor_stat.py (#882)
- fix InvalidArgumentError caused by zero sel and optimize zero matrix (#900)
- fix 'NoneType' has no len() in auto_sel (#911)
- set input
DeepmdData.type_map
to inputtype_map
(#924) - Fix member declartion of
deepmd
anddeepmd.entrypoints
. (#922) - add aliases to Arguments (#933)
- fix bug of gelu activation function (#939)
- convert
decay_rate
tostop_lr
from old inputs (#949) - only enable link what you use on GNU compilers (#962)
- Do not find protobuf for python (#963)
- fix an error in stress by ase interface (#964)
- remove bare
except
and limit thetry
clause (#977) - fix python cmake error (#976)
- Instantiate RunOptions first when training. (#1019)
- Fix complier type in cmake:
CMAKE_COMPILER_IS_GNUCXX
(#1038) - other cleanups of the code (#968 #970 #975 #999 #1004 #1002 #1001 #1010 #1014 #1012 #1011 #1021 #1036 #1037)
Contributors
- Han Bao
- Roberto Car
- Junhan Chang
- Yixiao Chen
- Ye Ding
- Weinan E
- Jiequn Han
- Li'ang Huang
- Weile Jia
- Zeyu Li
- Ziyao Li
- Yinnian Lin
- Yihao Liu
- Xinzijian Liu
- Denghui Lu
- Marián Rynik
- Shaochen Shi
- Ping Tuo
- Bo Wang
- Haidi Wang
- Han Wang
- Yingze Wang
- Yu Xia
- Fengbo Yuan
- Jiabin Yang
- Haotian Ye
- Jinzhe Zeng
- Duo Zhang
- Linfeng Zhang
- Yuzhi Zhang
v2.0.0-beta.4
New features:
- parallel training (#892 #905 #913) (Bytedance)
- automatically determine the
sel
from the training data. (#831) - build low and high precision at the same time (#879)
Performance improvement:
- speedup tabulate cuda kernel by reducing shm using (#830) (Bytedance)
- speedup format_nlist_b (#832 #845)
Enhancements:
- support to specify CUDA/ROCm root in python pkg building (#834) (Bytedance)
- use cached Session to speed up py tests (#833)
- add message for DecodeError raised when using model compression (#839)
- remove cub include for CUDA>=11 (#866)
- Add Errcheck after every kernel function runs And merge redundant code (#855)
- adapt changes to auditwheel directory in manylinux (#889)
- enhance the cli to generate doc json file (#891)
- raise warning before training if sel is not enough (#914)
Bug fixings:
v2.0.0-beta.3
New feature:
- derivatives for deep tensor (#805)
Performance improvement:
- speedup ROCm kernels which use atomicAdd (#809 #815 ) (from ByteDance)
- speedup CUDA kernels (use atomicAdd inside) by reducing the global memory write (#811)
Enhancement:
- add type-embedding developer doc (#762)
- add model compression support for models with exclude_types feature (#754)
- improve the doc and user interface of model compression (#772)
- allow c++ tests to run without internet (#785)
- support converting models generated in v1.3 to 2.0 compatibility (#725)
- give a default value to T and convert models from v1.2 to 2.0 compatibility (#789)
- improved documents for conda (#798)
- throw a message if tf runtime is incompatible (#797)
- capture OOM and print debug message (#801)
Bug fixings
v2.0.0-beta.2
New features:
- Add subcommand and python interface to calculate model-deviation (#715)
Enhancements
- Use fmod to wrap the coord of atoms. UT for force/virial ops (#741)
- UT for model devi C++ interface (#731)
- add CUDA/ROCM buidling documents (#739)
- add op unittests for prod_force, prod_virial, prod_force_grad and prod_virial_grad (#703)
Bug fixings:
v2.0.0-beta.1
v2.0.0-beta.0
Increment to v2.0.0-alpha:
New features:
- Atom type embedding
- Model deviation for virial
Enhancement:
- Improved documentation
- Better support for dipole and polarizability learning
- bit operations to encode neighbor information
- MPI support for atomic model deviation #628
- UT for GPU code #569
- UT for model compression #586
- Test Lammps build #600
Bug fixings
v2.0.0-alpha.1
What's new to v2.0.0-alpha.0
- Training and inference the dipole (vector).
- Split of training and validation dataset.
Enhancement:
- Strict argument check in the input script.
- Update readme for v2.0
- Auto conversion of input file to v2.0 compatibility
Bug fixings:
- Fix bugs of broken examples.
v2.0.0-alpha.0
The very first alpha release of deepmd-kit version 2.0.0. It includes the following new features
- Model compression
- New descriptor: three body embedding
- Hybridization of descriptors
- Long-range modification
- Type embedding (under development)
- Training and inference the dipole (vector) and polarizability (matrix). (under development)
- Split of training and validation dataset. (under development)
- ROCm device support (under development)
Enhancements
- More efficient training: all customized OPs are implemented with GPU.
- Parallel training with multiple GPU support (under development)
Improvement of the code for developers
- Supports version of the model. Easily check model compatability
- Clear and pythonic python interface
- C++ API that can be tested independently
- OP supports multi-device.
Bug fixings:
- remove
using namespace std
. Solves compiling compatability problem. - added
deepmd
namespace for the C++ API