飞桨分子动力学模拟-论文复现第六期：复现TorchMD

Paddle for MD 飞桨分子动力学模拟科学计算

复现论文-TorchMD: A deep learning framework for molecular simulations
本项目可在AIStudio一键运行：飞桨分子动力学模拟PaddleMD-复现TorchMD
在这里插入图片描述

【论文复现第六期】相关信息

飞桨第六期论文复现赛128 https://aistudio.baidu.com/aistudio/competition/detail/205/0/task-definition

issue报名地址：https://github.com/PaddlePaddle/Paddle/issues/41482

验收标准：Di-alanine 688 8 min 44 s， Trypsin 3,248 13 min 2 s。（后改为去掉速度要求）

成功后会合入 https://github.com/X4Science

飞桨顶会论文挑战赛第9号 https://www.educoder.net/competitions/index/paddlepaddle-02

torchMD代码学习：https://github.com/torchmd/torchmd

论文地址
论文中文版地址

复现工作

复现工作第一步PaddleMD
第一步完成后运行Tutorial指导手册复现第二步2PaddleMD提速复现第三步集成测试

复现相关几个手写API速度测试经测试，代码中有for循环的，一般速度比没有for循环都要慢30倍左右。

当前复现代码github地址PaddleMd https://github.com/skywalk163/INFINITY/tree/main/examples/PaddleMD

准备论文复现工作

下载论文原作源码

首先git clone下载论文原作的源码，使用命令git clone https://github.com/torchmd/torchmd

跑通原论文torchmd

先跑通原作代码，对整个流程有个基本了解

开始复现

第一阶段手工开始单个文件转换

将项目所有核心.py文件，使用%%writefile xx.py的格式，放到notebook cell中，这样可以通过查找替换，快速修改所有的代码。

优点是：代码修改效率高。发现一个问题，解决问题，并可以全部查找、替换，将同类问题全部解决。
缺点是：调试效率较低。需要另开一个notebook文件进行测试，且修改代码后，需要重新执行，甚至要重启测试项目的内核。

代码复现第二阶段

像常规notebook下的调试流程

1、对疑点文件拆分，将函数放到Cell进行测试

测试中可以加入测试代码，验证函数是否正确。最终保证所有函数测试通过

2、测试通过后，将修改写回文件

3、在tutorial.ipynb文件中总测试

优点是，基本不修改tutorial.ipynb文件代码。

代码复现第三阶段

调试精度和速度
速度提高工具
https://www.paddlepaddle.org.cn/documentation/docs/zh/develop/guides/performance_improving/profiling_model.html#profiler
当然最终这个项目里没有使用。

1、速度提高

有些手工写的算子使用了for循环，需要找到并提速。

提升paddleindex_add

5.12日确认了paddleindex_add对速度的影响非常大，这个算子要处理！

其次，在屏蔽掉paddleindex_add后，程序运行速度较快提高，非常利于调试！
目前的思路，是找到黑客马拉松项目里面的index_add代码，然后自己动手写成单独的算子。

6.1日，官方提供put_along_axis算子，可以实现index_add功能！

6.2日，好像put_along算子只支持1D index，以后不能用，后来发现可以使用。

6.4日使用put_alone_axis函数重写了paddleindex_add，速度提升非常多！基本上跟以前测试的时候使用透传的的速度相当！

质疑numpy()函数

使用了太多的numpy()转换函数，这会大大降低速度。要看看这些转换的原因是什么，能否去掉。

提高paddleindexjia

修改paddleindexjia函数，将其修改成由numpy实现：

#             pos[:, self.nongrouped] -= offset.unsqueeze(1)
#             pos = paddleindexjia(pos, -offset, self.nongrouped)pos = pos.numpy()offset = offset.unsqueeze(1).numpy()pos[:, group] -= offset # 尝试使用numpy来处理 相关语句共4句pos = paddle.to_tensor(pos)

最终这一大段应该还有提升空间

        if len(self.nongrouped):offset = paddle.floor(pos[:, self.nongrouped] / box) * box
#             pos[:, self.nongrouped] -= offset.unsqueeze(1)
#             pos = paddleindexjia(pos, -offset, self.nongrouped)pos = pos.numpy()offset = offset.unsqueeze(1).numpy()pos[:, group] -= offset # 尝试使用numpy来处理 相关语句共4句pos = paddle.to_tensor(pos)

性能分析cProfile

import cProfile
import re
cProfile.run(‘re.compile(“foo|bar”)’)

def my_on_trace_ready(prof): # 定义回调函数，性能分析器结束采集数据时会被调用callback = profiler.export_chrome_tracing('./profiler_demo') # 创建导出性能数据到profiler_demo文件夹的回调函数callback(prof)  # 执行该导出函数prof.summary(sorted_by=profiler.SortedKeys.GPUTotal) # 打印表单，按GPUTotal排序表单项p = profiler.Profiler(scheduler = [3,14], on_trace_ready=my_on_trace_ready, timer_only=True) # 初始化Profiler对象p.start() # 性能分析器进入第0个step

测试paddle.profiler

paddle.profiler是飞桨自带的性能分析套件。
但是我在启智平台使用的时候，内核直接重启。
后来发现是最后使用了exit()指令的缘故。

但是分析套件需要数据集部分，没有就报错。曾设想加上一个空的数据集指令来骗过去，但最终没有动手做。

2、精度测试

每个文件分别写调试命令，验证精度。

这个并没有做

对tutorial.ipynb文件中的变量输出，跟torchmd的比对。

print(Epot)
print(system.forces)

在使用同一随机种子的情况，paddlemd和torchmd精度对齐。

集成测试

跟openmm的力场能量进行比较，差别不大，在可接受范围

Running test: test-data/benzamidine/angle Energy diff: 1.599e-06 Force diff: 6.954e-05bond Energy diff: -2.731e-06 Force diff: 1.177e-04dihedral Energy diff: -1.608e-06 Force diff: 7.054e-06lennardjones Energy diff: 6.798e-06 Force diff: 1.794e-05electrostatic Energy diff: -6.766e-07 Force diff: 1.272e-07
All forces. Total energy: 20.914 Energy diff: 3.382e-06 Force diff 1.458e-04

成功复现

复现成功后，即可进行飞桨分子动力学模拟

使用PaddleMD进行分子动力学模拟

参考PaddleMD指引手册，以前丙氨酸二肽为例，进行分子动力学模拟。1000step大约需要4分钟。

首先编译安装PaddleMD

若有报错缺库，缺啥装啥。

编译安装后，需要重启一下环境。

!cd ~/INFINITY/examples/PaddleMD/ && python setup.py develop

参考指引手册，进行前丙氨酸二肽分子模拟

在INFINITY/examples/PaddleMD/test-data/目录中有各种分子结构文件，比如二面角、水分子、苯甲脒、凝血酶配体等。这里以前丙氨酸二肽为例。

读取前丙氨酸二肽分子结构文件

使用“moleculekit”库读取输入拓扑和起始坐标

from moleculekit.molecule import Molecule
import os
import paddletestdir = "INFINITY/examples/PaddleMD/test-data/prod_alanine_dipeptide_amber/"
mol = Molecule(os.path.join(testdir, "structure.prmtop"))  # Reading the system topology
mol.read(os.path.join(testdir, "input.coor"))  # Reading the initial simulation coordinates
mol.read(os.path.join(testdir, "input.xsc"))  # Reading the box dimensionsfrom paddlemd.forcefields.forcefield import ForceField
from paddlemd.parameters import Parametersprecision = paddle.float32
# 加载forcefield力场文件，并使用上述拓扑提取将用于模拟的相关参数
ff = ForceField.create(mol, os.path.join(testdir, "structure.prmtop"))
parameters = Parameters(ff, mol, precision=precision)

创建系统

我们可以创建一个“System”系统对象，该对象将包含模拟期间系统的状态，包括：
1、当前原子坐标
1.当前的盒子大小
1、当前原子速度
1、当前原子力场

from paddlemd.integrator import maxwell_boltzmann
from paddlemd.systems import Systemsystem = System(mol.numAtoms, nreplicas=1, precision=precision)
system.set_positions(mol.coords)
system.set_box(mol.box)
system.set_velocities(maxwell_boltzmann(parameters.masses, T=300, replicas=1))

from paddlemd.forces import Forces
bonded = ["bonds", "angles", "dihedrals", "impropers", "1-4"]
# bonded = ["dihedrals"]
# forces = Forces(parameters, cutoff=9, rfa=True, switch_dist=7.5)
forces = Forces(parameters, cutoff=9, rfa=True, switch_dist=7.5, terms=bonded)
# Evaluate current energy and forces. Forces are modified in-place
Epot = forces.compute(system.pos, system.box, system.forces, returnDetails=True)print(Epot)
print(system.forces)

动力学积分器

为了执行动力学，我们将创建一个“积分器”对象，用于集成模拟的时间步，以及一个“包装器”对象，用于包装周期单元内的系统坐标

from paddlemd.integrator import Integrator
from paddlemd.wrapper import Wrapperlangevin_temperature = 300  # K
langevin_gamma = 0.1
timestep = 1  # fsintegrator = Integrator(system, forces, timestep, gamma=langevin_gamma, T=langevin_temperature)
wrapper = Wrapper(mol.numAtoms, mol.bonds if len(mol.bonds) else None)


from paddlemd.minimizers import minimize_bfgsminimize_bfgs(system, forces, steps=500)  # Minimize the system steps=500

from paddlemd.utils import LogWriterlogger = LogWriter(path="logs/", keys=('iter','ns','epot','ekin','etot','T'), name='monitor.csv')

开始集成模拟

from tqdm import tqdm 
import numpy as npFS2NS = 1E-6 # Femtosecond to nanosecond conversionsteps = 1000 # 1000 
output_period = 10
save_period = 100
traj = []trajectoryout = "mytrajectory.npy"iterator = tqdm(range(1, int(steps / output_period) + 1))
# print(f"iterator={iterator}")
Epot = forces.compute(system.pos, system.box, system.forces)
for i in iterator:Ekin, Epot, T = integrator.step(niter=output_period)wrapper.wrap(system.pos, system.box)
#     currpos = system.pos.detach().cpu().numpy().copy()
#     currpos = system.pos.detach()currpos = system.pos
#     print(currpos.shape)traj.append(currpos)
#     print(len(traj) )
#     print(f"iterator={iterator}")if (i*output_period) % save_period  == 0:np.save(trajectoryout, np.stack(traj, axis=2))logger.write_row({'iter':i*output_period,'ns':FS2NS*i*output_period*timestep,'epot':Epot,'ekin':Ekin,'etot':Epot+Ekin,'T':T})

集成测试

INFINITY/examples/PaddleMD/3集成测试.ipynb/集成测试.ipynb 为集成测试文件,在AIStudio中，到相应位置点开该文件即可。也可以在终端使用命令执行。

集成测试的步骤：

首先使用`python setup.py develop`安装paddlemd开发模式。

在实际使用中，可以使用普通模式，命令为：python setup.py install

执行测试命令

可以使用python tests/test_paddlemd.py进行集成测试，使用./bin/paddlemd --conf tests/water/water_conf.yaml测试水分子，使用./bin/paddlemd --conf tests/prod_alanine_dipeptide_amber/conf.yaml测试prod alanine dipeptide前丙氨酸二肽，使用./bin/paddlemd --conf tests/trypsin/conf.yaml测试Trypsin胰蛋白酶。

在3集成测试.ipynb文件中，上述命令已经放入，直接运行即可。也可以到相应目录，在控制台直接使用相应命令执行。

目前这三个单项测试都可以正常运行不报错。速度大约是torchmd的十六分之一。还有较大提升空间。

集成测试，可以测试一部分，可看到势能和力场等数值跟openmm的较接近。但是后面还是会报错，大约是训练求导那块还有问题。

复现过程中的一些工作留档

手动适配的飞桨API

有些API飞桨没有适配，先手写顶上。

def paddlerandn_like(x) : # 添加飞桨的randn_like函数return paddle.randn(x.shape)# 后来发现飞桨2.3版本已经有了atan2函数，直接用即可
def paddleatan2(input, other): # 飞桨的atan2函数atan = paddle.atan(input/other)atan[1] = atan[1] + piatan[2] = atan[2] + pireturn atandef paddlescatter(x, dim, index, src): # scatter支持1D版本updates = srcif len(index.shape) == 1 :
#         for i in index:
#             x[i] += updates[i]for i in range(index.shape[0]):x[index[i]] += updates[i]return xi, j = index.shapegrid_x , grid_y = paddle.meshgrid(paddle.arange(i), paddle.arange(j))if dim == 0 :index = paddle.stack([index.flatten(), grid_y.flatten()], axis=1)elif dim == 1:index = paddle.stack([grid_x.flatten(), index.flatten()], axis=1)# PaddlePaddle updates 的 shape 大小必须与 index 对应updates_index = paddle.stack([grid_x.flatten(), grid_y.flatten()], axis=1)updates = paddle.gather_nd(updates, index=updates_index)return paddle.scatter_nd_add(x, index, updates)# 这个要使用飞桨黑客松的代码，对速度影响非常大
def paddleindex_add(x, dim, index, source): # 飞桨的index_addfor i in range(len(index)):x[index[i]] += source[i]return x# 最终没有等到黑客松的index_add算子，自己用put_alone_axis写了一个
# 飞桨的put_alone_axis不支持shape不一致的情况，即indices和value比arr长或者短的情况。
# 需要做的，就是要把短的补齐，长的分段传入。
def paddleput_alone_axis(arr, indices, value, axis, reduce="add"):#     print(f"==arr.shape:{arr.shape} indices.shape:{indices.shape} value.shape:{value.shape}")lenarr = arr.shape[0]lenindices = indices.shape[0]while lenarr < lenindices:arr = paddle.put_along_axis(arr, indices[:lenarr].reshape([-1, 1]), value[:lenarr], axis, reduce=reduce)indices = indices[lenarr:]value = value[lenarr:]lenarr = arr.shape[0]lenindices = indices.shape[0]xs = lenarr - lenindicesif xs >= 1:newindices = paddle.concat([indices, paddle.zeros([xs], dtype=paddle.int64)]).reshape([-1, 1])newvalue = paddle.concat([value, paddle.zeros([xs, value.shape[-1]])])else:newindices = indices.reshape([-1, 1])newvalue = valueout = paddle.put_along_axis(arr, newindices, newvalue, axis, reduce=reduce)return out# 为了跟程序里的参数序列对齐，尽量不修改代码，写paddleindex_add
def paddleindex_add(x, dim, index, source):return paddleput_alone_axis(x, index, source, dim)# paddleeye这个不用了，全部用变量乘以paddle.eye(3)实现，解决shape首位>1的情况。
#         box = paddleeye(box, 3)box = box*paddle.eye(3) # 可以很好的处理box[2, 3, 3]类型数据box = box.sum(1)def paddleeye(x, n): # 针对[1, 3, 3]输入的特供eye函数tmp =x[0][paddle.eye(n).astype(paddle.bool)]return tmp.unsqueeze_(0)# paddleindexjia，使用专为numpy计算的方式，速度比paddleindexjia提高30%以上
#             pos[:, self.nongrouped] -= offset.unsqueeze(1)
#             pos = paddleindexjia(pos, -offset, self.nongrouped)pos = pos.numpy()offset = offset.unsqueeze(1).numpy()pos[:, group] -= offset # 尝试使用numpy来处理 相关语句共4句pos = paddle.to_tensor(pos)def paddleindexjia (x, y, xindex): # 索引/切片/赋值特供版本'''切片+索引，使用循环来解决切片问题，然后使用中间变量，来实现按照索引赋值支持类似的语句pos[:, group] -= offset.unsqueeze(1)'''xlen = len(x)assert len(x.shape) == 3 , "维度不一致,必须为3D数据"
#     if len(y.shape) == 3 and y.shape[0] ==1 :
#         y = paddle.squeeze(y)assert len(y.shape) ==2 , "维度不一致，必须为2D数据"for i in range(xlen):tmp = x[i]tmp[xindex] += yx[i] = tmpreturn x# 写飞桨版本的笛卡尔直积函数cartesian_prod
from itertools import product
def paddlecartesian_prod(x,y): # 飞桨版本的笛卡尔直积函数z = list(product(x,y))z = paddle.to_tensor(z)return z.squeeze(axis=-1)# 最终版本笛卡尔直积函数支持变长变量输入
from itertools import product
def paddlecartesian_prod(*x): # 飞桨版本的笛卡尔直积函数z = list(product(*x))z = paddle.to_tensor(z)return z.squeeze(axis=-1)

学习paddlemd里面的几句话

# 这句什么意思        self.require_distances = any(f in self.nonbonded for f in self.energies)
# 也就是力场里面有任何一个nobonded值，则self.require_distances为True，也就是有”长距力“ ？
bonded = ["bonds", "angles", "dihedrals", "impropers", "1-4"]
nonbonded = ["electrostatics", "lj", "repulsion", "repulsioncg"]
terms = bonded + nonbondedclass TestClass():bonded = ["bonds", "angles", "dihedrals", "impropers", "1-4"]nonbonded = ["electrostatics", "lj", "repulsion", "repulsioncg"]terms = bonded + nonbondeddef __init__(self, terms=None):self.energies = [ene.lower() for ene in terms]self.require_distances = any(f in self.nonbonded for f in self.energies)print(f"self.require_distances:{self.require_distances}")
tmp = TestClass(terms=terms)

提速思考

像类似这样，应该都可以用张量直接操作

for i in range(nsystems):if explicit_forces:forces[i] = paddleindex_add(forces[i], 0, self.par.angles[:, 0], angle_forces[0])forces[i] = paddleindex_add(forces[i], 0, self.par.angles[:, 1], angle_forces[1])forces[i] = paddleindex_add(forces[i], 0, self.par.angles[:, 2], angle_forces[2])

调试与总结

复现心得

刚开始选这个复现题目的时候，我承认我大意了，分子动力学模拟这个跟平时接触的机器学习和神经网络有较大的不同，而且有几个算子不支持，导致复现工作几度陷入困境。

在低谷期，只好自己鼓励自己：如果我碰到坎，别人也会碰到；如果我想放弃，那别人也会想放弃。坚持，才能胜利！最终跌跌撞撞，在自己写了几个算子（只求功能，不求速度）之后，程序终于能跑通了！

除了自我鼓励之外，问题拆分也是一个非常有效的解决问题的法宝。几乎所有问题，都可以拆分为小问题，进而拆分为更小更易于解决的问题，最终完成开始看起来几乎不可能完成的任务！

一些技术问题存档

大部分碰到的技术问题，都存档了。比如一些分子库pip无法安装的问题，一些算子的问题等。可以在复现三步走对应的文档下部，看到问题存档。

当前遗留的一些问题

1 集成测试无法完全通过。

2 AI自动求导优化那块可能还有问题。

3 运行速度大约是torchmd的十六分之一，还有很大提升空间。

其中有几个算子需要飞桨官方开发出来。另外整个项目的计算（代码表达式）可能还有较大的改进空间。

引用Citation

Please cite:

@misc{doerr2020torchmd,title={TorchMD: A deep learning framework for molecular simulations}, author={Stefan Doerr and Maciej Majewsk and Adrià Pérez and Andreas Krämer and Cecilia Clementi and Frank Noe and Toni Giorgino and Gianni De Fabritiis},year={2020},eprint={2012.12106},archivePrefix={arXiv},primaryClass={physics.chem-ph}
}

To reproduce the paper go to the tutorial notebook https://github.com/torchmd/torchmd-cg/blob/master/tutorial/Chignolin_Coarse-Grained_Tutorial.ipynb