高级检索

写给物理学家的生成模型

王磊, 张潘

王磊, 张潘. 写给物理学家的生成模型[J]. 物理, 2024, 53(6): 368-378. DOI: 10.7693/wl20240602
引用本文: 王磊, 张潘. 写给物理学家的生成模型[J]. 物理, 2024, 53(6): 368-378. DOI: 10.7693/wl20240602
WANG Lei, ZHANG Pan. Generative models for physicists[J]. PHYSICS, 2024, 53(6): 368-378. DOI: 10.7693/wl20240602
Citation: WANG Lei, ZHANG Pan. Generative models for physicists[J]. PHYSICS, 2024, 53(6): 368-378. DOI: 10.7693/wl20240602

写给物理学家的生成模型

基金项目: 

国家自然科学基金(批准号:T2225018;92270107;12325501)资助项目

详细信息
    通讯作者:

    王磊,email:wanglei@iphy.ac.cn

Generative models for physicists

  • 摘要: 科学研究的本质在于创造。生成式人工智能为更有创意的科学探索打开了无尽的想象空间。作为生成式人工智能的核心,生成模型学习数据样本背后的概率分布,并据此随机采样生成新的样本。生成模型和统计物理在本质上是同一枚硬币的两面。文章从物理的视角介绍扩散模型、自回归模型、流模型、变分自编码器等现代生成模型。生成模型在原子尺度物质结构的生成与设计中展现出巨大的潜力。不仅如此,基于和统计物理的内在联系,生成模型对于优化“大自然的损失函数”——变分自由能具有独特的优势,这为求解困难的统计物理和量子多体问题提供了新的可能。同时,物理学的洞察也在推动生成模型的发展和创新。通过借鉴物理学原理和方法,还可以设计出更加高效、更加统一的生成模型,以应对人工智能领域中的挑战。
    Abstract: The essence of scientific research is about creation. Generative artificial intelligence (AI) has opened up an endless space of imagination for more creative scientific research. As the core of generative AI, generative models learn the underlying probability distribution from data, and then randomly sample from it to generate new samples. Generative models and statistical physics are essentially two sides of the same coin. This article introduces modern generative models including diffusion models, autoregressive models, flow models, and variational autoencoders from a physics perspective. Generative models demonstrate tremendous potential in the generation and design of materials at the atomic scale. Moreover, based on their inherent connection with statistical physics, they have a unique advantage in optimizing Nature’s cost function, the variational free energy, thereby providing new possibilities for solving difficult problems in statistical physics and quantum many-body systems. At the same time, physical insights are also driving the development and innovation of generative models. By drawing inspiration from physical principles and methods, more efficient and unified generative models can be designed to address challenges in the field of AI.
  • [1] 费曼的黑板图片. http://archives-dc.library.caltech.edu/islandora/object/ct1%3A483
    [2] OpenAI官网关于生成模型的介绍. https://openai.com/research/ generative-models
    [3] Jensen不等式说明,对于凸函数f有:f (x)的平均值≥ f (x的平均值)。-ln x是一个凸函数例子,据此可以证明相对熵的非负性。
    [4]

    Tomczak J M. Deep Generative Modeling. Springer International Publishing,2022

    [5]

    Murphy K P. Probabilistic Machine Learning: Advanced Topics. MIT Press,2023

    [6]

    Bishop C M,Bishop H. Deep Learning:Foundations and Concepts. Springer International Publishing,2024

    [7]

    Ackley D H,Hinton G E,Sejnowski T J. Cognitive Science,1985, 9(1):147

    [8]

    Nguyen H C,Zecchina R,Berg J. Advances in Physics,2017,66(3):197

    [9]

    Vincent P. Neural Computation,2011,23(7):1661

    [10]

    Song Y,Ermon S. Generative Modeling by Estimating Gradients of the Data Distribution. In: Wallach H et al (Ed.). Advances in Neural Information Processing Systems 32,2019

    [11]

    Kirkpatrick S,Gelatt C D,Vecchi M P. Science,1983,220(4598):671

    [12]

    Sohl-Dickstein J,Weiss E A,Maheswaranathan N et al. International Conference on Machine Learning,2015,37:2256

    [13]

    Vaswani A,Shazeer N,Parmar N et al. Advances in Neural Information Processing Systems,2017,30:6000

    [14] 60行代码实现GPT2模型. https://jaykmody.com/blog/gpt-fromscratch/
    [15] OpenAI开发的GPT2模型. https://openai.com/index/better-language-models/
    [16] 300行 FORTRAN代码 实现 GPT2模 型. https://ondrejcertik.com/blog/2023/03/fastgpt-faster-than-pytorch-in-300-lines-offortran/
    [17] 侯捷. STL源码剖析. 武汉:华中科技大学出版社,2002
    [18]

    Wei J,Tay Y,Bommasani R et al. 2022,arXiv:2206.07682

    [19]

    Kaplan J et al. 2020,arXiv:2001.08361

    [20]

    Schaeffer R et al. 2023,arXiv:2304.15004

    [21] 关于大语言模型中涌现现象的讨论. https://www.jasonwei.net/blog/common-arguments-regarding-emergent-abilities
    [22]

    Papamakarios G et al. Journal of Machine Learning Research, 2021,22:1

    [23]

    Kingma D P,Dhariwal P. Advances in Neural Information Processing Systems,2018,32:10236

    [24]

    Li S H,Wang L. Phys. Rev. Lett.,2018,121:260601

    [25]

    Li S H,Dong C X,Zhang L F et al. Phys. Rev. X,2020,10: 021020

    [26]

    Chen R T Q,Rubanova Y,Bettencourt J et al. Advances in Neural Information Processing Systems,2018,32:6572

    [27]

    Zhang L F,Weinan E,Wang L. 2018,arXiv:1809.10188

    [28]

    Maoutsa D,Reich S,Opper M. Entropy,2020,22(8):802

    [29]

    Song Y,Sohl-Dickstein J,Kingma D P et al. 2020,arXiv: 2011.13456

    [30]

    Lipman Y,Chen R T Q,Ben-Hamu H et al. 2022,arXiv: 2210.02747

    [31]

    Liu X C,Gong C Y,Liu Q. 2022,arXiv:2209.03003

    [32]

    Albergo M S,Vanden-Eijnden E. 2022,arXiv:2209.15571

    [33]

    Zhao L,Wang L. Chin. Phys. Lett.,2023,40:120201

    [34]

    Gómez-Bombarelli R,Wei J N,Duvenaud D et al. ACS Central Science,2018,4(2):268

    [35]

    Cédric V. Topics in Optimal Transportation,vol. 58. American Mathematical Soc.,2016

    [36] 训练数据的重要性. 2023,https://nonint.com/2023/06/10/the-itin-ai-models-is-the-dataset/
    [37]

    Ingraham J B,Baranov M,Costello Z et al. Nature,2023,623: 1070

    [38]

    DALL.E.3,https://openai.com/index/dall-e-3

    [39]

    Madani A,Krause B,Greene E R et al. Nature Biotechnology, 2023,41:1099

    [40]

    Xie T,Fu X,Ganea O et al. 2021,arXiv:2110.06197

    [41]

    Jiao R,Huang W B,Lin P J et al. Advances in Neural Information Processing Systems,2024,36:11464

    [42]

    Zeni C,Pinsler R,Zügner D et al. 2023,arXiv:2312.03687

    [43]

    Flam-Shepherd D,Aspuru-Guzik A. 2023,arXiv:2305.05708

    [44]

    Antunes L M,Butler K T,Grau-Crespo R. 2023,arXiv: 2307.04340

    [45]

    The Bitter Lesson. http://www.incompleteideas.net/IncIdeas/BitterLesson.html

    [46]

    Cao Z D,Luo X S,Lv J et al. 2024,arXiv:2403.15734

    [47]

    Merchant A,Batzner S,Schoenholz S S et al. Nature,2023,624: 80

    [48]

    Cheetham A K,Seshadri R. Chemistry of Materials,2024,36(8): 3490

    [49] 张林峰,王涵. 模拟微观世界:从薛定谔方程到大原子模型. 物理,2024,待发表
    [50]

    Wu D,Wang L,Zhang P. Phys. Rev. Lett.,2019,122(8):080602

    [51]

    Xie H,Li Z H,Wang H et al. Phys. Rev. Lett.,2023,131(12): 126501

    [52]

    Xie H,Zhang L F,Wang L. SciPost Physics,2023,14:154

    [53]

    Feynman R P,Cohen M. Physical Review,1956,102:1189

    [54]

    He K M,Zhang X Y,Ren S Q et al. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition,2016. pp. 770—778

    [55] 王磊,刘金国. 物理,2021,50(2):69
  • 期刊类型引用(1)

    1. 吴昊,吴松,聂丽萍,高振桓,王常帅,巩秀芳. 镍基高温合金设计研究现状与展望. 大型铸锻件. 2024(06): 26-33 . 百度学术

    其他类型引用(0)

计量
  • 文章访问数:  1355
  • HTML全文浏览量:  114
  • PDF下载量:  2330
  • 被引次数: 1
出版历程
  • 收稿日期:  2024-05-15
  • 网络出版日期:  2024-06-14

目录

    /

    返回文章
    返回