结论与延伸阅读

Part 11 of How To Scale Your Model 中文版 (Part 10: JAX | Part 12: GPUs)

感谢您的阅读！这里我们将提供一些延伸学习的参考资料.

感谢您阅读本系列文章, 并祝贺您坚持到了最后. 在我们结束之前, 先致以一些谢意:

致谢

本文档代表了 Google DeepMind 许多同仁的巨大集体投入, 我们想在此简要致谢!

James Bradbury, Reiner Pope, 和 Blake Hechtman 最初推导出了手稿中的许多想法, 并且是早期理解 Transformer 系统观点的先行者.
Sholto Douglas 撰写了本文档的初版, 并负责启动该项目. 他比任何人都更能代表本文档的整体叙事.
Jacob Austin 领导了将初版的粗略笔记转变为更精炼、更全面的成果的工作. 他完成了本文档的大部分编辑、格式化和发布工作, 并协调了其他作者的贡献.
大部分图表和动画由 Anselm Levskaya 和 Charlie Chen 制作.
Charlie Chen 撰写了推理部分, 并绘制了许多推理部分的图表.
Roy Frostig 在出版、编辑和许多其他环节提供了帮助.

我们还要感谢许多在整个过程中给予关键反馈的同仁, 特别是 Zak Stone, Nikhil Sethi, Caitlin Stanton, Alex Dimitriev, Sridhar Lakshmanamurthy, Albert Magyar, Diwakar Gupta, Jeff Dean, Corry Wang, Matt Johnson, Peter Hawkins 等等. 感谢 Ruiqi Gao 在 HTML 格式化方面的帮助.

感谢大家!

在您离开之前, 您可能也想阅读关于 NVIDIA GPU 的新章节第 12 节!

反馈

请留下评论或问题, 以便我们进一步改进. 您可以通过 jaaustin [at] google [dot] com 联系我们的通讯作者 Jacob Austin, 或通过在 GitHub 上发布 issue, pull request 或 discussion 来建议编辑.

Miscellaneous

^*Work done at Google DeepMind, now at MatX.

Citation

For attribution in academic contexts, please cite this work as:

    Austin et al., "How to Scale Your Model", Google DeepMind, online, 2025.

or as a BibTeX entry:

    @article{scaling-book,
      title = {How to Scale Your Model},
      author = {Austin, Jacob and Douglas, Sholto and Frostig, Roy and Levskaya, Anselm and Chen, Charlie and Vikram, Sharad
      and Lebron, Federico and Choy, Peter and Ramasesh, Vinay and Webson, Albert and Pope, Reiner},
      publisher = {Google DeepMind},
      howpublished = {Online},
      note = {Retrieved from https://jax-ml.github.io/scaling-book/},
      year = {2025}
    }