Bibliography

Bibliography#

[AZWG21]

Cem Anil, Guodong Zhang, Yuhuai Wu, and Roger Grosse. Learning to give checkable answers with prover-verifier games. 2021. arXiv:2108.12099.

[ATB17]

Thomas Anthony, Zheng Tian, and David Barber. Thinking fast and slow with deep learning and tree search. In Proceedings of the 31st International Conference on Neural Information Processing Systems (NIPS), 5366–5376. Red Hook, NY, USA, 2017. Curran Associates Inc. arXiv:1705.08439.

[FCR20]

Tanner Fiez, Benjamin Chasnov, and Lillian Ratliff. Implicit learning dynamics in stackelberg games: equilibria characterization, convergence analysis, and empirical study. In Proceedings of the 37th International Conference on Machine Learning (ICML), 3133–3144. 2020.

[FCAS+18]

Jakob Foerster, Richard Y. Chen, Maruan Al-Shedivat, Shimon Whiteson, Pieter Abbeel, and Igor Mordatch. Learning with opponent-learning awareness. In Proceedings of the 17th International Conference on Autonomous Agents and MultiAgent Systems (AAMAS), 122–130. Richland, SC, 2018. International Foundation for Autonomous Agents and Multiagent Systems. arXiv:1709.04326.

[HAD25]

Lewis Hammond and Sam Adam-Day. Neural interactive proofs. In Proceedings of the 13th International Conference on Learning Representations (ICLR). 2025. arXiv:2412.08897.

[HY23]

Ryuichiro Hataya and Makoto Yamada. Nystrom method for accurate and scalable implicit differentiation. In Proceedings of the 26th International Conference on Artificial Intelligence and Statistics (AISTATS), volume 206. Proceedings of Machine Learning Research, 2023. arXiv:2302.09726.

[HBK+21]

Dan Hendrycks, Steven Basart, Saurav Kadavath, Mantas Mazeika, Akul Arora, Ethan Guo, Collin Burns, Samir Puranik, Horace He, Dawn Song, and Jacob Steinhardt. Measuring coding challenge competence with APPS. In Proceedings of the 45th International Conference on Neural Information Processing Systems (NeurIPS), Datasets and Benchmarks Track (Round 2). 2021. arXiv:2105.09938.

[ICA18]

Geoffrey Irving, Paul Christiano, and Dario Amodei. AI safety via debate. 2018. arXiv:1805.00899.

[LFB+19]

Alistair Letcher, Jakob Foerster, David Balduzzi, Tim Rocktäschel, and Shimon Whiteson. Stable opponent shaping in differentiable games. In Proceedings of the 7th International Conference on Learning Representations (ICLR). 2019. arXiv:1811.08469.

[MSD+24]

Sumeet Ramesh Motwani, Chandler Smith, Rocktim Jyoti Das, Markian Rybchuk, Philip H. S. Torr, Ivan Laptev, Fabio Pizzati, Ronald Clark, and Christian Schroeder de Witt. Malt: improving reasoning with multi-agent llm training. 2024. arXiv:2412.01928.

[RSM+23]

Rafael Rafailov, Archit Sharma, Eric Mitchell, Stefano Ermon, Christopher D. Manning, and Chelsea Finn. Direct preference optimization: your language model is secretly a reward model. In Proceedings of the 37th International Conference on Neural Information Processing Systems (NeurIPS). Red Hook, NY, USA, 2023. Curran Associates Inc. arXiv:2305.18290.

[WSZP24]

Stephan Wäldchen, Kartikey Sharma, Max Zimmer, and Sebastian Pokutta. Interpretability guarantees with merlin-arthur classifiers. In Proceedings of the 27th International Conference on Artificial Intelligence and Statistics (AISTATS), volume 238 of Proceedings of Machine Learning Research. 2024. arXiv:2206.00759.

[XHLJ18]

Keyulu Xu, Weihua Hu, Jure Leskovec, and Stefanie Jegelka. How powerful are graph neural networks? In Proceedings of the 8th International Conference on Learning Representations (ICLR). 2018. arXiv:1810.00826.

[ZL10]

Chongjie Zhang and Victor Lesser. Multi-agent learning with policy prediction. In Proceedings of the 24th AAAI Conference on Artificial Intelligence, 927–934. 2010.