References
Agarwal, Alekh, Nan Jiang, Sham M Kakade, and Wen Sun. 2022.
Reinforcement Learning: Theory and
Algorithms. https://rltheorybook.github.io/rltheorybook_AJKS.pdf.
Allaire, J. J., Charles Teague, Carlos Scheidegger, Yihui Xie, and
Christophe Dervieux. 2024. “Quarto.” https://doi.org/10.5281/zenodo.5960048.
Baydin, Atilim Gunes, Barak A. Pearlmutter, Alexey Andreyevich Radul,
and Jeffrey Mark Siskind. 2018. “Automatic Differentiation in
Machine Learning: A Survey.” February 5, 2018. https://doi.org/10.48550/arXiv.1502.05767.
Boyd, Stephen, and Lieven Vandenberghe. 2004. Convex
Optimization. Cambridge University Press. https://web.stanford.edu/~boyd/cvxbook/.
Deng, Li. 2012. “The MNIST Database of
Handwritten Digit Images for Machine Learning
Research [Best of the Web].”
IEEE Signal Processing Magazine 29 (6): 141–42. https://doi.org/10.1109/MSP.2012.2211477.
Frans Berkelaar. 2009. Container Ship MSC Davos -
Westerschelde - Zeeland. https://www.flickr.com/photos/28169156@N03/52957948820/.
GPA Photo Archive. 2017. Robotic Arm. https://www.flickr.com/photos/iip-photo-archive/36123310136/.
Guy, Romain. 2006. Chess. https://www.flickr.com/photos/romainguy/230416692/.
Hastie, Trevor, Robert Tibshirani, and Jerome Friedman. 2013. The
Elements of Statistical Learning: Data
Mining, Inference, and Prediction.
Springer Science & Business Media. https://books.google.com?id=yPfZBwAAQBAJ.
James, Gareth, Daniela Witten, Trevor Hastie, Robert Tibshirani, and
Jonathan Taylor. 2023. An Introduction to
Statistical Learning: With Applications in
Python. Springer Texts in
Statistics. Cham: Springer International Publishing. https://doi.org/10.1007/978-3-031-38747-0.
Lai, T. L, and Herbert Robbins. 1985. “Asymptotically Efficient
Adaptive Allocation Rules.” Advances in Applied
Mathematics 6 (1): 4–22. https://doi.org/10.1016/0196-8858(85)90002-8.
Nielsen, Michael A. 2015. Neural Networks and
Deep Learning. Determination Press. http://neuralnetworksanddeeplearning.com/.
Ross, Stéphane, Geoffrey J. Gordon, and J. Bagnell. 2010. “A
Reduction of Imitation Learning and
Structured Prediction to No-Regret Online
Learning.” In. https://www.semanticscholar.org/paper/A-Reduction-of-Imitation-Learning-and-Structured-to-Ross-Gordon/79ab3c49903ec8cb339437ccf5cf998607fc313e.
Russell, Stuart J., and Peter Norvig. 2021. Artificial Intelligence:
A Modern Approach. Fourth edition. Pearson Series in Artificial
Intelligence. Hoboken: Pearson.
Schrittwieser, Julian, Ioannis Antonoglou, Thomas Hubert, Karen
Simonyan, Laurent Sifre, Simon Schmitt, Arthur Guez, et al. 2020.
“Mastering Atari, Go, Chess and Shogi by
Planning with a Learned Model.” Nature 588 (7839, 7839):
604–9. https://doi.org/10.1038/s41586-020-03051-4.
Schulman, John, Filip Wolski, Prafulla Dhariwal, Alec Radford, and Oleg
Klimov. 2017. “Proximal Policy Optimization
Algorithms.” August 28, 2017. https://doi.org/10.48550/arXiv.1707.06347.
Silver, David, Aja Huang, Chris J. Maddison, Arthur Guez, Laurent Sifre,
George van den Driessche, Julian Schrittwieser, et al. 2016.
“Mastering the Game of Go with Deep Neural Networks
and Tree Search.” Nature 529 (7587, 7587): 484–89. https://doi.org/10.1038/nature16961.
Silver, David, Thomas Hubert, Julian Schrittwieser, Ioannis Antonoglou,
Matthew Lai, Arthur Guez, Marc Lanctot, et al. 2018. “A General
Reinforcement Learning Algorithm That Masters Chess, Shogi, and
Go Through Self-Play.” Science 362 (6419):
1140–44. https://doi.org/10.1126/science.aar6404.
Silver, David, Julian Schrittwieser, Karen Simonyan, Ioannis Antonoglou,
Aja Huang, Arthur Guez, Thomas Hubert, et al. 2017. “Mastering the
Game of Go Without Human Knowledge.” Nature
550 (7676, 7676): 354–59. https://doi.org/10.1038/nature24270.
Sussman, Gerald Jay, Jack Wisdom, and Will Farr. 2013. Functional
Differential Geometry. Cambridge, MA: The MIT Press.
Sutton, Richard S., and Andrew G. Barto. 2018. Reinforcement
Learning: An Introduction. Second edition. Adaptive Computation and
Machine Learning Series. Cambridge, Massachusetts: The MIT Press. http://incompleteideas.net/book/RLbook2020trimmed.pdf.
Vershynin, Roman. 2018. High-Dimensional Probability:
An Introduction with Applications in
Data Science. Cambridge University Press. https://books.google.com?id=NDdqDwAAQBAJ.
Williams, Ronald J. 1992. “Simple Statistical Gradient-Following
Algorithms for Connectionist Reinforcement Learning.” Machine
Learning 8 (3): 229–56. https://doi.org/10.1007/BF00992696.