Machine learning (ML) and particularly deep learning (DL) are used in various physics studies. In this book, the authors show how. They begin by discussing the attributes and theory of ML and its apparent similarities in form to theories in physics, particularly thermodynamics. Subsequently, in the second part, they show how ML in general and DL in particular can be used to resolve issues related to the inverse problem in theoretical and experimental physics.

In the first chapter, the authors present theoretical arguments and constructs to support their hypothesis. The book is then divided into two parts: “Physical View of Deep Learning” (five chapters) and “Applications to Physics” (seven chapters).

Part 1 starts with chapter 2’s introduction to ML, which shows that entropy concepts in information theory and thermodynamics have similar mathematical shape and form, particularly when seen from higher theoretical aspects such as Markov chains and Hamiltonian function definitions. The next chapter discusses the basics of (artificial) neural networks from the following viewpoint: artificial neural networks (ANNs) and functions such as back prorogation can be explained in terms of Hamiltonian function. It uses bucket notations to draw a parallel with statical quantum mechanics. The authors underline the basic premise of ANN (calling it the universal approximation theorem of ANN): a function can be approximated by set functions with known maximum errors, which could be subsequently reduced through further learning about the nature of the function. This gives rise to the concept of deep learning.

Chapter 4 discusses “two representative examples of a recent deep learning architecture”: convolutional neural networks (CNNs) and recurrent neural networks (RNNs). CNN’s structure “emphasizes the spatial proximity in input data.” RNN’s structure learns data in a time series. The chapter elaborates some useful techniques, including long short-term memory (LSTM) and critical state and Turing completeness cellular automata (CSTCCA) to provide “a structure that respects the characteristics of data.” In the next chapter, “Sampling,” the authors discuss techniques for the sampling of expected outcomes, which is required for training ANNs. These techniques include concepts in statical mechanics such as the central limit theorem (CLT), the Monte Carlo method, the principal of detailed balance, the Metropolis method, and the heat bath method. The authors feel that a good understanding of these techniques will lead to a better understanding of the investigative processes in physics. The last chapter of Part 1 further elaborates chapter 3. Here, the authors look at Boltzmann machines and generative adversarial networks (GANs) to show how the inputs can reveal the probability distribution required for a focused sampling to train an ANN. Boltzmann machines use the Hamiltonian statistical method for a multi-particle spin system. The authors emphasize the importance of showing the link between physics and learning machines.

Part 2 looks at machine learning applications in physics. Chapter 7, “Inverse Problems in Physics,” attempts to define an (inverse) problem for a physicist, that is, define a physical problem from a machine learning point of view. This is difficult due to one too many relationships between learning and the problem, though it raises some important questions: Is it possible to “learn the discovery of physics”? “Can phase transitions be found by deep learning?” Chapter 8 discusses these two topics and introduces the Ising model related to machine learning. By using some experimental data, the authors show how it is a plausible technique. Chapter 9, “Dynamical Systems and Neural Networks,” notes that the dynamics of physical bodies/systems are often described by a set of differential equations (often nonlinear) and therefore it is important to explore the link between ANNs and differential equations, which is not quite clear. Solving a dynamical model by discretizing a differential equation does not align with the ANN approach. However, if the Hamiltonian approach to a dynamical system is adapted, an ANN-based solution would be possible. This is discussed here and in chapter 10, “Spinglass and Neural Networks,” which introduces the Hopfield model for dynamical systems. It uses an analogy of the human brain, which uses a network and memory (“spins”).

Building on discussions of dynamical systems, the next chapter discusses “Quantum Manybody Systems, Tensor Networks and Neural Networks.” It notes an important issue in quantum physics: finding the wave functions of a many-body system. In recent works, a wave function has been approximated by a tensor network, and sensor networks are networks with nodes that may acquire many quantum states. In this chapter, the authors assert that “the Boltzmann machine is closely related to tensor networks,” hence to ANNs through a Hilbert space. The authors demonstrate their approach in the last chapter, “Application to Superstring Theory.” The superstring theory unifies gravity and other forces into a combined force field (holographic principle). The authors solve an inverse problem of finding a gravitational field from a combined force field using ANNs.

The last chapter, “Epilogue,” summarizes the presented approach and analytics and underlines the results in a philosophical manner.

I can see the possibilities in the authors’ approach. The book has the feel of a graduate thesis. It could be quite useful to a researcher investigating the relationship between ANNs and dynamical physical systems.