## Chairs: Ben Peherstorfer (NYU) and Nick Boffi (NYU)

### Time: Aug 19th, 12:20pm-1:10pm ET, 19:10-20:00 CET, 01:10-02:00 GMT+8

**Some observations on partial differential equations in Barron and multi-layer spaces**, Weinan E (Princeton University); Stephan Wojtowytsch (Princeton University)

*Paper Highlight, by Juncai He*

Barron and tree-like spaces are considered as appropriate function spaces to study the mathematical aspects of neural networks with one or multi hidden layers. This paper presents some observations about the Barron or tree-like regularities of solutions of three prototypical PDEs (screened Poisson, heat, and viscous HJB). From a mathematical perspective, the proof techniques are not difficult, but, some interesting results show that the Barron regularity may differ from the classical regularity theory of PDEs in certain aspects. Due to the increasing attempts of solving PDEs by neural networks, studying the regularity theory for PDEs under such function spaces plays a fundamental role in it.

**A deep learning method for solving Fokker-Planck equations**, Yao Li (University of Massachusetts Amherst), Matthew Dobson (University of Massachusetts Amherst); Jiayu Zhai (University of Massachusetts Amherst)

*Paper Highlight, by Raffaele Marino*

The theory of Brownian motion, developed in different formulations by Einstein, Smoluchowski, and Langevin around 1905 and 1906, describes the dynamics of a particle suspended in a fluid. A prototypical example is a small colloidal object, e.g., a polystyrene bead about a micrometer in size, floating in the water at room temperature. Even without the action of externally applied forces, the particle is in an animated and erratic state of motion. This motion is generated at microscopic scales by collisions with the water molecules. It is visible at mesoscopic scales as an irregular diffusive movement. The Brownian motion has been successfully modeled by stochastic differential equations. Stochastic differential equations are used, in general, to model the dynamics of many real-world problems in the presence of uncertainty. The instantaneous and cumulative effects of the noise on the dynamics can be visualized through the probability distribution of the solution process. The Fokker-Planck equation (also known as the Kolmogorov forward equation) can analytically describe this probability measure. In general, this partial differential equation (PDE) cannot be solved analytically when the number of dimensions is high, and, therefore, numerical methods must be used. Traditional PDE solvers do not work well for the Fokker-Planck due to the curse of dimensionality and many other issues that high dimensional spaces bring together. However, recently, the application of deep learning methodology has shown many interesting results in solving PDEs in high-dimensional spaces. Deep learning is a class of machine learning algorithms that uses multiple layers to extract higher-level features from the raw input. It is based on Artificial Neural Networks, a series of functional transformations that can be obtained by fixing a set of basic functions in advance and allowing them to be adaptive during training. In this manuscript, the authors propose a mesh-free Fokker-Planck solver, in which a deep neural network represents the stationary solution to the Fokker-Planck equation and where just a small data set as a reference to locate the solution near the empirical probability distribution is needed. By introducing the differential operator of the Fokker-Planck equation into the loss function, the authors show improvements in the accuracy of the neural network representation with a reduction of the demand of data in the training process, reducing, therefore, the computational complexity for solving those kinds of PDE. Their simulations show that the neural network can tolerate very high noise in the training data long as it is spatially uncorrelated. This method, therefore, can help to deal with systems composed of many Brownian particles. It can be applied in many fields to understand the agents’ collective behavior comprising a complex system.

**Parameter Estimation with Dense and Convolutional Neural Networks Applied to the FitzHugh-Nagumo ODE**, Johann Rudi (Argonne National Laboratory), Julie Bessac (Argonne National Laboratory); Amanda Lenzi (Argonne National Laboratory)

*Paper Highlight, by Zhizhen Zhao*

This paper discusses a parameter estimation problem for a specific system of ODEs, i.e., the FitzHughâ€“Nagumo equations that describe spiking neurons. The parameter estimation problem is highly nonlinear. It is challenging and expensive to solve this inverse problem with the classical Bayesian framework. As such, the authors propose to use deep neural networks to learn a direct nonlinear mapping from the measured time-series to the underlying ODE parameters. The paper has a very strong emphasis on the application and the authors provided an extensive analysis of results for simulated clean and noisy observations. The authors compared CNNs with fully connected (dense) networks for parameter estimation. CNN architectures mostly show the lowest errors when recovering parameters, which can be attributed to their locally acting kernels being advantageous for time-evolving data. In addition, CNNs extract crucial properties or dynamics of the ODE output when predicting parameters from arbitrarily chosen partial observations; dense NNs, in contrast, perform significantly worse.

**Active learning with importance sampling: Optimizing objectives dominated by rare events to improve generalization**, Grant M Rotskoff (Stanford University), Eric Vanden-Eijnden (New York University)

*Paper Highlight, by Lin Lin*

The authors introduce an approach that combines rare events sampling techniques with neural network optimization to optimize objective functions that are dominated by rare events. This is a variance reduction technique, which evaluates the gradient of the loss function based on a partition of unity type of construction. The online construction of these windowing functions enables the adaptive sampling of regions of interest. Numerical experiments indicate that the method can be used to successfully solve high dimensional partial differential equations.

**A semigroup method for high dimensional committor functions based on neural network**, Haoya Li (Stanford University), Yuehaw Khoo (U Chicago); Yinuo Ren (Peking University); Lexing Ying (Stanford University)

*Paper highlight, by Jiequn Han*

This paper proposes a new method based on neural networks to compute the high-dimensional committor functions. Understanding transition dynamics from the commitor function is a fundamental problem in statistical mechanics with decades of work behind it. Traditional numerical methods have an intrinsic limitation in solving general high-dimensional commitor functions. Algorithms based on neural networks have received much interest in the community, all based on the Fokker-Planck equation’s variational form. This paper’s main innovation lies in proposing a new variational formulation (loss function) based on the differential operator’s semigroup. The new formulation does not contain any differential operator, and the authors explicitly derive the loss’s graidents used for the training. The gradients only involve the first-order derivatives of the neural networks, in contrast to the second-order derivatives required in the previous methods. This feature is conceptually beneficial to the efficient training of neural networks. Numerical results on the standard testing examples and the Ginzburg-Landau model demonstrate the superiority of the proposed method. Besides, the authors also show that in the lazy training regime, the corresponding gradient flow converges at a geometric rate to a local minimum under certain assumptions.