Session: 16-01-01: Poster Session: NSF-Funded Research (Grad & Undergrad)
Paper Number: 99357
99357 - Does Network Architecture Matter for Loop-Unrolled Elastic Localization?
Iterative learning models, such as loop unrolling or learned optimization, are often more effective than traditional feedforward learning models for a wide variety of physical systems. These procedures are based on the idea that one can "unroll" a classical iterative algorithm into a series of known operations or update steps, and then implement or approximate those operations as layers in a neural network. Ideally, since each step is assumed to be relatively simple, this iterative procedure generally requires networks with significantly fewer tunable parameters as compared to standard, one-shot approaches. This results in a kind of "learned relaxation" method, akin to the splitting methods commonly used to solve linear systems. As in splitting methods, the internal structure of the update rule can have strong ramifications on the method's performance, and certain approaches perform better on certain problems.
This poster explores and refines the application of these methods (especially the choice of network architecture) on the problem of elastic localization. In solid mechanics, this refers to the process of solving for the equilibrium stress or strain fields in a heterogeneous material microstructure subject to periodic bulk loading conditions. For the elastic localization problem, we have previously demonstrated that a hybrid iterative ML model (called a Recurrent Localization Network or RLN) outperforms other single-shot learning-based methods on the same problem. This approach is based on a loop-unrolling of classical FFT (Fast Fourier Transform)-based methods which are popular in mechanics literature. By basing the model architecture on the underlying physics, this approach allows for much more stable and robust training, and produces more expressive models with fairly few tunable parameters.
Our original choice of internal network architecture for the RLN was derived from a U-Net, a popular image processing architecture. However, this structure does not preserve the underlying translation-invariance of the physical system, and its performance deteriorates when different spatial resolutions are used. In contrast, myriad network architectures have been employed for similar problems (e.g. Invertible Neural Networks, Fourier Neural Operators, or even simpler CNN structure). Furthermore, the original RLN formulation contains a large number of hyperparameters that could ostensibly be tuned to further improve model performance. This poster explores several alternative choices of network structure and motivates their use by comparing them to existing classical numerical methods for elastic localization. We present a case study examining the relative advantages of these candidate architectures and appropriate choices of hyperparameter for each, and compare their accuracy, computational efficiency, and robustness.
Presenting Author: Conlain Kelly Georgia Institute of Technlogy
Presenting Author Biography: Conlain is a PhD student at Georgia Institute of Technology studying Computational Science and Engineering. His interests lie in the intersections between statistical mechanics, numerical methods, and machine learning.
Authors:
Conlain Kelly Georgia Institute of TechnlogySurya Kalidindi Georgia Institute of Technology
Does Network Architecture Matter for Loop-Unrolled Elastic Localization?
Paper Type
NSF Poster Presentation