Breakout Summary Report

 

ARM/ASR User and PI Meeting

Novel Applications of ML/AI to ARM Data and ASR Science
8 August 2023
4:15 PM - 6:15 PM
30
Rao Kotamarthi

Breakout Description

Machine learning and artificial intelligence (ML/AI) is becoming a valuable methodology for developing data-based models, building surrogates of existing models, and analysis of field observations for gaining process insights. The AI/ML-based science is becoming a path for doing science on its own with increasing atention to developing explainable neural networks, physics-informed neural networks (PINNs) and differentiable neural networks. This breakout will discuss this new way of doing science and the state of the art in developing ML/AI-based models and their use in observational data analysis and development of process-scale models and model evaluations. Model developments from process-scale to whole models based entirely on data-based approaches and large data learning to those focused on developing targeted process-model emulators for building hybrid models are encouraged for the session. The ARM data provide a unique window into a variety of sub-grid-scale phenomena to climate-scale models and the increasing use of AI/ML techniques to represent these fast and heterogenous processes using AI/ML techniques is becoming prevalent. The ARM data can serve as the curated data set for use with AI techniques to find data clusters and teleconnections, and serve as model training data sets and test data sets. Papers targeting these areas will be encouraged for the breakout.

Main Discussion

Seven presentations were made during this session.  The talks were 15 minutes long with 12 minutes of talk and 3 minutes for questions. We solicited presentations on development of emulators for process-scale models and use of ML for ARM observational and ASR laboratory data sets to find patterns and accelerate the characterization of the data sets.  Three of the presentations focused on the development of emulators and four on observational data sets.  Abstracts for the presentation and their titles are enclosed.  The ML methods used range from very shallow ML model, with three layers (Hagos et al.) to domain-aware Convolution Neural Network with custom loss functions (Garg et al.). The ARM observational data analysis ranged from developing community-usable ML toolkits (Jackson et al.) for classification of meteorological regimes to specialized methods for clustering mass spectra (Zawadowicz et al.).

Key Findings

There is increasing maturity in the program in using AI/ML methods for ARM/ASR applications.  Most of the methods discussed are off-the-shelf algorithms that have been adapted for a specific application. As more data from ARM and ASR becomes available, we can expect increasing use of observational data-based approaches for process model development and development of hybrid models. The use of ML for ARM data classification seems to be on a secure path with many approaches being investigated and probably will mature quickly in the next few years.

Issues

The methods used are basic and off- the-shelf methods.   The applications discussed seem to focus on using ML methods as an extension of statistical toolkit. Thinking of ML/AI as a new approach to doing scientific research is lacking.

Needs

It may be useful to have a tutorial session in future meetings to introduce the community to advanced AI/ML techniques and preparing large data sets for AI/ML use.

Decisions

We will need to advertise this breakout much more aggressively in the following years to increase awareness in the community of the uses of ML/AI.

Future Plans

We will continue to host these meetings in future ARM/ASR PI meetings. We will also consider hosting a tutorial/hackathon in future meetings to introduce larger sections of the ARM/ASR community to advanced AI/ML approaches that are made feasible with large HPC resources becoming available.

Action Items

Initiate discussion on developing tutorial/hackathon at the next ASR/ARM PI meeting.