Leveraging Machine Learning for Weather Radar Quality Control and Microphysical Retrievals

August 05, 2025

Isaac Schluesche

Committee: Michael Bell (Advisor); Russ Schumacher; Haonan Chen (Electrical and Computer Engineering)

Abstract

Data from meteorological radars must undergo an extensive quality control process in order to become useful. A primary part of the process is the removal of measurements returned by non-meteorological features such as the earth’s surface and biological targets (birds, insects, etc.). Existing methods struggle to achieve acceptable levels of non-meteorological data (NMD) removal in a timely fashion, resulting in the development of RONIN.jl (Random forest Optimized Nonmeteorological IdentificatioN) - an open-source Julia package providing an API for the end-to-end tuning, training, and testing of random forest (RF) models to remove NMD from radar sweeps. It is shown that Ronin is able to achieve performance that meets or exceeds current operational products while operating at a speed multiple orders of magnitude faster than other experimental machine learning based methods.

Hydrometeor size distributions (HSDs) are quantities of great interest in a range of meteorological disciplines including cloud microphysics and numerical modeling. An expanding body of literature has shown that these distributions can be succinctly represented as functions of two or three integral moments of the distribution itself, with effective normalizations drastically reducing variability toward a distribution that does not vary across climactic regimes or precipitation habits. In this study, a novel three-moment normalization is employed to generate an extensive library of simulated HSDs, broadly conditioned on observations, in service of training a retrieval algorithm for the full distribution. Simulated radar variables are also computed for the distributions contained in the synthetic dataset. Subsequently, several Artificial Neural Networks (ANNs) are trained to use the radar variables as input to retrieve the full HSD in a variety of different manners. Finally, the different retrieval techniques are evaluated on a real-world dataset. It is shown that all methods examined are at least somewhat effective at retrieving moments, even of lower order, of observed distributions, and produce distributions that match well with observations.