GISS Lunch Seminar Speaker: Sing-Chun Wang (DOE/PNNL) Title: Modeling fire emissions and identifying their key drivers in the contiguous US using interpretable machine learning Abstract: Wildfires are becoming more frequent and intense in the United States, which causes property damage and poor air quality. Wildfire is a complex process that intermingles multiple factors, including ignition, fuel, weather, topography, and climate. Thus, developing wildfire prediction models that encompass the complex and non-linear relationships between fires and their drivers is increasingly important and demanding. In this study, we built a machine learning (ML) model incorporating predictors of local meteorology, land-surface characteristics, and socioeconomic variables to predict monthly burned area at grid cells of 0.25deg x 0.25deg resolution across the contiguous US (CONUS) during 2000-2017. In addition, we design and include predictors representing the large-scale circulation patterns conducive to wildfires, which improves the predictions in several regions. We introduce a game-theory-based method, named the Shapley Additive explanation (SHAP), to interpret the machine learning model (hereafter referred to explainable ML model) and examine the relative importance of the predictor variables to wildfire prediction. Results show a key role of longitude and latitude in delineating fire regimes with different temporal patterns of burned area. For western US, the energy release component (ERC) is the major contributor to large burned areas. In contrast, the identified large-scale circulation patterns featuring less active upper-level ridge-trough and low RH two months earlier in winter contribute more to large burned areas in spring in the southeastern US. In another study, we construct another explainable ML model using a similar set of predictors to predict monthly fire PM2.5 emissions from GFED and use this model to diagnose the fire emissions simulated by process-based models from Fire Modeling Intercomparison Project (FireMIP). For instance, SHAP importance shows that SVD predictors representing the large-scale circulation patterns favorable for fires are the dominant factors for peak fire emissions in 2007, which may explain the underestimations in the process-based models. The results demonstrate how the ML technique enhances wildfire predictions and interpretation and how ML provides a better elucidation of the complex processes contributing to wildfires.