Parametric Drivers of High-End Sea-Level Hazards Evolve Over Time

By Alana Hough and Tony E. Wong, Rochester Institute of Technology

Climate models are critical tools for developing strategies to manage the risks posed by sea-level rise.  While these models are important for improving our understanding of climate risks, there is some uncertainty that is inherent in each parameter in the models.  This model parametric uncertainty leads to uncertainty in future risk.  Our goal is to understand how those parametric uncertainties impact our assessment of future climate risk, specifically high-end risks.  

Our new study, which is available as a preprint on arXiv (Hough and Wong, 2021), demonstrates an approach to tackle this challenge by using random forests.  Random forests are a machine learning algorithm that can be used for classification and regression problems.  Using random forests allows us to examine the relationship between the uncertain parameters in our models and future climate risks, and how the relative importances of those drivers change over time.  

In our work, we use random forests to classify states-of-the-world (SOWs) as being either a high-end scenario of global mean sea-level (GMSL) rise or non-high-end GMSL. Each SOW is defined by a set of sampled model parameter values and their corresponding projected GMSL from the Building blocks for Relevant Ice and Climate Knowledge sea-level model (BRICK; Wong et al., 2017b).  For our training data, we define SOW whose GMSL projections are greater than or equal to the 90th percentile among the whole ensemble as a “high-end” sea-level scenario. Data points less than the 90th percentile are classified as “non-high-end.” We do this for each year from 2020 to 2150, in order to examine how the breakdown of coastal hazard attributable to different model parametric uncertainties evolves over time. We also use both the Representative Concentration Pathways (RCP; Moss et al., 2010) 2.6 and 8.5 scenarios, to get a sense of how low or high radiative forcing changes this characterization of the drivers of coastal hazard.

Random forests for parametric uncertainties

Random forests are a collection of many decision trees.  Decision trees are a supervised machine learning technique that successively splits training outcomes into different outcome regions.  In our case, the decision trees will classify the outcomes (high-end or non-high-end GMSL) by using the parameters of the BRICK model as the features for classification.

The figure below shows a graphical representation of a hypothetical decision tree.  The tree in the figure splits on ECS, P0, and bSIMPLE, which are all BRICK model parameters.  (These parameters are just chosen for illustrative purposes; they represent the equilibrium climate sensitivity, Antarctic precipitation, and the Greenland ice sheet mass balance, respectively.)  In the example below, the initial split is on the ECS parameter at a value of 3.25 oC. This feature and value would be selected by considering all possible feature-value pairs and finding the one that yields the “cleanest”, or most pure, split of our data between non-high-end (left branch) and high-end SOW. We see that out of the original 1,000 samples, 600 of them are associated with the left branch (non-high-end GMSL) and 400 of them are associated with the right branch at the initial split. The right branch is initially associated with high-end GMSL, but is further refined by subdividing into SOW with P0 greater than or less than 0.5 m y-1.

Figure adapted from Hough and Wong (2021) under CC BY-NC-SA 4.0 license.

Construction of a decision tree stops and the outcome at each leaf node (the blue boxes at the bottoms of each branch) is determined once the levels of splits in a tree reach a specified depth or another specified stopping criteria is reached.  In the example decision tree, the maximum depth we use is 3, so the tree splits on 3 levels before creating and classifying leaf nodes.  Our leaf nodes would be classified as “high-end” or “non-high-end” by a majority vote among the data points at that node. 

Since decision trees are a supervised learning technique, we train the tree with BRICK data that includes the outcome classifications.  After training the tree, it can be used to predict outcomes based on input feature data.  For example, consider a feature data point x with ECS equals 5 ℃, P0 equals 0.2 m y -1, and bSIMPLE equals 9 m.  Starting at the top of the tree, we consider the ECS value.  Since x has an ECS greater than 3.25 ℃, we move down the right branch of the tree to the P0 node.  x’s P0 value, 0.2 m y -1, is less than 0.5 m y -1, so we continue to the left child of the P0 node, which is the bSIMPLE node.  Because x has a bSIMPLE value greater than 7.9 m, we go to the right child of the bSIMPLE node.  This node is a “high-end” leaf node, so we classify x as “high-end.”

Like many other machine learning algorithms, decision trees can overfit the training data used to create the tree.  Random forests, which are an ensemble method, are one way to reduce this overfitting.  The decision trees in the random forest are created using a random subset of training data and only a random subset of the BRICK parameters (features) are available at each split, which helps address the issue of overfitting to the training data.  For example, if a particular model parameter just happens to be really strongly related to future GMSL rise, a random forest will prevent the fitted decision trees from only relying on that parameter to classify a SOW’s future GMSL. 

Random forest hyperparameters govern the choices for how large of a random subset of features to consider at each split and how deep of a tree to build before classifying the leaf nodes. To determine the best hyperparameters to use in the random forest, we performed a 5-fold cross-validation grid search.  With a 5-fold cross-validation setup (see figure below), we take the training dataset and split it into five subsets, or folds.  Using the same hyperparameters, we perform five iterations creating random forests using a different subset as the testing dataset and the other four subsets as the training dataset.  We then take the mean of the iterations’ accuracy on the test subset to get one cross-validation score and choose the hyperparameter set with the best cross-validation score to use when making the forests.  This 5-fold cross-validation grid search technique also helps reduce overfitting to the training data.

Figure adapted from Hough and Wong (2021) under CC BY-NC-SA 4.0 license.

Using the classification data, we construct a random forest for each 5-year increment from 2020 to 2150 for each of the two radiative forcing scenarios, RCP2.6 and RCP8.5.  For each forest, we compute the relative feature importances, which are the Gini importances of each BRICK parameter.  The Gini importances are the averaged normalized reduction of node impurity of each parameter chosen as splits in the trees of the forest.  Hence, the more effectively a BRICK parameter splits the data, the higher its relative feature importance will be.

Characterizing drivers of sea-level hazards over time

The figure below shows the relative feature importances of the BRICK model parameters that were calculated based on the fitted random forests.  Equilibrium climate sensitivity (ECS, dark blue boxes in the figure below) and the aerosol scaling factor (𝛼DOECLIM, dark blue stippled boxes in the figure below) consistently are associated with the greatest high-end sea-level risk throughout both RCP2.6 and RCP8.5.  Both ECS and 𝛼DOECLIM are associated with the climate component of the BRICK model.  ECS in particular is a parameter that is well-known to play an important role in developing policies for characterizing and managing climate risks – it is defined as the equilibrium increase in global mean surface temperature that results from doubling the atmospheric CO2 concentration relative to pre-industrial conditions.

Figure adapted from Hough and Wong (2021) under CC BY-NC-SA 4.0 license.

The near-term sea-level hazard in both RCP scenarios is driven primarily by thermal expansion (𝛼TE, black boxes in the figure above).  This parameter’s importance for classifying (non-)high-end GMSL starts from  2020 and diminishes by the middle of the 21st century. This is true of both high- and low-radiative forcing scenarios, which highlights the hazard from committed sea-level rise due to warming that has already occurred.

As for the long-term risk, ice loss from the Antarctic ice sheet is a driver in both radiative forcing scenarios.  Tcrit is the temperature associated with the onset of fast dynamical disintegration of the Antarctic ice sheet.  In RCP8.5, Tcrit is important from 2040 to 2060, which is consistent with predictions that major Antarctic ice sheet disintegration will occur between 2040 and 2070 in high radiative forcing scenarios (Kopp et al, 2017; Nauels et al., 2017; Wong et al., 2017a; DeConto et al., 2021).  In addition to the Antarctic ice sheet, ice loss from the Greenland ice sheet also poses a risk in the long term under the RCP8.5 scenario. Interestingly, in the low-forcing RCP2.6 scenario, we see that the Antarctic ice sheet critical temperature, Tcrit, also plays an important role in classifying high-end GMSL. This demonstrates that even with strict reductions in radiative forcing (stemming from greenhouse gas emissions reductions), fast dynamical ice loss from Antarctica is possible and can play a key role in leading to the highest-end GMSL scenarios.

Our results demonstrate that climate risk management strategies must address near-term actions to mitigate near-term risks.  At the same time, risk management strategies must also guard against the long-term risks driven by mass loss from the major ice sheets. Indeed, our results for the parameter importance of Tcrit demonstrate that even with sharp reductions in greenhouse gas emissions, (i) we may well see fast ice loss from the major ice sheets and (ii) these ice sheets drive the worst outcomes in terms of sea-level projections.

While our work was centered around the impact of climate model parametric uncertainties on sea-level risk, the same machine learning approaches can be generalized to include the socioeconomic uncertainties that relate future climate hazards to financial and human risk.  This approach would help provide a more holistic view of uncertainties affecting future climate risk, and provides an avenue to connect uncertainties across coupled model components.


DeConto, R. M., Pollard, D., Alley, R. B., Velicogna, I., Gasson, E., Gomez, N., Sadai, S., Condron, A., Gilford, D. M., Ashe, E. L., Kopp, R. E., Li, D., and Dutton, A.: The Paris Climate Agreement and future sea-level rise from Antarctica, Nature, 593, 83–89,, 2021.

Hough, A., and Wong, T. E.: Analysis of the Evolution of Parametric Drivers of High-End Sea-Level Hazards. ArXiv:2106.12041 [Physics]. Retrieved from, 2021.

Kopp, R. E., DeConto, R. M., Bader, D. A., Hay, C. C., Horton, R. M., Kulp, S., Oppenheimer, M., Pollard, D., and Strauss, B. H.: Evolving Understanding of Antarctic Ice-Sheet Physics and Ambiguity in Probabilistic Sea-Level Projections, Earth’s Future, 5, 1217–1233,, 2017.

Moss, R. H., Edmonds, J. A., Hibbard, K. A.,Manning,M. R., Rose, S. K., van Vuuren, D. P., Carter, T. R., Emori, S., Kainuma,M., Kram, T., Meehl, G. A., Mitchell, J. F. B., Nakicenovic, N., Riahi, K., Smith, S. J., Stouffer, R. J., Thomson, A. M.,Weyant, J. P., andWilbanks, T. J.: The next generation of scenarios for climate change research and assessment, Nature, 463, 747–756,, 2010.

Nauels, A., Rogelj, J., Schleussner, C.-F., Meinshausen, M., and Mengel, M.: Linking sea level rise and socioeconomic indicators under the Shared Socioeconomic Pathways, Environmental Research Letters, 12, 114 002,, 2017.

Wong, T. E., Bakker, A. M. R., and Keller, K.: Impacts of Antarctic fast dynamics on sea-level projections and coastal flood defense, Climatic Change, 144, 347–364,, 2017a.

Wong, T. E., Bakker, A. M. R., Ruckert, K., Applegate, P., Slangen, A. B. A., and Keller, K.: BRICK v0.2, a simple, accessible,and transparent model framework for climate and regional sea-level projections, Geoscientific Model Development, 10, 2741–2760,, 2017b.

Leave a Reply