1 Introduction

Almost a decade after the introduction of complex network methods into climate science (Tsonis and Roebber 2004), network-based model validation techniques are still few and far between. Climate networks associate geographic locations with nodes (also called vertices) of a mathematical object called network or graph (Newman 2009). The connections between nodes (called links or edges) represent similarities between climatological time series at those locations, derived mostly from reanalyses or remote sensing data. The mathematical field of complex network theory has thrived over the past decades (Strogatz 2001) and now offers a variety of methods to uncover different aspects of the topological structure of networks (Newman 2003; Cohen and Havlin 2008).

Applied to the climate system, these methods have already lead to several substantial new insights: From the identification of dynamical transitions (Tsonis et al. 2007) and teleconnections (Donges et al. 2009b), via the study of El Niño (Yamasaki et al. 2008; Gozolchiani et al. 2011) and monsoon systems (Malik et al. 2011; Boers et al. 2013), to actual predictive power (Ludescher et al. 2013). It has been demonstrated that data-based climate networks show remarkable versatility. The model-based branch of the framework is far less developed, although recent attempts by Steinhaeuser and Tsonis (2013) and Fountalis et al. (2013) show the growing interest in the use of network methods for climate model intercomparison and climate model analysis (van der Mheen et al. 2013).

From a theoretical point of view, evaluating the network structure of modeled climate data constitutes a promising extension to conventional evaluation techniques like the comparison of annual cycles or seasonal means. While the latter approaches investigate properties of time series at each geographical location individually, climate networks describe their covariability and thus represent an essentially different aspect of spatio-temporal climate variability.

The intention of this paper is to propose a new method to evaluate climate models by means of complex networks. For this purpose, we compare the structural similarity of climate networks obtained from models to those obtained from reanalysis data (the reference networks). The differences between them are considered proxies for the quality of the underlying modeling, here the simulation of the South American climate as performed by a dynamical (CCLM) and a statistical (STARS) regional climate model (RCM). This work is to be seen as both an introduction of the methodology, and as a case study on the feasibility of the application of CCLM and STARS to South American climate.

With our methodology, we aim at a direct comparison of the network structure, which has the advantage of including all available information about the complex networks under study. Any kind of preprocessing of the networks, e.g. a clustering of nodes, bears the inherent danger of adding spurious information or diminishing the complexity of the network structure, possibly strip** it of relevant features.

One of the applied network difference measures is a modification of the Hamming distance, which is rooted in information theory (Hamming 1950) and has found plenty of applications, often in combination with complex network analysis (Donges et al. 2009a; Zhou et al. 2006; Ciliberti et al. 2007). We also compare the clustering structures (Watts and Strogatz 1998) of the observed and modeled networks by computing the root-mean-square distance of their respective fields of local clustering coefficients in order to evaluate the recreation of nonlinear dependencies by the models.

It should be noted that, comparing the spatial statistical interdependency structure within climatological fields, our method is related to approaches based on empirical orthogonal functions and teleconnection patterns (Handorf and Dethloff 2012; Stoner et al. 2009), yet distinct due to the inclusion of information about nonlinear interrelations (Donges et al.

Fig. 1
figure 1

Basic principle of STARS: at first (top panel), entire years from the observation period are resampled for their yearly means (red dots) to approximate a prescribed trend line (blue). Then, by iteratively replacing 12-day blocks (bottom panel), the resulting time series is further tuned to improve the matching of the actual (red dot) and prescribed (blue dot) yearly mean values

Fig. 2
figure 2

CCLM’s domain of computation including the sponge frame (colored), the CORDEX-South-America domain (dotted), and the common domain of evaluation (dashed). Colors indicate surface height

2.2 The dynamical approach: CCLM

The COSMO-CLM (CCLM, Rockel et al. 2008) is the climate version of the COSMO-Model (Baldauf et al. 2011), which is the operational numerical weather prediction model of the German Weather Service and other members of the COSMO consortium. The development of CCLM is steered by the CLM Community which has more than 50 member institutions from Europe, Asia, Africa and America. The model has been extensively applied to European domains (e.g. Jaeger et al. 2008; Zahn and Storch 2008; Hohenegger et al. 2009; Davin and Seneviratne 2011) but also to the Indian subcontinent (Dobler and Ahrens 2010), to CORDEX-East-Asia (Fischer et al. 2013), and to CORDEX-Africa (Nikulin et al. 2012). One of the very first applications was to South America (Böhm et al. 2003) but it has been run there rarely afterwards (Rockel and Geyer 2008; Wagner et al. 2011). CCLM is dynamical in the sense that it solves thermohydrodynamical equations describing the atmospheric circulation. The equations are discretized on a three-dimensional grid based on a rotated geographical coordinate system.

In this study the CCLM version 4.25.3 was used. Deviating from its default configuration, the model was run with 40 vertical levels, reaching up to 30 km above sea level and a Rayleigh dam** height of 18 km, as has been suggested for tropical regions (Panitz et al. 2013). We set the bottom of the deepest hydrologically active soil layer to 8 m, since rain forest roots go down to such depths (Baker et al. 2008). The numerical integration was performed with a total variation diminishing Runge–Kutta scheme (Liu et al. 1994) and a Bott advection scheme (Bott 1989), since both are supposedly more accurate than their default alternatives. We employ an implementation of the ECMWF IFS Cy33r1 convection scheme (Bechtold et al. 2008) and diagnose subgrid-scale clouds by a normalized saturation deficit criterion (Sommeria and Deardorff 1977). Additionally, a few tuning parameters were adjusted during preceding sensitivity experiments. Particularly, changing the convective parametrization and the subgrid-scale cloud scheme led to major improvements of the model performance over South America. These findings are presented in detail in a separate paper (Lange et al. 2014).

We run the model on the CORDEX-South-America domain (Giorgi et al. 2009) as displayed in Fig. 2. This implies a horizontal resolution 0.44° and 166 × 187 grid points including a 10 grid points wide sponge frame. The simulation covers the years 1979–2011 where the first 17 years serve as spinup time, since the STARS output is only available from 1996.

2.3 Common domain of evaluation

In order to construct climate networks of the same spatial embedding, both model outputs were to match in resolution and geographical boundaries. We chose a section of the native ERA-Interim grid, encompassing the South American mainland (Fig. 2): 82.3°W–33.8°W and 13.7°N–55.8°S. The resolution is approximately 0.7° in both latitude \(\phi\) and longitude \(\lambda\). This makes for a bounding box of \(N = N_\phi \times N_\lambda = 100 \times 70 = 7{,}000\) grid cells, which will be represented by nodes in the subsequently constructed climate networks. Since this is a regular Gaussian grid of considerable latitudinal extent, grid cells at different latitudes represent differently sized areas (about \(78\,\text {km} \times 78\,\text {km} = 6{,}084\,\text {km}^2\) at the equator and \(44\,\text {km} \times 78\,\text {km} = 3{,}432\,\text {km}^2\) at the southern boundary)—an effect we take into account by introducing area-proportional node weights (cf. Sect. 3.3).

Since STARS only resamples the input data, its output is already on the native ERA-Interim grid. CCLM output was remapped, conservatively (Jones 1999) in case of precipitation and bilinearly otherwise.