R-netCDF

Analysis and Visualization of netCDF files with R

The following R code inputs Daymet netCDF climate grids and computes raster surfaces of a bioclimatic variable called Post-Snow Growing Degree-Days (PSGDD5). PSGDD5 is a bioclimatic variable that quantifies the length and intensity of the growing season, where the growing season is defined by days of the year when daily average temperature exceeds 5°C and no snowpack is present.

Daymet is a 1km resolution gridded climate model of daily values covering much of North America (Thornton et al. 1997, Thornton et al. 2012). Tiles are available in netCDF format here: http://daymet.ornl.gov/dataaccess

The following example uses a mosaic of 4 Daymet tiles to cover the Olympic Peninsula, Washington. NetCDF files containing daily minimum temperature (tmin), daily maximum temperature (tmax), and daily snow-water equivalent (swe) are used to calculate PSGDD5.

http://daymet.ornl.gov/Tools provides many useful tools for processing Daymet files. Two were used in this example prior to executing the following R code.

Downloading netCDF Files from THREDDS Server

Joining Daymet Tiles with NCL

Setup and define global variables

First, load the required libraries.

library(ncdf4)
library(RColorBrewer)
library(lattice)
library(raster)
library(rasterVis)
library(proj4)

Set a working directory where tmax, tmin, and swe netCDF files are located.

setwd("~/Git/GDD/OPtiles")

Open a typical daymet netCDF named swe.OP.1980.nc in order to grab some global dimensions.

dir <- "~/Git/GDD/OPtiles/"
year <- "1980"
ncname <- "swe"
ncfname <- paste(dir, ncname,".OP.", year, ".nc", sep="")
ncin <- nc_open(ncfname, write=F)

View header information such as variable names and dimensions and coordinate system.

print(ncin)

Get x and y coordinates and day-of-year dimension and store them as array objects. Then close the netCDF file.

x <- ncvar_get(ncin,"x")
y <- ncvar_get(ncin,"y")
yearday <- ncvar_get(ncin,"yearday")
nc_close(ncin)

Processing one year

The dayment netCDF files have a timespan of one calendar year. Processing is done on year 1980 as an example, and then for each year of the 1980’s decade later on, using the same functions.

Here is a function that will be used to open daymet netCDF files and return the climate variable as a vector object.

getNCDFClimate <- function(var, year){ # Two arguments define the climate variable and year of the netCDF
ncfname <- paste(dir, var,".OP.", year, ".nc", sep="")
ncin <- nc_open(ncfname, write=T)
var.array <- ncvar_get(ncin,var) # get the climate variable values
nc_close(ncin)
var.vec.long <- as.vector(var.array) # convert from array to vector
return(var.vec.long)
}

Call the getNCDFClimate() function to return vectors of tmin, tmax, and swe variables.

vec.tmax <- getNCDFClimate("tmax", "1980")
vec.tmin <- getNCDFClimate("tmin", "1980")
vec.swe <- getNCDFClimate("swe", "1980")

Calculate a vector of average daily tempeartures from tmax and tmin.

vec.tmean <- (vec.tmax + vec.tmin)/2

Create matrices from vec.tmean and vec.swe where each row is a unique grid cell and each columns is a day of the year.

ncells <- dim(x)*dim(y)
nyearday <- dim(yearday)
tmean.mat <- matrix(vec.tmean, nrow=ncells, ncol=nyearday, byrow=F)
swe.mat <- matrix(vec.swe, nrow=ncells, ncol=nyearday, byrow=F)

For each cell in the snow-water equivalent matrix (swe.mat), get an index for the days when there is no snow. This function is designed to be applied over each row of the matrix.

getNoSnow <- function(x){
nosnowdays <- which(x<10) ## choose a value for no snow (kg/m2 = mm)
return(nosnowdays)
}

The getNoSnow() function is called within an apply() function and operates on every row (MARGIN=1) of the input matrix.

ns.index 

The apply() function returns a list with an element for each row of the matrix. Values in this list element refer to the days of the year when there is no snow in that cell. For example, here is the 5000th element of the list.

ns.index[[5000]]

 

  [1] 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219
 [18] 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236
 [35] 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253
 [52] 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270
 [69] 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287
 [86] 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304
[103] 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321
[120] 322 323 324 325 326 327 328

To complement this list of no-snow indices, split the mean daily temperature matrix (tmean.mat) into a list, with each row as an element. Come back in 10 minutes.

tmean.list 

Now, a function that calculates Post-snow growing degree-days above a 5°C base.

PSGDD5 <- function(x, indices){
ns.tmeans <- x[indices] # subset only the no-snow days from a vector of tmean values
ns.gdd5 <- ns.tmeans[ns.tmeans>5] - 5 # for days with no-snow and tmean > 5°C, calculate degrees above 5°C
ann.ns.gdd5 <- sum(ns.gdd5) # sum these "degree-days" over the year
return(ann.ns.gdd5)
}

The PSGDD5() function is called with the mapply() function so that it can simultaneously receive each element, in order, from the tmean.list and ns.index lists.

psgdd5 <- mapply(PSGDD5, tmean.list, ns.index)

Plotting Results

X and Y coordinates for our tile were retrieved at the start. They are paired together in a dataframe with expand.grid() and then combined with the vector to be plotted.

grid <- expand.grid(xlon=x, ylat=y)
xyz.psgdd5 <- cbind(grid, psgdd5)
head(xyz.psgdd5)

 

      xlon    ylat psgdd5
1 -1787500 1072500      0
2 -1786500 1072500      0
3 -1785500 1072500      0
4 -1784500 1072500      0
5 -1783500 1072500      0
6 -1782500 1072500      0

 

xyz.tmean <- cbind(grid, tmean.mat[,92]) # xyz for plotting tmean on April 1st
xyz.swe <- cbind(grid, swe.mat[,92]) # and for plotting swe

This constructed an xyz object which can easily be converted to a raster object with the raster package. First a projected coordinate system is defined in the PROJ4 syntax, based on info from the netCDF header.

pj <- "+proj=lcc +lat_1=25.00 +lat_2=60.00 +lat_0=42.5 +lon_0=-100.00 +x_0=0 +y_0=0 +ellps=GRS80 +units=m +no_defs"

rast.psgdd5 <- rasterFromXYZ(xyz.psgdd5, crs=pj) # create a raster object
rast.tmean <- rasterFromXYZ(xyz.tmean, crs=pj)
rast.swe <- rasterFromXYZ(xyz.swe, crs=pj) 

Plot maps of the annual PSGDD5 variable and its component parts, tmean and swe. Three plots are created with the levelplot() from the rasterVis package, then arranged on a page with grid.arrange() from the gridExtra package

cuttmp <- c(-30,-20,-10,-5,-2,0,2,5,10,20,30) # define break points for color scale
colkey <- list(at=cuttmp, labels=list(at=cuttmp)) # colorkey helps put legend labels at correct break points
p1 <- levelplot(rast.tmean, margin=FALSE, colorkey=colkey, at=cuttmp, cuts=12, col.regions=rev(brewer.pal(11,"RdBu")), main=list(label="Mean Temperature (April 1, 1980)", cex=.8), scales=list(draw=FALSE))

cutswe <- c(0,10,50,100,250,500,1000)
colkey <- list(at=cutswe, labels=list(at=cutswe))
p2 <- levelplot(rast.swe, margin=FALSE, colorkey=colkey, at=cutgdd, cuts=8, col.regions=rev(brewer.pal(7,"Blues")), main=list(label="SWE (mm) April 1, 1980", cex=.8), scales=list(draw=FALSE))

cutgdd <- c(0,50,200,400,600,800,1000,1500,2000,3000)
colkey <- list(at=cutgdd, labels=list(at=cutgdd))
p3 <- levelplot(rast.psgdd5, margin=FALSE, colorkey=colkey, at=cutgdd, cuts=10, col.regions=(brewer.pal(9,"PuBuGn")), main=list(label="Annual PSGDD5 (1980)", cex=.8), scales=list(draw=FALSE))

grid.arrange(p1, p2, p3, ncol=3)

unnamed-chunk-17

Batch Processing to calculate a long-term mean

Now that I have stepped through the process of calculating PSGDD5 for one year, I will calculate it for each year of a decade, and then generate a decadal average PSGDD5 grid.

Create a list to store each year’s PSGDD5 calculation. The vector decade contains names of years and will be used to read files.

psgdd5.8012 <- list()
decade <- seq(1980, 2012, by=1)

Loop through each year in the decade calling the functions defined earlier. The final result for each store is stored in the [[y]] position of the psgdd5.80s list.

for (y in 1:length(decade)){
vec.tmax <- getNCDFClimate("tmax", decade[y])
vec.tmin <- getNCDFClimate("tmin", decade[y])
vec.swe <- getNCDFClimate("swe", decade[y])

vec.tmean <- (vec.tmax + vec.tmin)/2

tmean.mat <- matrix(vec.tmean, nrow=ncells, ncol=nyearday, byrow=F)
swe.mat <- matrix(vec.swe, nrow=ncells, ncol=nyearday, byrow=F)

ns.index <- apply(swe.mat, MARGIN=1, FUN=getNoSnow)
tmean.list <- split(tmean.mat, row(tmean.mat))
psgdd5.80s[[y]] <- mapply(PSGDD5, tmean.list, ns.index)
}

Convert the list of PSGDD5 years to a dataframe and calculate the decadal average as the mean of each row in the dataframe.

psgdd5.80s.df <- do.call(cbind.data.frame, psgdd5.80s) ## each column is a year
names(psgdd5.80s.df) <- as.character(decade)
psgdd5.80s.df$mean <- rowMeans(psgdd5.80s.df)

Create a raster object of the decadal average PSGDD5 values.

xyz.mean <- cbind(grid, psgdd5_8089$mean)
rast.mean <- rasterFromXYZ(xyz.mean, crs=pj)

And a stack of PSGDD5 rasters for each individual year.

stk <- stack()
for (i in 1:10{
xyz <- cbind(grid, psgdd5_8089[,i])
rast <- rasterFromXYZ(xyz, crs=pj)
stk <- stack(stk, rast)
}
names(stk) <- decade

Define some X and Y values for a plotting extent.

ext.x <- c(-1750000, -1590000)
ext.y <- c(670000, 910000)

Plot the decadal average raster.

levelplot(rast.mean, margin=FALSE, colorkey=colkey, at=cutgdd, xlim=ext.x, ylim=ext.y, cuts=10, col.regions=(brewer.pal(9,"PuBuGn")), main=list(label="1980's Decadal Average PSGDD5", cex=1))

unnamed-chunk-20

Plot a stack of rasters to examine the variable at each year.

levelplot(stk, scales=list(draw=FALSE), at=cutgdd, xlim=ext.x, ylim=ext.y, cuts=10, col.regions=(brewer.pal(9,"PuBuGn")), main=list(label="1980-1989 yearly PSGDD5", cex=1))

unnamed-chunk-21

References:

Thornton, P.E., S.W. Running, and M.A. White. 1997. Generating surfaces of daily meteorology variables over large regions of complex terrain. Journal of Hydrology 190:214-251.

Thornton, PE, MM Thornton, BW Mayer, N Wilhelmi, Y Wei, RB Cook 2012. Daymet:Daily surface weather on a 1 km grid for North America,1980-2011. Acquired online (http://daymet.ornl.gov/) on 05/01/2013 from Oak Ridge National Laboratory Distributed Active Archive Center, Oak Ridge, Tennessee, U.S.A. doi:10.3334/ORNLDAAC/Daymet_V2.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s