Analysis and Visualization of netCDF files with R
The following R code inputs Daymet netCDF climate grids and computes raster surfaces of a bioclimatic variable called Post-Snow Growing Degree-Days (PSGDD5). PSGDD5 is a bioclimatic variable that quantifies the length and intensity of the growing season, where the growing season is defined by days of the year when daily average temperature exceeds 5°C and no snowpack is present.
Daymet is a 1km resolution gridded climate model of daily values covering much of North America (Thornton et al. 1997, Thornton et al. 2012). Tiles are available in netCDF format here: http://daymet.ornl.gov/dataaccess
The following example uses a mosaic of 4 Daymet tiles to cover the Olympic Peninsula, Washington. NetCDF files containing daily minimum temperature (tmin), daily maximum temperature (tmax), and daily snow-water equivalent (swe) are used to calculate PSGDD5.
http://daymet.ornl.gov/Tools provides many useful tools for processing Daymet files. Two were used in this example prior to executing the following R code.
Downloading netCDF Files from THREDDS Server
Setup and define global variables
First, load the required libraries.
library(ncdf4) library(RColorBrewer) library(lattice) library(raster) library(rasterVis) library(proj4)
Set a working directory where tmax, tmin, and swe netCDF files are located.
setwd("~/Git/GDD/OPtiles")
Open a typical daymet netCDF named swe.OP.1980.nc in order to grab some global dimensions.
dir <- "~/Git/GDD/OPtiles/" year <- "1980" ncname <- "swe" ncfname <- paste(dir, ncname,".OP.", year, ".nc", sep="") ncin <- nc_open(ncfname, write=F)
View header information such as variable names and dimensions and coordinate system.
print(ncin)
Get x and y coordinates and day-of-year dimension and store them as array objects. Then close the netCDF file.
x <- ncvar_get(ncin,"x") y <- ncvar_get(ncin,"y") yearday <- ncvar_get(ncin,"yearday") nc_close(ncin)
Processing one year
The dayment netCDF files have a timespan of one calendar year. Processing is done on year 1980 as an example, and then for each year of the 1980’s decade later on, using the same functions.
Here is a function that will be used to open daymet netCDF files and return the climate variable as a vector object.
getNCDFClimate <- function(var, year){ # Two arguments define the climate variable and year of the netCDF ncfname <- paste(dir, var,".OP.", year, ".nc", sep="") ncin <- nc_open(ncfname, write=T) var.array <- ncvar_get(ncin,var) # get the climate variable values nc_close(ncin) var.vec.long <- as.vector(var.array) # convert from array to vector return(var.vec.long) }
Call the getNCDFClimate()
function to return vectors of tmin, tmax, and swe variables.
vec.tmax <- getNCDFClimate("tmax", "1980") vec.tmin <- getNCDFClimate("tmin", "1980") vec.swe <- getNCDFClimate("swe", "1980")
Calculate a vector of average daily tempeartures from tmax and tmin.
vec.tmean <- (vec.tmax + vec.tmin)/2
Create matrices from vec.tmean
and vec.swe
where each row is a unique grid cell and each columns is a day of the year.
ncells <- dim(x)*dim(y) nyearday <- dim(yearday) tmean.mat <- matrix(vec.tmean, nrow=ncells, ncol=nyearday, byrow=F) swe.mat <- matrix(vec.swe, nrow=ncells, ncol=nyearday, byrow=F)
For each cell in the snow-water equivalent matrix (swe.mat
), get an index for the days when there is no snow. This function is designed to be applied over each row of the matrix.
getNoSnow <- function(x){ nosnowdays <- which(x<10) ## choose a value for no snow (kg/m2 = mm) return(nosnowdays) }
The getNoSnow()
function is called within an apply()
function and operates on every row (MARGIN=1
) of the input matrix.
ns.index
The apply()
function returns a list with an element for each row of the matrix. Values in this list element refer to the days of the year when there is no snow in that cell. For example, here is the 5000th element of the list.
ns.index[[5000]]
[1] 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 [18] 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 [35] 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 [52] 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 [69] 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 [86] 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 [103] 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 [120] 322 323 324 325 326 327 328
To complement this list of no-snow indices, split the mean daily temperature matrix (tmean.mat
) into a list, with each row as an element. Come back in 10 minutes.
tmean.list
Now, a function that calculates Post-snow growing degree-days above a 5°C base.
PSGDD5 <- function(x, indices){ ns.tmeans <- x[indices] # subset only the no-snow days from a vector of tmean values ns.gdd5 <- ns.tmeans[ns.tmeans>5] - 5 # for days with no-snow and tmean > 5°C, calculate degrees above 5°C ann.ns.gdd5 <- sum(ns.gdd5) # sum these "degree-days" over the year return(ann.ns.gdd5) }
The PSGDD5() function is called with the mapply() function so that it can simultaneously receive each element, in order, from the tmean.list
and ns.index
lists.
psgdd5 <- mapply(PSGDD5, tmean.list, ns.index)
Plotting Results
X and Y coordinates for our tile were retrieved at the start. They are paired together in a dataframe with expand.grid()
and then combined with the vector to be plotted.
grid <- expand.grid(xlon=x, ylat=y) xyz.psgdd5 <- cbind(grid, psgdd5) head(xyz.psgdd5)
xlon ylat psgdd5 1 -1787500 1072500 0 2 -1786500 1072500 0 3 -1785500 1072500 0 4 -1784500 1072500 0 5 -1783500 1072500 0 6 -1782500 1072500 0
xyz.tmean <- cbind(grid, tmean.mat[,92]) # xyz for plotting tmean on April 1st xyz.swe <- cbind(grid, swe.mat[,92]) # and for plotting swe
This constructed an xyz object which can easily be converted to a raster object with the raster
package. First a projected coordinate system is defined in the PROJ4 syntax, based on info from the netCDF header.
pj <- "+proj=lcc +lat_1=25.00 +lat_2=60.00 +lat_0=42.5 +lon_0=-100.00 +x_0=0 +y_0=0 +ellps=GRS80 +units=m +no_defs" rast.psgdd5 <- rasterFromXYZ(xyz.psgdd5, crs=pj) # create a raster object rast.tmean <- rasterFromXYZ(xyz.tmean, crs=pj) rast.swe <- rasterFromXYZ(xyz.swe, crs=pj)
Plot maps of the annual PSGDD5 variable and its component parts, tmean and swe. Three plots are created with the levelplot()
from the rasterVis
package, then arranged on a page with grid.arrange()
from the gridExtra
package
cuttmp <- c(-30,-20,-10,-5,-2,0,2,5,10,20,30) # define break points for color scale colkey <- list(at=cuttmp, labels=list(at=cuttmp)) # colorkey helps put legend labels at correct break points p1 <- levelplot(rast.tmean, margin=FALSE, colorkey=colkey, at=cuttmp, cuts=12, col.regions=rev(brewer.pal(11,"RdBu")), main=list(label="Mean Temperature (April 1, 1980)", cex=.8), scales=list(draw=FALSE)) cutswe <- c(0,10,50,100,250,500,1000) colkey <- list(at=cutswe, labels=list(at=cutswe)) p2 <- levelplot(rast.swe, margin=FALSE, colorkey=colkey, at=cutgdd, cuts=8, col.regions=rev(brewer.pal(7,"Blues")), main=list(label="SWE (mm) April 1, 1980", cex=.8), scales=list(draw=FALSE)) cutgdd <- c(0,50,200,400,600,800,1000,1500,2000,3000) colkey <- list(at=cutgdd, labels=list(at=cutgdd)) p3 <- levelplot(rast.psgdd5, margin=FALSE, colorkey=colkey, at=cutgdd, cuts=10, col.regions=(brewer.pal(9,"PuBuGn")), main=list(label="Annual PSGDD5 (1980)", cex=.8), scales=list(draw=FALSE)) grid.arrange(p1, p2, p3, ncol=3)
Batch Processing to calculate a long-term mean
Now that I have stepped through the process of calculating PSGDD5 for one year, I will calculate it for each year of a decade, and then generate a decadal average PSGDD5 grid.
Create a list to store each year’s PSGDD5 calculation. The vector decade
contains names of years and will be used to read files.
psgdd5.8012 <- list() decade <- seq(1980, 2012, by=1)
Loop through each year in the decade calling the functions defined earlier. The final result for each store is stored in the [[y]]
position of the psgdd5.80s
list.
for (y in 1:length(decade)){ vec.tmax <- getNCDFClimate("tmax", decade[y]) vec.tmin <- getNCDFClimate("tmin", decade[y]) vec.swe <- getNCDFClimate("swe", decade[y]) vec.tmean <- (vec.tmax + vec.tmin)/2 tmean.mat <- matrix(vec.tmean, nrow=ncells, ncol=nyearday, byrow=F) swe.mat <- matrix(vec.swe, nrow=ncells, ncol=nyearday, byrow=F) ns.index <- apply(swe.mat, MARGIN=1, FUN=getNoSnow) tmean.list <- split(tmean.mat, row(tmean.mat)) psgdd5.80s[[y]] <- mapply(PSGDD5, tmean.list, ns.index) }
Convert the list of PSGDD5 years to a dataframe and calculate the decadal average as the mean of each row in the dataframe.
psgdd5.80s.df <- do.call(cbind.data.frame, psgdd5.80s) ## each column is a year names(psgdd5.80s.df) <- as.character(decade) psgdd5.80s.df$mean <- rowMeans(psgdd5.80s.df)
Create a raster object of the decadal average PSGDD5 values.
xyz.mean <- cbind(grid, psgdd5_8089$mean) rast.mean <- rasterFromXYZ(xyz.mean, crs=pj)
And a stack of PSGDD5 rasters for each individual year.
stk <- stack() for (i in 1:10{ xyz <- cbind(grid, psgdd5_8089[,i]) rast <- rasterFromXYZ(xyz, crs=pj) stk <- stack(stk, rast) } names(stk) <- decade
Define some X and Y values for a plotting extent.
ext.x <- c(-1750000, -1590000) ext.y <- c(670000, 910000)
Plot the decadal average raster.
levelplot(rast.mean, margin=FALSE, colorkey=colkey, at=cutgdd, xlim=ext.x, ylim=ext.y, cuts=10, col.regions=(brewer.pal(9,"PuBuGn")), main=list(label="1980's Decadal Average PSGDD5", cex=1))
Plot a stack of rasters to examine the variable at each year.
levelplot(stk, scales=list(draw=FALSE), at=cutgdd, xlim=ext.x, ylim=ext.y, cuts=10, col.regions=(brewer.pal(9,"PuBuGn")), main=list(label="1980-1989 yearly PSGDD5", cex=1))
References:
Thornton, P.E., S.W. Running, and M.A. White. 1997. Generating surfaces of daily meteorology variables over large regions of complex terrain. Journal of Hydrology 190:214-251.
Thornton, PE, MM Thornton, BW Mayer, N Wilhelmi, Y Wei, RB Cook 2012. Daymet:Daily surface weather on a 1 km grid for North America,1980-2011. Acquired online (http://daymet.ornl.gov/) on 05/01/2013 from Oak Ridge National Laboratory Distributed Active Archive Center, Oak Ridge, Tennessee, U.S.A. doi:10.3334/ORNLDAAC/Daymet_V2.