Overview
Several factors contributed to extreme flooding that occured in Boulder, Colorado in 2013. In this lesson, we will explore several key variables including:
- precipipitation (rainfall) data collected by ???NDCD??
- measures of drought prior to the flooding that explain …
- increased stream discharge (or flow) that occured due to combination of the drought conditions and extreme rainfall.
###Goals / Objectives After completing this activity, you will:
###Things You’ll Need To Complete This Lesson
Please be sure you have the most current version of R and, preferably,
R studio to write your code.
####R Libraries to Install:
- ggplot2:
install.packages("ggplot2") - lubridate:
install.packages("lubridate")
####Data to Download
The LiDAR and imagery data used to create the rasters in this dataset were collected over the Harvard and San Joaquin field sites and processed at NEON headquarters. The entire dataset can be accessed by request from the NEON website.
####Recommended Pre-Lesson Reading
Lesson Series
#R Libraries
We will be working with time-series data in this lesson so we will load the lubridate
library. We will use ggplot2 to efficiently plot our data.
library(lubridate) #work with time series data
library(ggplot2) #create efficient, professional plots
library(plotly) #create interactive plots
#Soil Moisture
** we could get this from NDCD as well Boulder station w Soil Moisture: https://www.drought.gov/drought/content/tools/crn-soil-data#
CRN Station ID: 1045 LAT:40.0354 LON:-105.54090000000001 Name: BOULDER 14 W View Data
#Drought Data - the Palmer Drought Index The Palmer Drought Severity Index is an overall index of drought that is useful to understand drought associated with agriculture. It uses temperature and precipitation data and soil moisture to calculate water supply and demand.
#About downloading the data… Get the data here: http://www7.ncdc.noaa.gov/CDO/CDODivisionalSelect.jsp Metrics:
| Abbreviation | Metric |
|---|---|
| PCP | Precipitation Index |
| TAVG | Temperature Index |
| TMIN | Minimum Temperature Index |
| TMAX | Maximum Temperature Index |
| PDSI | Palmer Drought Severity Index |
| PHDI | Palmer Hydrological Drought Index |
| ZNDX | Palmer Z-Index |
| PMDI | Modified Palmer Drought Severity Index |
| CDD | Cooling Degree Days |
| HDD | Heating Degree Days |
| SPnn | Standard Precipitation Index |
01 - Ar Drainage Baisin 02 - CO Drainage Basin 03 - KS Drainage Basin 04- Platte Drainage 05 Rio Grand Stuff about the index will go here
https://www.drought.gov/drought/content/products-current-drought-and-monitoring-drought-indicators/palmer-drought-severity-index
We are interested in looking at the general state of drought in Colorado before
the 2013 floods. The data are in the drought directory of your download here:
drought/CDODiv8721376888863_CO.txt. To begin, we will read the data into R.
Our data have a header - the first row repsents the variable name for each column.
Thus we will use header=TRUE when we import the data to tell R to use that row
as a column name rather than a row of data.
#Import State Wide index data
drought <- read.csv("drought/CDODiv8721376888863_CO.txt", header = TRUE)
head(drought)
## StateCode Division YearMonth PCP TAVG PDSI PHDI ZNDX PMDI CDD HDD
## 1 5 0 199001 0.96 26.4 -3.42 -3.42 -1.69 -3.42 0 1144
## 2 5 0 199002 1.08 27.7 -3.65 -3.65 -1.74 -3.65 0 1038
## 3 5 0 199003 1.99 36.6 -3.65 -3.65 -1.12 -3.65 0 903
## 4 5 0 199004 2.08 45.1 -3.37 -3.37 -0.31 -3.37 0 609
## 5 5 0 199005 1.97 50.5 -3.02 -3.02 0.01 -2.96 0 446
## 6 5 0 199006 0.74 65.3 -3.59 -3.59 -2.64 -3.59 97 73
## SP01 SP02 SP03 SP06 SP09 SP12 SP24 TMIN TMAX X
## 1 -0.03 -0.59 -1.40 -1.13 -0.98 -1.34 -1.38 13.8 39.1 NA
## 2 0.06 -0.08 -0.55 -1.22 -0.78 -1.45 -1.26 15.5 39.9 NA
## 3 0.90 0.74 0.52 -0.89 -0.76 -1.07 -1.08 24.7 48.5 NA
## 4 0.51 0.80 0.73 -0.21 -0.51 -0.56 -0.95 31.8 58.3 NA
## 5 0.09 0.32 0.59 0.25 -0.42 -0.33 -1.02 36.6 64.5 NA
## 6 -1.26 -0.62 -0.26 -0.01 -0.76 -0.75 -1.31 48.9 81.6 NA
#view data structure
str(drought)
## 'data.frame': 311 obs. of 21 variables:
## $ StateCode: int 5 5 5 5 5 5 5 5 5 5 ...
## $ Division : int 0 0 0 0 0 0 0 0 0 0 ...
## $ YearMonth: int 199001 199002 199003 199004 199005 199006 199007 199008 199009 199010 ...
## $ PCP : num 0.96 1.08 1.99 2.08 1.97 0.74 3.25 1.76 2.1 1.61 ...
## $ TAVG : num 26.4 27.7 36.6 45.1 50.5 65.3 66 64.8 61.2 46.7 ...
## $ PDSI : num -3.42 -3.65 -3.65 -3.37 -3.02 -3.59 -2.54 -2.45 -1.83 -1.39 ...
## $ PHDI : num -3.42 -3.65 -3.65 -3.37 -3.02 -3.59 -2.54 -2.45 -1.83 -1.39 ...
## $ ZNDX : num -1.69 -1.74 -1.12 -0.31 0.01 -2.64 2.06 -0.53 1.11 0.75 ...
## $ PMDI : num -3.42 -3.65 -3.65 -3.37 -2.96 -3.59 -1.67 -1.75 -0.67 0.04 ...
## $ CDD : int 0 0 0 0 0 97 74 62 35 0 ...
## $ HDD : int 1144 1038 903 609 446 73 45 55 141 563 ...
## $ SP01 : num -0.03 0.06 0.9 0.51 0.09 -1.26 1.84 -0.36 0.95 0.48 ...
## $ SP02 : num -0.59 -0.08 0.74 0.8 0.32 -0.62 0.43 0.95 0.48 0.87 ...
## $ SP03 : num -1.4 -0.55 0.52 0.73 0.59 -0.26 0.38 0.12 1.25 0.56 ...
## $ SP06 : num -1.13 -1.22 -0.89 -0.21 0.25 -0.01 0.63 0.46 0.56 0.53 ...
## $ SP09 : num -0.98 -0.78 -0.76 -0.51 -0.42 -0.76 0.05 0.21 0.68 0.77 ...
## $ SP12 : num -1.34 -1.45 -1.07 -0.56 -0.33 -0.75 -0.26 -0.31 -0.04 0.31 ...
## $ SP24 : num -1.38 -1.26 -1.08 -0.95 -1.02 -1.31 -0.85 -0.81 -0.77 -0.39 ...
## $ TMIN : num 13.8 15.5 24.7 31.8 36.6 48.9 52.2 50.1 47.1 32.1 ...
## $ TMAX : num 39.1 39.9 48.5 58.3 64.5 81.6 79.8 79.5 75.3 61.2 ...
## $ X : logi NA NA NA NA NA NA ...
##Date Fields
Let’s have a look at the date field in our data. Viewing the structure, it appears as if it is not in a standard format. It imported into R as an integer.
$ YearMonth: int 199001 199002 199003 199004 199005 ...
We want to convert these numbers into a date field. We can use the as.Date
function to convert our string of numbers into a date that R will recognize.
#view structure of date field
str(drought$YearMonth)
## int [1:311] 199001 199002 199003 199004 199005 199006 199007 199008 199009 199010 ...
#convert to date
drought$YearMonth <- as.Date(drought$YearMonth, format="%Y%m%d")
## Error in as.Date.numeric(drought$YearMonth, format = "%Y%m%d"): 'origin' must be supplied
When we tried to convert our numeric string to a date, R returned an origin error.
R needs a day of the month in order to properly convert this to a date class.
We can add this using the paste0 function. Let’s add 01 to each field so R
thinks each date represents the first of the month.
#convert date field (YearMonth) to date class
#add a day of the month to each year-month combination
drought$YearMonth <- paste0(drought$YearMonth,"01")
#convert to date
drought$YearMonth <- as.Date(drought$YearMonth, format="%Y%m%d")
We’ve now successfully converted our YearMonth column into a date class. Next,
let’s plot the data using ggplot.
#plot Palmer Drought Index
palmer.drought <- ggplot(data=drought,
aes(YearMonth,PDSI)) +
geom_bar(stat="identity",position = "identity") +
xlab("Year / Month") + ylab("Palmer Drought Severity Index") +
ggtitle("Palmer Drought Severity Index - Colorado\nNA values included")
palmer.drought

Great - we’ve successfully created a plot, but what is going on with the Y axis? It appears as if we have values that extend to -100. Let’s look at a quick summery of our data.
#viw summary stats of the Palmer Drought Severity Index
summary(drought$PDSI)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## -99.9900 -2.1900 -0.0600 -0.7354 1.6300 5.0200
#view histogram of the data
hist(drought$PDSI,
main="Histogram of PDSI value",
col="wheat3")

It appears as if we have a negative value -99.99 that is throwing off our plot.
This value is a no data value. We need to assign this value to NA, which is R’s
representation of a null or no data value.
Then we can plot again!
#assign -99.99 to NA in the PDSI column
#note: you may want to do this across the entire data.frame or with other columns.
drought[drought$PDSI==-99.99,] <- NA
#view the histogram again - does the range look reasonable?
hist(drought$PDSI,
main="Histogram of PDSI value with -99.99 removed",
col="springgreen4")

#plot Palmer Drought Index
ggplot(data=drought,
aes(YearMonth,PDSI)) +
geom_bar(stat="identity",position = "identity") +
xlab("Year") + ylab("Palmer Drought Severity Index") +
ggtitle("Palmer Drought Severity Index - Colorado\n1990 to 2015")

##Plot.ly - Free Online Interactive Plots
We can create an interactive version of our plot using plot.ly.
#plot Palmer Drought Index
ggplot(data=drought,
aes(YearMonth,PDSI)) +
geom_bar(stat="identity",position = "identity") +
xlab("Year") + ylab("Palmer Drought Severity Index") +
ggtitle("Palmer Drought Severity Index - Colorado\n1990 to 2015")

#view plotly plot in R
#note - plotly doesn't recognize \n to create a second title line
ggplotly()
#publish plotly plot to your plot.ly online account when you are happy with it
#plotly_POST(droughtPlot.plotly)
#NOTE: we will have to manually present the plot.ly plot here as an embed as R won’t knit it out.
##Subsetting our data
Once we have a column of data defined as a date, class in R, we can quickly subset
the data to view portions of the data. We can subset the data directly in our
ggplot command however that will not translate to plot.ly. So let’s create a
new r object that is represents drought data from 2005-2015.
We will then plot that using ggplot and plot.ly.
To subset, we use the subset function, and specify:
- the R object that we wish to subset
- the date column and start date of the subset
- the date column and end date of the subset
Let’s subset our data.
#subset out date between 2005 and 2015
drought2005.2015 <- subset(drought,
YearMonth >= as.Date('2005-01-01', tz = "America/Denver") &
YearMonth <= as.Date('2015-11-01', tz = "America/Denver"))
#plot the data - September-October
droughtPlot.2005.2015 <- ggplot(data=drought2005.2015,
aes(YearMonth,PDSI)) +
geom_bar(stat="identity",position = "identity") +
xlab("Year / Month") + ylab("Palmer Drought Severity Index") +
ggtitle("Palmer Drought Severity Index - Colorado\n2005 - 2015")
droughtPlot.2005.2015

#view plotly plot in R
ggplotly()
#publish plotly plot to your plot.ly online account when you are happy with it
#plotly_POST(droughtPlot.plotly)
#Need to remember where i got the data from. I have a feeling all could be accessed via the API!! but i can’t remember how i found PDSI.
This is my order url - but it’s expired. http://www1.ncdc.noaa.gov/pub/orders/CDODiv5787606888799.txt
drought <- read.
#Challenge Another dataset - import and plot