This post shows some examples of how to find and visualize USGS stream gauge data using R. The {dataRetrieval} package (DeCicco et al. 2024) is used to find and download USGS stream gauge data.
This is not meant to be an exhaustive analysis of the storm or flooding; rather it is meant to demonstrate some ways you can visualize and analyze stream gauge data in R. I hope it inspires you to create you own analyses.
I used the {tidycensus} (Walker and Herman 2024) package to get county boundaries for the state. I wanted to plot the county boundaries on the map, and have the option to filter by county if I want.
Note
{tidycensus} uses {tigris} to get the geometries if requested (geometry = TRUE). If you just want the geometries, you can use the {tigris} package (Walker 2024) directly. By using {tidycensus} you can get other census variables such as population in the same dataframe.
First we need to find what stream gauge sites are available. We can use the whatNWISsites() function from {dataRetrieval} to find all sites in North Carolina with a discharge parameter.
We can use {leaflet} (Cheng et al. 2024) to make an interactive map of stream gauge locations. Clicking on a marker will display the name and id of a site.
Now we can plot the station data. Figure Figure 2 shows the timeseries of discharge at the selected stream gauge. Note that there is data missing during the peak flooding; I’m not sure if the measurement capabilities of the instrument were exceeded.
Tip
I wrote a helper function called add_labels_nwis() to rename the plot labels using the variableInfo attributes from the data frame. This replaces the variable code with the actual name of the variable.
Code
### function to add better labels to plot using attributes of NWIS data frameadd_labels_nwis <-function(df, g){ parameterInfo <-attr(df, "variableInfo") siteInfo <-attr(df, "siteInfo") g <- g +ylab(parameterInfo$variableDescription) +ggtitle(siteInfo$station_nm)return(g)}##g <- df_inst |>ggplot(aes(dateTime, Flow_Inst)) +geom_point(color ="lightblue") +geom_vline(xintercept =as.numeric(as.POSIXct("2024-09-26"))) +labs(x =NULL) +theme_minimal()g <-add_labels_nwis(df_inst,g)g
Make plot interactive
We can use the {plotly} package (Sievert 2020) to make an interactive plot that allows us to zoom in/out and hover over data points. Figure Figure 3 shows an interactive version of the previous figure. You can zoom and hover over data to display the values.
The *ggplotly()* function from {plotly} lets you turn a {ggplot} plot into an interactive plotly plot!
Code
ggplotly(g)
Full Plotly Version
You can also make the plot directly with {plotly} instead of converting it from a ggplot figure:
Code
parameterInfo <-attr(df, "variableInfo")p <- df_inst |>plot_ly(x =~dateTime, y =~Flow_Inst, type ="scatter", name ="Discharge") |>add_lines(x = lubridate::ymd("2024-09-26"), y =range(df_inst$Flow_Inst, na.rm =TRUE),line =list(color ="red", dash ="dash"), name ="Landfall") |>layout(xaxis =list(title = parameterInfo$variableDescription),yaxis =list(title ="Streamflow [ft^3/s]") )p
Summary
This post showed how you can use the {dataRetrieval} and other packages to find, download, and analyze USGS stream gauge data in R. Data for stream gauges in North Carolina during hurricane Helene were used as an example. I hope you find this useful getting started with your own analysis. My thoughts go out to everyone affected by Helene and the resulting flooding.
Cheng, Joe, Barret Schloerke, Bhaskar Karambelkar, and Yihui Xie. 2024. “Leaflet: Create Interactive Web Maps with the JavaScript ’Leaflet’ Library.”https://CRAN.R-project.org/package=leaflet.
DeCicco, Laura, Robert Hirsch, David Lorenz, Jordan Read, Jordan Walker, Lindsay Carr, David Watkins, David Blodgett, Mike Johnson, and Aliesha Krall. 2024. “dataRetrieval: R Packages for Discovering and Retrieving Water Data Available from u.s. Federal Hydrologic Web Services.”https://doi.org/10.5066/P9X4L3GE.
Sievert, Carson. 2020. “Interactive Web-Based Data Visualization with r, Plotly, and Shiny.”https://plotly-r.com.
Walker, Kyle, and Matt Herman. 2024. “Tidycensus: Load US Census Boundary and Attribute Data as ’Tidyverse’ and ’Sf’-Ready Data Frames.”https://walker-data.com/tidycensus/.
Source Code
---title: "Exploring USGS Stream Gauge Data in North Carolina During Hurricane Helene Using R"date: 2024-10-16#date-modified: todayimage: image.pngformat: html: code-fold: show code-tools: true toc: true fig-width: 9 fig-height: 7 tbl-cap-location: bottomcode-annotations: hovereditor: visualcategories: [R,leaflet,mapping,weather]freeze: truedraft: falsebibliography: references.bib---# IntroductionThis post shows some examples of how to find and visualize USGS stream gauge data using R. The {dataRetrieval} package [@dataRetrieval] is used to find and download USGS stream gauge data.::: callout-tipThis [tutorial on {dataRetrieval}](https://doi-usgs.github.io/dataRetrieval/articles/tutorial.html) was very helpful to get started using the package.:::I will look at data from stream gauges in North Carolina during [hurricane Helene](https://en.wikipedia.org/wiki/Hurricane_Helene) in late September, 2024. Helene brought [extreme amounts of rainfall to western North Carolina](https://climate.ncsu.edu/blog/2024/09/rapid-reaction-historic-flooding-follows-helene-in-western-nc/), causing extremely destructive flooding and unfortunately killing and injuring many people.::: callout-noteThis is not meant to be an exhaustive analysis of the storm or flooding; rather it is meant to demonstrate some ways you can visualize and analyze stream gauge data in R. I hope it inspires you to create you own analyses.:::# Analysis## Load Libraries```{r}#| label: libraries#| message: falselibrary(dataRetrieval) # <1>library(leaflet) # <2>library(tidyverse) # <3>library(DT) # <4>library(plotly) # <5>library(tidycensus) # <6>options(tigris_use_cache =TRUE)library(sf) # <7>```1. Get USGS stream gauge data2. Mapping3. Data wrangling and plotting4. Nice data tables5. Interactive plots6. Get county boundaries7. Spatial stuff## Get county boundariesI used the {tidycensus} [@tidycensus] package to get county boundaries for the state. I wanted to plot the county boundaries on the map, and have the option to filter by county if I want.::: callout-note{tidycensus} uses {tigris} to get the geometries if requested (*geometry = TRUE*). If you just want the geometries, you can use the {tigris} package [@tigris] directly. By using {tidycensus} you can get other census variables such as population in the same dataframe.:::```{r}#| label: tbl-counties#| tbl-cap: North Carolina counties and geometries#| message: falsenc_counties <-get_decennial(geography ="county",state ="NC",variables ="P001001", # total population,year =2010,geometry =TRUE)DT::datatable(nc_counties, options =list(pageLength =5), rownames =FALSE)```## Find stream gauge sitesFirst we need to find what stream gauge sites are available. We can use the *whatNWISsites()* function from {dataRetrieval} to find all sites in North Carolina with a discharge parameter.```{r}#| label: tbl-ncsites#| tbl-cap: Table of stream gauges in North Carolina with discharge measurement.nc_sites <-whatNWISsites(stateCd ="NC", parameterCd ="00060"# discharge parameter code )DT::datatable(nc_sites, options =list(pageLength =5), rownames =FALSE)```### Map of site locationsWe can use {leaflet} [@leaflet] to make an interactive map of stream gauge locations. Clicking on a marker will display the name and id of a site.```{r}#| label: fig-ncsites-map#| message: false#| fig-cap: Interactive map of stream gauge locations and county boundaries in North Carolina.leaflet(data = nc_sites) |>addProviderTiles(provider = providers$CartoDB.Voyager) |>addPolygons(data = nc_counties |> sf::st_transform(4326), color ="black", weight =1, fillOpacity =0.1, label =~NAME) |>addMarkers(lat =~dec_lat_va, lng =~dec_long_va,popup =paste0(nc_sites$station_nm, "<br>", nc_sites$site_no ) ,clusterOptions =markerClusterOptions())```## Get data for one stationOnce we identify a site to look at, we can get the timeseries data for that site using the *readNWISuv()* function from {dataRetrieval}.- The *renameNWISColumns()* function renames variables with a readable description instead of their parameter code.```{r}#| label: tbl-timeseries#| tbl-cap: Example timeseries data of discharge from stream gaugedf_inst <-readNWISuv(siteNumbers ="03451500",parameterCd ="00060",startDate ="2024-09-20",endDate ="2024-10-05",tz ="US/Eastern" ) |>renameNWISColumns()df_inst |>head(5) |> DT::datatable(rownames =FALSE)```## Plot Timeseries from a stationNow we can plot the station data. @fig-timeseries shows the timeseries of discharge at the selected stream gauge. Note that there is data missing during the peak flooding; I'm not sure if the measurement capabilities of the instrument were exceeded.::: callout-tipI wrote a helper function called *add_labels_nwis()* to rename the plot labels using the *variableInfo* attributes from the data frame. This replaces the variable code with the actual name of the variable.:::```{r}#| label: fig-timeseries#| fig-cap: Timeseries of discharge from a USGS stream gauge in Western North Carolina. Vertical line is Sept. 26, 2004, the date Hurricane Helene made landfall.### function to add better labels to plot using attributes of NWIS data frameadd_labels_nwis <-function(df, g){ parameterInfo <-attr(df, "variableInfo") siteInfo <-attr(df, "siteInfo") g <- g +ylab(parameterInfo$variableDescription) +ggtitle(siteInfo$station_nm)return(g)}##g <- df_inst |>ggplot(aes(dateTime, Flow_Inst)) +geom_point(color ="lightblue") +geom_vline(xintercept =as.numeric(as.POSIXct("2024-09-26"))) +labs(x =NULL) +theme_minimal()g <-add_labels_nwis(df_inst,g)g```## Make plot interactiveWe can use the {plotly} package [@plotly] to make an interactive plot that allows us to zoom in/out and hover over data points. @fig-timeseries-ggplotly shows an interactive version of the previous figure. You can zoom and hover over data to display the values.- The \*ggplotly()\* function from {plotly} lets you turn a {ggplot} plot into an interactive plotly plot!```{r}#| label: fig-timeseries-ggplotly#| fig-cap: Interactive version of stream gauge timeseries plot using ggplotly.ggplotly(g)```## Full Plotly VersionYou can also make the plot directly with {plotly} instead of converting it from a ggplot figure:```{r}#| label: fig-timeseries-plotly#| message: false#| fig-cap: Interactive version of stream gauge timeseries plot using {plotly}.parameterInfo <-attr(df, "variableInfo")p <- df_inst |>plot_ly(x =~dateTime, y =~Flow_Inst, type ="scatter", name ="Discharge") |>add_lines(x = lubridate::ymd("2024-09-26"), y =range(df_inst$Flow_Inst, na.rm =TRUE),line =list(color ="red", dash ="dash"), name ="Landfall") |>layout(xaxis =list(title = parameterInfo$variableDescription),yaxis =list(title ="Streamflow [ft^3/s]") )p```# SummaryThis post showed how you can use the {dataRetrieval} and other packages to find, download, and analyze USGS stream gauge data in R. Data for stream gauges in North Carolina during hurricane Helene were used as an example. I hope you find this useful getting started with your own analysis. My thoughts go out to everyone affected by Helene and the resulting flooding.# SessionInfo::: {.callout-tip collapse="true"}## Expand for Session Info```{r, echo = FALSE}sessionInfo()```:::