Code
# install.packages("devtools")
#devtools::install_github("andypicke/rgridstatus")
May 13, 2024
rgridstatus is a R API wrapper for the GridStatus.io API, which provides data on the US electrical grid.
You can install the development version of rgridstatus from GitHub with the install_github() function from the devtools package (Wickham et al. 2022):
# install.packages("devtools")
#devtools::install_github("andypicke/rgridstatus")
You will need to sign up for a free API key from GridStatus.io . By default, the package functions assume you have stored your API key in your .Renviron file as GRIDSTATUS_API_KEY. I find the easiest way to do this is to use the edit_r_environ() function from the usethis package (Wickham et al. 2024).
library(rgridstatus)
library(DT)
options(DT.options = list(pageLength = 5))
library(tidyverse)
ggplot2::theme_set(theme_gray(base_size = 16))
info <- rgridstatus::get_available_datasets()
info |>
DT::datatable(rownames = FALSE)
# updates for a specific data set
#updates <- rgridstatus::get_dataset_updates(wh_dataset = "caiso_fuel_mix", limit = 20)
# all datasets
updates <- rgridstatus::get_dataset_updates(limit = 20)
updates |> DT::datatable(rownames = FALSE)
caiso_mix <- rgridstatus::get_gridstatus_dataset(wh_dataset = "caiso_fuel_mix")
caiso_mix |>
DT::datatable(rownames = FALSE)
The rgridstatus package is focused on retrieving data from the API, but I want to show some examples of plotting the data. I may add some convenience functions for plotting to the pacakge in the future.
The gridstatus API returns the generation/fuel-mix data in a wide format. When plotting multiple variables, it is easier to pivot the data from wide to long format.
# df_long <- df |>
# pivot_longer(cols = -c(interval_start_utc, datetime_local),
# names_to = "Fuel Type",
# values_to = "MWh")
df_long <- rgridstatus::pivot_gen_long(caiso_mix)
df_long |>
DT::datatable(rownames = FALSE)
With the data in long format, we can just specify in the plotting function that we want to color/label the lines by Fuel Type.
The rgridstatus package provides functions for retrieving data on the US electrical grid from the GridStatus.io API in R.
You can install the development version of rgridstatus and try it out for yourself! The package is still in development; please provide any feedback or issues via the github site.
R version 4.4.1 (2024-06-14)
Platform: x86_64-apple-darwin20
Running under: macOS Sonoma 14.6.1
Matrix products: default
BLAS: /Library/Frameworks/R.framework/Versions/4.4-x86_64/Resources/lib/libRblas.0.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/4.4-x86_64/Resources/lib/libRlapack.dylib; LAPACK version 3.12.0
locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
time zone: America/Denver
tzcode source: internal
attached base packages:
[1] stats graphics grDevices datasets utils methods base
other attached packages:
[1] lubridate_1.9.3 forcats_1.0.0 stringr_1.5.1
[4] dplyr_1.1.4 purrr_1.0.2 readr_2.1.5
[7] tidyr_1.3.1 tibble_3.2.1 ggplot2_3.5.1
[10] tidyverse_2.0.0 DT_0.33 rgridstatus_0.0.0.9000
loaded via a namespace (and not attached):
[1] plotly_4.10.4 sass_0.4.9 utf8_1.2.4 generics_0.1.3
[5] renv_1.0.4 stringi_1.8.4 hms_1.1.3 digest_0.6.36
[9] magrittr_2.0.3 evaluate_0.24.0 grid_4.4.1 timechange_0.3.0
[13] fastmap_1.2.0 jsonlite_1.8.8 httr_1.4.7 fansi_1.0.6
[17] viridisLite_0.4.2 crosstalk_1.2.1 scales_1.3.0 lazyeval_0.2.2
[21] jquerylib_0.1.4 cli_3.6.3 rlang_1.1.4 munsell_0.5.1
[25] withr_3.0.1 cachem_1.1.0 yaml_2.3.10 tools_4.4.1
[29] tzdb_0.4.0 colorspace_2.1-0 curl_5.2.1 vctrs_0.6.5
[33] R6_2.5.1 lifecycle_1.0.4 htmlwidgets_1.6.4 pkgconfig_2.0.3
[37] pillar_1.9.0 bslib_0.8.0 gtable_0.3.5 data.table_1.15.4
[41] glue_1.7.0 xfun_0.46 tidyselect_1.2.1 rstudioapi_0.16.0
[45] knitr_1.48 farver_2.1.1 htmltools_0.5.8.1 labeling_0.4.3
[49] rmarkdown_2.27 compiler_4.4.1
---
title: "*rgridstatus*: A R Package to Retrieve Energy Data from the GridStatus API"
date: 2024-05-13
image: image.png
format:
html:
code-link: true
code-fold: show
code-tools: true
toc: true
fig-width: 10
fig-height: 7
tbl-cap-location: bottom
editor: visual
categories: [R, API, energy, package development]
freeze: auto
draft: false
bibliography: references.bib
---
# Introduction
[rgridstatus](https://github.com/andypicke/rgridstatus) is a R API wrapper for the [GridStatus.io](https://www.gridstatus.io/home) [API](https://www.gridstatus.io/api), which provides data on the US electrical grid.
- [API Documentation](https://api.gridstatus.io/docs)
- If you prefer to work in python, there is an existing [GridStatus.io python client](https://github.com/gridstatus/gridstatusio)
## Installation
You can install the development version of [rgridstatus](https://github.com/andypicke/rgridstatus) from [GitHub](https://github.com/) with the *install_github()* function from the *devtools* package [@devtools]:
```{r}
#| code-fold: show
# install.packages("devtools")
#devtools::install_github("andypicke/rgridstatus")
```
::: callout-note
You will need to sign up for a free API key from [GridStatus.io](https://www.gridstatus.io/api) . By default, the package functions assume you have stored your API key in your .Renviron file as GRIDSTATUS_API_KEY. I find the easiest way to do this is to use the *edit_r_environ()* function from the *usethis* package [@usethis].
:::
# Usage
```{r}
#| label: Load Libraries
#| message: false
#| code-fold: true
#| code-summary: Load Libraries
library(rgridstatus)
library(DT)
options(DT.options = list(pageLength = 5))
library(tidyverse)
ggplot2::theme_set(theme_gray(base_size = 16))
```
## Get info on datasets available from GridStatus.io :
```{r }
#| label: tbl-datasets
#| tbl-cap: Table of available datasets from GridStatus.io
info <- rgridstatus::get_available_datasets()
info |>
DT::datatable(rownames = FALSE)
```
### Get list of dataset updates
```{r}
#| label: tbl-updates
#| tbl-cap: Table of recent dataset updates from GridStatus.io API
# updates for a specific data set
#updates <- rgridstatus::get_dataset_updates(wh_dataset = "caiso_fuel_mix", limit = 20)
# all datasets
updates <- rgridstatus::get_dataset_updates(limit = 20)
updates |> DT::datatable(rownames = FALSE)
```
## Download a dataset
- By default, get_gridstatus_dataset() gets data for the previous 5 days. You can also specify a specific date range with *start_time* and *end_time*.
```{r}
#| label: tbl-caiso-fuel-mix
#| tbl-cap: Table of CAISO fuel-mix data from GridStatus.io API
caiso_mix <- rgridstatus::get_gridstatus_dataset(wh_dataset = "caiso_fuel_mix")
caiso_mix |>
DT::datatable(rownames = FALSE)
```
# Plotting
The rgridstatus package is focused on retrieving data from the API, but I want to show some examples of plotting the data. I may add some convenience functions for plotting to the pacakge in the future.
## Plot CAISO solar generation timeseries
```{r}
#| label: fig-caiso-solar
#| fig-cap: Plot of CAISO solar generation. data from GridStatus.io API
caiso_mix |>
ggplot(aes(datetime_local, solar)) +
geom_line(linewidth = 1.2) +
labs(title = "CAISO Solar Generation",
x = "",
y = "[MW]",
caption = "Data from GridStatusIO")
```
The gridstatus API returns the generation/fuel-mix data in a wide format. When plotting multiple variables, it is easier to pivot the data from wide to long format.
- rgridstatus has a convenience function pivot_gen_long() to accomplish this:
```{r}
#| label: tbl-gen-long
#| tbl-cap: CAISO fuel-mix data from Gridstatus after pivoting to a long format.
# df_long <- df |>
# pivot_longer(cols = -c(interval_start_utc, datetime_local),
# names_to = "Fuel Type",
# values_to = "MWh")
df_long <- rgridstatus::pivot_gen_long(caiso_mix)
df_long |>
DT::datatable(rownames = FALSE)
```
With the data in long format, we can just specify in the plotting function that we want to color/label the lines by Fuel Type.
- Here I've filtered by Fuel Type to include just solar and batteries
- I'm also using *plotly* [@plotly] to create an interactive plot that allows you to zoom in/out etc..
```{r}
#| label: fig-solar-batteries
#| fig-cap: Interactive plot of CAISO solar and battery generation. Data from GridStatus.io
g <- df_long |>
filter(`Fuel Type` %in% c("solar", "batteries")) |>
ggplot(aes(datetime_local, MWh)) +
geom_line(aes(color = `Fuel Type`)) +
labs(title = "CAISO Fuel Mix",
x = glue::glue('Local Datetime {tz(df_long$datetime_local[1])}'),
caption = "Data from GridStatusIO")
plotly::ggplotly(g)
```
# Summary
- The [rgridstatus](https://github.com/andypicke/rgridstatus) package provides functions for retrieving data on the US electrical grid from the [GridStatus.io](https://www.gridstatus.io/home) [API](https://www.gridstatus.io/api) in R.
- You can install the development version of [rgridstatus](https://github.com/andypicke/rgridstatus) and try it out for yourself! The package is still in development; please provide any feedback or issues via the github site.
# SessionInfo
::: {.callout-tip collapse="true"}
## Expand for Session Info
```{r, echo = FALSE}
sessionInfo()
```
:::
# References