install.packages("dplyr") install.packages("xml2") install.packages("rvest") install.packages("tibble") install.packages("purrr") install.packages("lubridate") install.packages("timetk")
If you live in New York and rely on heating oil to keep your home warm during the colder months, you know how important it is to keep track of heating oil prices. Fortunately, with a bit of R code, you can easily access the latest heating oil prices in New York.
The code uses the {dplyr}
package to clean and manipulate the data, as well as the {timetk}
package to plot the time series. Here’s a breakdown of what the code does:
read_html
function from the xml2
package.html_node
function from the rvest
package to extract the HTML node that contains the data table.The resulting data table is then cleaned and transformed using dplyr
functions such as html_table
, as_tibble
, set_names
, select
, mutate
, and arrange
.
Finally, the resulting time series data is plotted using plot_time_series
from the timetk
package.
To run this code, you will need to have these packages installed on your machine. You can install them using the install.packages function in R. Here’s how you can install the packages:
install.packages("dplyr") install.packages("xml2") install.packages("rvest") install.packages("tibble") install.packages("purrr") install.packages("lubridate") install.packages("timetk")
Once you have installed the packages, you can copy and paste the code into your R console or RStudio and run it to get the latest heating oil prices in New York.
In conclusion, the code above provides a simple and efficient way to access and visualize heating oil prices in New York using R. By keeping track of these prices, you can make informed decisions about when to buy heating oil and how much to purchase, ultimately saving you money on your heating bills.
Now let’s run it!
url <- "https://www.eia.gov/opendata/qb.php?sdid=PET.W_EPD2F_PRS_SNY_DPG.W" page <- xml2::read_html(url) node <- rvest::html_node( x = page , xpath = "/html/body/div[1]/section/div/div/div[2]/div[1]/table" ) ny_tbl <- node |> rvest::html_table() |> tibble::as_tibble() |> purrr::set_names('series_name','period','frequency','value','units') |> dplyr::select(period, frequency, value, units, series_name) |> dplyr::mutate(period = lubridate::ymd(period)) |> dplyr::arrange(period) ny_tbl |> timetk::plot_time_series(.date_var = period, .value = value)
Voila!