Skip to content

Commit

Permalink
Merge pull request #1 from davidcarslaw/master
Browse files Browse the repository at this point in the history
update
  • Loading branch information
marcelooyaneder authored Dec 31, 2024
2 parents 3871d12 + 22ba291 commit bb3e002
Show file tree
Hide file tree
Showing 11 changed files with 611 additions and 190 deletions.
4 changes: 2 additions & 2 deletions DESCRIPTION
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
Type: Package
Package: openair
Title: Tools for the Analysis of Air Pollution Data
Version: 2.18.2.9001
Date: 2024-03-11
Version: 2.18.2.9005
Date: 2024-10-01
Authors@R: c(
person("David", "Carslaw", , "[email protected]", role = c("aut", "cre"),
comment = c(ORCID = "0000-0003-0991-950X")),
Expand Down
32 changes: 31 additions & 1 deletion NEWS.md
Original file line number Diff line number Diff line change
@@ -1,14 +1,44 @@
# openair (development version)

## New Features

- The `source` argument of `importUKAQ()` now defaults to `NULL`. This option allows the function to assign the `source` of each `site` itself, with some caveats:

- Ambiguous codes (e.g., `"AD1"`, which corresponds to a SAQN and locally managed site) will preferentially import from the national networks (AURN, then AQE/SAQN/WAQN/NIAQN) over locally-managed networks. To override this users should manually define `source`.

- Incorrect codes not found in `importMeta()` will error if `importUKAQ()` is left to assign the `source`.

- When `data_type` is one of the aggregate types (e.g., `"annual"`) and a `site` isn't defined, a `source` must be provided.

- It is likely *slightly* slower for the function to assign `source` itself than for users to specify it themselves.

- Added new features for `openColours()`:

- Added new qualitative colour palettes: the "tol" family are colour-blind friendly palettes based on the work of Paul Tol (<https://personal.sron.nl/~pault/>), and "tableau" and "observable" provide access to the "Tableau10" and "Observable10" palettes to aid in consistency with plots made in those platforms.

- When `n` isn't defined for a qualitative palette (e.g., "Dark2"), the full qualitative palette will be returned. Previously this errored with the default of `100`.

- `openColours()` will now check whether the provided `scheme` is either a known scheme name *or* a vector of valid R colours, and provide an informative error if this is not the case.

- The `formula.label` argument of `polarPlot()` will now control whether concentration information is printed when `statistic = "cpf"`.

- add `calm.thresh` as an option to `windRose`. This change allows users to set a non-zero wind speed threshold that is considered as calm.

- DAQI information imported using `importUKAQ(data_type = "daqi")` will be returned with the relevant DAQI band appended as an additional factor column; either "Low" (1-3), "Moderate" (4-6), "High" (7-9), or "Very High" (10). See <https://uk-air.defra.gov.uk/air-pollution/daqi> for more information.

## Bug fixes

- Fixed an issue wherein `importUKAQ()` would drop sites if importing from `local` sites *and* another network.

- `polarCluster()` will no longer error with multiple `pollutant`s and a single `n.clusters`.

- `importUKAQ()` will correctly append site meta data when `meta = TRUE`, `source` is a length greater than 1, and a single site is repeated in more than one source (e.g., `importUKAQ(source = c("waqn", "aurn"), data_type = "daqi", year = 2024L))`)

# openair 2.18-2

## New Features

- add option to `corPlot` to carry through "use" option in `cor`.
- add option to `corPlot()` to carry through "use" option in `cor`.

## Bug fixes

Expand Down
57 changes: 55 additions & 2 deletions R/importUKAQ-utils.R
Original file line number Diff line number Diff line change
Expand Up @@ -326,12 +326,12 @@ readDAQI <- function(files, year, source) {
#' @param aq_data imported air quality data (annual, daqi, or otherwise)
#' @noRd
add_meta <- function(source, aq_data) {
meta_data <- importMeta(source = source)
meta_data <- importMeta(source = source, duplicate = TRUE)

meta_data <- distinct(meta_data, source, site, .keep_all = TRUE) %>%
select(source, site, code, latitude, longitude, site_type)

aq_data <- left_join(aq_data, meta_data, by = c("source", "code", "site"))
aq_data3 <- left_join(aq_data, meta_data, by = c("source", "code", "site"))

return(aq_data)
}
Expand Down Expand Up @@ -409,3 +409,56 @@ filter_site_pollutant <- function(aq_data, site, pollutant, to_narrow, data_type
# return output
return(aq_data)
}

#' Helper function to guess the source of UKAQ data
#' @param site Sites passed to [importUKAQ()]
#' @noRd
guess_source <- function(site) {
ukaq_meta <- importMeta("ukaq") %>%
dplyr::mutate(source = factor(.data$source, c("aurn", "saqn", "aqe", "waqn", "ni", "local"))) %>%
dplyr::arrange(.data$source) %>%
dplyr::distinct(.data$site, .data$latitude, .data$longitude, .keep_all = TRUE)

source_tbl <-
data.frame(code = toupper(site)) %>%
dplyr::left_join(ukaq_meta, by = "code")

if (any(is.na(source_tbl$source))) {
ambiguous_codes <-
source_tbl %>%
dplyr::filter(is.na(.data$source)) %>%
dplyr::pull(.data$code)

cli::cli_abort(
c(
"x" = "Unknown site codes detected. Please ensure all site codes can be found in {.fun importMeta}.",
"i" = "Unknown site codes: {ambiguous_codes}"
)
)
}

if (nrow(source_tbl) > length(site)) {
source_tbl_all <- source_tbl

source_tbl <- dplyr::slice_head(source_tbl, n = 1, by = "code")

source_tbl_other <-
dplyr::anti_join(source_tbl_all, source_tbl, by = join_by("code", "source", "site", "latitude", "longitude", "site_type"))

alternatives <-
source_tbl_other %>%
dplyr::mutate(str = paste0(.data$code, " (could also be '", .data$site, "' from the source: '", .data$source, "'.)")) %>%
dplyr::pull(.data$str)

msg <- c(
"x" = "Ambiguous site codes detected. National networks are imported preferentially to locally managed networks.",
alternatives,
"i" = "Specify {.field source} to import sites from specific monitoring networks.")

names(msg)[names(msg) == ""] <- "!"

cli::cli_warn(msg)
}

return(source_tbl$source)
}
40 changes: 31 additions & 9 deletions R/importUKAQ.R
Original file line number Diff line number Diff line change
Expand Up @@ -94,8 +94,10 @@
#' @param year Year(s) to import. To import a series of years use, e.g.,
#' `2000:2020`. To import several specific years use `year = c(2000, 2010,
#' 2020)`.
#' @param source The network to which the `site`(s) belong, defaulting to
#' `"aurn"`. Providing a single network will attempt to import all of the
#' @param source The network to which the `site`(s) belong. The default, `NULL`,
#' allows [importUKAQ()] to guess the correct `source`, preferring national
#' networks over locally managed networks. Alternatively, users can define a `source`.
#' Providing a single network will attempt to import all of the
#' given `site`s from the provided network. Alternatively, a vector of sources
#' can be provided of the same length as `site` to indicate which network each
#' `site` individually belongs. Available networks include:
Expand Down Expand Up @@ -183,7 +185,7 @@
importUKAQ <-
function(site = "my1",
year = 2022,
source = "aurn",
source = NULL,
data_type = "hourly",
pollutant = "all",
hc = FALSE,
Expand All @@ -205,6 +207,14 @@ importUKAQ <-
)
}

# guess sources
if (is.null(source)) {
if (data_type %in% c("annual", "monthly", "daqi") & missing(site)) {
cli::cli_abort("Please provide a {.field source} when {.field data_type} is '{data_type}'.")
}
source <- guess_source(site)
}

# obtain correct URL info for the source
url_domain <- dplyr::case_match(
source,
Expand Down Expand Up @@ -244,7 +254,7 @@ importUKAQ <-
if (data_type == "15_min") {
data_type <- "15min"
}

if (!tolower(data_type) %in% allowed_types) {
cli::cli_warn(
c(
Expand Down Expand Up @@ -292,15 +302,26 @@ importUKAQ <-
source <- unique(source)
files <- paste0(url_domain, "annual_DAQI", url_abbr)
files <- unique(files)

# import DAQI
aq_data <-
purrr::pmap(
tidyr::crossing(tidyr::nesting(files, source), year),
readDAQI,
.progress = ifelse(progress, "Importing DAQI", FALSE)
) %>%
purrr::list_rbind()
purrr::list_rbind() %>%
dplyr::tibble() %>%
dplyr::mutate(
band = dplyr::case_when(
.data$poll_index %in% 1:3 ~ "Low",
.data$poll_index %in% 4:6 ~ "Moderate",
.data$poll_index %in% 7:9 ~ "High",
.data$poll_index == 10 ~ "Very High"
),
band = factor(.data$band, c("Low", "Moderate", "High", "Very High")),
.after = "poll_index"
)
}

# Import any other stat
Expand All @@ -327,8 +348,9 @@ importUKAQ <-
source = source,
url_data = url_domain
) %>%
dplyr::left_join(pcodes,
by = "code") %>%
dplyr::left_join(pcodes,
by = "code"
) %>%
tidyr::crossing(year = year)
} else {
site_info <-
Expand All @@ -340,7 +362,7 @@ importUKAQ <-
dplyr::mutate(pcode = rep(NA, times = length(site))) %>%
tidyr::crossing(year = year)
}

aq_data <-
purrr::pmap(
.l = site_info,
Expand Down
Loading

0 comments on commit bb3e002

Please sign in to comment.