Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Faster assess_temporal_independence() #353

Open
Rafnuss opened this issue Dec 29, 2024 · 0 comments
Open

Faster assess_temporal_independence() #353

Rafnuss opened this issue Dec 29, 2024 · 0 comments

Comments

@Rafnuss
Copy link
Collaborator

Rafnuss commented Dec 29, 2024

assess_temporal_independence <- function(df, minDeltaTime_dur, deltaTimeComparedTo) {

Here is a suggestion to make assess_temporal_independence() faster using more vectorial approach and more basic R functions. It's a slightly modified version with only a vector timestamp rather than the whole data.frame for increased modularity.

Let me know if you're intrested and I can create a PR.

#' Assess temporal independence
#'
#' @param timestamp A vector of datetime (or numeric in minutes)
#' @param minDeltaTime_dur: Duration in minutes between records of the same
#'   species at the same station to be considered independent.
#' @param deltaTimeComparedTo: Character, `"lastIndependentRecord"` or
#'   `"lastRecord"`.
#'   For two records to be considered independent, must the second one be at
#'   least `minDeltaTime` minutes after the last independent record of the same
#'   species (`deltaTimeComparedTo = "lastIndependentRecord"`), or
#'   `minDeltaTime` minutes after the last record (`deltaTimeComparedTo =
#'   "lastRecord"`)?
#'   If `minDeltaTime` is 0, `deltaTimeComparedTo` should be NULL.
#' @noRd
assess_temporal_independence <- function(timestamp, minDeltaTime_dur = 60, deltaTimeComparedTo = "lastRecord") {
  # Convert to numeric
  t <- as.numeric(timestamp)

  # Compute for lastRecord:
  # Are idpt if the duration since last record is greater than minDeltaTime_dur. First record is always a new event
  independent <- c(T, diff(t) > minDeltaTime_dur * 60)

  # For lastIndependentRecord, it's a bit more complicated
  if (deltaTimeComparedTo == "lastIndependentRecord") {
    # keep a copy to compare later in case.
    independent_old <- independent

    # lastIndependentRecord can only have more sequence/event than lastRecord, so we start from lastRecord sequence and split new sequences within if required.
    # cumsum(independent) allow to create groups based on the independent vector
    independent <- split(t, cumsum(independent), drop = FALSE) %>%
      lapply(\(tt){
        idpt <- rep(F, length(tt))
        continue <- T
        i <- 1
        while (continue) {
          idpt[i] <- T
          # findInterval is a fast way to compute the next index of the +minDeltaTime_dur record which will make the new sequence.
          e <- findInterval(tt[i] + minDeltaTime_dur * 60, tt)
          if (e == length(idpt)) {
            continue <- F
          } else {
            i <- e + 1
          }
        }
        return(idpt)
      }) %>%
      unlist() %>%
      unname()

    # Should always be zero
    # sum(independent_old & !independent)
    # New group/event/sequence
    # sum(!independent_old & independent)
  }
  return(independent)
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant