Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Triggers enhancements #2522

Closed
1 task done
anna-geller opened this issue Nov 14, 2023 · 0 comments
Closed
1 task done

Triggers enhancements #2522

anna-geller opened this issue Nov 14, 2023 · 0 comments
Labels
enhancement New feature or request
Milestone

Comments

@anna-geller
Copy link
Member

anna-geller commented Nov 14, 2023

This is an epic to track work we plan to do around triggers in 0.14 and 0.15.

  1. Add a "null action" e.g. action "NONE" to avoid moving the file. User can decide how and when they want to process the detected file (move/rename, delete)
  2. Allow to determine when the action of MOVE or DELETE should be executed: 1) during trigger evaluation (as is currently), 2) post-execution i.e. in the same way as listeners were handled (running after Execution terminated). This might be an Enum property actionOn: POST_TRIGGER, POST_EXECUTION. Alternative (and preferred) way would be to only keep the post-execution behavior so that the file persists until the Execution ends. The downsides of this approach is that if processing would take longer, this could lead to the trigger being potentially evaluated multiple times (?)
  3. Allow not to download the detected file to internal storage. Instead, provide a full path e.g. s3://mybucket/myfile.csv and let the user download and process the file as they wish. This is especially desirable when dealing with large files and when the task doesn't even need to download a file e.g. LoadFromGCS task may be used with GCS Trigger to detect a file and then ingest it to BigQuery directly from GCS without downloading it.
  4. Trigger an execution when a file was modified. This would require the NONE action to avoid a never ending loop. In theory, triggers must be stateless so we cannot store metadata such as file size or last modified date in the backend. However, we can compare the last modified date (we store lastModifiedDate for all file detection triggers) of the file with the last trigger execution to evaluate whether a file has changed and the trigger needs to start a new execution.

Tasks

Preview Give feedback
  1. enhancement
    loicmathieu
@anna-geller anna-geller added the enhancement New feature or request label Nov 14, 2023
@github-project-automation github-project-automation bot moved this to Backlog in All issues Nov 14, 2023
@anna-geller anna-geller moved this from Backlog to Ready in All issues Nov 30, 2023
@anna-geller anna-geller added this to the v0.15.0 milestone Dec 4, 2023
@anna-geller anna-geller removed this from All issues Dec 12, 2023
@anna-geller anna-geller modified the milestones: v0.15.0, v0.14.0 Jan 14, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant