Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: create TES task as k8s job #200

Open
wants to merge 23 commits into
base: main
Choose a base branch
from
Open

feat: create TES task as k8s job #200

wants to merge 23 commits into from

Conversation

JaeAeich
Copy link

@JaeAeich JaeAeich commented Jul 21, 2024

PS: This is WIP, consider this PR for POC, this maybe/will be further broken down into smaller PR once a concensus on code is reached and everthing starts to work.

Summary by Sourcery

This pull request introduces the ability to create TES tasks as Kubernetes jobs. It includes a new Kubernetes client wrapper, a converter for TES tasks to Kubernetes jobs, and updates to the deployment configuration to support these new features.

  • New Features:
    • Introduced a Kubernetes client wrapper to abstract interactions with the Kubernetes API.
    • Added functionality to convert TES tasks to Kubernetes jobs using a new converter module.
    • Implemented a new module for creating TES tasks, including the creation of Kubernetes jobs and config maps.
  • Enhancements:
    • Updated the deployment configuration to include a taskmaster template for Kubernetes jobs.
    • Enhanced the CreateTask function to convert TES tasks to Kubernetes jobs and print the resulting job object.
  • Deployment:
    • Added a new taskmaster template and environment properties to the deployment configuration.

@JaeAeich JaeAeich requested a review from uniqueg July 21, 2024 10:58
Copy link

sourcery-ai bot commented Jul 21, 2024

Reviewer's Guide by Sourcery

This pull request introduces the ability to create TES tasks as Kubernetes jobs. It includes new configurations, utility functions, and classes for managing Kubernetes interactions. The changes are primarily focused on integrating Kubernetes job creation and management into the existing TES task workflow.

File-Level Changes

Files Changes
tesk/api/kubernetes/client_wrapper.py
tesk/api/kubernetes/converter.py
tesk/api/kubernetes/constants.py
tesk/api/kubernetes/template.py
tesk/api/kubernetes/__init__.py
Introduced Kubernetes client wrapper, conversion logic, constants, and templates for managing Kubernetes jobs.
tesk/utils.py
tesk/custom_config.py
Added utility functions and custom configuration models for handling taskmaster settings and conversions.
tesk/api/ga4gh/tes/controllers.py
tesk/api/ga4gh/tes/task/create_task.py
tesk/api/ga4gh/tes/task/__init__.py
Updated TES task creation logic to integrate with Kubernetes job creation.

Tips
  • Trigger a new Sourcery review by commenting @sourcery-ai review on the pull request.
  • Continue your discussion with Sourcery by replying directly to review comments.
  • You can change your review settings at any time by accessing your dashboard:
    • Enable or disable the Sourcery-generated pull request summary or reviewer's guide;
    • Change the review language;
  • You can always contact us if you have any questions or feedback.

@JaeAeich JaeAeich requested review from lvarin and jemaltahir July 21, 2024 10:58
Copy link

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @JaeAeich - I've reviewed your changes and they look great!

Here's what I looked at during the review
  • 🟡 General issues: 7 issues found
  • 🟡 Security: 1 issue found
  • 🟢 Testing: all looks good
  • 🟡 Complexity: 1 issue found
  • 🟢 Documentation: all looks good

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment to tell me if it was helpful.

deployment/config.yaml Outdated Show resolved Hide resolved
tesk/api/ga4gh/tes/controllers.py Outdated Show resolved Hide resolved
tesk/api/ga4gh/tes/controllers.py Outdated Show resolved Hide resolved
tesk/exceptions.py Show resolved Hide resolved
tesk/api/kubernetes/client_wrapper.py Show resolved Hide resolved
tesk/api/kubernetes/client_wrapper.py Outdated Show resolved Hide resolved
tesk/api/kubernetes/client_wrapper.py Outdated Show resolved Hide resolved
tesk/api/kubernetes/template.py Outdated Show resolved Hide resolved
tesk/exceptions.py Show resolved Hide resolved
tesk/api/kubernetes/client_wrapper.py Show resolved Hide resolved
@JaeAeich
Copy link
Author

The CI will break, those are easy fixes, please ignore them and review the rest of the code and logic :)

Copy link

codecov bot commented Jul 21, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 98.21%. Comparing base (1de3bc2) to head (104aa05).

Additional details and impacted files
@@           Coverage Diff           @@
##             main     #200   +/-   ##
=======================================
  Coverage   98.21%   98.21%           
=======================================
  Files           8        8           
  Lines         561      561           
=======================================
  Hits          551      551           
  Misses         10       10           
Flag Coverage Δ
test_unit 98.21% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

deployment/config.yaml Outdated Show resolved Hide resolved
deployment/config.yaml Outdated Show resolved Hide resolved
deployment/config.yaml Outdated Show resolved Hide resolved
tesk/api/ga4gh/tes/controllers.py Outdated Show resolved Hide resolved
tesk/api/ga4gh/tes/controllers.py Outdated Show resolved Hide resolved
tesk/api/kubernetes/template.py Outdated Show resolved Hide resolved
tesk/custom_config.py Outdated Show resolved Hide resolved
tesk/custom_config.py Outdated Show resolved Hide resolved
tesk/custom_config.py Outdated Show resolved Hide resolved
tesk/custom_config.py Outdated Show resolved Hide resolved
@uniqueg uniqueg changed the title feat: create tes task as k8s job feat: create TES task as k8s job Jul 27, 2024
Comment on lines +58 to +83
┌─────────────────────────────────────────────────────────┐
│ Kubernetes │
│ │
│ ┌────────────────────────────┐ ┌───────────────────┐ │
│ │ Secret: ftp-secret │ │ ConfigMap/PVC │ │
│ │ - username │ │ - JSON_INPUT.gz │ │
│ │ - password │ │ │ │
│ └──────────▲─────────────────┘ └───────▲───────────┘ │
│ │ | │
│ │ | │
│ │ | │
│ ┌─────────┴────────────────────────────┴────────────┐ │
│ │ Job: taskmaster │ │
│ │ ┌───────────────────────────────────────────────┐ │ │
│ │ │ Pod: taskmaster │ │ │
│ │ │ - Container: taskmaster │ │ │
│ │ │ - Env: TESK_FTP_USERNAME │ │ │
│ │ │ - Env: TESK_FTP_PASSWORD │ │ │
│ │ │ - Args: -f /jsoninput/JSON_INPUT.gz │ │ │
│ │ │ - Mounts: /podinfo │ │ │
│ │ │ /jsoninput │ │ │
│ │ └───────────────────────────────────────────────┘ │ │
│ └───────────────────────────────────────────────────┘ │
│ │
└─────────────────────────────────────────────────────────┘

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was just trying to understand k8s flow, please ignore, maybe will remove later.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fine for me, but the rendered version does not look good to me:

Screenshot from 2024-08-05 10-55-34

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In principle it's nice to have such a schema in the docs.

I am, however, wondering about the FTP secret. Maybe this will become clearer to me after looking at the code below, but seeing it represented like this in the doc, I have the concern that FTP is somehow treated special, when, ideally, storage solutions should all be treated in an abstract manner, like previously discussed: abstract storage handler and individual implementations for different storage/file transfer solutions. And that should probably extend to managing secrets/credentials as well, no?

@@ -9,7 +9,7 @@ host_name: ""
#

# 'openstack' or 's3'
storage: none
storage: s3
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TODO: turn back to none.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍🏽

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should probably be described in more detail somewhere. Also, if none is an allowed value, this should be listed, and not just openstack and s3.

Comment on lines +58 to +83
┌─────────────────────────────────────────────────────────┐
│ Kubernetes │
│ │
│ ┌────────────────────────────┐ ┌───────────────────┐ │
│ │ Secret: ftp-secret │ │ ConfigMap/PVC │ │
│ │ - username │ │ - JSON_INPUT.gz │ │
│ │ - password │ │ │ │
│ └──────────▲─────────────────┘ └───────▲───────────┘ │
│ │ | │
│ │ | │
│ │ | │
│ ┌─────────┴────────────────────────────┴────────────┐ │
│ │ Job: taskmaster │ │
│ │ ┌───────────────────────────────────────────────┐ │ │
│ │ │ Pod: taskmaster │ │ │
│ │ │ - Container: taskmaster │ │ │
│ │ │ - Env: TESK_FTP_USERNAME │ │ │
│ │ │ - Env: TESK_FTP_PASSWORD │ │ │
│ │ │ - Args: -f /jsoninput/JSON_INPUT.gz │ │ │
│ │ │ - Mounts: /podinfo │ │ │
│ │ │ /jsoninput │ │ │
│ │ └───────────────────────────────────────────────┘ │ │
│ └───────────────────────────────────────────────────┘ │
│ │
└─────────────────────────────────────────────────────────┘

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fine for me, but the rendered version does not look good to me:

Screenshot from 2024-08-05 10-55-34

@@ -9,7 +9,7 @@ host_name: ""
#

# 'openstack' or 's3'
storage: none
storage: s3
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍🏽

logger.error("Nothing received in request body.")
raise BadRequest("No request body received.")
tes_task = TesTask(**request_body)
namespace = "tesk"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have just seen that this value is hardwired to tesk. In the java implementation, the TESK API is running in the same namespace that it will create the jobs, so you can deploy it in a shared cluster and call the namespace whatever you want.

I am not sure what others think, but this will be an issue for CSC

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Won't be an issue I plan on removing it from here, I'll add this to docs config or as env variable so when launching the api anyone can configure it.

Copy link
Member

@uniqueg uniqueg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Didn't review the whole PR, but here are some comments for now

Comment on lines +58 to +83
┌─────────────────────────────────────────────────────────┐
│ Kubernetes │
│ │
│ ┌────────────────────────────┐ ┌───────────────────┐ │
│ │ Secret: ftp-secret │ │ ConfigMap/PVC │ │
│ │ - username │ │ - JSON_INPUT.gz │ │
│ │ - password │ │ │ │
│ └──────────▲─────────────────┘ └───────▲───────────┘ │
│ │ | │
│ │ | │
│ │ | │
│ ┌─────────┴────────────────────────────┴────────────┐ │
│ │ Job: taskmaster │ │
│ │ ┌───────────────────────────────────────────────┐ │ │
│ │ │ Pod: taskmaster │ │ │
│ │ │ - Container: taskmaster │ │ │
│ │ │ - Env: TESK_FTP_USERNAME │ │ │
│ │ │ - Env: TESK_FTP_PASSWORD │ │ │
│ │ │ - Args: -f /jsoninput/JSON_INPUT.gz │ │ │
│ │ │ - Mounts: /podinfo │ │ │
│ │ │ /jsoninput │ │ │
│ │ └───────────────────────────────────────────────┘ │ │
│ └───────────────────────────────────────────────────┘ │
│ │
└─────────────────────────────────────────────────────────┘

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In principle it's nice to have such a schema in the docs.

I am, however, wondering about the FTP secret. Maybe this will become clearer to me after looking at the code below, but seeing it represented like this in the doc, I have the concern that FTP is somehow treated special, when, ideally, storage solutions should all be treated in an abstract manner, like previously discussed: abstract storage handler and individual implementations for different storage/file transfer solutions. And that should probably extend to managing secrets/credentials as well, no?

compressed_data Outdated
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is this used for? Consider giving it a more expressive (and less ambiguous - compressed could indicate sth like ZIP or GZ) name. Also consider giving it a file extension (.json, .txt ?). Finally, consider arranging test data in a separate folder data/ or tests/data/.

@@ -9,7 +9,7 @@ host_name: ""
#

# 'openstack' or 's3'
storage: none
storage: s3
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should probably be described in more detail somewhere. Also, if none is an allowed value, this should be listed, and not just openstack and s3.

deployment/config.yaml Outdated Show resolved Hide resolved
encoded_data.txt Outdated
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As above, consider moving to dedicated data folder.

Also, consider using a file naming pattern that, when read left to right, goes from more generic to more specific, e.g., data_encoded.txt and data_compressed.txt instead of encoded_data.txt and compressed_data.txt. This helps keeping related data together when sorted alphanumerically.

Even better to add something more expressive to explain the use case of these files, i.e., data_USE_CASE_encoded.txt and data_USE_CASE_compressed.txt.

@@ -4,9 +4,10 @@ requires = ["poetry-core"]

[tool.bandit]
skips = [
"B101", # Use of asserts is discouraged as it is removed during byte generation, we don't need that!
"B108", # Insecure usage of temp file/directory, false positive.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But are you sure that this will always be false positives?
If not, better to disable per line/block/module

@@ -26,14 +28,22 @@ def CancelTask(id, *args, **kwargs) -> dict: # type: ignore

# POST /tasks
@log_traffic
def CreateTask(*args, **kwargs) -> dict: # type: ignore
def CreateTask(**kwargs) -> dict[str, str]: # type: ignore
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we still need the type: ignore here after implementation?

Comment on lines 37 to 46
try:
request_body = kwargs.get("body")
if request_body is None:
logger.error("Nothing received in request body.")
raise BadRequest("No request body received.")
tes_task = TesTask(**request_body)
namespace = "tesk"
return CreateTesTask(tes_task, namespace).response()
except Exception as e:
raise InternalServerError from e
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This error handling looks wrong/weird to me:

  1. The generic catching of Exception means that every exception will raise a 500 error, no? So there is no chance of the route ever raising a 400, 401, 403 etc. error. Surely this is not what we want.
  2. FOCA already handles errors in application context appropriately. Specifically, all errors not listed in the exceptions dictionary are raised as 500 errors (see here), so this code is redundant.

Comment on lines 39 to 41
if request_body is None:
logger.error("Nothing received in request body.")
raise BadRequest("No request body received.")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Connexion should handle BadRequest/400 errors automatically, unless that functionality is specifically disabled in the config (which it isn't). So no need for this code, I believe (can be easily tested).

Comment on lines 43 to 44
namespace = "tesk"
return CreateTesTask(tes_task, namespace).response()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Except when calling well known builtin functions, I think it improves readibility a lot to name params when passing args. It also helps prevent bugs that may arise from using positional args. Also, in this case, it avoids the need to define a variable just to be clear about what you are passing.

So consider using:

return CreateTesTask(task=tes_task, namespace="tesk").response()

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants