Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: create TES task as k8s job #200

Open
wants to merge 23 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 6 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
26 changes: 26 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -55,6 +55,32 @@ After the last executor, the `filer` is called once more to process the outputs
and push them to remote locations from the PVC. The PVC is the scrubbed, deleted
and the taskmaster ends, completing the task.

┌─────────────────────────────────────────────────────────┐
│ Kubernetes │
│ │
│ ┌────────────────────────────┐ ┌───────────────────┐ │
│ │ Secret: ftp-secret │ │ ConfigMap/PVC │ │
│ │ - username │ │ - JSON_INPUT.gz │ │
│ │ - password │ │ │ │
│ └──────────▲─────────────────┘ └───────▲───────────┘ │
│ │ | │
│ │ | │
│ │ | │
│ ┌─────────┴────────────────────────────┴────────────┐ │
│ │ Job: taskmaster │ │
│ │ ┌───────────────────────────────────────────────┐ │ │
│ │ │ Pod: taskmaster │ │ │
│ │ │ - Container: taskmaster │ │ │
│ │ │ - Env: TESK_FTP_USERNAME │ │ │
│ │ │ - Env: TESK_FTP_PASSWORD │ │ │
│ │ │ - Args: -f /jsoninput/JSON_INPUT.gz │ │ │
│ │ │ - Mounts: /podinfo │ │ │
│ │ │ /jsoninput │ │ │
│ │ └───────────────────────────────────────────────┘ │ │
│ └───────────────────────────────────────────────────┘ │
│ │
└─────────────────────────────────────────────────────────┘

Comment on lines +58 to +83
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was just trying to understand k8s flow, please ignore, maybe will remove later.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fine for me, but the rendered version does not look good to me:

Screenshot from 2024-08-05 10-55-34

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In principle it's nice to have such a schema in the docs.

I am, however, wondering about the FTP secret. Maybe this will become clearer to me after looking at the code below, but seeing it represented like this in the doc, I have the concern that FTP is somehow treated special, when, ideally, storage solutions should all be treated in an abstract manner, like previously discussed: abstract storage handler and individual implementations for different storage/file transfer solutions. And that should probably extend to managing secrets/credentials as well, no?

## Requirements

- A working [Kubernetes](https://kubernetes.io/) cluster version 1.9 and later.
Expand Down
2 changes: 1 addition & 1 deletion deployment/charts/tesk/values.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@ host_name: ""
#

# 'openstack' or 's3'
storage: none
storage: s3
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TODO: turn back to none.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍🏽

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should probably be described in more detail somewhere. Also, if none is an allowed value, this should be listed, and not just openstack and s3.


# Configurable storage class.
storageClass:
Expand Down
72 changes: 72 additions & 0 deletions deployment/config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -72,6 +72,78 @@ custom:
tesResources_backend_parameters:
- VmSize
- ParamToRecogniseDataComingFromConfig
# Taskmaster configuration
taskmaster_template:
apiVersion: batch/v1
kind: Job
metadata:
name: taskmaster
labels:
app: taskmaster
spec:
template:
metadata:
name: taskmaster
spec:
serviceAccountName: default
containers:
- name: taskmaster
image: docker.io/elixircloud/tesk-core-taskmaster:v0.10.2
args:
- -f
- /jsoninput/JSON_INPUT.gz
env:
- name: TESK_FTP_USERNAME
valueFrom:
secretKeyRef:
name: ftp-secret
key: username
optional: true
- name: TESK_FTP_PASSWORD
valueFrom:
secretKeyRef:
name: ftp-secret
key: password
optional: true
volumeMounts:
- name: podinfo
mountPath: /podinfo
readOnly: true
- name: jsoninput
mountPath: /jsoninput
readOnly: true
volumes:
- name: podinfo
downwardAPI:
items:
- path: labels
fieldRef:
fieldPath: metadata.labels
restartPolicy: Never
# Taskmaster environment properties
taskmaster_env_properties:
# Taskmaster image name
imageName: docker.io/elixircloud/tesk-core-taskmaster
# Taskmaster image version
imageVersion: latest
# Filer image name
filerImageName: docker.io/elixircloud/tesk-core-filer
# Filer image version
filerImageVersion: latest
# Test FTP account settings
ftp:
# Name of the secret with FTP account credentials
secretName: account-secret
# If FTP account enabled (based on non-emptiness of secretName)
enabled: false
# If verbose (debug) mode of taskmaster is on (passes additional flag to taskmaster and sets image pull policy to Always)
debug: false
# Environment variables, that will be passed to taskmaster
environment:
key: value
# Service Account name for taskmaster
serviceAccountName: default

uniqueg marked this conversation as resolved.
Show resolved Hide resolved

# Logging configuration
# Cf. https://foca.readthedocs.io/en/latest/modules/foca.models.html#foca.models.config.LogConfig
Expand Down
21 changes: 18 additions & 3 deletions tesk/api/ga4gh/tes/controllers.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,10 +2,15 @@

import logging

# from connexion import request # type: ignore
from foca.utils.logging import log_traffic # type: ignore

from tesk.api.ga4gh.tes.models import TesTask
from tesk.api.ga4gh.tes.service_info.service_info import ServiceInfo
from tesk.api.ga4gh.tes.task.create_task import CreateTesTask
from tesk.api.kubernetes.convert.converter import TesKubernetesConverter
from tesk.api.kubernetes.convert.template import KubernetesTemplateSupplier
from tesk.exceptions import BadRequest, InternalServerError
from tesk.utils import get_custom_config

# Get logger instance
logger = logging.getLogger(__name__)
Expand All @@ -26,14 +31,24 @@ def CancelTask(id, *args, **kwargs) -> dict: # type: ignore

# POST /tasks
@log_traffic
def CreateTask(*args, **kwargs) -> dict: # type: ignore
def CreateTask(**kwargs) -> dict: # type: ignore
"""Create task.

Args:
*args: Variable length argument list.
**kwargs: Arbitrary keyword arguments.
"""
pass
try:
request_body = kwargs.get("body")
if request_body is None:
logger("Nothing recieved in request body.")
raise BadRequest("No request body recieved.")
tes_task = TesTask(**request_body)
namespace = "tesk"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have just seen that this value is hardwired to tesk. In the java implementation, the TESK API is running in the same namespace that it will create the jobs, so you can deploy it in a shared cluster and call the namespace whatever you want.

I am not sure what others think, but this will be an issue for CSC

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Won't be an issue I plan on removing it from here, I'll add this to docs config or as env variable so when launching the api anyone can configure it.

task_creater = CreateTesTask(tes_task, namespace)
task_creater.response()
except Exception as e:
raise InternalServerError from e


# GET /tasks/service-info
Expand Down
1 change: 1 addition & 0 deletions tesk/api/ga4gh/tes/task/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
"""Task API controller logic."""
81 changes: 81 additions & 0 deletions tesk/api/ga4gh/tes/task/create_task.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,81 @@
"""TESK API module for creating a task."""

import logging

from tesk.api.ga4gh.tes.models import TesTask
from tesk.api.kubernetes.client_wrapper import KubernetesClientWrapper
from tesk.api.kubernetes.constants import Constants
from tesk.api.kubernetes.convert.converter import TesKubernetesConverter
from tesk.constants import TeskConstants
from tesk.exceptions import KubernetesError

logger = logging.getLogger(__name__)


class CreateTesTask:
"""Create TES task."""

# TODO: Add user to the class when auth implemented in FOCA
def __init__(self, task: TesTask, namespace=TeskConstants.tesk_namespace):
"""Initialize the CreateTask class.

Args:
task: TES task to create.
namespace: Kubernetes namespace where the task is created.
"""
self.task = task
# self.user = user
self.kubernetes_client_wrapper = KubernetesClientWrapper()
self.namespace = namespace
self.tes_kubernetes_converter = TesKubernetesConverter(self.namespace)
self.constants = Constants()

def create_task(self):
"""Create TES task."""
attempts_no = 0
while attempts_no < self.constants.job_create_attempts_no:
try:
attempts_no += 1
resources = self.task.resources

if resources and resources.ram_gb:
minimum_ram_gb = self.kubernetes_client_wrapper.minimum_ram_gb()
if resources.ram_gb < minimum_ram_gb:
self.task.resources.ram_gb = minimum_ram_gb

task_master_job = (
self.tes_kubernetes_converter.from_tes_task_to_k8s_job(
self.task,
# self.user
)
)

task_master_config_map = (
self.tes_kubernetes_converter.from_tes_task_to_k8s_config_map(
self.task,
task_master_job,
# user
)
)
_ = self.kubernetes_client_wrapper.create_config_map(
task_master_config_map
)
created_job = self.kubernetes_client_wrapper.create_job(task_master_job)
print(task_master_config_map)
print(task_master_job)
return created_job.metadata.name

except KubernetesError as e:
if (
not e.is_object_name_duplicated()
or attempts_no >= self.constants.job_create_attempts_no
):
raise e

except Exception as exc:
logging.error("ERROR: In createTask", exc_info=True)
raise exc

def response(self) -> dict:
"""Create response."""
return self.create_task()
1 change: 1 addition & 0 deletions tesk/api/kubernetes/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
"""Kubernetes API module."""
Loading
Loading