-
Notifications
You must be signed in to change notification settings - Fork 281
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP] Support nomad #750
base: develop
Are you sure you want to change the base?
[WIP] Support nomad #750
Conversation
Can one of the admins verify this patch? |
de57562
to
224ad71
Compare
69eece4
to
f14a8ca
Compare
@antoinesauray Thanks for making this PR. This will be a really helpful feature. Would you be able to
Thank you! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you think it is possible to expose resource allocation & port number to containerManager? Seems like you hardcodes those values in job
description
qf_http_thread_pool_size, | ||
qf_http_timeout_request, | ||
qf_http_timeout_content, | ||
num_frontend_replicas=1): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Based on your description about DNS resolution, it seems like Consul should also run in the boot up process? Is it going to be built in to nomad cluster?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nomad leaves the DNS responsability to another service, Consul is one of them but it should be flexible enough to support alternatives. An abstract class / interface is the better choice here I think.
Also, It should be configured on the host (through dnsmasq). I'll document the setup I have in the PR.
import dns.resolver | ||
import socket | ||
|
||
class ConsulDNS(DNS): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Couldn't find the usage in this PR. Where is it used?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It should be used on the client side as follows:
from clipper_admin.deployers import python as python_deployer
from clipper_admin import ClipperConnection, DockerContainerManager, NomadContainerManager, ConsulDNS
nomad_ip_addr = '10.65.30.43'
dns = ConsulDNS() # We use Consul for DNS resolution
container_manager = NomadContainerManager(
nomad_ip=nomad_ip_addr,
dns=dns
)
clipper_conn = ClipperConnection(container_manager)
clipper_conn.connect()
I will document this as well
I was thinking about it, it can be done in the instanciation of |
4cb15a9
to
2a3b9de
Compare
Need support here, i'm still stuck because of #751 |
@antoinesauray I don't have enough bandwidth to handle this recently. For the testing stuff, I will try to resolve by next Tuesday. Please leave one more message if I don't come back by next Tuesday. |
Nomad API
We can use the Nomad Api to place Docker containers on nodes. I chose to use a Python library to make things simpler https://github.com/jrxFive/python-nomad
How to address containers
To address those containers through ip & port, we have two options
I chose to go with the option (2) because it is more standard across environments. The (1) can be supported but will require specific code.
Workflow when connecting
When the Clipper Admin connects, It will try to determine the ip and ports of each of the service in order to know if it needs to submit a new job, or use what already exists.
Example: For Redis, a
redis.service.consul
DNS request is sent, if it returns at least one ip:port, it is used, otherwise a Redis instance job is submitted. It will then keep sending SRV requests until the service is up, otherwise the process stops.Selecting containers
Nomad does not have the notion of selectors. I propose to use conventions to solve this problem. Job are prefixed with
clipper-{cluster-name}
. This allows us to select them based on their name (when we want to stop a container for instance).That is how it looks like in Consul UI
Managing the connection between Models and Query Frontend
This one is tricky. The problem if that both the ip and port of Query Frontend are going to change overtime. Meaning that we have to submit a new job every time.
The only way I could solve this was to use a load balancer (namely Fabio, one of the previously mentionned) and to do TCP forwarding. This leaves the responsability of Fabio to route to the correct ip and port. But this implementation is specific.
That means we are booting the Model containers with CLIPPER_IP='fabio.service.consul' and CLIPPER_PORT='7000'. This part needs to be improved though.
If you have any questions don't hesitate, I know this is quite a big description