Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ya-runtime-ai as library for frameworks #59

Open
nieznanysprawiciel opened this issue Feb 12, 2024 · 1 comment
Open

ya-runtime-ai as library for frameworks #59

nieznanysprawiciel opened this issue Feb 12, 2024 · 1 comment

Comments

@nieznanysprawiciel
Copy link
Contributor

nieznanysprawiciel commented Feb 12, 2024

Why:

  • We would like to be able to add new frameworks without creating new runtime build

What:

  • ya-runtime-ai should become library
  • Frameworks will use library and specialize to handle each runtime
  • We won't have any common binary. We will distribute separate binaries for each runtime

Alternative:

  • Create API between ya-runtime-ai and python
    • This way we avoid making Rust competence necessary to integrate new frameworks
@pwalski
Copy link
Contributor

pwalski commented Apr 18, 2024

Each implementation of ya-runtime-ai library will have:

  • framework specific cmdline parameters (so it will be easy to have them configured in runtime descriptor (instead of using some additional config files))
  • framework specific offer-template/test cmdline args implementation (e.g. test if framework can start its API server)
  • framework process startup (adding some framework specific cmdline args (model, chat template, etc.))
  • framework process startup monitoring (e.g. monitoring logs to verify if model was loaded properly)
  • healthcheck monitoring (pinging some framework specific endpoint)
  • framework specific shutdown implementation (e.g. calling a /kill endpoint, doing some cleanup)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants