Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bot stuck due to high load caused by mongo #488

Open
ihor-chaban opened this issue Dec 9, 2024 · 3 comments
Open

Bot stuck due to high load caused by mongo #488

ihor-chaban opened this issue Dec 9, 2024 · 3 comments

Comments

@ihor-chaban
Copy link

As the title says, since I've been using the bot with the latest changes (master HEAD, gpt-4o model), I've noticed that it sometimes becomes unresponsive.
Then I went to a server to check what was going on and I couldn't even log into my server because of the crazy-high load caused by the bot (Load Average 25+ on 1 CPU server).
When I finally logged into the server I found that the load was caused by the mongo container.
I saw it does constant read/write operations at a rate of ~45-50 M/s.
But I couldn't get any relevant logs or other useful information because even simple commands like top or ps took a while to process under such load, not to mention any docker commands.
After killing mongod --bind_ip_all process the server went alive again.
I'm the only user of my bot and I use it quite rarely so it couldn't be any justifiable load.
I've tried deleting all files and rebuilding the bot image with clean MongoDB but it always happens again after some time.
I've never seen such a problem before v1.5 and gpt-4o model, so it must be related to the latest changes.
For now, I'm reverting to older releases, even though they don't have gpt-4o model.
Has anyone else had this problem? It's hard to debug because when it happens the server is unresponsive.

@ihor-chaban ihor-chaban changed the title Bot to stuck due to high load caused by mongo Bot stuck due to high load caused by mongo Dec 9, 2024
@ihor-chaban
Copy link
Author

ihor-chaban commented Dec 14, 2024

This is the only recent change related to DB:
752f38b#diff-d7094880f6e1845d792ec1cb547780e39276fc2fb13321ce3b52e393fc1755a7L462-R469

Probably under some conditions it gets stuck in a loop which causes DB operation to go crazy.

Unfortunately, v1.5 release and all changes after feel like a step backwards because of the amount of major issues, missed bugs and the code not being reviewed properly.

@ihor-chaban
Copy link
Author

https://github.com/father-bot/chatgpt_telegram_bot/blob/main/bot/bot.py#L462-L476

if current_model == "gpt-4-vision-preview" or current_model == "gpt-4o" or update.message.photo is not None and len(update.message.photo) > 0:
    ...
    if current_model != "gpt-4o":
        ...
    task = asyncio.create_task(_vision_message_handle_fn(...
else
    task = asyncio.create_task(message_handle_fn(...

These conditions make no sense together for a number of reasons:

  1. It will always create _vision_message_handle_fn taks if current model is "gpt-4-vision-preview" or "gpt-4o" regardless if the message has any photo included or not. There is no way to create a regular message_handle_fn task with these models selected.
  2. update.message.photo is not None and len(update.message.photo) > 0 why check both type and length? If optional parameter is not set it will be None.
  3. Parent condition has current_model == "gpt-4o" and then child condition is current_model != "gpt-4o", it looks odd and contradicts itself.

I would change this code block to:

        if update.message.photo:
            if current_model != "gpt-4o" and current_model != "gpt-4-vision-preview":
                current_model = "gpt-4o"
                db.set_user_attribute(user_id, "current_model", "gpt-4o")
            task = asyncio.create_task(
                _vision_message_handle_fn(
                    update, context, use_new_dialog_timeout=use_new_dialog_timeout)
            )
        else:
            task = asyncio.create_task(
                message_handle_fn()
            )

I'm not sure if this will resolve the DB overload issue, but it looks much better than the original code.
I will test it for some time to see if I run into the same issue again.

@ihor-chaban
Copy link
Author

ihor-chaban commented Dec 15, 2024

Also, why not set "gpt-4o" model by default?
This could be changed in:

It's pretty stupid that there is no single place to easily change the default model.
These all could be a simple parameter in config/models.yml or config/config.yml

Is there any reason to keep obsolete models in the list?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant