Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Python: Bug: Kernel Function plugin not working with AzureAssistantAgent #10141

Open
vslepakov opened this issue Jan 9, 2025 · 11 comments · May be fixed by #10191
Open

Python: Bug: Kernel Function plugin not working with AzureAssistantAgent #10141

vslepakov opened this issue Jan 9, 2025 · 11 comments · May be fixed by #10191
Assignees
Labels
agents bug Something isn't working experimental Associated with an experimental feature python Pull requests for the Python Semantic Kernel

Comments

@vslepakov
Copy link
Member

Describe the bug
Testing the setup described here with a bugfix released in 1.18.0

To Reproduce
See the setup here.

Expected behavior
AzureAssistantAgent with a kernel function plugin works as part of AgentGroupChat

Platform

  • OS: Windows
  • IDE: VS Code
  • Language: Python
  • Source: semantic-kernel==1.18.0

Additional context

ERROR:

semantic_kernel.exceptions.service_exceptions.ServiceResponseException: ("<class 'semantic_kernel.connectors.ai.open_ai.services.azure_chat_completion.AzureChatCompletion'> service failed to complete the prompt", BadRequestError('Error code: 400 - {\'error\': {\'message\': "An assistant message with \'tool_calls\' must be followed by tool messages responding to each \'tool_call_id\'. The following tool_call_ids did not have response messages: call_74vVFw3smVjsnsoCwcbrUNaN", \'type\': \'invalid_request_error\', \'param\': \'messages.[3].role\', \'code\': None}}'))

According to this, the tool_call_id should be included in messages with AuthorRole.TOOL. I believe this should be handled in semantic kernel

Part of the stack trace:

...
 File "c:\Users\<snip>\Projects\semantic_kernel_agents\.venv\Lib\site-packages\semantic_kernel\agents\group_chat\agent_group_chat.py", line 144, in invoke
    async for message in super().invoke_agent(selected_agent):
  File "c:\Users\<snip>\Projects\semantic_kernel_agents\.venv\Lib\site-packages\semantic_kernel\agents\group_chat\agent_chat.py", line 144, in invoke_agent
    async for is_visible, message in channel.invoke(agent):
  File "c:\Users\<snip>\Projects\semantic_kernel_agents\.venv\Lib\site-packages\semantic_kernel\agents\channels\chat_history_channel.py", line 71, in invoke
    async for response_message in agent.invoke(self):
  File "c:\Users\<snip>\Projects\semantic_kernel_agents\.venv\Lib\site-packages\semantic_kernel\agents\chat_completion\chat_completion_agent.py", line 111, in invoke
    messages = await chat_completion_service.get_chat_message_contents(
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "c:\Users\<snip>\Projects\semantic_kernel_agents\.venv\Lib\site-packages\semantic_kernel\connectors\ai\chat_completion_client_base.py", line 142, in get_chat_message_contents
    return await self._inner_get_chat_message_contents(chat_history, settings)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "c:\Users\<snip>\Projects\semantic_kernel_agents\.venv\Lib\site-packages\semantic_kernel\utils\telemetry\model_diagnostics\decorators.py", line 83, in wrapper_decorator
    return await completion_func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "c:\Users\<snip>\Projects\semantic_kernel_agents\.venv\Lib\site-packages\semantic_kernel\connectors\ai\open_ai\services\open_ai_chat_completion_base.py", line 88, in _inner_get_chat_message_contents
    response = await self._send_request(settings)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "c:\Users\<snip>\Projects\semantic_kernel_agents\.venv\Lib\site-packages\semantic_kernel\connectors\ai\open_ai\services\open_ai_handler.py", line 59, in _send_request
    return await self._send_completion_request(settings)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "c:\Users\<snip>\Projects\semantic_kernel_agents\.venv\Lib\site-packages\semantic_kernel\connectors\ai\open_ai\services\open_ai_handler.py", line 99, in _send_completion_request
    raise ServiceResponseException(
semantic_kernel.exceptions.service_exceptions.ServiceResponseException: ("<class 'semantic_kernel.connectors.ai.open_ai.services.azure_chat_completion.AzureChatCompletion'> service failed to complete the prompt", BadRequestError('Error code: 400 - {\'error\': {\'message\': "An assistant message with \'tool_calls\' must be followed by tool messages responding to each \'tool_call_id\'. The following tool_call_ids did not have response messages: call_74vVFw3smVjsnsoCwcbrUNaN", \'type\': \'invalid_request_error\', \'param\': \'messages.[3].role\', \'code\': None}}'))
@vslepakov vslepakov added the bug Something isn't working label Jan 9, 2025
@markwallace-microsoft markwallace-microsoft added python Pull requests for the Python Semantic Kernel triage labels Jan 9, 2025
@moonbox3
Copy link
Contributor

moonbox3 commented Jan 9, 2025

Hi @vslepakov, it looks like one of the tool calls may be failing and we're not sending back a result for that particular tool call? Are you able to enable logging so we can get some more information about the number of tool calls being made, and what else could be going on?

@moonbox3 moonbox3 self-assigned this Jan 10, 2025
@moonbox3 moonbox3 removed the triage label Jan 10, 2025
@vslepakov
Copy link
Member Author

Hi @moonbox3, sure here you go. Let me know if you need anything else:

https://gist.github.com/vslepakov/715e7eb0a85688564da987d1633ccbf6

@moonbox3
Copy link
Contributor

moonbox3 commented Jan 10, 2025

Thanks for sending, @vslepakov. I'm not able to reproduce the tool calling issue with an AzureAssistantAgent. Are you able to share some code that I'd be able to use to reproduce it?

As a baseline, could you run this sample, please? https://github.com/microsoft/semantic-kernel/blob/main/python/samples/getting_started_with_agents/step7_assistant.py. It makes several tool calls. I'd like to know if you can run that sample, as well, or if you experience failures. Thanks.

As a note, I have the AZURE_OPENAI_API_VERSION in my .env file as 2024-09-01-preview.

@vslepakov
Copy link
Member Author

Thanks @moonbox3. Just add you to my private repo playground.
It's on this branch: bug-repro-10141

Using the same AZURE_OPENAI_API_VERSION

Not sure if it makes a difference but I am using AgentGroupChat whereas the sample you provided does not.

@moonbox3
Copy link
Contributor

Thanks, @vslepakov. I will take a look at your repo soon. I just adjusted the mixed_agent_chats sample here so that there is an AzureAssistantAgent, and it uses the menu plugin as part of writing its copy. I am still not able to get it to fail right now... I dug into the chat history that is sent from the AzureAssistantAgent -> AzureChatCompletion service, and it does contain a message with FunctionCallContent from the Assistant, and then the next message is the FunctionResultContent from the Tool. This is acceptable ordering when sending to an OpenAI model:

Image

I see in your logs, though, that right when you send a message to the AzureChatCompletion agent, after first running the AzureAssistantAgent it is failing with the 400. This is puzzling.

@moonbox3
Copy link
Contributor

Hi @vslepakov, I've been looking into this more, and was playing around with your code that you gave me access to. I'm not sure if you're running the same code that is in your GitHub repo? I had to make several changes in some files to make things work, based on your agent selection criteria (for example, simple_selection_function requires 5 args, but 4 were passed in). Additionally, based on your previous logging, the AzureAssistantAgent was running the grading plugin, correct? I updated the grading agent to use an AzureAssistantAgent instead of a ChatCompletionAgent to try and mimic what you showed in your previous logs so we could see the error when sending the messages that contain tool calls to the AzureChatCompletion agent.

I have not been able to make it fail in the same way that your logging has showed. I've attached some redacted logs here, so you can see the flow. I see we're making the grading tool call, and then including that tool call response to the formatter (Azure Chat Completion Agent), and I'm not getting the 400 that you previously showed.

https://gist.github.com/moonbox3/b0e90068273ec816cc4b42cafe73eb02

Do you have any other tips or code changes I would need to make to get it to fail? Are you able to have it fail 100% of the time?

Note: I tested both with gpt-4o and gpt-4o-mini.

@vslepakov
Copy link
Member Author

Hi @moonbox3, thanks for your time and your help, really appreciate it!
I do not have any uncommitted changes on branch bug-repro-10141 and can just run the code without any modifications (app.py is the entry point). You might have picked the main branch, grading agent has been removed in bug-repro-10141 and grading_plugin is being used instead.

I can repro this on my machine 100% of the time when the plugin is invoked (sometimes it isn't).

@moonbox3
Copy link
Contributor

Thanks for your reply. I completely overlooked that you were on that separate branch. Let me try that tomorrow morning.

@moonbox3
Copy link
Contributor

@vslepakov I can repro this issue now -- not 100% of the time, but mostly. I'll work on investigating what is going on. Somehow the order of messages from the OpenAI assistant is causing the issue. The order should be:

  1. user input msg
  2. Initial response from model with comments
  3. tool call for grading (FunctionCallContent)
  4. tool response (FunctionResultContent)

This causes an issue:

  1. user input msg
  2. tool call for grading (FunctionCallContent)
  3. Initial response from model with comments
  4. tool response (FunctionResultContent)

I need to track down where this race condition is happening. :)

@vslepakov
Copy link
Member Author

thanks for the heads up @moonbox3! sounds great, let me know if I can help.

@moonbox3
Copy link
Contributor

moonbox3 commented Jan 15, 2025

This was a fun one to track down. Here's my analysis:

I found that, at times, when the (Azure) OpenAI Assistant Agent makes a tool call, that tool call's creation timestamp is coming after the message creation timestamp (the message creation being the text that the assistant responds with -- it's textual analysis of the poem). Currently in our code, if we have a tool call (FunctionCallContent), we first yield that, and then we make a call to get completed steps, to then yield more content like FunctionResultContent and TextContent. There will be two steps (or more, depending upon the number of tool calls).

Right now we sort the completed steps in this way:

completed_steps_to_process: list[RunStep] = sorted(
    [s for s in steps if s.completed_at is not None and s.id not in processed_step_ids],
    key=lambda s: s.created_at,
)

When there are no failures, it's because the tool call was created before the final message content (as has been since this assistant was first coded). However, it appears that processing on the server-side can cause fluctuations in when the steps are created/processed. When we have a failure, we now have the message_creation (TextContent) yielded before the FunctionResultContent, which if sent to an (Azure) OpenAI Chat Completion endpoint will break with a 400 because of the gap in the ordering between the FunctionCallContent and the FunctionResultContent:

FunctionCallContent
TextContent # this should follow `FunctionResultContent` (and it does during times when we don't see a 400)
FunctionResultContent

As I've mentioned the 400 isn't 100% repeatable because of server-side processing, so we will get the correct ordering:

FunctionCallContent
FunctionResultContent
TextContent

I've tested changing the ordering of how we sort of the completed steps -- if we have a step_type == "tool_calls" and "message_creation", we sort to allow for "tool_calls" to come first, and any ties are broken by the step.completed_at timestamp. This works, and I don't get any incorrect ordering anymore. Let me continue to test, and I should have a fix out soon.

@moonbox3 moonbox3 linked a pull request Jan 15, 2025 that will close this issue
4 tasks
@moonbox3 moonbox3 added experimental Associated with an experimental feature agents labels Jan 15, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
agents bug Something isn't working experimental Associated with an experimental feature python Pull requests for the Python Semantic Kernel
Projects
Status: No status
Development

Successfully merging a pull request may close this issue.

3 participants