Python: Bug: Kernel Function plugin not working with AzureAssistantAgent #10141

vslepakov · 2025-01-09T12:14:07Z

Describe the bug
Testing the setup described here with a bugfix released in 1.18.0

To Reproduce
See the setup here.

Expected behavior
AzureAssistantAgent with a kernel function plugin works as part of AgentGroupChat

Platform

OS: Windows
IDE: VS Code
Language: Python
Source: semantic-kernel==1.18.0

Additional context

ERROR:

semantic_kernel.exceptions.service_exceptions.ServiceResponseException: ("<class 'semantic_kernel.connectors.ai.open_ai.services.azure_chat_completion.AzureChatCompletion'> service failed to complete the prompt", BadRequestError('Error code: 400 - {\'error\': {\'message\': "An assistant message with \'tool_calls\' must be followed by tool messages responding to each \'tool_call_id\'. The following tool_call_ids did not have response messages: call_74vVFw3smVjsnsoCwcbrUNaN", \'type\': \'invalid_request_error\', \'param\': \'messages.[3].role\', \'code\': None}}'))

According to this, the tool_call_id should be included in messages with AuthorRole.TOOL. I believe this should be handled in semantic kernel

Part of the stack trace:

...
 File "c:\Users\<snip>\Projects\semantic_kernel_agents\.venv\Lib\site-packages\semantic_kernel\agents\group_chat\agent_group_chat.py", line 144, in invoke
    async for message in super().invoke_agent(selected_agent):
  File "c:\Users\<snip>\Projects\semantic_kernel_agents\.venv\Lib\site-packages\semantic_kernel\agents\group_chat\agent_chat.py", line 144, in invoke_agent
    async for is_visible, message in channel.invoke(agent):
  File "c:\Users\<snip>\Projects\semantic_kernel_agents\.venv\Lib\site-packages\semantic_kernel\agents\channels\chat_history_channel.py", line 71, in invoke
    async for response_message in agent.invoke(self):
  File "c:\Users\<snip>\Projects\semantic_kernel_agents\.venv\Lib\site-packages\semantic_kernel\agents\chat_completion\chat_completion_agent.py", line 111, in invoke
    messages = await chat_completion_service.get_chat_message_contents(
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "c:\Users\<snip>\Projects\semantic_kernel_agents\.venv\Lib\site-packages\semantic_kernel\connectors\ai\chat_completion_client_base.py", line 142, in get_chat_message_contents
    return await self._inner_get_chat_message_contents(chat_history, settings)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "c:\Users\<snip>\Projects\semantic_kernel_agents\.venv\Lib\site-packages\semantic_kernel\utils\telemetry\model_diagnostics\decorators.py", line 83, in wrapper_decorator
    return await completion_func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "c:\Users\<snip>\Projects\semantic_kernel_agents\.venv\Lib\site-packages\semantic_kernel\connectors\ai\open_ai\services\open_ai_chat_completion_base.py", line 88, in _inner_get_chat_message_contents
    response = await self._send_request(settings)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "c:\Users\<snip>\Projects\semantic_kernel_agents\.venv\Lib\site-packages\semantic_kernel\connectors\ai\open_ai\services\open_ai_handler.py", line 59, in _send_request
    return await self._send_completion_request(settings)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "c:\Users\<snip>\Projects\semantic_kernel_agents\.venv\Lib\site-packages\semantic_kernel\connectors\ai\open_ai\services\open_ai_handler.py", line 99, in _send_completion_request
    raise ServiceResponseException(
semantic_kernel.exceptions.service_exceptions.ServiceResponseException: ("<class 'semantic_kernel.connectors.ai.open_ai.services.azure_chat_completion.AzureChatCompletion'> service failed to complete the prompt", BadRequestError('Error code: 400 - {\'error\': {\'message\': "An assistant message with \'tool_calls\' must be followed by tool messages responding to each \'tool_call_id\'. The following tool_call_ids did not have response messages: call_74vVFw3smVjsnsoCwcbrUNaN", \'type\': \'invalid_request_error\', \'param\': \'messages.[3].role\', \'code\': None}}'))

The text was updated successfully, but these errors were encountered:

moonbox3 · 2025-01-09T23:50:55Z

Hi @vslepakov, it looks like one of the tool calls may be failing and we're not sending back a result for that particular tool call? Are you able to enable logging so we can get some more information about the number of tool calls being made, and what else could be going on?

vslepakov · 2025-01-10T09:01:58Z

Hi @moonbox3, sure here you go. Let me know if you need anything else:

https://gist.github.com/vslepakov/715e7eb0a85688564da987d1633ccbf6

moonbox3 · 2025-01-10T10:50:47Z

Thanks for sending, @vslepakov. I'm not able to reproduce the tool calling issue with an AzureAssistantAgent. Are you able to share some code that I'd be able to use to reproduce it?

As a baseline, could you run this sample, please? https://github.com/microsoft/semantic-kernel/blob/main/python/samples/getting_started_with_agents/step7_assistant.py. It makes several tool calls. I'd like to know if you can run that sample, as well, or if you experience failures. Thanks.

As a note, I have the AZURE_OPENAI_API_VERSION in my .env file as 2024-09-01-preview.

vslepakov · 2025-01-10T12:04:44Z

Thanks @moonbox3. Just add you to my private repo playground.
It's on this branch: bug-repro-10141

Using the same AZURE_OPENAI_API_VERSION

Not sure if it makes a difference but I am using AgentGroupChat whereas the sample you provided does not.

moonbox3 · 2025-01-10T13:06:02Z

Thanks, @vslepakov. I will take a look at your repo soon. I just adjusted the mixed_agent_chats sample here so that there is an AzureAssistantAgent, and it uses the menu plugin as part of writing its copy. I am still not able to get it to fail right now... I dug into the chat history that is sent from the AzureAssistantAgent -> AzureChatCompletion service, and it does contain a message with FunctionCallContent from the Assistant, and then the next message is the FunctionResultContent from the Tool. This is acceptable ordering when sending to an OpenAI model:

I see in your logs, though, that right when you send a message to the AzureChatCompletion agent, after first running the AzureAssistantAgent it is failing with the 400. This is puzzling.

moonbox3 · 2025-01-13T00:47:19Z

Hi @vslepakov, I've been looking into this more, and was playing around with your code that you gave me access to. I'm not sure if you're running the same code that is in your GitHub repo? I had to make several changes in some files to make things work, based on your agent selection criteria (for example, simple_selection_function requires 5 args, but 4 were passed in). Additionally, based on your previous logging, the AzureAssistantAgent was running the grading plugin, correct? I updated the grading agent to use an AzureAssistantAgent instead of a ChatCompletionAgent to try and mimic what you showed in your previous logs so we could see the error when sending the messages that contain tool calls to the AzureChatCompletion agent.

I have not been able to make it fail in the same way that your logging has showed. I've attached some redacted logs here, so you can see the flow. I see we're making the grading tool call, and then including that tool call response to the formatter (Azure Chat Completion Agent), and I'm not getting the 400 that you previously showed.

https://gist.github.com/moonbox3/b0e90068273ec816cc4b42cafe73eb02

Do you have any other tips or code changes I would need to make to get it to fail? Are you able to have it fail 100% of the time?

Note: I tested both with gpt-4o and gpt-4o-mini.

vslepakov · 2025-01-13T09:13:33Z

Hi @moonbox3, thanks for your time and your help, really appreciate it!
I do not have any uncommitted changes on branch bug-repro-10141 and can just run the code without any modifications (app.py is the entry point). You might have picked the main branch, grading agent has been removed in bug-repro-10141 and grading_plugin is being used instead.

I can repro this on my machine 100% of the time when the plugin is invoked (sometimes it isn't).

moonbox3 · 2025-01-13T10:17:18Z

Thanks for your reply. I completely overlooked that you were on that separate branch. Let me try that tomorrow morning.

moonbox3 · 2025-01-15T05:50:04Z

@vslepakov I can repro this issue now -- not 100% of the time, but mostly. I'll work on investigating what is going on. Somehow the order of messages from the OpenAI assistant is causing the issue. The order should be:

user input msg
Initial response from model with comments
tool call for grading (FunctionCallContent)
tool response (FunctionResultContent)

This causes an issue:

user input msg
tool call for grading (FunctionCallContent)
Initial response from model with comments
tool response (FunctionResultContent)

I need to track down where this race condition is happening. :)

vslepakov · 2025-01-15T07:29:17Z

thanks for the heads up @moonbox3! sounds great, let me know if I can help.

moonbox3 · 2025-01-15T08:40:12Z

This was a fun one to track down. Here's my analysis:

I found that, at times, when the (Azure) OpenAI Assistant Agent makes a tool call, that tool call's creation timestamp is coming after the message creation timestamp (the message creation being the text that the assistant responds with -- it's textual analysis of the poem). Currently in our code, if we have a tool call (FunctionCallContent), we first yield that, and then we make a call to get completed steps, to then yield more content like FunctionResultContent and TextContent. There will be two steps (or more, depending upon the number of tool calls).

Right now we sort the completed steps in this way:

completed_steps_to_process: list[RunStep] = sorted(
    [s for s in steps if s.completed_at is not None and s.id not in processed_step_ids],
    key=lambda s: s.created_at,
)

When there are no failures, it's because the tool call was created before the final message content (as has been since this assistant was first coded). However, it appears that processing on the server-side can cause fluctuations in when the steps are created/processed. When we have a failure, we now have the message_creation (TextContent) yielded before the FunctionResultContent, which if sent to an (Azure) OpenAI Chat Completion endpoint will break with a 400 because of the gap in the ordering between the FunctionCallContent and the FunctionResultContent:

FunctionCallContent
TextContent # this should follow `FunctionResultContent` (and it does during times when we don't see a 400)
FunctionResultContent

As I've mentioned the 400 isn't 100% repeatable because of server-side processing, so we will get the correct ordering:

FunctionCallContent
FunctionResultContent
TextContent

I've tested changing the ordering of how we sort of the completed steps -- if we have a step_type == "tool_calls" and "message_creation", we sort to allow for "tool_calls" to come first, and any ties are broken by the step.completed_at timestamp. This works, and I don't get any incorrect ordering anymore. Let me continue to test, and I should have a fix out soon.

vslepakov added the bug Something isn't working label Jan 9, 2025

markwallace-microsoft added python Pull requests for the Python Semantic Kernel triage labels Jan 9, 2025

moonbox3 self-assigned this Jan 10, 2025

moonbox3 removed the triage label Jan 10, 2025

moonbox3 added this to Semantic Kernel Jan 10, 2025

moonbox3 linked a pull request Jan 15, 2025 that will close this issue

Python: Update sort step method for assistant invoke. #10191

Open

4 tasks

moonbox3 added experimental Associated with an experimental feature agents labels Jan 15, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Python: Bug: Kernel Function plugin not working with AzureAssistantAgent #10141

Python: Bug: Kernel Function plugin not working with AzureAssistantAgent #10141

vslepakov commented Jan 9, 2025

moonbox3 commented Jan 9, 2025

vslepakov commented Jan 10, 2025

moonbox3 commented Jan 10, 2025 •

edited

Loading

vslepakov commented Jan 10, 2025

moonbox3 commented Jan 10, 2025

moonbox3 commented Jan 13, 2025

vslepakov commented Jan 13, 2025

moonbox3 commented Jan 13, 2025

moonbox3 commented Jan 15, 2025

vslepakov commented Jan 15, 2025

moonbox3 commented Jan 15, 2025 •

edited

Loading

Python: Bug: Kernel Function plugin not working with AzureAssistantAgent #10141

Python: Bug: Kernel Function plugin not working with AzureAssistantAgent #10141

Comments

vslepakov commented Jan 9, 2025

moonbox3 commented Jan 9, 2025

vslepakov commented Jan 10, 2025

moonbox3 commented Jan 10, 2025 • edited Loading

vslepakov commented Jan 10, 2025

moonbox3 commented Jan 10, 2025

moonbox3 commented Jan 13, 2025

vslepakov commented Jan 13, 2025

moonbox3 commented Jan 13, 2025

moonbox3 commented Jan 15, 2025

vslepakov commented Jan 15, 2025

moonbox3 commented Jan 15, 2025 • edited Loading

moonbox3 commented Jan 10, 2025 •

edited

Loading

moonbox3 commented Jan 15, 2025 •

edited

Loading