Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Python: Document generator agent framework demo #10184

Open
wants to merge 24 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
24 commits
Select commit Hold shift + click to select a range
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
58 changes: 58 additions & 0 deletions python/samples/demos/document_generator/GENERATED_DOCUMENT.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,58 @@
### Understanding Semantic Kernel AI Connectors

AI Connectors in Semantic Kernel are components that facilitate communication between the Kernel's core functionalities and various AI services. They abstract the intricate details of service-specific protocols, allowing developers to seamlessly interact with AI services for tasks like text generation, chat interactions, and more.

### Using AI Connectors in Semantic Kernel

Developers utilize AI connectors to connect their applications to different AI services efficiently. The connectors manage the requests and responses, providing a streamlined way to leverage the power of these AI services without needing to handle the specific communication protocols each service requires.

### Creating Custom AI Connectors in Semantic Kernel

To create a custom AI connector in Semantic Kernel, one must extend the base classes provided, such as `ChatCompletionClientBase` and `AIServiceClientBase`. Below is a guide and example for implementing a mock AI connector:

#### Step-by-Step Walkthrough

1. **Understand the Base Classes**: The foundational classes `ChatCompletionClientBase` and `AIServiceClientBase` provide necessary methods and structures for creating chat-based AI connectors.

2. **Implementing the Connector**: Here's a mock implementation example illustrating how to implement a connector without real service dependencies, ensuring compatibility with Pydantic's expectations within the framework:

```python
from semantic_kernel.connectors.ai.chat_completion_client_base import ChatCompletionClientBase

class MockAIChatCompletionService(ChatCompletionClientBase):
def __init__(self, ai_model_id: str):
super().__init__(ai_model_id=ai_model_id)

async def _inner_get_chat_message_contents(self, chat_history, settings):
# Mock implementation: returns dummy chat message content for demonstration.
return [{"role": "assistant", "content": "Mock response based on your history."}]

def service_url(self):
return "http://mock-ai-service.com"
```

### Usage Example

The following example demonstrates how to integrate and use the `MockAIChatCompletionService` in an application:

```python
import asyncio
from semantic_kernel.contents.chat_history import ChatHistory
from semantic_kernel.connectors.ai.prompt_execution_settings import PromptExecutionSettings

async def main():
chat_history = ChatHistory(messages=[{"role": "user", "content": "Hello"}])
settings = PromptExecutionSettings(model="mock-model")

service = MockAIChatCompletionService(ai_model_id="mock-model")

response = await service.get_chat_message_contents(chat_history, settings)
print(response)

# Run the main function
asyncio.run(main())
```

### Conclusion

By following the revised guide and understanding the base class functionalities, developers can effectively create custom connectors within Semantic Kernel. This structured approach enhances integration with various AI services while ensuring alignment with the framework's architectural expectations. Custom connectors offer flexibility, allowing developers to adjust implementations to meet specific service needs, such as additional logging, authentication, or modifications tailored to specific protocols. This guide provides a strong foundation upon which more complex and service-specific extensions can be built, promoting robust and scalable AI service integration.
104 changes: 104 additions & 0 deletions python/samples/demos/document_generator/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,104 @@
# Document Generator

This sample app demonstrates how to create technical documents for a codebase using AI. More specifically, it uses the agent framework offered by **Semantic Kernel** to ochestrate multiple agents to create a technical document.

This sample app also provides telemetry to monitor the agents, making it easier to observe the inner workings of the agents.

To learn more about agents, please refer to this introduction [video](https://learn.microsoft.com/en-us/shows/generative-ai-for-beginners/ai-agents-generative-ai-for-beginners).
To learn more about the Semantic Kernel Agent Framework, please refer to the [Semantic Kernel documentation](https://learn.microsoft.com/en-us/semantic-kernel/frameworks/agent/agent-architecture?pivots=programming-language-python).

> Note: This sample app cannot guarantee to generate a perfect technical document each time due to the stochastic nature of the AI model. Please a version of the document generated by the app in [GENERATED_DOCUMENT.md](GENERATED_DOCUMENT.md).

## Design

### Tools/PLugins

- **Code Execution Plugin**: This plugin offers a sandbox environment to execute Python snippets. It returns the output of the program or errors if any.
- **Repository File Plugin**: This plugin allows the AI to retrieve files from the Semantic Kernel repository.
- **User Input Plugin**: This plugin allows the AI to present content to the user and receive feedback.

### Agents

- **Content Creation Agent**: This agent is responsible for creating the content of the document. This agent has access to the **Repository File Plugin** to read source files it deems necessary for reference.
- **Code Validation Agent**: This agent is responsible for validating the code snippets in the document. This agent has access to the **Code Execution Plugin** to execute the code snippets.
- **User Agent**: This agent is responsible for interacting with the user. This agent has access to the **User Input Plugin** to present content to the user and receive feedback.

### Agent Selection Strategy

### Termination Strategy

## Prerequisites

1. Azure OpenAI
2. Azure Application Insights

## Additional packages

- `AICodeSandbox` - for executing AI generated code in a sandbox environment

```bash
pip install AICodeSandbox
```

> You must also have `docker` installed and running on your machine. Follow the instructions [here](https://docs.docker.com/get-started/introduction/get-docker-desktop/) to install docker for your platform. Images will be pulled during runtime if not already present. Containers will be created and destroyed during code execution.

## Running the app

### Step 1: Set up the environment

Make sure you have the following environment variables set:

```env
OPENAI_CHAT_MODEL_ID=<model-id>
TaoChenOSU marked this conversation as resolved.
Show resolved Hide resolved
OPENAI_API_KEY=<your-key>
```

> gpt-4o-2024-08-06 was used to generate [GENERATED_DOCUMENT.md](GENERATED_DOCUMENT.md).

```env

### Step 2: Run the app

```bash
python ./main.py
```

Expected output:

```bash
==== ContentCreationAgent just responded ====
==== CodeValidationAgent just responded ====
==== ContentCreationAgent just responded ====
...
```

## Customization

Since this is a sample app that demonstrates the creation of a technical document on Semantic Kernel AI connectors, you can customize the app to suit your needs. You can try different tasks, add more agents, tune existing agents, change the agent selection strategy, or modify the termination strategy.

- To try a different task, modify the `TASK` prompt in `main.py`.
- To add more agents, create a new agent under `agents/` and add it to the `agents` list in `main.py`.
- To tune existing agents, modify the `INSTRUCTION` prompt in the agent's source code.
- To change the agent selection strategy, modify `custom_selection_strategy.py`.
- To change the termination strategy, modify `custom_termination_strategy.py`.

## Optional: Monitoring the agents

When you see the final document generated by the app, what you see is actually the creation of multiple agents working together. You may wonder, how did the agents work together to create the document? What was the sequence of actions taken by the agents? How did the agents interact with each other? To answer these questions, you need to **observe** the agents.

Semantic Kernel by default instruments all the LLM calls. However, for agents there is no default instrumentation. This sample app shows how one can extend the Semantic Kernel agent to add instrumentation.

> There are currently no standards on what information needs to be captured for agents as the concept of agents is still relatively new. At the time of writing, the Semantic Convention for agents is still in the draft stage: <https://github.com/open-telemetry/semantic-conventions/issues/1732>

To monitor the agents, set the following environment variables:

```env
AZURE_APP_INSIGHTS_CONNECTION_STRING=<your-connection-string>

SEMANTICKERNEL_EXPERIMENTAL_GENAI_ENABLE_OTEL_DIAGNOSTICS=true
SEMANTICKERNEL_EXPERIMENTAL_GENAI_ENABLE_OTEL_DIAGNOSTICS_SENSITIVE=true
```

Follow this guide to inspect the telemetry data: <https://learn.microsoft.com/en-us/semantic-kernel/concepts/enterprise-readiness/observability/telemetry-with-app-insights?tabs=Powershell&pivots=programming-language-python#inspect-telemetry-data>

Or follow this guide to visualize the telemetry data on Azure AI Foundry: <https://learn.microsoft.com/en-us/semantic-kernel/concepts/enterprise-readiness/observability/telemetry-with-azure-ai-foundry-tracing#visualize-traces-on-azure-ai-foundry-tracing-ui-1>
Original file line number Diff line number Diff line change
@@ -0,0 +1,57 @@
# Copyright (c) Microsoft. All rights reserved.

import sys
from collections.abc import AsyncIterable

if sys.version_info >= (3, 12):
from typing import override # pragma: no cover
else:
from typing_extensions import override # pragma: no cover

from samples.demos.document_generator.agents.custom_agent_base import CustomAgentBase
from samples.demos.document_generator.plugins.code_execution_plugin import CodeExecutionPlugin
from semantic_kernel.connectors.ai.function_choice_behavior import FunctionChoiceBehavior
from semantic_kernel.contents.chat_history import ChatHistory
from semantic_kernel.contents.chat_message_content import ChatMessageContent

INSTRUCTION = """
You are a code validation agent in a collaborative document creation chat.
Your task is to validate Python code in the latest document draft and summarize any errors.
Follow the instructions in the document to assemble the code snippets into a single Python script.
If the snippets in the document are from multiple scripts, you need to modify them to work together as a single script.
Execute the code to validate it. If there are errors, summarize the error messages.
Do not try to fix the errors.
"""

DESCRIPTION = """
Select me to validate the Python code in the latest document draft.
"""


class CodeValidationAgent(CustomAgentBase):
def __init__(self):
kernel = self._create_kernel()
kernel.add_plugin(plugin=CodeExecutionPlugin(), plugin_name="CodeExecutionPlugin")

settings = kernel.get_prompt_execution_settings_from_service_id(service_id=CustomAgentBase.SERVICE_ID)
settings.function_choice_behavior = FunctionChoiceBehavior.Auto(maximum_auto_invoke_attempts=1)

super().__init__(
kernel=kernel,
execution_settings=settings,
name="CodeValidationAgent",
instructions=INSTRUCTION.strip(),
description=DESCRIPTION.strip(),
)

@override
async def invoke(self, history: ChatHistory) -> AsyncIterable[ChatMessageContent]:
cloned_history = history.model_copy(deep=True)
cloned_history.add_user_message_str(
"Now validate the Python code in the latest document draft and summarize any errors."
)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I notice in group chats, agents often struggle with following the system prompt. An agent will try to continue the conversation instead of executing the task it's given. Adding a user message here can prevent the agent from trying to continue the conversation.


async for response_message in super().invoke(cloned_history):
yield response_message
Original file line number Diff line number Diff line change
@@ -0,0 +1,54 @@
# Copyright (c) Microsoft. All rights reserved.

import sys
from collections.abc import AsyncIterable

if sys.version_info >= (3, 12):
from typing import override # pragma: no cover
else:
from typing_extensions import override # pragma: no cover

from samples.demos.document_generator.agents.custom_agent_base import CustomAgentBase
from samples.demos.document_generator.plugins.repo_file_plugin import RepoFilePlugin
from semantic_kernel.connectors.ai.function_choice_behavior import FunctionChoiceBehavior
from semantic_kernel.contents.chat_history import ChatHistory
from semantic_kernel.contents.chat_message_content import ChatMessageContent

INSTRUCTION = """
You are part of a chat with multiple agents focused on creating technical content.

Your task is to generate informative and engaging technical content,
including code snippets to explain concepts or demonstrate features.
Incorporate feedback by providing the updated full content with changes.
"""

DESCRIPTION = """
Select me to generate new content or to revise existing content.
"""


class ContentCreationAgent(CustomAgentBase):
def __init__(self):
kernel = self._create_kernel()
kernel.add_plugin(plugin=RepoFilePlugin(), plugin_name="RepoFilePlugin")

settings = kernel.get_prompt_execution_settings_from_service_id(service_id=CustomAgentBase.SERVICE_ID)
settings.function_choice_behavior = FunctionChoiceBehavior.Auto()

super().__init__(
kernel=kernel,
execution_settings=settings,
name="ContentCreationAgent",
instructions=INSTRUCTION.strip(),
description=DESCRIPTION.strip(),
)

@override
async def invoke(self, history: ChatHistory) -> AsyncIterable[ChatMessageContent]:
cloned_history = history.model_copy(deep=True)
cloned_history.add_user_message_str(
"Now generate new content or revise existing content to incorporate feedback."
)

async for response_message in super().invoke(cloned_history):
yield response_message
Original file line number Diff line number Diff line change
@@ -0,0 +1,55 @@
# Copyright (c) Microsoft. All rights reserved.

import sys
from abc import ABC
from collections.abc import AsyncIterable
from typing import ClassVar

from opentelemetry import trace

if sys.version_info >= (3, 12):
from typing import override # pragma: no cover
else:
from typing_extensions import override # pragma: no cover

from semantic_kernel.agents.chat_completion.chat_completion_agent import ChatCompletionAgent
from semantic_kernel.connectors.ai.open_ai.services.open_ai_chat_completion import OpenAIChatCompletion
from semantic_kernel.contents.chat_history import ChatHistory
from semantic_kernel.contents.chat_message_content import ChatMessageContent
from semantic_kernel.kernel import Kernel


class CustomAgentBase(ChatCompletionAgent, ABC):
SERVICE_ID: ClassVar[str] = "chat_completion"

def _create_kernel(self) -> Kernel:
kernel = Kernel()
kernel.add_service(OpenAIChatCompletion(service_id=self.SERVICE_ID))

return kernel

@override
async def invoke(self, history: ChatHistory) -> AsyncIterable[ChatMessageContent]:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had to override this method to wrap an agent invocation in a span whose name is the name of the agent. This is a requested feature which we can bring to the agent framework: #10174

# Since the history contains internal messages from other agents,
# we will do our best to filter out those. Unfortunately, there will
# be a side effect of losing the context of the conversation internal
# to the agent when the conversation is handed back to the agent, i.e.
# previous function call results.
filtered_chat_history = ChatHistory()
for message in history:
content = message.content
# We don't want to add messages whose text content is empty.
# Those messages are likely messages from function calls and function results.
if content:
filtered_chat_history.add_message(message)

tracer = trace.get_tracer(__name__)
response_messages: list[ChatMessageContent] = []
with tracer.start_as_current_span(self.name):
# Cache the messages within the span such that subsequent spans
# that process the message stream don't become children of this span
async for response_message in super().invoke(filtered_chat_history):
response_messages.append(response_message)

for response_message in response_messages:
yield response_message
55 changes: 55 additions & 0 deletions python/samples/demos/document_generator/agents/user_agent.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,55 @@
# Copyright (c) Microsoft. All rights reserved.

import sys
from collections.abc import AsyncIterable

if sys.version_info >= (3, 12):
from typing import override # pragma: no cover
else:
from typing_extensions import override # pragma: no cover

from samples.demos.document_generator.agents.custom_agent_base import CustomAgentBase
from samples.demos.document_generator.plugins.user_plugin import UserPlugin
from semantic_kernel.connectors.ai.function_choice_behavior import FunctionChoiceBehavior
from semantic_kernel.contents.chat_history import ChatHistory
from semantic_kernel.contents.chat_message_content import ChatMessageContent

INSTRUCTION = """
You are part of a chat with multiple agents working on a document.

Your task is to summarize the user's feedback on the latest draft from the author agent.
Present the draft to the user and summarize their feedback.

Do not try to address the user's feedback in this chat.
"""

DESCRIPTION = """
Select me if you want to ask the user to review the latest draft for publication.
"""


class UserAgent(CustomAgentBase):
def __init__(self):
kernel = self._create_kernel()
kernel.add_plugin(plugin=UserPlugin(), plugin_name="UserPlugin")

settings = kernel.get_prompt_execution_settings_from_service_id(service_id=CustomAgentBase.SERVICE_ID)
settings.function_choice_behavior = FunctionChoiceBehavior.Auto(maximum_auto_invoke_attempts=1)

super().__init__(
kernel=kernel,
execution_settings=settings,
name="UserAgent",
instructions=INSTRUCTION.strip(),
description=DESCRIPTION.strip(),
)

@override
async def invoke(self, history: ChatHistory) -> AsyncIterable[ChatMessageContent]:
cloned_history = history.model_copy(deep=True)
cloned_history.add_user_message_str(
"Now present the latest draft to the user for feedback and summarize their feedback."
)

async for response_message in super().invoke(cloned_history):
yield response_message
Loading
Loading