Durable Functions で複数 AI エージェントのオーケストレーションをしてみる

Durable Functions の fan-in / fan-out を用いて、複数の AI エージェントを制御する簡単なサンプルを作成してみた。

Overview

Azure Functions の Durable Functions がマルチ AI エージェントを制御するに便利そうであることは、以下にまとめてある。

Azure Functions の Durable Functions に関して触ってみる

今回はこちらを利用して実際に複数の AI エージェントをどう活用できるか試してみた。Durable Functions の fan-in / fan-out の構成だと以下のような構成になるが、今回は各関数において AI エージェントへのリクエストを実施している。最終的に、それぞれの実行結果をまとめて LLM の入力とし、要約を作成させている。

fan-in/fan-out

概念図としては以下の通りである。

Multi Agent System

今回は、旅行プランを作成してくれるマルチ AI エージェントシステムを作成してみた。検索エージェントでは、ユーザーからの質問を検索し、該当する地名を返している。その後、その場所について観光名所を回答してくれる観光地調査エージェントと、観光名物調査エージェントがそれぞれ並列して調査を行う。最終的にそれぞれの結果が得られるまで待機して、得られた結果を統合して旅行プラン作成エージェントがプランを作成している。

手順

今回も以下の公開情報を参照して進めている。

クイックスタート: Python Durable Functions アプリを作成する

AI エージェントに関しては、Azure AI Agent Service で bing 検索を行うエージェントを作成しているが、Azure AI Agent Service に関しては以下の記事で詳細を触れている。旅行プラン作成エージェントには bing 検索は不要であることから、シンプルに Chat Completions API を利用した。

Azure AI Agent Service のサンプルデモをデプロイする

こちらを実行する上では、Azure Functions から Azure AI Foundry にアクセスした上でエージェントを呼び出す必要がある。そのため、Azure Functions でシステム割り当てのマネージド ID を有効化し、Azure AI Foundry Project にて、[Azure ML データ科学者] ロールを紐づける必要があった。

RBAC

環境変数に関しては、コードが実行される Azure Functions 上に定義する必要があるため、以下を参考に Visual Studio Code から設定した。

v2 プログラミングモデルを有効にする

1
2
3
4
5
6


"ENDPOINT_URL"=
"DEPLOYMENT_NAME"=
"AZURE_OPENAI_API_KEY"=
"AZURE_PROJECTS_CONNECTION_STRING"=
"AZURE_OPENAI_MODEL"=
"AZURE_BING_SEARCH_CONNECTION_NAME"=

各エージェントの概要は以下の通り。

検索エージェント

ユーザーの質問に関して、都市名だけを返すようにしている。その都市名を踏まえて、次のエージェントに対する質問を返すようにしている。bing 検索の部分は、Azure AI Agent Service のエージェントを呼び出しており、システムプロンプトを f1 から渡すことで振る舞いを定義している。今回はサンプルのため、質問は既にテンプレート化されているが、この質問も自動で生成できるとより柔軟性は増すかと思われる。しかしながら、安定した動作を希望する場合には、このようにテンプレート化することも選択肢であると思う。

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73


def f1(input: str) -> str:
    # ユーザーの質問を検索し、得られた結果に関する土地について質問を作成する
    system_prompt = "あなたは、すべての質問に対して Bing Searchを使用してユーザーの質問に答える役立つアシスタントです。ユーザーの要望には厳密に従ってください。"
    city, _ = bing_search(input, system_prompt)
    user_question = [f"{city}の観光名所を教えてください。",f"{city}の名物の食べ物を教えてください。"]
    return user_question

def bing_search(prompt, system_prompt):
    try:
        # Initialize AI Project client
        project_client = AIProjectClient.from_connection_string(
            credential = DefaultAzureCredential(),
            conn_str = project_connstring
        )
    
        # Initialize Bing Grounding Tool
        bing_connection = project_client.connections.get(
            connection_name = bing_search_name
        )
        connection_id = bing_connection.id
        bing = BingGroundingTool(connection_id=connection_id)
        # Initiate Agent Service
        agent = project_client.agents.create_agent(
            model = gpt_model,
            name = "bing-search-agent",
            instructions = system_prompt,
            tools = bing.definitions,
            headers={"x-ms-enable-preview": "true"}
        )
        print(f"Created agent, agent ID: {agent.id}")
        
        # Create a thread
        thread = project_client.agents.create_thread()
        print(f"Created thread, thread ID: {thread.id}")
        
        # Create a message
        message = project_client.agents.create_message(
            thread_id = thread.id,
            role = "user",
            content = prompt
        )
        print(f"Created message, message ID: {message.id}")
        # Run the agent
        run = project_client.agents.create_and_process_run(
            thread_id = thread.id,
            agent_id = agent.id
        )
        # Check the run status
        if run.status == "failed":
            project_client.agents.delete_agent(agent.id)
            print(f"Deleted agent, agent ID: {agent.id}")
            return f"Run failed: {run.last_error}", ""
        # Retrieve messages from the agent
        messages = project_client.agents.list_messages(thread_id=thread.id)
        result = ""
        citations = []
        for message in messages.data:
            if message.role == "assistant":
                result = message.content[0].text.value                
                for annotation in message.content[0].text.annotations:
                    citation_text = annotation.text
                    citation_url = annotation['url_citation']['url']
                    citations.append(f"{citation_text}: {citation_url}")
                print("Retrieved groundings from Bing Search")
                
        # Delete the agent once done
        project_client.agents.delete_agent(agent.id)
        print(f"Deleted agent, agent ID: {agent.id}")
        
        result = result if result else "No response from agent."
        return result, citations
    except Exception as e:
        return f"An error occurred: {e}", ""

観光地調査エージェント

f1 と基本の動作は同様で、システムプロンプトで振る舞いを定義し、観光名所の検索を特化させている。

1
2
3
4
5


def f2(input: str) -> str:
    # 観光名所を調べて回答
    system_prompt = "あなたは観光名所を調べて 5 つ回答するエージェントです。ユーザーが満足するような観光名所を提案してください。"
    result, _ = bing_search(input, system_prompt)
    return result

観光名物調査エージェント

こちらも f1 と動作は同様で、システムプロンプトで観光名物の検索に特化させている。

1
2
3
4
5


def f3(input: str) -> str:
    # 名物を調べて回答
    system_prompt = "あなたはその土地の有名な食べ物を調べて5つ回答するエージェントです。ユーザーが満足するような食事を提案してください。"
    result, _ = bing_search(input, system_prompt)
    return result

旅行プラン作成エージェント

観光名所エージェント、観光名物エージェントの結果を組み合わせて、今回は小さい子供もいる家族向けのプランを作成してもらうようにした。情報の要約を行うのがタスクであり、bing 検索は不要であることから Chat Completions API を利用するシンプルな作りになっている。

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20


def f4(input: list) -> str:
    input_text = "\n".join(input)
    chat_prompt = [{
        "role": "system",
        "content": f"input にある情報をすべて一度記載した上で、小さい子供もいる家族に魅力的なプランを作成してください。 input : {input_text}"
    }]
    messages = chat_prompt  
    completion = client.chat.completions.create(  
        model=deployment,
        messages=messages,
        max_tokens=1000,  
        temperature=0.7,  
        top_p=0.95,  
        frequency_penalty=0,  
        presence_penalty=0,
        stop=None,  
        stream=False
    )
    summary = completion.to_json()
    return summary

結果

今回はサンプルとして、Minecraft の実写映画が 2025 年 4 月に公開されているが、このロケ地を巡るプランを生成してもらった。実行結果は以下の通りである。

映画『マインクラフト／ザ・ムービー』公式サイト

Result

作成されたプランを確認してみると、オークランドを巡るプランが生成されている。確かに、実写版 Minecraft は、ニュージーランドのオークランドで撮影されていたようである。その他の観光地、名物に関しても、オークランドの観光案内に記載されているもののようである。

Where was A Minecraft Movie filmed? Filming locations of Jack Black’s latest adventure comedy, explored

実際に作成されたプランに関しては以下の通りであるが、なかなか家族連れに良さそうなプランが作成されたように思う。このように複数の AI エージェントにそれぞれの役割を持たせることで、複雑なタスクでも期待するような結果を返してもらえやすくなると考えられている。また、AI エージェントに関しては、手持ちのデータを用いる RAG を利用するエージェントを加えるなどすると、より正確でユニークな回答が期待できる。

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23


### オークランドでの家族向け魅力的なプラン

オークランドは子供連れの家族にとっても楽しめる観光名所がたくさんあります。以下は、家族全員が楽しめる一日プランの提案です。

#### 午前中：スカイタワーでの冒険
- **スカイタワー**を訪れ、地上328メートルの展望台からオークランドの美しい景色を楽しみましょう。子供たちも楽しめるスカイジャンプやスカイウォークはスリル満点ですが、観覧車のように安全に景色を楽しむこともできます。

#### 昼食：フィッシュ＆チップスを楽しむ
- スカイタワー周辺には、地元の人気店が多くあります。ここで**フィッシュ＆チップス**を楽しんで、家族全員でシェアしましょう。ボリュームたっぷりで、子供たちも大好きなメニューです。

#### 午後：ホビット村の探索
- 昼食後は、ホビット村へ向かいましょう。映画『ロード・オブ・ザ・リング』や『ホビット』のファンにはたまらないスポットです。子供たちはホビットの家々を見学したり、写真を撮ったりして楽しめます。

#### 夕方：マウントイーデンでの絶景
- ホビット村から戻った後は、**マウントイーデン**に立ち寄りましょう。ここでは子供たちも元気に登れるハイキングを楽しむことができます。頂上からの360度のパノラマビューは、夕暮れ時に特に美しく、家族の思い出に残る瞬間を提供してくれます。

#### 夜：地元料理を楽しむ
- 夕食には、オークランドのレストランで**ラムチョップ**や**ローストラム**を楽しみましょう。子供向けのメニューも多く、家族全員で美味しい料理をシェアできます。

#### オプション：ワイへケ島のビーチで遊ぶ
- もし時間に余裕があれば、フェリーで**ワイへケ島**に立ち寄って、美しいビーチで遊ぶこともおすすめです。家族で海水浴を楽しんだり、ピクニックをすることができます。

このプランを通じて、オークランドの多彩な魅力を満喫しながら、家族全員が楽しい思い出を作ることができるでしょう。

ちなみに、Azure AI Agent Service で作成した bing 検索を行うエージェントで使用しているモデルは gpt-4o であるが、以下の情報にあるようにトレーニングデータは 2023 年 10 月であるため、それ以降の情報に関してはモデル単体で回答することはできない。

GPT-4 モデルと GPT-4 Turbo モデル

そのため、Minecraft の実写映画が 2025 年 4 月に公開されているが、こちらのロケ地に関しては gpt-4o に回答しても結果が得られなかった。しかし、bing 検索を組み合わせることで最新の情報を得ることができるため、回答することができた。

Response

コード

各エージェントを含めた実際に Azure Functions にデプロイしたコードはこちらである。

  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152


import azure.functions as func
import azure.durable_functions as df
from openai import AzureOpenAI  
import os  
from azure.ai.projects import AIProjectClient
from azure.ai.projects.models import BingGroundingTool
from azure.identity import DefaultAzureCredential

endpoint = os.getenv("ENDPOINT_URL")  
deployment = os.getenv("DEPLOYMENT_NAME")  
subscription_key = os.getenv("AZURE_OPENAI_API_KEY")  
project_connstring = os.getenv("AZURE_PROJECTS_CONNECTION_STRING")
gpt_model = os.getenv("AZURE_OPENAI_MODEL")
bing_search_name = os.getenv("AZURE_BING_SEARCH_CONNECTION_NAME")

client = AzureOpenAI(  
    azure_endpoint=endpoint,  
    api_key=subscription_key,  
    api_version="2024-05-01-preview",
)

myApp = df.DFApp(http_auth_level=func.AuthLevel.ANONYMOUS)
# An HTTP-triggered function with a Durable Functions client binding
@myApp.route(route="orchestrators/{functionName}")
@myApp.durable_client_input(client_name="client")
async def http_start(req: func.HttpRequest, client):
    function_name = req.route_params.get('functionName')
    instance_id = await client.start_new(function_name)
    response = client.create_check_status_response(req, instance_id)
    return response
# Orchestrator

@myApp.orchestration_trigger(context_name="context")
def orchestrator_function(context: df.DurableOrchestrationContext):
    query = yield context.call_activity("f1", "A Minecraft Movie の撮影地はどこですか。一都市だけ回答して、開催地の都市名以外の情報は必要ありません。出力例 : バルセロナ、東京")
    parallel_tasks = [ context.call_activity("f2", query[0]), context.call_activity("f3", query[1]) ]
    outputs = yield context.task_all(parallel_tasks)
    # Aggregate all N outputs and send the result to F3.
    result = yield context.call_activity("f4", outputs)
    return result

# Activity functions
@myApp.activity_trigger(input_name="input")
def f1(input: str) -> str:
    # ユーザーの質問を検索し、得られた結果に関する土地について質問を作成する
    system_prompt = "あなたは、すべての質問に対して Bing Searchを使用してユーザーの質問に答える役立つアシスタントです。ユーザーの要望には厳密に従ってください。"
    city, _ = bing_search(input, system_prompt)
    user_question = [f"{city}の観光名所を教えてください。",f"{city}の名物の食べ物を教えてください。"]
    return user_question

@myApp.activity_trigger(input_name="input")
def f2(input: str) -> str:
    # 観光名所を調べて回答
    system_prompt = "あなたは観光名所を調べて 5 つ回答するエージェントです。ユーザーが満足するような観光名所を提案してください。"
    result, _ = bing_search(input, system_prompt)
    return result

@myApp.activity_trigger(input_name="input")
def f3(input: str) -> str:
    # 名物を調べて回答
    system_prompt = "あなたはその土地の有名な食べ物を調べて5つ回答するエージェントです。ユーザーが満足するような食事を提案してください。"
    result, _ = bing_search(input, system_prompt)
    return result

@myApp.activity_trigger(input_name="input")
def f4(input: list) -> str:
    input_text = "\n".join(input)
    chat_prompt = [{
        "role": "system",
        "content": f"input にある情報をすべて一度記載した上で、小さい子供もいる家族に魅力的なプランを作成してください。 input : {input_text}"
    }]
    messages = chat_prompt  
    completion = client.chat.completions.create(  
        model=deployment,
        messages=messages,
        max_tokens=1000,  
        temperature=0.7,  
        top_p=0.95,  
        frequency_penalty=0,  
        presence_penalty=0,
        stop=None,  
        stream=False
    )
    summary = completion.to_json()
    return summary

def bing_search(prompt, system_prompt):
    try:
        # Initialize AI Project client
        project_client = AIProjectClient.from_connection_string(
            credential = DefaultAzureCredential(),
            conn_str = project_connstring
        )
    
        # Initialize Bing Grounding Tool
        bing_connection = project_client.connections.get(
            connection_name = bing_search_name
        )
        connection_id = bing_connection.id
        bing = BingGroundingTool(connection_id=connection_id)
        # Initiate Agent Service
        agent = project_client.agents.create_agent(
            model = gpt_model,
            name = "bing-search-agent",
            instructions = system_prompt,
            tools = bing.definitions,
            headers={"x-ms-enable-preview": "true"}
        )
        print(f"Created agent, agent ID: {agent.id}")
        
        # Create a thread
        thread = project_client.agents.create_thread()
        print(f"Created thread, thread ID: {thread.id}")
        
        # Create a message
        message = project_client.agents.create_message(
            thread_id = thread.id,
            role = "user",
            content = prompt
        )
        print(f"Created message, message ID: {message.id}")
        # Run the agent
        run = project_client.agents.create_and_process_run(
            thread_id = thread.id,
            agent_id = agent.id
        )
        # Check the run status
        if run.status == "failed":
            project_client.agents.delete_agent(agent.id)
            print(f"Deleted agent, agent ID: {agent.id}")
            return f"Run failed: {run.last_error}", ""
        # Retrieve messages from the agent
        messages = project_client.agents.list_messages(thread_id=thread.id)
        result = ""
        citations = []
        for message in messages.data:
            if message.role == "assistant":
                result = message.content[0].text.value                
                for annotation in message.content[0].text.annotations:
                    citation_text = annotation.text
                    citation_url = annotation['url_citation']['url']
                    citations.append(f"{citation_text}: {citation_url}")
                print("Retrieved groundings from Bing Search")
                
        # Delete the agent once done
        project_client.agents.delete_agent(agent.id)
        print(f"Deleted agent, agent ID: {agent.id}")
        
        result = result if result else "No response from agent."
        return result, citations
    except Exception as e:
        return f"An error occurred: {e}", ""

まとめ

Durable Functions を用いたマルチ AI エージェント構成に関して実際に試してみた。Durable Functions が fan-in / fan-out の実装を担ってくれるため、ユーザーがその上にデプロイする LLM ロジックの実装だけを考えれば良いのは便利であった。単体モデルでは回答できないことも、AI エージェントでは回答できることを確認した。加えて、比較的簡単なシナリオではあったが、各エージェントの役割を細分化することで、複雑なタスクでも回答が得られそうな結果だった。より複雑なタスクにおいても、このように細分化された AI エージェントを用いることや独自のデータを利用することで、より AI エージェントが頼りになる存在になりそうである。