{"id":4467,"date":"2026-02-12T20:44:55","date_gmt":"2026-02-12T20:44:55","guid":{"rendered":"https:\/\/salarydistribution.com\/machine-learning\/2026\/02\/12\/build-long-running-mcp-servers-on-amazon-bedrock-agentcore-with-strands-agents-integration\/"},"modified":"2026-02-12T20:44:55","modified_gmt":"2026-02-12T20:44:55","slug":"build-long-running-mcp-servers-on-amazon-bedrock-agentcore-with-strands-agents-integration","status":"publish","type":"post","link":"https:\/\/salarydistribution.com\/machine-learning\/2026\/02\/12\/build-long-running-mcp-servers-on-amazon-bedrock-agentcore-with-strands-agents-integration\/","title":{"rendered":"Build long-running MCP servers on Amazon Bedrock AgentCore with Strands Agents integration"},"content":{"rendered":"<div id=\"\">\n<p>AI agents are rapidly evolving from mere chat interfaces into sophisticated autonomous workers that handle complex, time-intensive tasks. As organizations deploy agents to train <a href=\"https:\/\/aws.amazon.com\/ai\/machine-learning\/\" target=\"_blank\" rel=\"noopener noreferrer\">machine learning<\/a> (ML) models, process large datasets, and run extended simulations, the <a href=\"https:\/\/modelcontextprotocol.io\/docs\/getting-started\/intro\" target=\"_blank\" rel=\"noopener noreferrer\">Model Context Protocol<\/a> (MCP) has emerged as a standard for agent-server integrations. But a critical challenge remains: these operations can take minutes or hours to complete, far exceeding typical session timeframes. By using <a href=\"https:\/\/aws.amazon.com\/bedrock\/agentcore\/\" target=\"_blank\" rel=\"noopener noreferrer\">Amazon Bedrock AgentCore<\/a> and <a href=\"https:\/\/strandsagents.com\/latest\/\" target=\"_blank\" rel=\"noopener noreferrer\">Strands Agents<\/a> to implement persistent state management, you can enable seamless, cross-session task execution in production environments. Imagine your AI agent initiating a multi-hour data processing job, your user closing their laptop, and the system seamlessly retrieving completed results when the user returns days later\u2014with full visibility into task progress, outcomes, and errors. This capability transforms AI agents from conversational assistants into reliable autonomous workers that can handle enterprise-scale operations. Without these architectural patterns, you\u2019ll encounter timeout errors, inefficient resource utilization, and potential data loss when connections terminate unexpectedly.<\/p>\n<p>In this post, we provide you with a comprehensive approach to achieve this. First, we introduce a context message strategy that maintains continuous communication between servers and clients during extended operations. Next, we develop an asynchronous task management framework that allows your AI agents to initiate long-running processes without blocking other operations. Finally, we demonstrate how to bring these strategies together with Amazon Bedrock AgentCore and Strands Agents to build production-ready AI agents that can handle complex, time-intensive operations reliably.<\/p>\n<h2>Common approaches to handle long-running tasks<\/h2>\n<p>When designing MCP servers for long-running tasks, you might face a fundamental architectural decision: should the server maintain an active connection and provide real-time updates, or should it decouple task execution from the initial request? This choice leads to two distinct approaches: <strong>context messaging<\/strong> and<strong> async task management<\/strong>.<\/p>\n<h3>Using context messaging<\/h3>\n<p>The context messaging approach maintains continuous communication between the MCP server and client throughout task execution. This is achieved by using MCP\u2019s built-in context object to send periodic notifications to the client. This approach is optimal for scenarios where tasks are typically completed within 10\u201315 minutes and network connectivity remains stable. The context messaging approach offers these advantages:<\/p>\n<ul>\n<li>Straightforward implementation<\/li>\n<li>No additional polling logic required<\/li>\n<li>Straightforward client implementation<\/li>\n<li>Minimal overhead<\/li>\n<\/ul>\n<h3>Using async task management<\/h3>\n<p>The async task management approach separates task initiation from execution and result retrieval. After executing the MCP tool, the tool immediately returns a task initiation message while executing the task in the background. This approach excels in demanding enterprise scenarios where tasks might run for hours, users need flexibility to disconnect and reconnect, and system reliability is paramount. The async task management approach provides these benefits:<\/p>\n<ul>\n<li>True fire-and-forget operation<\/li>\n<li>Safe client disconnection while tasks continue processing<\/li>\n<li>Data loss prevention through persistent storage<\/li>\n<li>Support for long-running operations (hours)<\/li>\n<li>Resilience against network interruptions<\/li>\n<li>Asynchronous workflows<\/li>\n<\/ul>\n<h2>Context messaging<\/h2>\n<p>Let\u2019s begin by exploring the context messaging approach, which provides a straightforward solution for handling moderately long operations while maintaining active connections. This approach builds directly on existing capabilities of MCP and requires minimal additional infrastructure, making it an excellent starting point for extending your agent\u2019s processing time limits. Imagine you\u2019ve built an MCP server for an AI agent that helps data scientists train ML models. When a user asks the agent to train a complex model, the underlying process might take 10\u201315 minutes\u2014far beyond the typical 30-second to 2-minute HTTP timeout limit in most environments. Without a proper strategy, the connection would drop, the operation would fail, and the user would be left frustrated. In a Streamable HTTP transport for MCP client implementation, these timeout constraints are particularly limiting. When task execution exceeds the timeout limit, the connection aborts and the agent\u2019s workflow interrupts. This is where context messaging comes in. The following diagram illustrates the workflow when implementing the context messaging approach. Context messaging uses the built-in context object of MCP to send periodic signals from the server to the MCP client, effectively keeping the connection alive throughout longer operations. Think of it as sending \u201cheartbeat\u201d messages that help prevent the connection from timing out.<\/p>\n<div id=\"attachment_124056\" class=\"wp-caption alignnone\">\n        <a href=\"https:\/\/d2908q01vomqb2.cloudfront.net\/f1f836cb4ea6efb2a0b1b99f41ad8b103eff4b59\/2026\/02\/07\/image-1-1.png\"><img decoding=\"async\" loading=\"lazy\" aria-describedby=\"caption-attachment-124056\" class=\"wp-image-124056 size-full\" src=\"https:\/\/d2908q01vomqb2.cloudfront.net\/f1f836cb4ea6efb2a0b1b99f41ad8b103eff4b59\/2026\/02\/07\/image-1-1.png\" alt=\"Sequence diagram showing Model Context Protocol (MCP) architecture with four components: User, Agent (AI processor), MCP Server (communication manager), and MCP Tool (task executor). Flow: User queries Agent \u2192 Agent requests MCP Server \u2192 Server invokes Tool \u2192 Context messaging exchanges during execution \u2192 Tool returns output \u2192 Server processes and returns to Agent \u2192 Agent responds to User. Demonstrates layered architecture with intelligent intermediary and dynamic context messaging.\" width=\"762\" height=\"600\"><\/a><\/p>\n<p id=\"caption-attachment-124056\" class=\"wp-caption-text\">Figure 1: Illustration of workflow in context messaging approach<\/p>\n<\/p><\/div>\n<p>Here is a code example to implement the context messaging:<\/p>\n<div class=\"hide-language\">\n<pre><code class=\"lang-python\">from mcp.server.fastmcp import Context, FastMCP\nimport asyncio\n\nmcp = FastMCP(host=\"0.0.0.0\", stateless_http=True)\n\n@mcp.tool()\nasync def model_training(model_name: str, epochs: int, ctx: Context) -&gt; str:\n\u00a0\u00a0 \u00a0\"\"\"Execute a task with progress updates.\"\"\"\n\n\u00a0\u00a0 \u00a0for i in range(epochs):\n\u00a0\u00a0 \u00a0 \u00a0 \u00a0# Simulate long running time training work\n\u00a0\u00a0 \u00a0 \u00a0 \u00a0progress = (i + 1) \/ epochs\n\u00a0\u00a0 \u00a0 \u00a0 \u00a0await asyncio.sleep(5)\n\u00a0\u00a0 \u00a0 \u00a0 \u00a0await ctx.report_progress(\n\u00a0\u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0progress=progress,\n\u00a0\u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0total=1.0,\n\u00a0\u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0message=f\"Step {i + 1}\/{epochs}\",\n\u00a0\u00a0 \u00a0 \u00a0 \u00a0)\n\n\u00a0\u00a0 \u00a0return f\"{model_name} training completed. The model artifact is stored in s3:\/\/templocation\/model.pickle . The model training score is 0.87, validation score is 0.82.\"\n\nif __name__ == \"__main__\":\n\u00a0\u00a0 \u00a0mcp.run(transport=\"streamable-http\")<\/code><\/pre>\n<\/p><\/div>\n<p>The key element here is the <code>Context<\/code> parameter in the tool definition. When you include a parameter with the <code>Context<\/code> type annotation, FastMCP automatically injects this object, giving you access to methods such as <code>ctx.info()<\/code> and <code>ctx.report_progress()<\/code>. These methods send messages to the connected client without terminating tool execution.<\/p>\n<p>The <code>report_progress()<\/code> calls within the training loop serve as those critical heartbeat messages, making sure the MCP connection remains active throughout the extended processing period.<\/p>\n<p>For many real-world scenarios, exact progress can\u2019t be easily quantified\u2014such as when processing unpredictable datasets or making external API calls. In these cases, you can implement a time-based heartbeat system:<\/p>\n<div class=\"hide-language\">\n<pre><code class=\"lang-python\">from mcp.server.fastmcp import Context, FastMCP\nimport time\nimport asyncio\n\nmcp = FastMCP(host=\"0.0.0.0\", stateless_http=True)\n\n@mcp.tool()\nasync def model_training(model_name: str, epochs: int, ctx: Context) -&gt; str:\n\u00a0\u00a0 \u00a0\"\"\"Execute a task with progress updates.\"\"\"\n\u00a0\u00a0 \u00a0done_event = asyncio.Event()\n\u00a0\u00a0 \u00a0start_time = time.time()\n\n\u00a0\u00a0 \u00a0async def timer():\n\u00a0\u00a0 \u00a0 \u00a0 \u00a0while not done_event.is_set():\n\u00a0\u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0elapsed = time.time() - start_time\n\u00a0\u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0await ctx.info(f\"Processing ......: {elapsed:.1f} seconds elapsed\")\n\u00a0\u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0await asyncio.sleep(5) \u00a0# Check every 5 seconds\n\u00a0\u00a0 \u00a0 \u00a0 \u00a0return\n\n\u00a0\u00a0 \u00a0timer_task = asyncio.create_task(timer())\n\n\u00a0\u00a0 \u00a0## main task#####################################\n\u00a0\u00a0 \u00a0for i in range(epochs):\n\u00a0\u00a0 \u00a0 \u00a0 \u00a0# Simulate long running time training work\n\u00a0\u00a0 \u00a0 \u00a0 \u00a0progress = (i + 1) \/ epochs\n\u00a0\u00a0 \u00a0 \u00a0 \u00a0await asyncio.sleep(5)\n\u00a0\u00a0 \u00a0#################################################\n\n\u00a0\u00a0 \u00a0# Signal the timer to stop and clean up\n\u00a0\u00a0 \u00a0done_event.set()\n\u00a0\u00a0 \u00a0await timer_task\n\n\u00a0\u00a0 \u00a0total_time = time.time() - start_time\n\u00a0\u00a0 \u00a0print(f\"\u23f1\ufe0f Total processing time: {total_time:.2f} seconds\")\n\n\u00a0\u00a0 \u00a0return f\"{model_name} training completed. The model artifact is stored in s3:\/\/templocation\/model.pickle . The model training score is 0.87, validation score is 0.82.\"\n\nif __name__ == \"__main__\":\n\u00a0\u00a0 \u00a0mcp.run(transport=\"streamable-http\")<\/code><\/pre>\n<\/p><\/div>\n<p>This pattern creates an asynchronous timer that runs alongside your main task, sending regular status updates every few seconds. Using <code>asyncio.Event()<\/code> for coordination facilitates clean shutdown of the timer when the main work is completed.<\/p>\n<h3>When to use context messaging<\/h3>\n<p>Context messaging works best when:<\/p>\n<ul>\n<li>Tasks take 1\u201315 minutes to complete*<\/li>\n<li>Network connections are generally stable<\/li>\n<li>The client session can remain active throughout the operation<\/li>\n<li>You need real-time progress updates during processing<\/li>\n<li>Tasks have predictable, finite execution times with clear termination conditions<\/li>\n<\/ul>\n<p>*Note: \u201c15 minutes\u201d is based on the maximum time for synchronous requests Amazon Bedrock AgentCore offered. More details about Bedrock AgentCore service quotas can be found at <a href=\"https:\/\/docs.aws.amazon.com\/bedrock-agentcore\/latest\/devguide\/bedrock-agentcore-limits.html\" target=\"_blank\" rel=\"noopener noreferrer\">Quotas for Amazon Bedrock AgentCore<\/a>. If the infrastructure hosting the agent doesn\u2019t implement hard time limits, be extremely cautious when using this approach for tasks that might potentially hang or run indefinitely. Without proper safeguards, a stuck task could maintain an open connection indefinitely, leading to resource depletion, unresponsive processes, and potentially system-wide stability issues.<\/p>\n<p>Here are some important limitations to consider:<\/p>\n<ul>\n<li><strong>Continuous connection required<\/strong> \u2013 The client session must remain active throughout the entire operation. If the user closes their browser or the network drops, the work is lost.<\/li>\n<li><strong>Resource consumption<\/strong> \u2013 Keeping connections open consumes server and client resources, potentially increasing costs for long-running operations.<\/li>\n<li><strong>Network dependency<\/strong> \u2013 Network instability can still interrupt the process, requiring a full restart.<\/li>\n<li><strong>Ultimate timeout limits<\/strong> \u2013 Most infrastructures have hard timeout limits that can\u2019t be circumvented with heartbeat messages.<\/li>\n<\/ul>\n<p>Therefore, for truly long-running operations that might take hours or for scenarios where users need to disconnect and reconnect later, you\u2019ll need the more robust asynchronous task management approach.<\/p>\n<h2>Async task management<\/h2>\n<p>Unlike the context messaging approach where clients must maintain continuous connections, the async task management pattern follows a \u201cfire and forget\u201d model:<\/p>\n<ol>\n<li><strong>Task initiation<\/strong> \u2013 Client makes a request to start a task and immediately receives a task ID<\/li>\n<li><strong>Background processing<\/strong> \u2013 Server executes the work asynchronously, with no client connection required<\/li>\n<li><strong>Status checking<\/strong> \u2013 Client can reconnect whenever to check progress using the task ID<\/li>\n<li><strong>Result retrieval<\/strong> \u2013 When they\u2019re completed, results remain available for retrieval whenever the client reconnects<\/li>\n<\/ol>\n<p>The following figure illustrates the workflow in the asynchronous task management approach.<\/p>\n<div id=\"attachment_124058\" class=\"wp-caption alignnone\">\n        <a href=\"https:\/\/d2908q01vomqb2.cloudfront.net\/f1f836cb4ea6efb2a0b1b99f41ad8b103eff4b59\/2026\/02\/07\/image-3-1.png\"><img decoding=\"async\" aria-describedby=\"caption-attachment-124058\" loading=\"lazy\" class=\"wp-image-124058 size-full\" src=\"https:\/\/d2908q01vomqb2.cloudfront.net\/f1f836cb4ea6efb2a0b1b99f41ad8b103eff4b59\/2026\/02\/07\/image-3-1.png\" alt=\"Sequence diagram showing Model Context Protocol (MCP) architecture with asynchronous task handling. Six components: User, Agent (AI processor), MCP Server, MCP Tool (task executor), Check Task Tool (status checker), and Cache (result storage). Flow: User queries Agent \u2192 Agent requests MCP Server \u2192 Server invokes MCP Tool \u2192 User receives immediate notice with Task ID \u2192 Tool executes and stores result in Cache \u2192 User checks task status via Agent \u2192 Agent requests Check Task Tool through MCP Server \u2192 Check Task Tool retrieves result from Cache using Task ID \u2192 Result returns through Server to Agent \u2192 Agent responds to User. Demonstrates asynchronous processing with task tracking and caching\" width=\"1210\" height=\"817\"><\/a><\/p>\n<p id=\"caption-attachment-124058\" class=\"wp-caption-text\">Figure 2: Illustration of workflow in asynchronous task management approach<\/p>\n<\/p><\/div>\n<p>This pattern mirrors how you interact with batch processing systems in enterprise environments\u2014submit a job, disconnect, and check back later when convenient. Here\u2019s a practical implementation that demonstrates these principles:<\/p>\n<div class=\"hide-language\">\n<pre><code class=\"lang-python\">from mcp.server.fastmcp import Context, FastMCP\nimport asyncio\nimport uuid\nfrom typing import Dict, Any\n\nmcp = FastMCP(host=\"0.0.0.0\", stateless_http=True)\n\n# task storage\ntasks: Dict[str, Dict[str, Any]] = {}\n\nasync def _execute_model_training(\n\u00a0\u00a0 \u00a0 \u00a0 \u00a0task_id: str, \n\u00a0\u00a0 \u00a0 \u00a0 \u00a0model_name: str, \n\u00a0\u00a0 \u00a0 \u00a0 \u00a0epochs: int\n\u00a0\u00a0 \u00a0):\n\u00a0\u00a0 \u00a0\"\"\"Background task execution.\"\"\"\n\u00a0\u00a0 \u00a0tasks[task_id][\"status\"] = \"running\"\n\u00a0\u00a0 \u00a0\n\u00a0\u00a0 \u00a0for i in range(epochs):\n\u00a0\u00a0 \u00a0 \u00a0 \u00a0tasks[task_id][\"progress\"] = (i + 1) \/ epochs\n\u00a0\u00a0 \u00a0 \u00a0 \u00a0await asyncio.sleep(2)\n\n\u00a0\u00a0 \u00a0tasks[task_id][\"result\"] = f\"{model_name} training completed. The model artifact is stored in s3:\/\/templocation\/model.pickle . The model training score is 0.87, validation score is 0.82.\"\n\u00a0\u00a0 \u00a0\n\u00a0\u00a0 \u00a0tasks[task_id][\"status\"] = \"completed\"\n\n@mcp.tool()\ndef model_training(\n\u00a0\u00a0 \u00a0model_name: str, \n\u00a0\u00a0 \u00a0epochs: int = 10\n\u00a0\u00a0 \u00a0) -&gt; str:\n\u00a0\u00a0 \u00a0\"\"\"Start model training task.\"\"\"\n\u00a0\u00a0 \u00a0task_id = str(uuid.uuid4())\n\u00a0\u00a0 \u00a0tasks[task_id] = {\n\u00a0\u00a0 \u00a0 \u00a0 \u00a0\"status\": \"started\", \n\u00a0\u00a0 \u00a0 \u00a0 \u00a0\"progress\": 0.0, \n\u00a0\u00a0 \u00a0 \u00a0 \u00a0\"task_type\": \"model_training\"\n\u00a0\u00a0 \u00a0}\n\u00a0\u00a0 \u00a0asyncio.create_task(_execute_model_training(task_id, model_name, epochs))\n\u00a0\u00a0 \u00a0return f\"Model Training task has been initiated with task ID: {task_id}. Please check back later to monitor completion status and retrieve results.\"\n\n@mcp.tool()\ndef check_task_status(task_id: str) -&gt; Dict[str, Any]:\n\u00a0\u00a0 \u00a0\"\"\"Check the status of a running task.\"\"\"\n\u00a0\u00a0 \u00a0if task_id not in tasks:\n\u00a0\u00a0 \u00a0 \u00a0 \u00a0return {\"error\": \"task not found\"}\n\u00a0\u00a0 \u00a0\n\u00a0\u00a0 \u00a0task = tasks[task_id]\n\u00a0\u00a0 \u00a0return {\n\u00a0\u00a0 \u00a0 \u00a0 \u00a0\"task_id\": task_id,\n\u00a0\u00a0 \u00a0 \u00a0 \u00a0\"status\": task[\"status\"],\n\u00a0\u00a0 \u00a0 \u00a0 \u00a0\"progress\": task[\"progress\"],\n\u00a0\u00a0 \u00a0 \u00a0 \u00a0\"task_type\": task.get(\"task_type\", \"unknown\")\n\u00a0\u00a0 \u00a0}\n\n@mcp.tool()\ndef get_task_results(task_id: str) -&gt; Dict[str, Any]:\n\u00a0\u00a0 \u00a0\"\"\"Get results from a completed task.\"\"\"\n\u00a0\u00a0 \u00a0if task_id not in tasks:\n\u00a0\u00a0 \u00a0 \u00a0 \u00a0return {\"error\": \"task not found\"}\n\u00a0\u00a0 \u00a0\n\u00a0\u00a0 \u00a0task = tasks[task_id]\n\u00a0\u00a0 \u00a0if task[\"status\"] != \"completed\":\n\u00a0\u00a0 \u00a0 \u00a0 \u00a0return {\"error\": f\"task not completed. Current status: {task['status']}\"}\n\u00a0\u00a0 \u00a0\n\u00a0\u00a0 \u00a0return {\n\u00a0\u00a0 \u00a0 \u00a0 \u00a0\"task_id\": task_id,\n\u00a0\u00a0 \u00a0 \u00a0 \u00a0\"status\": task[\"status\"],\n\u00a0\u00a0 \u00a0 \u00a0 \u00a0\"result\": task[\"result\"]\n\u00a0\u00a0 \u00a0}\n\nif __name__ == \"__main__\":\n\u00a0\u00a0 \u00a0mcp.run(transport=\"streamable-http\")<\/code><\/pre>\n<\/p><\/div>\n<p>This implementation creates a task management system with three distinct MCP tools:<\/p>\n<ul>\n<li><code>model_training()<\/code> \u2013 The entry point that initiates a new task. Rather than performing the work directly, it:\n<ul>\n<li>Generates a unique task identifier using Universally Unique Identifier (UUID)<\/li>\n<li>Creates an initial task record in the storage dictionary<\/li>\n<li>Launches the actual processing as a background task using <code>asyncio.create_task()<\/code><\/li>\n<li>Returns immediately with the task ID, allowing the client to disconnect<\/li>\n<\/ul>\n<\/li>\n<li><code>check_task_status()<\/code> \u2013 Allows clients to monitor progress at their convenience by:\n<ul>\n<li>Looking up the task by ID in the storage dictionary<\/li>\n<li>Returning current status and progress information<\/li>\n<li>Providing appropriate error handling for missing tasks<\/li>\n<\/ul>\n<\/li>\n<li><code>get_task_results()<\/code>\u2013 Retrieves completed results when ready by:\n<ul>\n<li>Verifying the task exists and is completed<\/li>\n<li>Returning the results stored during background processing<\/li>\n<li>Providing clear error messages when results aren\u2019t ready<\/li>\n<\/ul>\n<\/li>\n<\/ul>\n<p>The actual work happens in the private <code>_execute_model_training()<\/code> function, which runs independently in the background after the initial client request is completed. It updates the task\u2019s status and progress in the shared storage as it progresses, making this information available for subsequent status checks.<\/p>\n<h3>Limitations to consider<\/h3>\n<p>Although the async task management approach helps solve connectivity issues, it introduces its own set of limitations:<\/p>\n<ul>\n<li><strong>User experience friction<\/strong> \u2013 The approach requires users to manually check task status, remember task IDs across sessions, and explicitly request results, increasing interaction complexity.<\/li>\n<li><strong>Volatile memory storage<\/strong> \u2013 Using in-memory storage (as in our example) means the tasks and results are lost if the server restarts, making the solution unsuitable for production without persistent storage.<\/li>\n<li><strong>Serverless environment constraints<\/strong> \u2013 In ephemeral serverless environments, instances are automatically terminated after periods of inactivity, causing the in-memory task state to be permanently lost. This creates a paradoxical situation where the solution designed to handle long-running operations becomes vulnerable to the exact duration it aims to support. Unless users maintain regular check-ins to help prevent session time limits, both tasks and results could vanish.<\/li>\n<\/ul>\n<h3>Moving toward a robust solution<\/h3>\n<p>To address these critical limitations, you need to include external persistence that survives both server restarts and instance terminations. This is where integration with dedicated storage services becomes essential. By using external agent memory storage systems, you can fundamentally change where and how task information is maintained. Instead of relying on the MCP server\u2019s volatile memory, this approach uses persistent external agent memory storage services that remain available regardless of server state.<\/p>\n<p>The key innovation in this enhanced approach is that when the MCP server runs a long-running task, it writes the interim or final results directly into external memory storage, such as <a href=\"https:\/\/docs.aws.amazon.com\/bedrock-agentcore\/latest\/devguide\/memory.html\" target=\"_blank\" rel=\"noopener noreferrer\">Amazon Bedrock AgentCore Memory<\/a> that the agent can access, as illustrated in the following figure. This helps create resilience against two types of runtime failures:<\/p>\n<ol>\n<li>The instance running the MCP server can be terminated due to inactivity after task completion<\/li>\n<li>The instance hosting the agent itself can be recycled in ephemeral serverless environments<\/li>\n<\/ol>\n<div id=\"attachment_124060\" class=\"wp-caption alignnone\">\n        <a href=\"https:\/\/d2908q01vomqb2.cloudfront.net\/f1f836cb4ea6efb2a0b1b99f41ad8b103eff4b59\/2026\/02\/07\/image-5-1.png\"><img decoding=\"async\" aria-describedby=\"caption-attachment-124060\" loading=\"lazy\" class=\"wp-image-124060 size-full\" src=\"https:\/\/d2908q01vomqb2.cloudfront.net\/f1f836cb4ea6efb2a0b1b99f41ad8b103eff4b59\/2026\/02\/07\/image-5-1.png\" alt=\"Sequence diagram showing Model Context Protocol (MCP) architecture with event-driven synchronization and memory management. Five components: User, Agent (AI processor), AgentCore Memory (event storage), MCP Server, and MCP Tool (task executor). Flow: User queries Agent \u2192 Agent requests MCP Server with Event Sync to AgentCore Memory \u2192 Server invokes MCP Tool \u2192 Tool sends immediate notice \u2192 User receives notification \u2192 Tool executes and outputs result, adding event to AgentCore Memory \u2192 Multiple Event Sync operations occur between Agent and AgentCore Memory \u2192 User checks task status \u2192 Agent retrieves information via Event Sync \u2192 Agent responds to User. Demonstrates event-driven architecture with synchronized memory management across agent sessions.\" width=\"1060\" height=\"806\"><\/a><\/p>\n<p id=\"caption-attachment-124060\" class=\"wp-caption-text\">Figure 3. MCP integration with external memory<\/p>\n<\/p><\/div>\n<p>With external memory storage, when users return to interact with the agent\u2014whether minutes, hours, or days later\u2014the agent can retrieve the completed task results from persistent storage. This approach minimizes runtime dependencies: even if both the MCP server and agent instances are terminated, the task results remain safely preserved and accessible when needed.<\/p>\n<p>The next section will explore how to implement this robust solution using <a href=\"https:\/\/docs.aws.amazon.com\/bedrock-agentcore\/latest\/devguide\/agents-tools-runtime.html\" target=\"_blank\" rel=\"noopener noreferrer\">Amazon Bedrock AgentCore Runtime<\/a> as a serverless hosting environment, AgentCore Memory for persistent agent memory storage, and the Strands Agents framework to orchestrate these components into a cohesive system that maintains task state across session boundaries.<\/p>\n<h2>Amazon Bedrock AgentCore and Strands Agents implementation<\/h2>\n<p>Before diving into the implementation details, it\u2019s important to understand the deployment options available for MCP servers on Amazon Bedrock AgentCore. There are two primary approaches: <a href=\"https:\/\/docs.aws.amazon.com\/bedrock-agentcore\/latest\/devguide\/gateway.html\" target=\"_blank\" rel=\"noopener noreferrer\">Amazon Bedrock AgentCore Gateway<\/a> and AgentCore Runtime. AgentCore Gateway has a 5-minute timeout for invocations, making it unsuitable for hosting MCP servers that provide tools requiring extended response times or long-running operations. AgentCore Runtime offers significantly more flexibility with a 15-minute request timeout (for synchronous requests) and adjustable maximum session duration (for asynchronous processes; the default duration is 8 hours) and idle session timeout. Although you could host an MCP server in a traditional serverful environment for unlimited execution time, AgentCore Runtime provides an optimal balance for most production scenarios. You gain serverless benefits such as automatic scaling, pay-per-use pricing, and no infrastructure management, while the adjustable maximums session duration covers most real-world long running tasks\u2014from data processing and model training to report generation and complex simulations. You can use this approach to build sophisticated AI agents without the operational overhead of managing servers while reserving serverful deployments only for the rare cases that genuinely require multiday executions. For more information about AgentCore Runtime and AgentCore Gateway service quotas, refer to <a href=\"https:\/\/docs.aws.amazon.com\/bedrock-agentcore\/latest\/devguide\/bedrock-agentcore-limits.html\" target=\"_blank\" rel=\"noopener noreferrer\">Quotas for Amazon Bedrock AgentCore<\/a>.<\/p>\n<p>Next, we walk through the implementation, which is illustrated in the following diagram. This implementation consists of two interconnected components: the MCP server that executes long-running tasks and writes results to AgentCore Memory, and the agent that manages the conversation flow and retrieves those results when needed. This architecture creates a seamless experience where users can disconnect during lengthy processes and return later to find their results waiting for them.<\/p>\n<p><a href=\"https:\/\/d2908q01vomqb2.cloudfront.net\/f1f836cb4ea6efb2a0b1b99f41ad8b103eff4b59\/2026\/02\/07\/image-7.png\"><img decoding=\"async\" loading=\"lazy\" class=\"alignnone size-full wp-image-124062\" src=\"https:\/\/d2908q01vomqb2.cloudfront.net\/f1f836cb4ea6efb2a0b1b99f41ad8b103eff4b59\/2026\/02\/07\/image-7.png\" alt=\"Architecture diagram showing AgentCore Runtime system with three main components and their interactions. Left: User interacts with Agent (dollar sign icon) within AgentCore Runtime, exchanging queries and responses. Agent connects to MCP Client which sends tasks and receives tool results. Center-right: AgentCore Runtime contains MCP Server with Tools component. Bottom-left: Bedrock LLM (brain icon) connects to Agent. Bottom-center: AgentCore Memory component stores session data. Three numbered interaction flows: (1) MCP Client connects to MCP Server using bearer token, content-type, and session\/memory\/actor IDs in request header; (2) Tools write results to AgentCore Memory upon task completion using session\/memory\/actor IDs for seamless continuity across disconnections; (3) Agent synchronizes with AgentCore Memory when new conversations are added for timely retrieval of tool-generated results. Demonstrates integrated architecture for agent-based task processing with persistent memory and LLM capabilities.\" width=\"1433\" height=\"678\"><\/a><\/p>\n<h3>MCP server implementation<\/h3>\n<p>Let\u2019s examine how our MCP server implementation uses AgentCore Memory to achieve persistence:<\/p>\n<div class=\"hide-language\">\n<pre><code class=\"lang-python\">from mcp.server.fastmcp import Context, FastMCP\nimport asyncio\nimport uuid\nfrom typing import Dict, Any\nimport json\nfrom bedrock_agentcore.memory import MemoryClient\n\nmcp = FastMCP(host=\"0.0.0.0\", stateless_http=True)\nagentcore_memory_client = MemoryClient()\n\nasync def _execute_model_training(\n\u00a0\u00a0 \u00a0 \u00a0 \u00a0model_name: str, \n\u00a0\u00a0 \u00a0 \u00a0 \u00a0epochs: int,\n\u00a0\u00a0 \u00a0 \u00a0 \u00a0session_id: str,\n\u00a0\u00a0 \u00a0 \u00a0 \u00a0actor_id: str,\n\u00a0\u00a0 \u00a0 \u00a0 \u00a0memory_id: str\n\u00a0\u00a0 \u00a0):\n\u00a0\u00a0 \u00a0\"\"\"Background task execution.\"\"\"\n\u00a0\u00a0 \u00a0\n\u00a0\u00a0 \u00a0for i in range(epochs):\n\u00a0\u00a0 \u00a0 \u00a0 \u00a0await asyncio.sleep(2)\n\n\u00a0\u00a0 \u00a0try:\n\u00a0\u00a0 \u00a0 \u00a0 \u00a0response = agentcore_memory_client.create_event(\n\u00a0\u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0memory_id=memory_id,\n\u00a0\u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0actor_id=actor_id,\n\u00a0\u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0session_id=session_id,\n\u00a0\u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0messages=[\n\u00a0\u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0(\n\u00a0\u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0json.dumps({\n\u00a0\u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0\"message\": {\n\u00a0\u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0\"role\": \"user\",\n\u00a0\u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0\"content\": [\n\u00a0\u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0{\n\u00a0\u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0\"text\": f\"{model_name} training completed. The model artifact is stored in s3:\/\/templocation\/model.pickle . The model training score is 0.87, validation score is 0.82.\"\n\u00a0\u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0}\n\u00a0\u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0]\n\u00a0\u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0},\n\u00a0\u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0\"message_id\": 0\n\u00a0\u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0}),\n\u00a0\u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0'USER'\n\u00a0\u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0)\n\u00a0\u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0]\n\u00a0\u00a0 \u00a0 \u00a0 \u00a0)\n\u00a0\u00a0 \u00a0 \u00a0 \u00a0print(response)\n\u00a0\u00a0 \u00a0except Exception as e:\n\u00a0\u00a0 \u00a0 \u00a0 \u00a0print(f\"Memory save error: {e}\")\n\n\u00a0\u00a0 \u00a0return\n\n@mcp.tool()\ndef model_training(\n\u00a0\u00a0 \u00a0 \u00a0 \u00a0model_name: str, \n\u00a0\u00a0 \u00a0 \u00a0 \u00a0epochs: int,\n\u00a0\u00a0 \u00a0 \u00a0 \u00a0ctx: Context\n\u00a0\u00a0 \u00a0) -&gt; str:\n\u00a0\u00a0 \u00a0\"\"\"Start model training task.\"\"\"\n\n\u00a0\u00a0 \u00a0print(ctx.request_context.request.headers)\n\u00a0\u00a0 \u00a0mcp_session_id = ctx.request_context.request.headers.get(\"mcp-session-id\", \"\")\n\u00a0\u00a0 \u00a0temp_id_list = mcp_session_id.split(\"@@@\")\n\u00a0\u00a0 \u00a0session_id = temp_id_list[0]\n\u00a0\u00a0 \u00a0memory_id= temp_id_list[1]\n\u00a0\u00a0 \u00a0actor_id \u00a0= temp_id_list[2]\n\n\u00a0\u00a0 \u00a0asyncio.create_task(_execute_model_training(\n\u00a0\u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0model_name, \n\u00a0\u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0epochs, \n\u00a0\u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0session_id, \n\u00a0\u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0actor_id, \n\u00a0\u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0memory_id\n\u00a0\u00a0 \u00a0 \u00a0 \u00a0)\n\u00a0\u00a0 \u00a0)\n\u00a0\u00a0 \u00a0return f\"Model {model_name}Training task has been initiated. Total training epochs are {epochs}. The results will be updated once the training is completed.\"\n\n\nif __name__ == \"__main__\":\n\u00a0\u00a0 \u00a0mcp.run(transport=\"streamable-http\")<\/code><\/pre>\n<\/p><\/div>\n<p>The implementation relies on two key components that enable persistence and session management.<\/p>\n<ol>\n<li>The <code>agentcore_memory_client.create_event()<\/code> method serves as the bridge between tool execution and persistent memory storage. When a background task is completed, this method saves the results directly to the agent\u2019s memory in AgentCore Memory using the specified memory ID, actor ID, and session ID. Unlike traditional approaches where results might be stored temporarily or require manual retrieval, this integration enables task outcomes to become permanent parts of the agent\u2019s conversational memory. The agent can then reference these results in future interactions, creating a continuous knowledge-building experience across multiple sessions.<\/li>\n<li>The second crucial component involves extracting session context through <code>ctx.request_context.request.headers.get(\"mcp-session-id\", \"\")<\/code>. The <code>\"Mcp-Session-Id\"<\/code> is part of <a href=\"https:\/\/modelcontextprotocol.io\/specification\/2025-06-18\/basic\/transports%22%20%5Cl%20%22session-management\" target=\"_blank\" rel=\"noopener noreferrer\">standard MCP protocol<\/a>. You can use this header to pass a composite identifier containing three essential pieces of information in a delimited format: <code>session_id@@@memory_id@@@actor_id<\/code>. This approach allows our implementation to retrieve the necessary context identifiers from a single header value. Headers are used instead of environment variables by necessity\u2014these identifiers change dynamically with each conversation, whereas environment variables remain static from container startup. This design choice is particularly important in multi-tenant scenarios where a single MCP server simultaneously handles requests from multiple users, each with their own distinct session context.<\/li>\n<\/ol>\n<p>Another important aspect in this example involves proper message formatting when storing events. Each message saved to AgentCore Memory requires two components: the content and a role identifier. These two components need to be formatted in a way that the agent framework can be recognized. Here is an example for Strands Agents framework:<\/p>\n<div class=\"hide-language\">\n<pre><code class=\"lang-css\">messages=[\n\u00a0 \u00a0 (\n\u00a0 \u00a0\u00a0 \u00a0 \u00a0json.dumps({\n\u00a0 \u00a0\u00a0 \u00a0 \u00a0 \u00a0 \u00a0\"message\": {\n\u00a0 \u00a0\u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0\"role\": \"user\",\n\u00a0 \u00a0\u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0\"content\": [\n\u00a0 \u00a0\u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0{\n\u00a0 \u00a0\u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0\"text\":\u00a0&lt;message to the memory&gt;\n\u00a0 \u00a0\u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0}\n\u00a0 \u00a0\u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0]\n\u00a0 \u00a0\u00a0 \u00a0 \u00a0 \u00a0 \u00a0},\n\u00a0 \u00a0\u00a0 \u00a0 \u00a0 \u00a0 \u00a0\"message_id\": 0\n\u00a0 \u00a0\u00a0 \u00a0 \u00a0}),\n\u00a0 \u00a0\u00a0 \u00a0 \u00a0'USER'\n\u00a0 \u00a0 )\n]<\/code><\/pre>\n<\/p><\/div>\n<p>The content is an inner JSON object (serialized with <code>json.dumps()<\/code>) that contains the message details, including role, text content, and message ID. The outer role identifier (USER in this example) helps AgentCore Memory categorize the message source.<\/p>\n<h3>Strands Agents implementation<\/h3>\n<p>Integrating Amazon Bedrock AgentCore Memory with Strands Agents is remarkably straightforward using the <code>AgentCoreMemorySessionManager<\/code> class from the <a href=\"https:\/\/github.com\/aws\/bedrock-agentcore-sdk-python\/?tab=readme-ov-file\" target=\"_blank\" rel=\"noopener noreferrer\">Bedrock AgentCore SDK<\/a>. As shown in the following code example, implementation requires minimal configuration\u2014create an <code>AgentCoreMemoryConfig<\/code> with your session identifiers, initialize the session manager with this config, and pass it directly to your agent constructor. The session manager transparently handles the memory operations behind the scenes, maintaining conversation history and context across interactions while organizing memories using the combination of <code>session_id<\/code>, <code>memory_id<\/code>, and <code>actor_id<\/code>. For more information, refer to <a href=\"https:\/\/strandsagents.com\/latest\/documentation\/docs\/community\/session-managers\/agentcore-memory\/\" target=\"_blank\" rel=\"noopener noreferrer\">AgentCore Memory Session Manager<\/a>.<\/p>\n<div class=\"hide-language\">\n<pre><code class=\"lang-python\">from\u00a0bedrock_agentcore.memory.integrations.strands.config\u00a0import\u00a0AgentCoreMemoryConfig\nfrom\u00a0bedrock_agentcore.memory.integrations.strands.session_manager\u00a0import\u00a0AgentCoreMemorySessionManager\n\n@app.entrypoint\nasync def strands_agent_main(payload, context):\n\n\u00a0\u00a0 \u00a0session_id = context.session_id\n\u00a0\u00a0 \u00a0if not session_id:\n\u00a0\u00a0 \u00a0 \u00a0 \u00a0session_id = str(uuid.uuid4())\n\u00a0\u00a0 \u00a0print(f\"Session ID: {session_id}\")\n\n\u00a0\u00a0 \u00a0memory_id = payload.get(\"memory_id\")\n\u00a0\u00a0 \u00a0if not memory_id:\n\u00a0\u00a0 \u00a0 \u00a0 \u00a0memory_id = \"\"\n\u00a0\u00a0 \u00a0print(f\"? Memory ID: {memory_id}\")\n\n\u00a0\u00a0 \u00a0actor_id = payload.get(\"actor_id\")\n\u00a0\u00a0 \u00a0if not actor_id:\n\u00a0\u00a0 \u00a0 \u00a0 \u00a0actor_id = \"default\"\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\n\u00a0 \u00a0 agentcore_memory_config = AgentCoreMemoryConfig(\n\u00a0\u00a0 \u00a0 \u00a0 \u00a0memory_id=memory_id,\n\u00a0\u00a0 \u00a0 \u00a0 \u00a0session_id=session_id,\n\u00a0\u00a0 \u00a0 \u00a0 \u00a0actor_id=actor_id\n\u00a0\u00a0 \u00a0)\n\n\u00a0\u00a0 \u00a0session_manager = AgentCoreMemorySessionManager(\n\u00a0\u00a0 \u00a0 \u00a0 \u00a0agentcore_memory_config=agentcore_memory_config\n\u00a0\u00a0 \u00a0)\n\u00a0\u00a0 \u00a0\n\u00a0\u00a0 \u00a0user_input = payload.get(\"prompt\")\n\n\u00a0\u00a0 \u00a0headers = {\n\u00a0\u00a0 \u00a0 \u00a0 \u00a0\"authorization\": f\"Bearer {bearer_token}\",\n\u00a0\u00a0 \u00a0 \u00a0 \u00a0\"Content-Type\": \"application\/json\",\n\u00a0\u00a0 \u00a0 \u00a0 \u00a0\"Mcp-Session-Id\": session_id + \"@@@\" + memory_id + \"@@@\" + actor_id\n\u00a0\u00a0 \u00a0}\n\n\u00a0\u00a0 \u00a0# Connect to an MCP server using SSE transport\n\u00a0\u00a0 \u00a0streamable_http_mcp_client = MCPClient(\n\u00a0\u00a0 \u00a0 \u00a0 \u00a0lambda: streamablehttp_client(\n\u00a0\u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0mcp_url,\n\u00a0\u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0headers,\n\u00a0\u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0timeout=30\n\u00a0\u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0)\n\u00a0\u00a0 \u00a0 \u00a0 \u00a0)\n\n\u00a0\u00a0 \u00a0with streamable_http_mcp_client:\n\u00a0\u00a0 \u00a0 \u00a0 \u00a0# Get the tools from the MCP server\n\u00a0\u00a0 \u00a0 \u00a0 \u00a0tools = streamable_http_mcp_client.list_tools_sync()\n\n\u00a0\u00a0 \u00a0 \u00a0 \u00a0# Create an agent with these tools\u00a0\u00a0 \u00a0 \u00a0 \u00a0\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0agent\u00a0= Agent(\n\u00a0\u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0tools\u00a0= tools,\n\u00a0\u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0callback_handler=call_back_handler,\n\u00a0\u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0session_manager=session_manager\n\u00a0\u00a0 \u00a0 \u00a0 \u00a0)<\/code><\/pre>\n<\/p><\/div>\n<p>The session context management is particularly elegant here. The agent receives session identifiers through the payload and context parameters supplied by AgentCore Runtime. These identifiers form a crucial contextual bridge that connects user interactions across multiple sessions. The <code>session_id<\/code> can be extracted from the context object (generating a new one if needed), and the <code>memory_id<\/code> and <code>actor_id<\/code> can be retrieved from the payload. These identifiers are then packaged into a custom HTTP header (<code>Mcp-Session-Id<\/code>) that\u2019s passed to the MCP server during connection establishment.<\/p>\n<p>To maintain this persistent experience across multiple interactions, clients must consistently provide the same identifiers when invoking the agent:<\/p>\n<div class=\"hide-language\">\n<pre><code class=\"lang-css\"># invoke agentcore through boto3\nboto3_response = agentcore_client.invoke_agent_runtime(\n \u00a0 \u00a0agentRuntimeArn=agent_arn,\n \u00a0 \u00a0qualifier=\"DEFAULT\",\n \u00a0 \u00a0payload=json.dumps(\n \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0{\n \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0\"prompt\": user_input,\n \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0\"actor_id\": actor_id,\n \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0\"memory_id\": memory_id\n \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0}\n \u00a0 \u00a0 \u00a0 \u00a0),\n \u00a0 \u00a0runtimeSessionId = session_id,\n)<\/code><\/pre>\n<\/p><\/div>\n<p>By consistently providing the same <code>memory_id<\/code>, <code>actor_id<\/code>, and <code>runtimeSessionId<\/code> across invocations, users can create a continuous conversational experience where task results persist independently of session boundaries. When a user returns days later, the agent can automatically retrieve both conversation history and the task results that were completed during their absence.<\/p>\n<p>This architecture represents a significant advancement in AI agent capabilities\u2014transforming long-running operations from fragile, connection-dependent processes into robust, persistent tasks that continue working regardless of connection state. The result is a system that can deliver truly asynchronous AI assistance, where complex work continues in the background and results are seamlessly integrated whenever the user returns to the conversation.<\/p>\n<h2>Conclusion<\/h2>\n<p>In this post, we\u2019ve explored practical ways to help AI agents handle tasks that take minutes or even hours to complete. Whether using the more straightforward approach of keeping connections alive or the more advanced method of injecting task results to agent\u2019s memory, these techniques enable your AI agent to tackle valuable complex work without frustrating time limits or lost results.<\/p>\n<p>We invite you to try these approaches in your own AI agent projects. Start with context messaging for moderate tasks, then move to async management as your needs grow. The solutions we\u2019ve shared can be quickly adapted to your specific needs, helping you build AI that delivers results reliably\u2014even when users disconnect and return days later. What long-running tasks could your AI assistants handle better with these techniques?<\/p>\n<p>To learn more, see the\u00a0<a href=\"https:\/\/docs.aws.amazon.com\/bedrock-agentcore\/\" target=\"_blank\" rel=\"noopener noreferrer\">Amazon Bedrock AgentCore documentation<\/a>\u00a0and explore our\u00a0<a href=\"https:\/\/github.com\/aws-samples\/sample-mcp-for-long-runing-tasks-with-amazon-bedrock-agentcore\" target=\"_blank\" rel=\"noopener noreferrer\">sample notebook.<\/a><\/p>\n<hr>\n<h2>About the Authors<\/h2>\n<p><a href=\"https:\/\/d2908q01vomqb2.cloudfront.net\/f1f836cb4ea6efb2a0b1b99f41ad8b103eff4b59\/2026\/02\/07\/1516614244231.jpeg\"><img decoding=\"async\" loading=\"lazy\" class=\"wp-image-124064 size-thumbnail alignleft\" src=\"https:\/\/d2908q01vomqb2.cloudfront.net\/f1f836cb4ea6efb2a0b1b99f41ad8b103eff4b59\/2026\/02\/07\/1516614244231-100x100.jpeg\" alt=\"\" width=\"100\" height=\"100\"><\/a><strong>Haochen Xie<\/strong> is a Senior Data Scientist at AWS Generative AI Innovation Center. He is an ordinary person.<\/p>\n<p><strong><a href=\"https:\/\/d2908q01vomqb2.cloudfront.net\/f1f836cb4ea6efb2a0b1b99f41ad8b103eff4b59\/2026\/02\/07\/Screenshot-2026-02-07-at-3.53.12%E2%80%AFPM.png\"><img decoding=\"async\" loading=\"lazy\" class=\"size-thumbnail wp-image-124069 alignleft\" src=\"https:\/\/d2908q01vomqb2.cloudfront.net\/f1f836cb4ea6efb2a0b1b99f41ad8b103eff4b59\/2026\/02\/07\/Screenshot-2026-02-07-at-3.53.12%E2%80%AFPM-100x101.png\" alt=\"\" width=\"100\" height=\"101\"><\/a>Flora Wang<\/strong> is an Applied Scientist at AWS Generative AI Innovation Center, where she works with customers to architect and implement scalable Generative AI solutions that address their unique business challenges. She specializes in model customization techniques and agent-based AI systems, helping organizations harness the full potential of generative AI technology.<\/p>\n<p><a href=\"https:\/\/d2908q01vomqb2.cloudfront.net\/f1f836cb4ea6efb2a0b1b99f41ad8b103eff4b59\/2026\/02\/07\/image.png\"><img decoding=\"async\" loading=\"lazy\" class=\"alignleft wp-image-124066\" src=\"https:\/\/d2908q01vomqb2.cloudfront.net\/f1f836cb4ea6efb2a0b1b99f41ad8b103eff4b59\/2026\/02\/07\/image-100x100.png\" alt=\"\" width=\"100\" height=\"100\"><\/a><strong>Yuan Tian<\/strong> is an Applied Scientist at the AWS Generative AI Innovation Center, where he works with customers across diverse industries\u2014including healthcare, life sciences, finance, and energy\u2014to architect and implement generative AI solutions such as agentic systems. He brings a unique interdisciplinary perspective, combining expertise in machine learning with computational biology.<\/p>\n<p><a href=\"https:\/\/d2908q01vomqb2.cloudfront.net\/f1f836cb4ea6efb2a0b1b99f41ad8b103eff4b59\/2026\/02\/07\/Screenshot-2026-02-07-at-3.53.18%E2%80%AFPM.png\"><img decoding=\"async\" loading=\"lazy\" class=\"size-thumbnail wp-image-124070 alignleft\" src=\"https:\/\/d2908q01vomqb2.cloudfront.net\/f1f836cb4ea6efb2a0b1b99f41ad8b103eff4b59\/2026\/02\/07\/Screenshot-2026-02-07-at-3.53.18%E2%80%AFPM-100x101.png\" alt=\"\" width=\"100\" height=\"101\"><\/a><strong>Hari Prasanna Das<\/strong> is an Applied Scientist at the AWS Generative AI Innovation Center, where he works with AWS customers across different verticals to expedite their use of Generative AI. Hari holds a PhD in Electrical Engineering and Computer Sciences from the University of California, Berkeley. His research interests include Generative AI, Deep Learning, Computer Vision, and Data-Efficient Machine Learning.<\/p>\n<p>       <!-- '\"` -->\n      <\/div>\n","protected":false},"excerpt":{"rendered":"<p>https:\/\/aws.amazon.com\/blogs\/machine-learning\/build-long-running-mcp-servers-on-amazon-bedrock-agentcore-with-strands-agents-integration\/<\/p>\n","protected":false},"author":0,"featured_media":4468,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":[],"categories":[3],"tags":[],"_links":{"self":[{"href":"https:\/\/salarydistribution.com\/machine-learning\/wp-json\/wp\/v2\/posts\/4467"}],"collection":[{"href":"https:\/\/salarydistribution.com\/machine-learning\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/salarydistribution.com\/machine-learning\/wp-json\/wp\/v2\/types\/post"}],"replies":[{"embeddable":true,"href":"https:\/\/salarydistribution.com\/machine-learning\/wp-json\/wp\/v2\/comments?post=4467"}],"version-history":[{"count":0,"href":"https:\/\/salarydistribution.com\/machine-learning\/wp-json\/wp\/v2\/posts\/4467\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/salarydistribution.com\/machine-learning\/wp-json\/wp\/v2\/media\/4468"}],"wp:attachment":[{"href":"https:\/\/salarydistribution.com\/machine-learning\/wp-json\/wp\/v2\/media?parent=4467"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/salarydistribution.com\/machine-learning\/wp-json\/wp\/v2\/categories?post=4467"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/salarydistribution.com\/machine-learning\/wp-json\/wp\/v2\/tags?post=4467"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}