Windows 365 for Agents MCP server reference

Windows 365 for Agents is an MCP server that gives you full operational control of a Windows 365 cloud PC. Use this MCP server to drive a real Windows environment through desktop interaction (mouse, keyboard, screen capture, command execution), browser automation via Microsoft Edge, and semantic UI inspection via Windows UI Automation.

Note

Browser automation works on Microsoft Edge. Edge launches automatically on the first browser tool call. focus_browser can also target Chrome or Firefox, but DOM-level browser tools only work on the Edge instance.

To learn more about Windows 365 for Agents, see Windows 365 for Agents documentation.

Overview

Server ID Tenant-level URL Display name Description
mcp_W365ComputerUse https://agent365.svc.cloud.microsoft/
agents/tenants/{tenantId}/
servers/mcp_W365ComputerUse
Windows 365 for Agents MCP server Full operational control of a Windows 365 cloud PC, including desktop interaction, browser automation, and UI inspection.

Available tools

mcp_desktop_move_mouse

Moves the cursor to a screen position. Use mcp_desktop_click instead if you intend to click at the destination. Required parameters:

  • x: X coordinate in screen pixels
  • y: Y coordinate in screen pixels

mcp_desktop_click

Clicks at a position, or at the current cursor location if coordinates are omitted. Supports single-click, double-click, and all five mouse buttons.

Optional parameters:

  • x: X coordinate in screen pixels (omit for current position)
  • y: Y coordinate in screen pixels (omit for current position)
  • button: Left, Right, Middle, Forward, or Backward (default Left)
  • clickCount: 1 = single click, 2 = double click (default 1)

mcp_desktop_get_cursor_position

Returns the current cursor coordinates. No parameters. Returns {cursorX, cursorY}.

mcp_desktop_drag_mouse

Drags from one position to another. Useful for moving objects, resizing windows, or pixel-precise scrolling. Required parameters:

  • startX: Start X coordinate.
  • startY: Start Y coordinate.
  • endX: End X coordinate.
  • endY: End Y coordinate. Optional parameters:
  • button: Left, Right, or Middle (default is Left)

mcp_desktop_scroll

Scrolls at a position by using notch units, not pixels. Three notches are approximately one page.

Required parameters:

  • x: Scroll position X
  • y: Scroll position Y

Optional parameters:

  • deltaX: Horizontal notches, positive = right (default 0)
  • deltaY: Vertical notches, positive = down (default 0)

Note

Values are clamped to the range [-20, 20].

mcp_desktop_type_text

Types text by simulating keyboard input. For keyboard shortcuts, use mcp_desktop_press_keys. For web form fields, use mcp_browser_type.

Required parameters:

  • text: Text to type.

mcp_desktop_press_keys

Presses a key combination simultaneously. Supports modifier keys, function keys, and standard keys.

Required parameters:

  • keys: Array of key names to press together (for example, ["ctrl","c"], ["alt","tab"], ["ctrl","shift","s"])

mcp_desktop_take_screenshot

Captures the full screen or a cropped region as a PNG image (base64-encoded).

Optional parameters:

  • x: Crop region left edge
  • y: Crop region top edge
  • width: Crop region width
  • height: Crop region height

Note

Provide all four crop parameters together, or omit all four for a full-screen capture.

mcp_desktop_zoom_region

Captures a screen region at native resolution as a PNG image (base64-encoded). Use this feature to inspect small text or dense UI elements that are hard to read in a downscaled full-screen screenshot.

Required parameters:

  • x: Left edge X coordinate in screen pixels
  • y: Top edge Y coordinate in screen pixels
  • width: Region width in pixels
  • height: Region height in pixels

Note

The maximum region size is 1920x1080 pixels.

mcp_desktop_analyze_screen

Performs OCR on the entire screen. No parameters. Returns {fullText, averageConfidence, boxes[{text, confidence, x, y, width, height}], width, height}.

mcp_desktop_get_screen_size

Returns the screen resolution. No parameters. Returns {width, height}.

mcp_desktop_list_windows

Lists all visible windows with their titles, positions, and dimensions. No parameters. Returns an array of {title, processName, handle, x, y, width, height}.

mcp_desktop_activate_window

Brings a window to the foreground by using a fuzzy title match.

Required parameters:

  • titlePattern: Partial window title (case-insensitive substring)

mcp_desktop_focus_browser

Focuses a browser window (Edge, Chrome, or Firefox), optionally filtered by URL or title.

Optional parameters:

  • pattern: URL or title substring to match (omit for any browser window)

mcp_desktop_close_window

Closes a window gracefully by using a fuzzy title match. The system protects critical processes and you can't close them.

Required parameters:

  • titlePattern: Partial window title (80% match threshold). Returns {matchedTitle, processName, closed}.

mcp_desktop_resize_window

Resizes, moves, maximizes, minimizes, or restores a window by using a fuzzy title match.

Required parameters:

  • title: Window title to match (case-insensitive fuzzy match)
  • action: Action to perform - Resize, Move, Maximize, Minimize, or Restore

Optional parameters:

  • x: Left edge X coordinate (used with Resize or Move)
  • y: Top edge Y coordinate (used with Resize or Move)
  • width: Width in pixels (used with Resize)
  • height: Height in pixels (used with Resize)

mcp_desktop_execute_shell_command

Runs a shell command in a sandboxed environment. The command is checked against an allow list, and dangerous patterns are blocked.

Required parameters:

  • command: Command to run

Optional parameters:

  • cwd: Working directory. Use forward slashes (for example, C:/Users/me/project).
  • timeoutMs: Timeout in milliseconds (default 30000, max 30000)

Note

  • Allowed commands: git, npm, dotnet, python, cargo, node, pip, dir, mkdir, del, copy, move, robocopy, findstr, where, type, and notepad.
  • Blocked patterns include shell metacharacters (|, ;, &, <, >), environment variable expansion (%VAR%), interpreter eval flags (python -c or node -e), git config --global, npm -g, path-prefixed executables, rm -rf, sudo, and disk or system commands.
  • The command's stdout and stderr each truncate at 32 KB. For arbitrary computation, use mcp_desktop_execute_python_code. The command returns {stdout, stderr, exitCode, success, timedOut, resourceLimitsApplied}.

mcp_desktop_execute_python_code

Executes Python code in a sandboxed environment with resource limits. This function is ideal for data processing, calculations, file I/O, and any computation that goes beyond simple shell commands.

Required parameters:

  • code: Python code (max 262,144 characters).

Optional parameters:

  • cwd: Working directory. Use forward slashes.
  • timeoutMs: Timeout in milliseconds (default 30000, max 30000).

Returns the same schema as mcp_desktop_execute_shell_command.

Note

The sandbox enforces a 512 MB memory limit and a 30-second timeout.

mcp_desktop_wait_milliseconds

Pauses execution to allow animations or transitions to complete. Don't use this function in polling loops. Instead, use mcp_browser_wait_for for DOM polling.

Required parameters:

  • ms: Wait duration in milliseconds (clamped to [0, 5000])

mcp_desktop_clipboard_read

Reads the current content of the system clipboard. This command doesn't require any parameters. It returns a JSON object that describes the clipboard format and payload, which can be either a text string or a base64-encoded image.

mcp_desktop_clipboard_write

Writes text to the system clipboard, replacing the current content.

Required parameters:

  • text: Text to write to the clipboard

Returns a confirmation that includes the character count.

mcp_desktop_list_processes

Lists running processes in the current session. Each entry includes the PID, process name, memory usage, window title (if any), and startTimeTicks. Pair startTimeTicks with mcp_desktop_kill_process to prevent killing a recycled PID.

Optional parameters:

  • maxCount: Maximum number of processes to return (default 200)

Returns a JSON array of process info objects.

mcp_desktop_kill_process

Terminates a process by PID. Supply the startTime value from mcp_desktop_list_processes to protect against PID recycling.

Required parameters:

  • pid: Process ID returned by mcp_desktop_list_processes
  • startTime: Process start time ticks returned by mcp_desktop_list_processes

Optional parameters:

  • force: Force-kill without a graceful shutdown (default false)

Returns a JSON result describing the outcome.

mcp_desktop_launch_application

Launches a GUI application from an allowed directory. Use mcp_desktop_execute_shell_command for CLI commands instead.

Required parameters:

  • path: Absolute path to the executable. Use forward slashes (for example, C:/Program Files/app.exe).

Optional parameters:

  • args: Array of command-line arguments

Returns {path, pid}.

mcp_desktop_get_system_info

Returns the OS version, CPU, RAM, available disk space, and display resolution. No parameters. Returns a JSON object containing the system information.

mcp_browser_navigate

Navigates to a URL and waits for the page to load.

Required parameters:

  • url: Full URL including protocol (for example, https://example.com)

mcp_browser_back

Navigates back in browser history. No parameters.

mcp_browser_forward

Navigates forward in browser history. No parameters.

mcp_browser_reload

Reloads the current page. No parameters.

mcp_browser_get_url

Returns the current page URL as a plain string. No parameters.

mcp_browser_get_title

Returns the current page title as a plain string. No parameters.

mcp_browser_get_text

Returns the visible page text content as a plain string. No parameters. Truncated at 512 KB.

mcp_browser_get_html

Returns the full page HTML source as a plain string. No parameters. Truncated at 512 KB.

mcp_browser_get_page_state

Retrieves multiple page state fields in a single call. Useful for capturing several signals at once without issuing separate tool calls.

Required parameters:

  • fields: Array of fields to return. Allowed values: url, title, dom, screenshot, tabs

Returns a JSON object containing only the requested fields.

mcp_browser_click

Clicks a DOM element by CSS selector. More reliable than coordinate-based clicking for web content.

Required parameters:

  • selector: CSS selector (for example, #submit-btn or a.nav-link)

mcp_browser_type

Types text into a form element by using a CSS selector.

Required parameters:

  • selector: CSS selector of the input element.
  • text: Text to type.

mcp_browser_query_text

Gets the text content of the first element that matches a CSS selector.

Required parameters:

  • selector: CSS selector.

mcp_browser_wait_for

Waits for a DOM element to appear. This function is useful for dynamic content that loads asynchronously.

Required parameters:

  • selector: CSS selector to wait for.

Optional parameters:

  • timeoutMs: Timeout in milliseconds. The default is 5,000 and the maximum is 30,000.

mcp_browser_eval_js

Evaluates a JavaScript expression in the page context and returns the result as a string.

Required parameters:

  • expression: JavaScript expression that returns a string

Note

If your expression returns an object or number, convert it to a string explicitly (for example, JSON.stringify(obj) or .toString()).

mcp_browser_list_tabs

Lists all open tabs with their index, title, and URL. No parameters. Returns an array of {index, title, url}.

mcp_browser_switch_tab

Switches to a tab by index.

Required parameters:

  • tabIndex: 0-based tab index

mcp_browser_new_tab

Opens a new tab, optionally navigating to a URL.

Optional parameters:

  • url: URL to open (blank tab if omitted)

Returns {index, title, url}.

mcp_browser_create_tabs

Opens multiple tabs at once. Optionally bring one of them to the foreground.

Required parameters:

  • urls: Array of URLs to open, one tab per URL

Optional parameters:

  • foregroundIndex: Index of the tab to bring to the foreground after creation (omit to keep the current tab focused)

Returns a text confirmation.

mcp_browser_close_tab

Closes a tab by index.

Required parameters:

  • tabIndex: 0-based tab index

mcp_browser_screenshot

Captures a PNG screenshot of the browser viewport only (not the full screen). No parameters. Returns a base64-encoded PNG.

mcp_browser_select_option

Selects one or more options in a <select> element by their value attribute.

Required parameters:

  • selector: CSS selector for the <select> element
  • values: Array of option value(s) to select

Returns a confirmation with the count of selected options.

mcp_browser_fill_form

Fill multiple form fields in a single call. Each entry is a {selector, value} pair. The operation stops on the first failure and reports which fields succeeded.

Required parameters:

  • fields: Array of {selector, value} pairs

Returns a confirmation with the count of filled fields.

mcp_browser_drag

Drags a source element onto a target element. Both elements are identified by CSS selector.

Required parameters:

  • sourceSelector: CSS selector of the drag source
  • targetSelector: CSS selector of the drop target

mcp_browser_pdf_save

Saves the current page as a PDF file. Destination paths are restricted to %USERPROFILE% or %TEMP%.

Required parameters:

  • filePath: Destination file path under %USERPROFILE% or %TEMP%. Use forward slashes.

Returns a confirmation including the saved file path.

mcp_browser_handle_dialog

Accepts or dismisses a pending browser dialog (alert, confirm, prompt, or beforeunload). Returns "No dialog pending" if no dialog is active.

Required parameters:

  • action: accept or dismiss

Optional parameters:

  • promptText: Text to supply to a prompt dialog (ignored for alert and confirm)

mcp_browser_get_cookies

Gets cookies for the current page, or for a specified set of URLs. Cookie values are always redacted for security; names, domains, paths, and flags are returned.

Optional parameters:

  • urls: Array of URLs to get cookies for (omit for the current page)

Returns an array of cookie objects with redacted values.

mcp_browser_set_cookies

Sets cookies on the current page's domain. This action adds or overwrites cookies but doesn't clear existing cookies.

Required parameters:

  • cookies: Array of cookie objects. Each entry requires name and value. Optional fields: domain, path, secure, httpOnly, sameSite.

Returns a text confirmation.

mcp_browser_execute_batch

Executes multiple browser actions sequentially in a single call. This action stops on the first failure and returns the results collected up to that point.

Required parameters:

  • actions: Array of {action, params} objects. Allowed actions: navigate, snapshot, click_ref, type_ref, hover_ref, scroll_ref, keypress_ref, wait_for, eval_js.

Returns an array of results, one per executed action.

mcp_browser_snapshot

Captures the page's accessibility tree with stable ref IDs (for example, e5) that map to DOM nodes. Use the refs with mcp_browser_click_ref, mcp_browser_type_ref, and mcp_browser_hover_ref. Refs expire when the page navigates—retake a snapshot after navigation.

Optional parameters:

  • maxDepth: Maximum tree depth, 1-10 (default 5)
  • includeIframes: Include cross-origin iframes (default true)

Returns a JSON object containing the accessibility snapshot and ref IDs.

mcp_browser_click_ref

Clicks an element by ref ID from mcp_browser_snapshot. A hit-test verifies that no other element overlays the target. Fails if the snapshot expires—retake the snapshot in that case.

Required parameters:

  • snapshotId: Snapshot ID returned by mcp_browser_snapshot
  • ref: Element ref (for example, e5) from the snapshot nodes

Optional parameters:

  • button: Left, Right, or Middle (default Left)
  • clickCount: 1 = single click, 2 = double click (default 1)

Returns a confirmation including the clicked coordinates.

mcp_browser_type_ref

Types text into an element by using the ref ID from mcp_browser_snapshot. The element is focused first, and existing text is cleared by default. The operation fails if the snapshot expires.

Required parameters:

  • snapshotId: Snapshot ID returned by mcp_browser_snapshot
  • ref: Element ref (for example, e5) from the snapshot nodes
  • text: Text to type

Optional parameters:

  • clear: Clear existing text first (default true)

Returns a confirmation that includes the character count.

mcp_browser_hover_ref

Hovers over an element by using the ref ID from mcp_browser_snapshot. Returns immediately. The operation fails if the snapshot expires - retake the snapshot in that case.

Required parameters:

  • snapshotId: Snapshot ID returned by mcp_browser_snapshot
  • ref: Element ref (for example, e5) from the snapshot nodes

Returns a confirmation including the hover coordinates.

mcp_accessibility_get_accessibility_tree

Retrieves the UI element tree for the foreground window. Each element includes its role, name, value, and screen coordinates.

Optional parameters:

  • maxDepth: Maximum tree traversal depth, 1-10 (default 3)
  • maxElements: Maximum elements to return, 1-2000 (default 500)

Returns a hierarchical tree of {role, name, value, x, y, width, height, children[...]}.

mcp_browser_keypress_ref

Presses a single key on an element by ref ID from mcp_browser_snapshot. The element is focused first. Supports modifier keys. Fails if the snapshot has expired — retake the snapshot in that case.

Required parameters:

  • snapshotId: Snapshot ID returned by mcp_browser_snapshot
  • ref: Element ref (for example, e5) from the snapshot nodes
  • key: Key name — for example, Enter, Escape, Tab, ArrowUp, ArrowDown, or F1F12

Optional parameters:

  • modifiers: Array of modifier keys to hold during the press — Ctrl, Shift, Alt, or Meta

Returns a text confirmation.

mcp_browser_scroll_ref

Scrolls an element into view by ref ID from mcp_browser_snapshot. Optionally, scrolls by a pixel delta within the element. Fails if the snapshot expires.

Required parameters:

  • snapshotId: Snapshot ID returned by mcp_browser_snapshot
  • ref: Element ref (for example, e5) from the snapshot nodes

Optional parameters:

  • deltaX: Horizontal scroll delta in pixels (default 0)
  • deltaY: Vertical scroll delta in pixels (default 0)

Returns a text confirmation.

mcp_browser_set_file_input_ref

Sets files on a file input element by ref ID from mcp_browser_snapshot. File paths are restricted to the user's Documents, Downloads, Desktop, or %TEMP% directories.

Required parameters:

  • snapshotId: Snapshot ID returned by mcp_browser_snapshot
  • ref: Element ref for the file input
  • filePaths: Array of file paths to upload

Returns a text confirmation.

mcp_accessibility_find_ui_element

Searches for UI elements by text content, accessibility role, or name (case-insensitive substring). Returns matching elements with their clickable screen coordinates.

Optional parameters:

  • text: Text to search for (used as name if name omitted)
  • role: UI role filter - Button, TextBox, CheckBox, MenuItem, ComboBox, and more
  • name: Accessible name (takes precedence over text if both provided)
  • windowHandle: Target window handle (null = foreground window)

Key features

Desktop interaction

  • Click, double-click, right-click, and five-button mouse control.
  • Pixel-precise drag and drop.
  • Notch-based scrolling (three notches ≈ one page).
  • Keyboard typing and multi-key shortcut combos.
  • Cursor position tracking.
  • Screen resolution detection.

Screen capture and analysis

  • Full-screen or cropped PNG screenshots.
  • OCR of the full screen with per-region confidence scores and bounding boxes.
  • Browser-viewport-only screenshots for web content.

Window management

  • Enumerate all visible windows with positions and dimensions.
  • Activate windows by fuzzy title match.
  • Focus browser windows (Edge, Chrome, Firefox) optionally filtered by URL or title.
  • Graceful window close with protection for system-critical processes.

Command execution

  • Sandboxed shell commands with an allow list (git, npm, dotnet, python, cargo, node, pip, dir, mkdir, del, copy, move, robocopy, findstr, where, type).
  • Sandboxed Python execution up to 262,144 characters of code.
  • Working-directory and per-call timeout control (max 30 seconds).
  • Resource limits and hardened block list against shell metacharacters, eval flags, privilege escalation, and destructive operations.

Browser automation

  • Navigate, back, forward, reload, and configurable wait conditions on navigation (load, networkidle0, networkidle2).
  • Read page URL, title, visible text (512 KB cap), and full HTML (512 KB cap).
  • Consolidated page state retrieval — URL, title, DOM, screenshot, and tab list in a single call.
  • DOM-level click, type, form fill, drag, and <select> option selection by CSS selector.
  • Accessibility-snapshot-based interaction by ref ID — click, type, hover, keypress with modifiers, scroll, and file-input upload.
  • Wait for dynamic elements with configurable timeout, optionally requiring visibility.
  • Evaluate JavaScript expressions in the page context.
  • Multi-tab management: list, switch, open one or many at once, and close.
  • Cookie inspection (values redacted) and assignment on the current domain.
  • Batched action execution — sequence multiple browser steps in one call, stopping on first failure.
  • Save the current page as a PDF under %USERPROFILE% or %TEMP%.
  • Dialog handling for alert, confirm, prompt, and beforeunload.
  • Runs on Microsoft Edge, launched automatically on first use.

UI accessibility

  • Retrieve the Windows UI Automation tree for the foreground window with configurable depth and element count.
  • Find UI elements by text, role, or accessible name.
  • Returns clickable screen coordinates for precise targeting of buttons, text boxes, checkboxes, menu items, and combo boxes.

Timing and synchronization

  • Use mcp_desktop_wait_milliseconds for short one-shot pauses (up to five seconds).
  • Use mcp_browser_wait_for for DOM-level polling (up to 30 seconds).

Notes

  • All coordinates are in screen pixels with (0,0) at the top-left corner. Coordinates from mcp_desktop_take_screenshot, mcp_desktop_analyze_screen, mcp_accessibility_find_ui_element, and mcp_desktop_list_windows all share the same coordinate space.
  • A cursor failsafe is active: If the cursor moves within five pixels of any screen corner, mouse operations are canceled. Avoid targeting the extreme edges of the screen.
  • Shell pipe operators (|), semicolons (;), ampersands (&), and output redirection (>, <) are blocked. To transform command output, capture it and process it with mcp_desktop_execute_python_code.
  • If interpreter eval flags are blocked or if python -c "..." and node -e "..." are rejected, you can use mcp_desktop_execute_python_code for Python code, or write code to a file first.
  • Command stdout/stderr is truncated at 32 KB each. Use flags to limit verbose output (for example, git log --oneline -20) or redirect to a file and read it separately.
  • Maximum timeout for mcp_desktop_execute_shell_command and mcp_desktop_execute_python_code is 30 seconds. For longer work, break it into smaller steps or launch a background process from Python and poll.
  • There's no dedicated file read/write tool. Read files with mcp_desktop_execute_shell_command using the type command. Write files with mcp_desktop_execute_python_code using Python's built-in file I/O. Shell output redirection (>, >>) is blocked.
  • mcp_browser_eval_js always returns a string. Convert objects or numbers explicitly before returning.
  • Browser DOM tools (mcp_browser_click, mcp_browser_type, mcp_browser_eval_js, and others) operate only on the Microsoft Edge instance. mcp_desktop_focus_browser can focus Chrome or Firefox windows, but DOM tools don't target them.
  • mcp_desktop_take_screenshot requires all four crop parameters (x, y, width, height) together, or none for a full-screen capture.
  • mcp_desktop_scroll uses notch units (clamped to [-20, 20]), not pixels. Three notches is approximately one page.
  • mcp_accessibility_find_ui_element requires at least one of text, role, or name. When both text and name are provided, name takes precedence.
  • mcp_browser_snapshot refs expire on navigation. If a _ref tool (click, type, hover, keypress, scroll, or set file input) fails because the snapshot is stale, retake the snapshot and retry.
  • mcp_browser_set_file_input_ref only accepts file paths under the user's Documents, Downloads, Desktop, or %TEMP% directories. Files outside those locations are rejected.
  • mcp_browser_get_cookies always returns redacted cookie values. Use it for inspection—names, domains, paths, and flags are returned in full, but values aren't exposed.
  • mcp_browser_set_cookies only adds or overwrites cookies. It doesn't clear existing cookies. To remove a cookie, overwrite it with an expired expires value via this tool, or clear it through the page itself.
  • mcp_browser_execute_batch stops on the first failed action and returns only the results collected up to that point. Subsequent actions in the array aren't attempted. Allowed batch actions are limited to: navigate, snapshot, click_ref, type_ref, hover_ref, scroll_ref, keypress_ref, wait_for, and eval_js.
  • mcp_browser_create_tabs opens tabs in the order provided. If foregroundIndex is omitted, focus stays on the currently active tab.
  • mcp_browser_get_page_state only returns the fields listed in the fields array. Request only what you need – including dom or screenshot can produce large payloads.

Common use cases

Fill out a web form

  • Call mcp_browser_navigate to open the target page.
  • Call mcp_browser_wait_for to wait for the form to load.
  • Call mcp_browser_type to fill each field by CSS selector.
  • Call mcp_browser_click to submit the form.
  • Call mcp_browser_wait_for to wait for the confirmation element.
  • Call mcp_browser_get_text to read and verify the result.

Automate a desktop application

  • Call mcp_desktop_activate_window to bring the application to the foreground.
  • Call mcp_desktop_take_screenshot to capture the current state.
  • Call mcp_accessibility_find_ui_element to locate a button or field by name.
  • Call mcp_desktop_click on the element's reported coordinates.
  • Call mcp_desktop_type_text to enter data.
  • Call mcp_desktop_press_keys for shortcuts (for example, ["ctrl","s"] to save).
  • Call mcp_desktop_take_screenshot to verify the result.

Extract data from a web page

  • Call mcp_browser_navigate to open the page.
  • Call mcp_browser_get_text to extract visible text content.
  • Call mcp_desktop_execute_python_code to parse and process the extracted data.
  • Call mcp_browser_eval_js to query specific values via JavaScript when text extraction isn't enough.

Run development tasks

  • Call mcp_desktop_execute_shell_command for git pull, npm install, and dotnet build.
  • Call mcp_desktop_take_screenshot to capture build output.
  • Call mcp_desktop_execute_python_code to analyze logs or test results.
  • Call mcp_browser_navigate to open a local dev server in the browser.
  • Call mcp_browser_screenshot to capture the rendered page.

Read and write files

  • Read a file by using mcp_desktop_execute_shell_command with type C:\path\to\file.txt.
  • Write a file by using mcp_desktop_execute_python_code with Python's open(...) and write(...).
  • Verify by using mcp_desktop_execute_shell_command with dir C:\path\to\output.txt.
  • Call mcp_accessibility_get_accessibility_tree to understand the full UI structure.
  • Call mcp_accessibility_find_ui_element to find a specific control (for example, role: "MenuItem", name: "Settings").
  • Call mcp_desktop_click using the element's reported coordinates.
  • Call mcp_accessibility_find_ui_element again to find the next control in the dialog.
  • Call mcp_desktop_type_text or mcp_desktop_click to interact with it.

Keep a long-running session alive

  • Send any MCP request at least once every 30 minutes to prevent idle eviction.
  • mcp_desktop_get_screen_size is lightweight and works well as a heartbeat.