---
title: "Drive web and app UIs with vision-grounded steps when selectors are brittle or unavailable"
description: "Use Midscene.js when an agent needs screenshot-grounded UI actions and assertions across web, mobile, or desktop surfaces where DOM selectors are fragile, unavailable, or not the right abstraction."
verification: "listed"
source: "https://github.com/web-infra-dev/midscene"
author: "web-infra-dev"
publisher_type: "organization"
category:
  - "Browser Automation"
framework:
  - "Multi-Framework"
tool_ecosystem:
  github_repo: "web-infra-dev/midscene"
  github_stars: 12613
  npm_package: "@midscene/core"
  npm_weekly_downloads: 83670
---

# Drive web and app UIs with vision-grounded steps when selectors are brittle or unavailable

Use Midscene.js when an agent needs screenshot-grounded UI actions and assertions across web, mobile, or desktop surfaces where DOM selectors are fragile, unavailable, or not the right abstraction.

## Prerequisites

Midscene.js, Node.js, a supported vision model, and a target automation surface such as Playwright, Puppeteer, Android adb, or iOS WebDriverAgent

## Installation

Choose whichever fits your setup:

1. Copy this skill folder into your local skills directory.
2. Clone the repo and symlink or copy the skill into your agent workspace.
3. Add the repo as a git submodule if you manage shared skills centrally.
4. Install it through your internal provisioning or packaging workflow.
5. Download the folder directly from GitHub and place it in your skills collection.

Install command or upstream instructions:

```
Install the core package with `npm install @midscene/core`, connect it to your browser or device automation surface using the upstream setup guide, then author natural-language UI actions, assertions, and extraction steps through the SDK, YAML flow, or playground tooling.
```

## Documentation

- https://midscenejs.com

## Source

- [Agent Skill Exchange](https://agentskillexchange.com/skills/drive-web-and-app-uis-with-vision-grounded-steps-when-selectors-are-brittle-or-unavailable/)
