This could play very well with building a managed agentic system around computer...

waleedlatif1 · 2025-04-28T17:31:31 1745861491

we have support for browser agents (browser-use and stagehand), but running this locally with computer use agents would change the game. will explore :)

nico · 2025-04-28T19:13:33 1745867613

Can the browser agents use my already running browser? It would be nice to automate some light workflows in platforms that require login, especially the ones that make it hard to use headless browsers

Right now my solution is to build extensions that I can manually start on my browser. But using extensions to gather and export data + maintaining them is a bit of a pain

waleedlatif1 · 2025-04-28T19:40:05 1745869205

yeah! for stagehand, I actually stitched together a way for you to login and authenticate on platforms without the LLM's ever seeing your login credentials. in the prompt, you can specify the username as %username% and provide the credentials right below, and then we use selectors to enter that value into the DOM and hand it back to the agent once the login is completed. you can also get structured output. afaik, stagehand themselves don't even offer these three in their SDK and there's no way to login without giving the LLM your credentials. it isn't the best, but its the only place I've seen you can get secure login + agents + structured output

nico · 2025-04-28T19:54:37 1745870077

Amazing. Can it also use multimodal local LLMs? For example, can it pass images to gemma3 running via ollama?

waleedlatif1 · 2025-04-28T20:00:36 1745870436

although I haven't experimented with gemma 3 locally, with ollama we have instructions in the README, and all you need to do is initialize the model, and pull it when running sim studio. let me know how it goes!