Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

This could play very well with building a managed agentic system around computer-use for RPA. Great stuff!


we have support for browser agents (browser-use and stagehand), but running this locally with computer use agents would change the game. will explore :)


Can the browser agents use my already running browser? It would be nice to automate some light workflows in platforms that require login, especially the ones that make it hard to use headless browsers

Right now my solution is to build extensions that I can manually start on my browser. But using extensions to gather and export data + maintaining them is a bit of a pain


yeah! for stagehand, I actually stitched together a way for you to login and authenticate on platforms without the LLM's ever seeing your login credentials. in the prompt, you can specify the username as %username% and provide the credentials right below, and then we use selectors to enter that value into the DOM and hand it back to the agent once the login is completed. you can also get structured output. afaik, stagehand themselves don't even offer these three in their SDK and there's no way to login without giving the LLM your credentials. it isn't the best, but its the only place I've seen you can get secure login + agents + structured output


Amazing. Can it also use multimodal local LLMs? For example, can it pass images to gemma3 running via ollama?


although I haven't experimented with gemma 3 locally, with ollama we have instructions in the README, and all you need to do is initialize the model, and pull it when running sim studio. let me know how it goes!




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: