Self-Operating Computer

Using the same inputs and outputs of a human operator, this framework enables multimodal AI models to view the screen and decide on a series of mouse and keyboard actions to reach an objective.

Integration

Currently integrated with GPT-4-Vision as the default model.

Compatibility

Designed for support across operating systems and to be used various multimodal models.

Future Plans

At HyperwriteAI, we are developing Agent-1-Vision, a multimodal model designed for operating software and computer interfaces, with more accurate click location predictions.

Agent-1-Vision Model API Access

We will soon be offering API access to our Agent-1-Vision model. If you're interested in gaining access to this API, sign up here:

Sign up for API Access

Additional Thoughts

We recognize that some operating system functions may be more efficiently executed with hotkeys such as entering the Browser Address bar using command + L rather than by simulating a mouse click at the correct XY location.

We plan to make these improvements over time. However, it's important to note that many actions require the accurate selection of visual elements on the screen, necessitating precise XY mouse click locations.

A primary focus of this project is to refine the accuracy of determining these click locations. We believe this is essential for achieving a fully self-operating computer in the current technological landscape.

Join Our Discord Community

For real-time discussions and community support, join our Discord server.

If you're already a member, join the discussion in #self-operating-computer.

If you're new, first join our Discord Server and then navigate to the #self-operating-computer.

An open-source framework to enable multimodal models to operate a computer.

Ask a computer to do anything

The all-encompassing AI solution you've been waiting for

AI is transforming our world, reshaping the way we work and live. We envision a future where one powerful AI agent streamlines your digital life, seamlessly integrating all your needs into a single, intelligent solution. Our Personal Assistant embodies this vision, offering you unparalleled convenience and efficiency in tackling everyday tasks, without the hassle of juggling multiple tools.

Personal Assistant

What I can help you with today?

Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.

Email Management

Effortlessly conquer your inbox. Stay organized, prioritize messages, and seamless organization, smart prioritization, and rapid responses, all with the power of AI at your fingertips.

Personal Assistant

What I can help you with today?

Order me a large pizza to One Vanderbilt?

What kind of pizza would you like?

Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.

Everyday Tasks

Streamline your daily routine. From scheduling appointments and ordering food to online shopping and bill payments, let the power of AI optimize your everyday tasks for a smoother, more efficient lifestyle.

Personal Assistant

What are we working on today?

Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.

Research

Enhance your research capabilities. Dive into a wealth of knowledge, retrieve accurate information, and uncover valuable insights, all through the brilliance of AI-driven search and thought.