Operator: OpenAI’s AI Agent

OpenAI begins the new year 2025 by introducing in January its general-purpose AI agent that could take control of the web browser and independently perform certain actions. To the US users, it will be available on subscription plan first and will be introduced to the other countries soon. Later it will be integrated to all its ChatGPT clients.

The tasks Operator could perform are: booking travel accommodations, making restaurant reservations and shopping online

The agent will use a dedicated web browser to complete the tasks. Users can still take control of the screen as the Operator is working.

Operator is powered by CUA or Computer-Using Agent model. It blends the vision capabilities of the GPT-4o with the reasoning abilities of OpenAI’s advanced models. CUA is trained to interact with the front-end of the web-cites (it does not have to use APIs to tap into different services). It can use buttons, navigate menus and fill out forms on a web page just as humans do.

OpenAI collaborates with eBay, Instacart, StabHub, Uber, Priceline, DoorDash so as to ensure that Operator respects the terms and conditions of those businesses.

The CUA model is trained to ask for user confirmation before finalizing tasks. A user can double-check the model’s work before it becomes final.

OpenAI warns the CUA is not perfect. It is not reliable across all the scenarios. OpenAI recommends an abundance of caution and supervise certain tasks. (such as banking transactions). Users may have to intervene for card transactions. Operator does not.

There are some limitations of the Operator. There are rate limits – both daily and task-wise. It denies certain tasks for security reasons. This can change in future. It also gets stuck if the task is complex. It will ask the user to take over.

OpenAI is slower than others to develop agents. It considers the safety risks around the technology. It considers Operator safe enough. It is a bold attempt.

print

Comments

Leave a Reply

Your email address will not be published. Required fields are marked *