Even its maker, OpenAI, warns that ChatGPT Agent won't always ask for human confirmation where it is required before taking an action. Image / Getty Creative
Even its maker, OpenAI, warns that ChatGPT Agent won't always ask for human confirmation where it is required before taking an action. Image / Getty Creative
Experts say “Agentic AI” will be the next big thing in artificial intelligence.
And that might already be the case in a finely-honed corporate environment, where an AI agent – technology capable of carrying out tasks and making its own decisions, within certain parameters – can already be deployedfor tasks such as fielding helpdesk queries.
But the new general-purpose AI agents, aimed at everyday users, can be a much hairier proposition.
Experts say they’re too slow and clumsy to save you much time at this point, and that there are risks once they’ve got spending power.
Mastercard and Visa policies around purchases made by AI agents are emerging, but they are a work in progress that could require a test case to refine (more on that below).
In the domestic environment, use cases include everything from an AI agent “looking” at a photo of a dish, working out all the ingredients, ordering them for you and finding a recipe, to cleaning up and managing your email inbox, and sending basic replies on your behalf.
Not just about you
Deloitte NZ AI director Dr Amanda Williamson says it’s important to remember that when you give an AI agent full access to your email, it’s not just you deciding to trust a big tech firm with your sensitive information.
You’re also making a decision on behalf of all of your friends, family and colleagues or anyone who might send you a message. The same goes for giving an agent access to your calendar and letting it make appointments for you.
When you give an AI agent full access to your inbox, you're also making a decision on behalf of friends, family and colleagues, says Deloitte artificial intelligence director Dr Amanda Williamson.
And Williamson, who has been using a number of agents, says they can’t be left to do their own thing.
“It can be a bit frustrating to use, because it asks for human input quite regularly ... If you ask it to research something, it might say after 10 minutes, ‘Oh, would you like that to be in a PowerPoint presentation?’ and if a human isn’t sitting there to click ‘yes’ it’s going to be a frustrating experience.”
Agents have recently been pushed towards the mainstream with a number of home users experimenting with the recently released ChatGPT Agent (now available to New Zealanders who are on a US$20 ($34) or US$200 per month premium plan).
Protection from rogue spending
Earlier adopters who were happy to give a ChatGPT agent full access to their bank accounts – all the better for a snappy social clip – often found it went for top-rated items from top-rated sellers. One user did get coffee delivered in half an hour, even if it was the most expensive cup of his life (see clip below).
Using OpenAI's new Agent to order coffee from doordash! Follow @aitoolhub.co 🎥 : @adamstewartmarketing comment discord or newsletter to stay connected #chatgpt#aiagent#aiagents#n8n#openai#llm
Williamson found a grey area when she asked Perplexity’s AI agent to order her ingredients for making a pavlova. She assumed it would ask her to okay the final order, but instead the agent tried to go ahead and buy the items from New World’s online store, where Williamson’s credit card details were saved and the agent did not have to enter a security code.
In the event, the agent was thwarted because her card on file was out of date (see video below).
Whatever you think of Visa and Mastercard’s fees, they’ve helped fund a near black-and-white protection against unauthorised spending on your card. If you didn’t authorise the spending, you can get your money back in most circumstances.
AI agents have introduced an element of fuzziness.
“Today, Visa’s existing rules for disputes and liability apply to purchases made by AI agents,” a Visa spokesman said.
“If an agent places an order, it must show that the cardholder authorised and authenticated the transaction and received the goods or services.
“If a cardholder has stored their payment details and enrolled in an AI service, this may be considered consent for future purchases.
“However, Visa’s zero liability policy still protects users against genuinely unauthorised transactions.”
Mastercard was unable to comment by deadline.
Williamson says liability problems are sweeping and extend beyond credit card purchases.
She asks what happens if an agent books the wrong flight, makes an error screening job applicants or enters incorrect medical data on a form?
“These are all very fresh questions,” she says. And at this point, many institutions have not even grasped the power of agents, let alone formed policies around them, she adds.
‘Riskier than standard ChatGPT’
“An AI personal assistant might sound great at first glance but the risks of using something like ChatGPT Agent are much higher than when using standard ChatGPT,” Simply Privacy director Frith Tweedie says.
“The more you want it to do for you, the more access you have to give it,” the former digital and tech lawyer says.
That comes against the backdrop of OpenAI’s own testing finding that 91% of the time its agent will ask for confirmation before taking an action, when confirmation is required, which implies that 9% of the time it might take action without asking.
“So you really need to consider how comfortable you are handing over access to your credit card details, website logins, calendar or contacts to a black box.”
OpenAI has published the ChatGPT System Card, which is its technical brief on how the system works, Tweedie says.
OpenAI acknowledges privacy risk
“While they’ve clearly thought carefully about the various risks and how to reduce them, they explicitly note that ‘ChatGPT agent may have access to sensitive and private data about the user (eg via their Google drive or email)’ and acknowledge ‘the risk that ChatGPT agent could mistakenly reveal this private data in ways the user doesn’t intend’.”
OpenAI acknowledges in its terms there is a risk a ChatGPT agent could mistakenly reveal private data "in ways the user doesn’t intend”, says Simply Privacy director Frith Tweedie.
‘Onus on us’
Even Sam Altman, not generally known for underselling OpenAI’s services, has said “bad actors may try to ‘trick’ users’ AI agents into giving private information they shouldn’t, in ways we can’t predict”, Tweedie says.
“He recommended that people give agents ‘the minimum access required to complete a task’. But once again, this put the onus on us to manage these risks, rather than big tech fully addressing them.”
There’s also the paradox that Altman is recommending minimum access, but for maximum usefulness an agent needs access to all your logins and data.
Lastly, Tweedie says: “The well-established accuracy challenges of generative AI aren’t going anywhere with agentic AI – they’re multiplying.
“These kinds of agents typically involve a chain of individual LLMs [large language models] and the potential multiplication of errors.
“Plus, later models often accept earlier answers as gospel, so a single early hallucination can echo right through to the final action.
“That could look like buying the wrong thing, submitting your credit card details to a scam website or worse.”
While you can get agents to peer review each other’s accuracy, co-ordinating multiple agents requires another layer of logic (often itself an LLM) that might introduce fresh errors or rubber‑stamp the wrong consensus, Tweedie says.
Privacy protections weak compared to elsewhere
The rise of agentic AI underscores a range of issues that are increasingly presenting challenges to the Privacy Act and the extent of protection it provides, she says.
“While it provides a flexible, tech-agnostic and principles‑based framework, unlike Australia, the UK and Europe, we have no rights to an explanation of automated decisions. And our tiny fines [$10,000 compared with up to A$50 million ($55m) in Australia] are absolutely no deterrent for global AI vendors or local corporates betting big on productivity gains.”
Chris Keall is an Auckland-based member of the Herald’s business team. He joined the Herald in 2018 and is the technology editor and a senior business writer.