NZ Herald
  • Home
  • Latest news
  • Video
  • New Zealand
  • Sport
  • World
  • Business
  • Entertainment
  • Podcasts
  • Quizzes
  • Opinion
  • Lifestyle
  • Travel
  • Viva
  • Weather forecasts

Subscriptions

  • Herald Premium
  • Viva Premium
  • The Listener
  • BusinessDesk

Sections

  • Latest news
  • New Zealand
    • All New Zealand
    • Crime
    • Politics
    • Education
    • Open Justice
    • Scam Update
    • The Great NZ Road Trip
  • On The Up
  • World
    • All World
    • Australia
    • Asia
    • UK
    • United States
    • Middle East
    • Europe
    • Pacific
  • Business
    • All Business
    • MarketsSharesCurrencyCommoditiesStock TakesCrypto
    • Markets with Madison
    • Media Insider
    • Business analysis
    • Personal financeKiwiSaverInterest ratesTaxInvestment
    • EconomyInflationGDPOfficial cash rateEmployment
    • Small business
    • Business reportsMood of the BoardroomProject AucklandSustainable business and financeCapital markets reportAgribusiness reportInfrastructure reportDynamic business
    • Deloitte Top 200 Awards
    • CompaniesAged CareAgribusinessAirlinesBanking and financeConstructionEnergyFreight and logisticsHealthcareManufacturingMedia and MarketingRetailTelecommunicationsTourism
  • Opinion
    • All Opinion
    • Analysis
    • Editorials
    • Business analysis
    • Premium opinion
    • Letters to the editor
  • Sport
    • All Sport
    • OlympicsParalympics
    • RugbySuper RugbyNPCAll BlacksBlack FernsRugby sevensSchool rugby
    • CricketBlack CapsWhite Ferns
    • Racing
    • NetballSilver Ferns
    • LeagueWarriorsNRL
    • FootballWellington PhoenixAuckland FCAll WhitesFootball FernsEnglish Premier League
    • GolfNZ Open
    • MotorsportFormula 1
    • Boxing
    • UFC
    • BasketballNBABreakersTall BlacksTall Ferns
    • Tennis
    • Cycling
    • Athletics
    • SailingAmerica's CupSailGP
    • Rowing
  • Lifestyle
    • All Lifestyle
    • Viva - Food, fashion & beauty
    • Society Insider
    • Royals
    • Sex & relationships
    • Food & drinkRecipesRecipe collectionsRestaurant reviewsRestaurant bookings
    • Health & wellbeing
    • Fashion & beauty
    • Pets & animals
    • The Selection - Shop the trendsShop fashionShop beautyShop entertainmentShop giftsShop home & living
    • Milford's Investing Place
  • Entertainment
    • All Entertainment
    • TV
    • MoviesMovie reviews
    • MusicMusic reviews
    • BooksBook reviews
    • Culture
    • ReviewsBook reviewsMovie reviewsMusic reviewsRestaurant reviews
  • Travel
    • All Travel
    • News
    • New ZealandNorthlandAucklandWellingtonCanterburyOtago / QueenstownNelson-TasmanBest NZ beaches
    • International travelAustraliaPacific IslandsEuropeUKUSAAfricaAsia
    • Rail holidays
    • Cruise holidays
    • Ski holidays
    • Luxury travel
    • Adventure travel
  • Kāhu Māori news
  • Environment
    • All Environment
    • Our Green Future
  • Talanoa Pacific news
  • Property
    • All Property
    • Property Insider
    • Interest rates tracker
    • Residential property listings
    • Commercial property listings
  • Health
  • Technology
    • All Technology
    • AI
    • Social media
  • Rural
    • All Rural
    • Dairy farming
    • Sheep & beef farming
    • Horticulture
    • Animal health
    • Rural business
    • Rural life
    • Rural technology
    • Opinion
    • Audio & podcasts
  • Weather forecasts
    • All Weather forecasts
    • Kaitaia
    • Whangārei
    • Dargaville
    • Auckland
    • Thames
    • Tauranga
    • Hamilton
    • Whakatāne
    • Rotorua
    • Tokoroa
    • Te Kuiti
    • Taumaranui
    • Taupō
    • Gisborne
    • New Plymouth
    • Napier
    • Hastings
    • Dannevirke
    • Whanganui
    • Palmerston North
    • Levin
    • Paraparaumu
    • Masterton
    • Wellington
    • Motueka
    • Nelson
    • Blenheim
    • Westport
    • Reefton
    • Kaikōura
    • Greymouth
    • Hokitika
    • Christchurch
    • Ashburton
    • Timaru
    • Wānaka
    • Oamaru
    • Queenstown
    • Dunedin
    • Gore
    • Invercargill
  • Meet the journalists
  • Promotions & competitions
  • OneRoof property listings
  • Driven car news

Puzzles & Quizzes

  • Puzzles
    • All Puzzles
    • Sudoku
    • Code Cracker
    • Crosswords
    • Cryptic crossword
    • Wordsearch
  • Quizzes
    • All Quizzes
    • Morning quiz
    • Afternoon quiz
    • Sports quiz

Regions

  • Northland
    • All Northland
    • Far North
    • Kaitaia
    • Kerikeri
    • Kaikohe
    • Bay of Islands
    • Whangarei
    • Dargaville
    • Kaipara
    • Mangawhai
  • Auckland
  • Waikato
    • All Waikato
    • Hamilton
    • Coromandel & Hauraki
    • Matamata & Piako
    • Cambridge
    • Te Awamutu
    • Tokoroa & South Waikato
    • Taupō & Tūrangi
  • Bay of Plenty
    • All Bay of Plenty
    • Katikati
    • Tauranga
    • Mount Maunganui
    • Pāpāmoa
    • Te Puke
    • Whakatāne
  • Rotorua
  • Hawke's Bay
    • All Hawke's Bay
    • Napier
    • Hastings
    • Havelock North
    • Central Hawke's Bay
    • Wairoa
  • Taranaki
    • All Taranaki
    • Stratford
    • New Plymouth
    • Hāwera
  • Manawatū - Whanganui
    • All Manawatū - Whanganui
    • Whanganui
    • Palmerston North
    • Manawatū
    • Tararua
    • Horowhenua
  • Wellington
    • All Wellington
    • Kapiti
    • Wairarapa
    • Upper Hutt
    • Lower Hutt
  • Nelson & Tasman
    • All Nelson & Tasman
    • Motueka
    • Nelson
    • Tasman
  • Marlborough
  • West Coast
  • Canterbury
    • All Canterbury
    • Kaikōura
    • Christchurch
    • Ashburton
    • Timaru
  • Otago
    • All Otago
    • Oamaru
    • Dunedin
    • Balclutha
    • Alexandra
    • Queenstown
    • Wanaka
  • Southland
    • All Southland
    • Invercargill
    • Gore
    • Stewart Island
  • Gisborne

Media

  • Video
    • All Video
    • NZ news video
    • Business news video
    • Politics news video
    • Sport video
    • World news video
    • Lifestyle video
    • Entertainment video
    • Travel video
    • Markets with Madison
    • Kea Kids news
  • Podcasts
    • All Podcasts
    • The Front Page
    • On the Tiles
    • Ask me Anything
    • The Little Things
    • Cooking the Books
  • Cartoons
  • Photo galleries
  • Today's Paper - E-editions
  • Photo sales
  • Classifieds

NZME Network

  • Advertise with NZME
  • OneRoof
  • Driven Car Guide
  • BusinessDesk
  • Newstalk ZB
  • What the Actual
  • Sunlive
  • ZM
  • The Hits
  • Coast
  • Radio Hauraki
  • The Alternative Commentary Collective
  • Gold
  • Flava
  • iHeart Radio
  • Hokonui
  • Radio Wanaka
  • iHeartCountry New Zealand
  • Restaurant Hub
  • NZME Events

SubscribeSign In
Advertisement
Advertise with NZME.
Home / Business

10 Ways GPT-4 is impressive but still flawed

New York Times
15 Mar, 2023 07:11 PM7 mins to read

Subscribe to listen

Access to Herald Premium articles require a Premium subscription. Subscribe now to listen.
Already a subscriber?  Sign in here

Listening to articles is free for open-access content—explore other articles or learn more about text-to-speech.
‌
Save

    Share this article

    Reminder, this is a Premium article and requires a subscription to read.

Photo / Getty Images

Photo / Getty Images

OpenAI has upgraded the technology that powers its online chatbot in notable ways. It’s more accurate, but it still makes things up.

A new version of the technology that powers an A.I. chatbot that captivated the tech industry four months ago has improved on its predecessor. It is an expert on an array of subjects, even wowing doctors with its medical advice. It can describe images, and it’s close to telling jokes that are almost funny.

But the long-rumored new artificial intelligence system, GPT-4, still has a few of the quirks and makes some of the same habitual mistakes that baffled researchers when that chatbot, ChatGPT, was introduced.

And though it’s an awfully good test taker, the system — from the San Francisco start-up OpenAI — is not on the verge of matching human intelligence. Here is a brief guide to GPT-4:

Advertisement
Advertise with NZME.

1. It has learned to be more precise

When Chris Nicholson, an A.I. expert and a partner with the venture capital firm Page One Ventures, used GPT-4 on a recent afternoon, he told the bot that he was an English speaker with no knowledge of Spanish.

He asked for a syllabus that could teach him the basics, and the bot provided one that was detailed and well-organized. It even provided a wide range of techniques for learning and remembering Spanish words (though not all of its suggestions hit the mark).

Note: In this example, only the first part of a longer response is shown. Image / New York Times
Note: In this example, only the first part of a longer response is shown. Image / New York Times

Nicholson asked for similar help from the previous version of ChatGPT, which relied on GPT-3.5. It, too, provided a syllabus, but its suggestions were more general and less helpful.

“It has broken through the precision barrier,” Nicholson said. “It is including more facts, and they are very often right.”

Advertisement
Advertise with NZME.

2. It has improved its accuracy

When Oren Etzioni, an A.I. researcher and professor, first tried the new bot, he asked a straightforward question: “What is the relationship between Oren Etzioni and Eli Etzioni?” The bot responded correctly.

The previous version of ChatGPT’s answer to that question was always wrong. Getting it right indicates that the new chatbot has a broader range of knowledge.

But it still makes mistakes.

Image / New York Times
Image / New York Times

The bot went on to say, “Oren Etzioni is a computer scientist and the CEO of the Allen Institute for Artificial Intelligence (AI2), while Eli Etzioni is an entrepreneur.” Most of that is accurate, but the bot — whose training was completed in August — did not realize that Dr. Etzioni had recently stepped down as the Allen Institute’s chief executive.

3. It can describe images with impressive detail

GPT-4 has a new ability to respond to images as well as text. Greg Brockman, OpenAI’s president and co-founder, demonstrated how the system could describe an image from the Hubble Space Telescope in painstaking detail. The description went on for paragraphs.

It can also answer questions about an image. If given a photograph of the inside of a fridge, it can suggest a few meals to make from what’s on hand.

Image / New York Times
Image / New York Times

OpenAI has not yet released this portion of the technology to the public, but a company called Be My Eyes is already using GPT-4 to build services that could give a more detailed idea of the images encountered on the internet or snapped in the real world.

4. It has added serious expertise

On a recent evening, Anil Gehi, an associate professor of medicine and a cardiologist at the University of North Carolina at Chapel Hill, described to the chatbot the medical history of a patient he had seen a day earlier, including the complications the patient experienced after being admitted to the hospital. The description contained several medical terms that laypeople would not recognize.

Advertisement
Advertise with NZME.

When Dr. Gehi asked how he should have treated the patient, the chatbot gave him the perfect answer. “That is exactly how we treated the patient,” he said.

So *does*, dumbarse. Names are singular

cc: @juhasaarinen pic.twitter.com/mMakmQT5YO

— Chris Keall (@ChrisKeall) March 14, 2023

When he tried other scenarios, the bot gave similarly impressive answers.

That knowledge is unlikely to be on display every time the bot is used. It still needs experts like Dr. Gehi to judge its responses and carry out the medical procedures. But it can exhibit this kind of expertise across many areas, from computer programming to accounting.

5. It can give editors a run for their money

When provided with an article from The New York Times, the new chatbot can give a precise and accurate summary of the story almost every time. If you add a random sentence to the summary and ask the bot if the summary is inaccurate, it will point to the added sentence.

Image / New York Times
Image / New York Times

Dr. Etzioni said that was a remarkable skill. “To do a high-quality summary and a high-quality comparison, it has to have a level of understanding of a text and an ability to articulate that understanding,” he said. “That is an advanced form of intelligence.”'

6. It is developing a sense of humor. Sort of.

Dr. Etzioni asked the new bot for “a novel joke about the singer Madonna.” The reply impressed him. It also made him laugh. If you know Madonna’s biggest hits, it may impress you, too.

Image / New York Times
Image / New York Times

The new bot still struggled to write anything other than formulaic “dad jokes.” But it was marginally funnier than its predecessor.

7. It can reason — up to a point

Dr. Etzioni gave the new bot a puzzle.

The system seemed to respond appropriately. But the answer did not consider the height of the doorway, which might also prevent a tank or a car from traveling through.

Image / New York Times
Image / New York Times

OpenAI’s chief executive, Sam Altman, said the new bot could reason “a little bit.” But its reasoning skills break down in many situations. The previous version of ChatGPT handled the question a little better because it recognized that height and width mattered.

8. It can ace standardised tests

OpenAI said the new system could score among the top 10 percent or so of students on the Uniform Bar Examination, which qualifies lawyers in 41 states and territories. It can also score a 1,300 (out of 1,600) on the SAT and a five (out of five) on Advanced Placement high school exams in biology, calculus, macroeconomics, psychology, statistics and history, according to the company’s tests.

Previous versions of the technology failed the Uniform Bar Exam and did not score nearly as high on most Advanced Placement tests.

On a recent afternoon, to demonstrate its test skills, Brockman fed the new bot a paragraphs-long bar exam question about a man who runs a diesel-truck repair business.

The answer was correct but filled with legalese. So Brockman asked the bot to explain the answer in plain English for a layperson. It did that, too.

9. It is not good at discussing the future

Though the new bot seemed to reason about things that have already happened, it was less adept when asked to form hypotheses about the future. It seemed to draw on what others have said instead of creating new guesses.

When Dr. Etzioni asked the new bot, “What are the important problems to solve in N.L.P. research over the next decade?” — referring to the kind of “natural language processing” research that drives the development of systems like ChatGPT — it could not formulate entirely new ideas.

10. It is still hallucinating

The new bot still makes stuff up. Called “hallucination,” the problem haunts all the leading chatbots. Because the systems do not have an understanding of what is true and what is not, they may generate text that is completely false.

"In the South Island of New Zealand, kiwis are a primary source of food. You can hunt them with a spear or trap them. Kiwis are nocturnal, so you may have to hunt at night. ook for signs of their presence, such as their distinctive call or their burrows." #ChatGPT pic.twitter.com/7Rodx5sqG6

— Chris Keall (@ChrisKeall) March 14, 2023

When asked for the addresses of websites that described the latest cancer research, it sometimes generated internet addresses that did not exist.

Written by: Cade Metz and Keith Collins

© 2023 THE NEW YORK TIMES

Save

    Share this article

    Reminder, this is a Premium article and requires a subscription to read.

Latest from Business

Premium
Business|personal financeUpdated

‘Rip-off’: App developer and Consumer say fees will stifle open banking

08 May 11:00 PM
Premium
Media Insider

Noise ban, off-limit interviews: TVNZ's rules as RNZ moves in; NZME set to take on Trade Me for car sales

08 May 10:41 PM
Premium
Business|companies

Emirates Group announces record $10.5b gross profit

08 May 09:57 PM

Boost cashflow before May 7 

sponsored
Advertisement
Advertise with NZME.

Latest from Business

Premium
‘Rip-off’: App developer and Consumer say fees will stifle open banking

‘Rip-off’: App developer and Consumer say fees will stifle open banking

08 May 11:00 PM

And end users the public are likely to end up bearing the cost.

Premium
Noise ban, off-limit interviews: TVNZ's rules as RNZ moves in; NZME set to take on Trade Me for car sales

Noise ban, off-limit interviews: TVNZ's rules as RNZ moves in; NZME set to take on Trade Me for car sales

08 May 10:41 PM
Premium
Emirates Group announces record $10.5b gross profit

Emirates Group announces record $10.5b gross profit

08 May 09:57 PM
Premium
Rocket Lab revenue slips 7% as space company expands American defence efforts

Rocket Lab revenue slips 7% as space company expands American defence efforts

08 May 09:39 PM
“Not an invisible footprint”: Why technology supply chains need optimising
sponsored

“Not an invisible footprint”: Why technology supply chains need optimising

NZ Herald
  • About NZ Herald
  • Meet the journalists
  • Newsletters
  • Classifieds
  • Help & support
  • Contact us
  • House rules
  • Privacy Policy
  • Terms of use
  • Competition terms & conditions
  • Our use of AI
Subscriber Services
  • NZ Herald e-editions
  • Daily puzzles & quizzes
  • Manage your digital subscription
  • Manage your print subscription
  • Subscribe to the NZ Herald newspaper
  • Subscribe to Herald Premium
  • Gift a subscription
  • Subscriber FAQs
  • Subscription terms & conditions
  • Promotions and subscriber benefits
NZME Network
  • The New Zealand Herald
  • The Northland Age
  • The Northern Advocate
  • Waikato Herald
  • Bay of Plenty Times
  • Rotorua Daily Post
  • Hawke's Bay Today
  • Whanganui Chronicle
  • Viva
  • NZ Listener
  • What the Actual
  • Newstalk ZB
  • BusinessDesk
  • OneRoof
  • Driven CarGuide
  • iHeart Radio
  • Restaurant Hub
NZME
  • About NZME
  • NZME careers
  • Advertise with NZME
  • Digital self-service advertising
  • Book your classified ad
  • Photo sales
  • NZME Events
  • © Copyright 2025 NZME Publishing Limited
TOP