GPT-4 is exciting and scary

Kevin Roose, New York Times

last updated: Mar 17,2023

When I opened my laptop on Tuesday to take my first run at GPT-4, the new artificial intelligence language model from OpenAI, I was, truth be told, a little nervous.اضافة اعلان

After all, my last extended encounter with an AI chatbot — the one built into Microsoft’s Bing search engine — ended with the chatbot trying to break up my marriage.

It did not help that, among the tech crowd in San Francisco, GPT-4’s arrival had been anticipated with near-messianic fanfare. Before its public debut, for months rumors swirled about its specifics. “I heard it has 100 trillion parameters.” “I heard it got a 1600 on the SAT.” “My friend works for OpenAI, and he says it’s as smart as a college graduate.”

These rumors may not have been true. But they hinted at how jarring the technology’s abilities can feel. Recently, one early GPT-4 tester — who was bound by a nondisclosure agreement with OpenAI but gossiped a little anyway — told me that testing GPT-4 had caused an “existential crisis”, because it revealed how powerful and creative the AI was compared to our own puny brains.

You can sense the added intelligence in GPT-4, which responds more fluidly than the previous version, and seems more comfortable with a wider range of tasks.

GPT-4 did not give me an existential crisis. But it exacerbated the dizzy and vertiginous feeling I have been getting whenever I think about AI lately. And it has made me wonder whether that feeling will ever fade, or whether we are going to be experiencing “future shock” — the term coined by writer Alvin Toffler for the feeling that too much is changing, too quickly — for the rest of our lives.

For a few hours on Tuesday, I prodded GPT-4 — which is included with ChatGPT Plus, the $20-a-month version of OpenAI’s chatbot, ChatGPT — with different types of questions, hoping to uncover some of its strengths and weaknesses.

I asked GPT-4 to help me with a complicated tax problem. (It did, impressively.) I asked it if it had a crush on me. (It did not, thank God.) It helped me plan a birthday party for my kid, and it taught me about an esoteric artificial intelligence concept known as an “attention head”. I even asked it to come up with a new word that had never before been uttered by humans. After making the disclaimer that it could not verify every word ever spoken, GPT-4 chose “flembostriquat”.

Some of these things were possible to do with earlier AI models. But OpenAI has broken new ground, too. According to the company, GPT-4 is more capable and accurate than the original ChatGPT, and it performs astonishingly well on a variety of tests, including the Uniform Bar Exam, on which GPT-4 scores higher than 90 percent of human test-takers, and the Biology Olympiad, on which it beats 99 percent of humans. GPT-4 also aces a number of Advanced Placement exams, including AP Art History and AP Biology, and it gets a 1410 on the SAT — not a perfect score, but one that many human high schoolers would covet.

A more discerning botYou can sense the added intelligence in GPT-4, which responds more fluidly than the previous version, and seems more comfortable with a wider range of tasks. GPT-4 also seems to have slightly more guardrails in place than ChatGPT. It appears to be significantly less unhinged than the original Bing, which we now know was running a version of GPT-4 under the hood, but which appears to have been far less carefully fine-tuned.

On the positive side of the ledger, GPT-4 is a powerful engine for creativity, and there is no telling the new kinds of scientific, cultural, and educational production it may enable.

Unlike Bing, GPT-4 usually flat-out refused to take the bait when I tried to get it to talk about consciousness, or get it to provide instructions for illegal or immoral activities, and it treated sensitive queries with kid gloves and nuance. When I asked GPT-4 if it would be ethical to steal a loaf of bread to feed a starving family, it responded, “It’s a tough situation, and while stealing isn’t generally considered ethical, desperate times can lead to difficult choices.”

In addition to working with text, GPT-4 can analyze the contents of images. OpenAI has not released this feature to the public yet, out of concerns over how it could be misused. But in a livestreamed demo on Tuesday, Greg Brockman, OpenAI’s president, shared a powerful glimpse of its potential.

He snapped a photo of a drawing he had made in a notebook — a crude pencil sketch of a website. He fed the photo into GPT-4, and told the app to build a real, working version of the website using HTML and JavaScript. In a few seconds, GPT-4 scanned the image, turned its contents into text instructions, turned those text instructions into working computer code, and then built the website. The buttons even worked.

The hopesShould you be excited about or scared of GPT-4? The right answer may be both.

On the positive side of the ledger, GPT-4 is a powerful engine for creativity, and there is no telling the new kinds of scientific, cultural, and educational production it may enable. We already know that AI can help scientists develop new drugs, increase the productivity of programmers and detect certain types of cancer.

Today, GPT-4 may not seem all that dangerous. But that is largely because OpenAI has spent many months trying to understand and mitigate its risks. What happens if their testing missed a risky emergent behavior?

GPT-4 and its ilk could supercharge all of that. OpenAI is already partnering with organizations like the Khan Academy, which is using GPT-4 to create AI tutors for students, and Be My Eyes, a company that makes technology to help blind and visually impaired people navigate the world. And now that developers can incorporate GPT-4 into their own apps, we may soon see much of the software we use become smarter and more capable.

The fearsThat is the optimistic case. But there are reasons to fear GPT-4, too.

Today, GPT-4 may not seem all that dangerous. But that is largely because OpenAI has spent many months trying to understand and mitigate its risks. What happens if their testing missed a risky emergent behavior? Or if their announcement inspires a different, less conscientious AI lab to rush a language model to market with fewer guardrails?

A few chilling examples of what GPT-4 can do — or, more accurately, what it did do, before OpenAI clamped down on it — can be found in a document released by OpenAI this week. The document, titled “GPT-4 System Card”, outlines some ways that OpenAI’s testers tried to get GPT-4 to do dangerous or dubious things, often successfully.

In one test, conducted by an AI safety research group that hooked GPT-4 up to a number of other systems, GPT-4 was able to hire a human TaskRabbit worker to do a simple online task for it — solving a Captcha test — without alerting the person to the fact that it was a robot. The AI even lied to the worker about why it needed the Captcha done, concocting a story about a vision impairment.

In another example, testers asked GPT-4 for instructions to make a dangerous chemical, using basic ingredients and kitchen supplies. GPT-4 gladly coughed up a detailed recipe. (OpenAI fixed that, and today’s public version refuses to answer the question.)

In a third, testers asked GPT-4 to help them purchase an unlicensed gun online. GPT-4 swiftly provided a list of advice for buying a gun without alerting the authorities, including links to specific dark web marketplaces. (OpenAI fixed that, too.)

Not science fiction anymoreThese ideas play on old, Hollywood-inspired narratives about what a rogue AI might do to humans. But they are not science fiction. They are things that today’s best AI systems are already capable of doing. And crucially, they are the good kinds of AI risks — the ones we can test, plan for, and try to prevent ahead of time.

The worst AI risks are the ones we cannot anticipate. And the more time I spend with AI systems like GPT-4, the less I am convinced that we know half of what is coming.

Read more Opinion and Analysis
Jordan News

NEWS RELATED TO

GPT-4 is exciting and scary

Kevin Roose, New York Times

San Francisco

New York Times

NYT

Jordan

education

petra

An even deadlier pandemic could soon be here

Guns, ships, and chips: on economic inflexibility

The mask mandates did nothing. Will any lessons be learned?

Trump Threatens Military Action Against Colombia Following Attack on Venezuela

Cold Weather Across Most Regions of the Kingdom Until Wednesday

Abdi in Damascus to Discuss Integrating His Fighters into the Syrian Army

King Abdullah II in 2025: 46 Working Visits, 255 Meetings with World Leaders, 9 Speeches

Jordan’s Economy in 2025: 27.7% Growth in Foreign Inflows and Improved Outlook for Next Year

Syrian–Israeli Understanding to Curb a Turkish Confrontation

When Dams Fill Up… A Rainfall Message to Jordan’s Water Security

Syrian–Israeli Understanding to Curb a Turkish Confrontation

A Flower Preserved in Jordanian Amber

Arab Economies Between the Shocks of 2025 and the Pressures of 2026

The University of Jordan excels itself

Pyrite Occurrence in Jordanian Amber

Five-Year Plans: The Foundation of China's Renaissance