An AI detector mislabeled nearly every essay written by a non-native English speaker as being written by a bot

Gabriel Rivera

Updated Thu, Jul 13, 2023, 4:57 AM4 min read

Stanford researchers found that some of the most popular AI-detection tools flagged essays by non-native English speakers as text generated by AI.Getty Images

Researchers found popular GPT-detectors flagged essays by non-native English speakers as AI-written.
One detection system marked almost 98% of their essays as AI-generated text.
The findings add to concerns on the effectiveness of AI-detection systems and systemic bias in AI.

Systems that detect AI-generated writing are flagging essays written by non-native English speakers as bot-generated, researchers from Stanford University said.

In the study published Monday, the researchers ran more than 100 essays written by non-native English speakers through seven popular GPT detectors. (GPT is the framework behind some of the most widely used chatbots today, including ChatGPT.) The essays were written for an English-proficiency exam.

The researchers also fed the detectors essays written by US eighth graders who speak English natively.

More than half of the essays written by non-native English speakers were marked as AI-generated by the detection systems, the Stanford researchers found. And one GPT-detection system flagged almost 98% of those essays as written by AI.

But when evaluating essays written by eighth-grade native English speakers, the detection systems performed much better, assessing that about 20% of their essays were produced by AI. Still mislabeling human-written essays as bot-written — but not at the same rate.

The researchers warned the discrepancy could cause considerable harm to non-native English speakers who could be falsely accused of submitting work produced by chatbots for school assignments, college applications, or for work they do.

AI-detection software is increasingly being marketed as a silver bullet to floods of misinformation being produced with tools like ChatGPT, Dall-E and MidJourney. But the study adds to the growing questions surrounding the effectiveness of these systems — and echoes longstanding calls to address biases hardwired into AI systems.

Chatbots like ChatGPT generate written text by guessing the most logical word to follow in a sentence. It makes these guesses based on underlying large language models, or LLMs, which are trained on datasets comprised of mounds of data scraped from all sorts of publicly available sources, like social media platforms and Wikipedia.

The researchers attributed the results of their study to a metric many GPT-detection systems use called "text perplexity." In an email to Insider, Stanford professor James Zou, who was the corresponding author of the study, said the metric "measures how surprising the word choices are in the text."

The text that ChatGPT and other chatbots spit out is generally marked as low text perplexity by detectors because they're programmed to use the most common words or responses when constructing sentences. As a result, Zou said, "these detectors are more likely to flag text with low perplexity as AI-generated content."

When the detection systems evaluated the essays written by non-native English speakers, it marked much of their writing as low perplexity because they used "a more limited range of linguistic expression," the study said. This resulted in a vast majority of their content being flagged as AI-generated text.

The GPT detectors tested in the study aren't the only defensive tools producing paltry results. In June, The New York Times evaluated five programs designed to spot AI-generated images. The newspaper fooled the sensors on several occasions, noting the detectors tested didn't use logical reasoning when assessing the authenticity of an image.

Meanwhile, upon ChatGPT's initial release, OpenAI warned users in a blog post that the chatbot would "sometimes respond to harmful instructions or exhibit biased behavior." The company assured then that it was moving to "block certain types of unsafe content," but that it did expect the chatbot "to have some false negatives and positives" for the time being.

OpenAI CEO Sam Altman acknowledged in a tweet earlier this year that ChatGPT has "shortcomings around bias" and vowed the company would improve them. "I'm optimistic that we will get to a world where these models can be a force to reduce bias in society, not reinforce it," Altman said in a May interview with Rest of World.

Read the original article on Business Insider

Trump to set interest rates himself under secret presidential plan
Donald Trump’s aides have drawn up secret plans to oust the chairman of the Federal Reserve and allow the president to set interest rates, according to reports.
The Telegraph•18h ago
Forget Nvidia: Members of Congress Are Scooping Up Shares of Its Core Rival Instead
There's a much more popular AI stock on Capitol Hill than the leading AI chip maker.
Motley Fool•3h ago
2 Stock-Split Stocks to Buy Hand Over Fist Right Now
These proven wealth builders could be exactly what you're searching for right now.
Motley Fool•22h ago
Walmart CEO started his career unloading trailers at the warehouse—he says he got promotion after promotion by raising his hand when his boss was out of town
Walmart's CEO went from earning $6.50 an hour to $25 million a year—here are his three tips for climbing the corporate ladder like him
Fortune•1d ago
Analysts revamp Microsoft stock price target after earnings
Here's what could happen next to Microsoft shares.
TheStreet•12h ago
Housing supply surges by up to 50% in these metro areas — and many sellers are being forced to slash their asking prices
The property report includes 85 major metropolitan areas in the U.S. with populations of at least 750,000.
MarketWatch•1d ago
Bank of America lays out the exact scenario that could finally pop the stock market's AI bubble
It's been an "everything buy bonds" bull rally in markets for months, but BofA is cautiously watching a couple of indicators.
Business Insider•10h ago
Meet the 2 Best S&P 500 Stocks of 2024. They Could Soar Another 69% and 91%, According to Certain Wall Street Analysts.
Nvidia and Super Micro Computer have been the best-performing S&P 500 stocks of 2024, but select Wall Street analysts still see substantial upside for shareholders.
Motley Fool•3h ago
U.S. panic over national debt might mark a culture shift—are Americans becoming more ‘European’ about money?
Jamie Dimon and Jerome Powell are taking the European viewpoint on soaring debt levels in the U.S.
Fortune•1d ago
Why I Just Added This Ultra-High-Yield Dividend ETF to My Retirement Account
The JPMorgan Nasdaq Equity Premium Income ETF offers a compelling blend of income and upside to my portfolio.
Motley Fool•4h ago

News

Life

Entertainment

Finance

Sports

New on Yahoo

Yahoo Finance

An AI detector mislabeled nearly every essay written by a non-native English speaker as being written by a bot

Recommended Stories

Trump to set interest rates himself under secret presidential plan

Forget Nvidia: Members of Congress Are Scooping Up Shares of Its Core Rival Instead

2 Stock-Split Stocks to Buy Hand Over Fist Right Now

Walmart CEO started his career unloading trailers at the warehouse—he says he got promotion after promotion by raising his hand when his boss was out of town

Analysts revamp Microsoft stock price target after earnings

Housing supply surges by up to 50% in these metro areas — and many sellers are being forced to slash their asking prices

Bank of America lays out the exact scenario that could finally pop the stock market's AI bubble

Meet the 2 Best S&P 500 Stocks of 2024. They Could Soar Another 69% and 91%, According to Certain Wall Street Analysts.

U.S. panic over national debt might mark a culture shift—are Americans becoming more ‘European’ about money?

Why I Just Added This Ultra-High-Yield Dividend ETF to My Retirement Account