Blog Articles

Home – Blog – Articles

Unmasking the automaton: detecting texts generated by Artificial Intelligence

Santiago Vikklegas-Ceballos
July 8, 2023
No Comments

Artificial intelligence, particularly with large language models (LLMs), has made significant progress in generating human-like text. As AI-generated text becomes more prevalent in our daily lives, it is essential to develop skills and tools to identify such content. This guide will help you understand what an LLM is (Large Language Model) and how it works, to learn some manual detection techniques or automatic tools to detect AI-generated text and, finally, will recommend some key media and information literacy courses to navigate the new era of Artificial Intelligence.

What is a grand language model and how does it work?

A large language model (LLM) is a type of artificial intelligence that can understand, generate, and translate text by learning from large amounts of written content, such as books, websites, articles, and social media posts. Once trained, LLMs can efficiently perform a variety of tasks, such as summarizing articles, answering questions, or even creating new text that sounds as if it was written by a human.. They use a deep learning algorithm, which means they rely on a network that integrates a large number of mathematical functions to mimic the way our brains process information.

LLMs operate based on probabilities: they learn to recognize patterns and relationships between contexts, words, and phrases in large amounts of data. They create text by selecting the next most likely word or phrase based on their understanding of the context and the words they have already generated.

Given that LLMs, as GPT-4, As AI becomes increasingly sophisticated, it's difficult to determine whether a text is generated by AI or written by humans. Text generated by LLM can be more consistent and less random than human handwriting, making it difficult to identify with absolute certainty. Although tools exist to detect AI-generated text, their detection rates can vary significantly, ranging from 26% to 90% accuracy.. This uncertainty highlights the importance of media and information literacy for humans to responsibly harness the potential of AI-generated content.

Tips for manual detection of AI texts

AI-generated text has unique characteristics that can help distinguish it from human-written content. Some key points are:

Check for consistency: It may contain inconsistencies in narrative, tone, or focus. Inconsistencies in genres, pronouns, tenses, and persons may be attributable to AI.
Analyze fluency: It may be too fluid or lack common human errors, such as typos or colloquialisms. Depending on the writer's level, it's important to expect some common errors, as well as colloquial language.
Search for repetitions: The author may repeat certain phrases or use similar sentence structures repeatedly. In these cases, it's advisable to review the author's writing history to determine if there are common structures and words they frequently use.
Examine the consistency: It can lose coherence or context throughout the text, straying from the main topic. Reviewing the main ideas and line of argumentation is key when dealing with more professional texts.

Automated tools for detecting AI-generated text

Multiple tools and algorithms have been developed to detect AI-generated text, although none have achieved 100% reliability in independent results. Some of the most recognized are:

OpenAI AI Classifier: This tool analyzes text and assigns confidence levels to determine whether the content is AI-generated or human-written. Detection rates are around 26% with a 9% false positive rate (mistakenly identifying text as AI-generated when it was actually created by a human).
TurnItIn AI Handwriting Detector: This paid tool, soon to be launched by the market leader in anti-plagiarism software, claims to have an accuracy rate of 97% in detecting AI-generated text.
GPTZeroIt is the most widely used free app. Although it claims effectiveness rates of over 90%, there are no independent studies to support this.
Originality AI: It claims to accurately detect text produced by GPT-3, GPT-3.5, and ChatGPT. It assigns a percentage probability to whether the text was generated by humans or AI. However, it is not perfect, with a success rate of 71%.
DetectGPTCreated by professors and academic researchers, it is a model for detecting AI-generated text with GPT-2 (a very old version) that improves detection rates from 81% to 95% in its latest version.

Keys to continuous learning

Media and information literacy (Media and Information Literacy or MIL) is crucial for understanding and evaluating AI-generated content. MIL helps users develop critical thinking skills and promotes the responsible consumption of digital content. Some key MIL skills and courses to help you develop them include:

Critical thinking: to critically evaluate information and question its credibility, helping to recognize disinformation, whether generated by AI or not. The open course platform edX (made up of university programs from major institutions around the world) offers free of charge “Critical thinking: reasoned decision making”.
Digital citizenship: to act responsibly and ethically when encountering AI-generated content. This includes understanding the potential risks and consequences of sharing or using unverified AI-generated text. A vast number of free classes for digital citizens are available, from the Colombian Ministry of Education and its program. Colombia Learns are offered some featured courses.
Understanding and using AI technology: Media and information literacy contributes to a better overall understanding of technologies, helping users to distinguish between different forms of AI and their applications in everyday life. Recently, in conjunction with the Santiago Library (Chile), we launched the free course “Introduction to Artificial Intelligence for Librarians”"where you can find valuable information for any AI user.".
Promoting resilience against fake news and disinformation: MIL enables users to detect and counter fake news and disinformation, including AI-generated text, which has the potential to disrupt societies and affect decision-making processes. An initial step at this level is the guide to detecting fake news of the International Federation of Libraries, free for everyone to use.

The ability to detect AI-generated text is becoming vital in the digital age. By learning manual detection techniques, using automated tools, and promoting media and information literacy, people can effectively reconcile themselves with AI-generated content. The key is to remain vigilant, think critically, and stay informed about the evolving landscape of AI-generated text and its implications for our daily lives..

References

Akhtar, P., Ghouri, AM, Khan, HUR, Haq, MA, Awan, U., Zahoor, N., Khan, Z., & Ashraf, A. (2022). Detecting fake news and disinformation using artificial intelligence and machine learning to avoid supply chain disruptions. Annals of Operations Research. https://doi.org/10.1007/s10479-022-05015-5

Alimardani, A., & A. Jane, E. (2023, February 19). We pitted ChatGPT against tools for detecting AI-written text, and the results are troubling. The Conversation. Retrieved April 9, 2023, from https://theconversation.com/we-pitted-chatgpt-against-tools-for-detecting-ai-written-text-and-the-results-are-troubling-199774

Bryden, T. (2022, December 2). How Do Large Language Models Work? Speak Ai. Retrieved April 9, 2023, from https://speakai.co/how-do-large-language-models-work/

Decision Engines. (2023, February 27). Understanding Large Language Models: What They Are and How to Use Them. Decision Engines, Inc. Retrieved April 9, 2023, from https://decisionengines.ai/understanding-large-language-models/

Dilmegani, C. (2023, February 16). Large Language Models: Complete Guide in 2023. AIMultiple. Retrieved April 9, 2023, from https://research.aimultiple.com/large-language-models/

Hassan, M.H. (2023, February 18). DetectGPT – Detecting AI Generated Text | Medium. Medium. Retrieved April 9, 2023, from https://medium.com/@TheHaseebHassan/detectgpt-detecting-ai-generated-text-a0284f1d05de

Heikkilä, M. (2022, December 19). How to spot AI-generated text. MIT Technology Review. Retrieved April 9, 2023, from https://www.technologyreview.com/2022/12/19/1065596/how-to-spot-ai-generated-text/

Hendrik Kirchner, J., Ahmad, L., Aaronson, S., & Leike, J. (2023, January 31). New AI classifier for indicating AI-written text. OpenAI. Retrieved April 9, 2023, from https://openai.com/blog/new-ai-classifier-for-indicating-ai-written-text

Mahmood, SH (2023, February 25). The 8 Most Accurate AI Text Detectors You Can Try. MUO – Make Use Of. Retrieved April 9, 2023, from https://www.makeuseof.com/accurate-ai-text-detectors/

Muehmel, K. (2023, March 15). What Is a Large Language Model, the Tech Behind ChatGPT? Retrieved April 9, 2023, from https://blog.dataiku.com/large-language-model-chatgpt

Rogers, R. (2023, February 8). How to Detect AI-Generated Text, According to Researchers. WIRED. Retrieved April 9, 2023, from https://www.wired.com/story/how-to-spot-generative-ai-text-chatgpt/

Sánchez, S., Rojo, AF, Martínez, A., & Samaniego, CM (2021). Media and information literacy: a measurement instrument for adolescents. Educational Review, 73(4), 487–502. https://doi.org/10.1080/00131911.2019.1646708

Terra, J. (2023, February 21). What is Media and Information Literacy? Simplilearn.com. Retrieved April 9, 2023, from https://www.simplilearn.com/what-is-media-and-information-literacy-article

Turnitin. (2023, February 13). Turnitin announces AI writing detector and AI writing resource center for educators. Retrieved April 9, 2023, from https://www.turnitin.com/press/turnitin-announces-ai-writing-detector-and-ai-writing-resource-center-for-educators

UNESCO. (2022). Media and information literate citizens: Think critically, click wisely! UNESCO Institute for Information Technologies in Education. Retrieved April 9, 2023, from https://iite.unesco.org/publications/media-and-information-literate-citizens-think-critically-click-wisely/

Wang, B., Rau, P.P., & Yuan, T. (2022). Measuring user competence in using artificial intelligence: validity and reliability of artificial intelligence literacy scale. Behavior & Information Technology, 1–14. https://doi.org/10.1080/0144929x.2022.2072768

Wiggers, K. (2022, April 28). The emerging types of language models and why they matter. Techcrunch. Retrieved April 9, 2023, from https://techcrunch.com/2022/04/28/the-emerging-types-of-language-models-and-why-they-matter/

Comment the post Cancel reply

Latest Posts

All Post
chatbots
AI of things
Marketing
Myths and facts about AI
Augmented reality

Blog Articles

Unmasking the automaton: detecting texts generated by Artificial Intelligence

What is a grand language model and how does it work?

Tips for manual detection of AI texts

Automated tools for detecting AI-generated text

Keys to continuous learning

References

Comment the post Cancel reply

Recommended Videos

Latest Posts

Busting Artificial Intelligence Myths: Truths Behind the News Headlines

Living with AI in 2024: time to take action

March News: Artificial Intelligence Avalanches and Panics.

Unmasking the automaton: detecting texts generated by Artificial Intelligence

Start

Us

Training

Blog

let's talk