GPT is Generative pre-trained transformers. It is a type of large language model and a prominent framework for generative artificial intelligence.

There are two ways that AI develops a language model. The first is to use a person’s language as a statistic. The second is to use a neural network. Using the neural network is better performance. Google also uses this method on its search engine to predict what the user is about to search for.

GPT is a conditional probability prediction tool underlying natural language processing. The difference between human beings and GPT is a vast amount of learning. When human beings write an article or essay, we write and fix and repeat it until we think it is done or perfect. However, GPT is not doing the same process as we do. When a user types the keywords, GPT gathers every data related to the keywords and makes the text article what the user asked.

GPT models are based on transformer architecture. The concept of transformer architecture was first introduced in the report ‘Attention is all you need’ published by Google Brain in 2017. After that, the competition began to build a supermodel capable of handling various language tasks, Google’s BERT, MS’s Turing NLG, and OpenAI’s GPT3. Then which generation of GPT is used for ChatGPT is GPT 3.5. Because of this vast amount of train, developments in GPT language models have had a significant impact on natural language processing and other applications that rely on text data.

History of GPT

OpenAI introduced the first GPT in the world in 2018. The amount of the first GPT was trained with 117 million parameters. A year later, in 2019, the second GPT was released to the public by OpenAI. At this time, it was trained with 1.5 billion which is 10 times larger. The second GPT could write a page of the book in 10 seconds. In 2020, there was an upgraded version, GPT 3. It was trained with 175 billion parameters. What we can do with the third GPT is various language-related problem-solving, random writing, simple rule calculation, translation, and simple coding according to a given sentence.

Before GPT

Before transformer architecture technology was introduced, ‘generative pretraining’ was a long-established concept in machine learning applications. The most effective neural models for Natural Language Processing (NLP) typically utilize supervised learning using extensive volumes of data that have been labeled manually.

Limitations

Pre-training : It should be pre-trained. This is the major difference between human beings and GPT. When we have a conversation with others, we remember what we talked about previously, such as an hour or a week ago, but GPT does not. GPT does not have an ongoing long-term memory that learns from each interaction.
Limited input size : A user cannot provide a lot of text as input for the output. GPT3 has a prompt limit of about 2,048 tokens. It means that, for example, we cannot ask for translation over what it can handle.
Slow inference time : GPT3 has a slow inference time because the model takes a long time to generate results.
Lack of explainability : GPT3 is prone to the same problems that many neural networks face. Lack of ability to explain and interpret why certain inputs result in certain outputs. It does not understand the nuance contained in the sentence or words, such as anger, happy, or delight.

Risks

Mimicry : NLP models like GPT3 are continuously improving in accuracy, blurring the line between machine-generated content and human-written text, leading to challenges in distinguishing between them. This could potentially give rise to concerns related to copyright infringement and plagiarism.
Accuracy : Even though GPT3 excels at mimicking the structure of text produced by humans, it faces difficulties in maintaining factual precision across various use cases.
Bias : Language models carry a susceptibility to biases inherent in machine learning. These models are trained on internet text, they have the potential to internalize and reflect the biases prevalent in human online interactions. For instance, a pair of researchers at the Middlebury Institute of International Studies at Monterey discovered that GPT2 is proficient at producing extremist content, such as dialogues resembling those of conspiracy theorists and white supremacists. This situation opens the door to the automation and amplification of hate speech, often unintentionally. To address this concern, ChatGPT, built upon a version of GPT3, strives to mitigate such risks through more extensive training and incorporation of user input. There was this issue previously with the chatbot from MS.

Problem

Ethics issues always come up when the latest technologies related to artificial intelligence are introduced every time, such as generating fake news, deep fake, and other misleading forms of content.

We should establish related organizations and make related regulations globally before the technology develops further.

Reference

Network, A. (2021, February 19). GPT 모델의 발전 과정 그리고 한계. Medium. Medium
Wikimedia Foundation. (2023, August 13). Generative pre-trained transformer. Wikipedia. Wikipedia
Lutkevich, B., & Schmelzer, R. (2023, August 17). What is GPT-3? everything you need to know - techtarget. Enterprise AI. techtarget

What is GPT?