What is Generative Pre-trained Transformer (GPT)? Explain Like I’m 5

5 min readMay 10, 2023

2022 to 2023 has undoubtedly emerged as an era of extraordinary technological advancements, particularly within the field of artificial intelligence (AI). AI is making remarkable progress, impacting various aspects of our lives. From research institutions to businesses, the excitement surrounding AI is palpable.

ChatGPT — you probably already heard it somewhere, or maybe everywhere. ChatGPT is an AI language model that has taken the world by storm since its release in 2020. Indeed, this AI is a powerful natural language processing tool that can be used to generate human-like language, making it an ideal tool for creating chatbots, virtual assistants, and other applications that require natural language interactions.

So, we know the deal about this ChatGPT and its impact, but do we really understand about ChatGPT and what lies behind it? What the heck is GPT?

Generative Pre-trained Transformer (GPT)

Photo by Levart_Photographer on Unsplash

Developed by OpenAI, GPT, which stands for “Generative Pre-trained Transformer,” is a neural network that has the ability to generate human-like language, making it an impressive tool for natural language processing (NLP).

GPT is like a really smart computer that can understand and use language to create new sentences and paragraphs that make sense. It does this by learning from a lot of examples of language that humans have written or spoken before.

Imagine you have a big book of stories and every time you read a sentence or a paragraph, you remember how it’s written and what it means. After you’ve read a lot of stories, you can start making up your own sentences and paragraphs that sound like they could be part of the stories you’ve read. GPT is like that, but it’s much faster and can remember a lot more stories than you could ever read in your lifetime. People can use GPT to do things like write articles, create chatbots, and even help with medical research.

What lies behind GPT is a type of artificial intelligence (AI) called a neural network. What the heck is neural network?

Neural Network

Imagine you have a big puzzle to solve, but you don’t know how to do it. You decide to ask your friends for help, but each friend only knows how to solve a small part of the puzzle.

A neural network is kind of like a big group of friends working together to solve a puzzle. Each “friend” in the network is called a neuron, and each neuron knows how to do a small part of the problem. The neurons are connected to each other, so they can talk to each other and share information.

When you give the neural network a problem to solve (like recognizing a picture of a dog), each neuron works on a different part of the problem. Some neurons might look for pointy ears, while others might look for whiskers. The neurons talk to each other and share their findings, until the whole network has figured out that the picture is of a dog.

Just like how you and your friends get better at solving puzzles the more you practice, a neural network can get better at solving problems the more it’s trained. The more examples of dog pictures you give it, the better it gets at recognizing dogs.

In simple terms, neural networks are made up of layers of interconnected nodes that can process and analyze information. GPT is a specific type of neural network called a transformer, which is designed to process sequences of data (like words in a sentence).

To train GPT, you need to let it consume lot of text (like books, articles, and websites) and let it learn from that data. Specifically, it uses a type of training called “unsupervised learning,” which means that you don’t tell GPT what the correct answers are — you just let it learn by analyzing the patterns in the text. Once GPT is trained, you can use it to generate new text. For example, if you give GPT the beginning of a sentence (“Once upon a time, there was a…”), it can use its knowledge of language to generate the rest of the sentence (“…little girl who lived in a big, old house in the woods”).

One of the reasons GPT is so powerful is that it can remember a lot of contexts from the text it’s been trained on. For example, if we give it the sentence “I went to the bank to deposit some money,” it can understand that “bank” in this context means a place where you keep your money, not a riverbank.

In addition to GPT, there are many other types of neural networks that are used in AI and machine learning applications. Here are a few examples:

Convolutional Neural Networks (CNNs): These are often used in image recognition tasks, as they can identify patterns and features in images.
Recurrent Neural Networks (RNNs): These are well-suited to tasks that involve sequential data, such as language processing and speech recognition.
Deep Neural Networks (DNNs): These are neural networks that have multiple layers, enabling them to learn more complex patterns and features.
Autoencoders: These neural networks are designed to learn efficient representations of input data, and are often used in tasks such as data compression and feature extraction.

Each type of neural network has its own strengths and weaknesses, and the choice of network architecture will depend on the specific task at hand. As AI and machine learning continue to evolve, we can expect to see new and innovative neural network architectures being developed to tackle increasingly complex problems.

BECOME a WRITER at MLearning.ai

Mlearning.ai Submission Suggestions

How to become a writer on Mlearning.ai

medium.com