Data Science

GPT — Intuitively and Exhaustively Explained | by Daniel Warfield | Dec, 2023

By adminDecember 1, 2023No Comments2 Mins Read1 Views

Natural Language Processing | Machine Learning | Chat GPT

Exploring the architecture of OpenAI’s Generative Pre-trained Transformers.

“Mixture Expert” by the author using MidJourney. All images by the author unless otherwise specified.

In this article we’ll be exploring the evolution of OpenAI’s GPT models. We’ll briefly cover the transformer, describe variations of the transformer which lead to the first GPT model, then we’ll go through GPT1, GPT2, GPT3, and GPT4 to build a complete conceptual understanding of the state of the art.

Who is this useful for? Anyone interested in natural language processing (NLP), or cutting edge AI advancements.

How advanced is this post? This is not a complex post, it’s mostly conceptual. That said, there are a lot of concepts, so it might be daunting to less experienced data scientists.

Pre-requisites: I’ll briefly cover transformers in this article, but you can refer to my dedicated article on the subject for more information.

Before we get into GPT I want to briefly go over the transformer. In its most basic sense, the transformer is an encoder-decoder style model.

A transformer working in a translation task. The input (I am a manager) is compressed to some abstract representation that encodes the meaning of the entire input. The decoder works recurrently, by feeding into itself, to construct the output. From my article on transformers

The encoder converts an input into an abstract representation which the decoder uses to iteratively generate output.

high level representation of how the output of the encoder relates to the decoder. the decoder references the encoded input for every recursive loop of the output. From my article on transformers

both the encoder and decoder employ an abstract representations of text which is created using multi headed self attention.

Previous ArticleFree MIT Course: TinyML and Efficient Deep Learning Computing

Next Article 10 GitHub Repositories to Master Machine Learning

admin

Leave A Reply Cancel Reply

WP Twitter Auto Publish Powered By : XYZScripts.com