LLMs: Common Terms Explained, Simply
What LLMs are, how they are trained, and how they can be used, visually explained!
This newsletter is sponsored by DevStats.
Out Ship, Out Deliver, Out Perform.
DevStats helps engineering leaders unpack metrics, experience flow, and ship faster so every release drives real business impact.
✅ Spot bottlenecks before they stall delivery
✅ Tie dev work to business goals
✅ Ship more, miss less, prove your impact
It’s time to ship more and make your impact impossible to ignore.
Thanks to DevStats for sponsoring this newsletter. Let’s get back to this week’s thought.
Intro
Large Language Models (LLMs) have quickly become a hot topic in tech, business, and everyday conversations, but the jargon sometimes can make it a harder topic for daily discussion.
And also, what someone says can be interpreted differently, based on their understanding.
To ensure that you know all the main terms behind LLMs, read the simple explanations next and refer back to them whenever you need to.
When I first saw Ashish’s visuals, I immediately thought how clearly and simply these terms are explained!
Introducing Ashish Bamania
Ashish Bamania is an Emergency Medicine doctor by day and a self-taught Software Engineer and writer in his spare time. His goal with writing is to simplify the latest advances in AI, Quantum Computing & Software Engineering.
He recently published a book called LLMs In 100 Images, covering the most important concepts you need to understand Large Language Models.
I’ve asked him to share some of these great visuals in the article today to simplify how LLMs work, and luckily for us, he is kindly sharing many of the awesome visuals with us today!
Check out the book and use my code EL20 for 20% off.
LLMs are One of the Most Successful AI Technologies to Date
The LLM market is expected to reach a total value of $82 billion by 2033.
As of 2025, 67% of organizations worldwide have adopted LLMs to support their operations with generative AI.
If you’re new to LLMs and would love to deepen your knowledge, this article will help you do that.
Let’s begin!
What Are LLMs?
Large Language Models (LLMs) are AI systems that are trained on vast amounts of text data to understand and generate human-like language.
During training, they learn patterns, relationships, and structures in language by analyzing billions of text examples from books, articles, websites, and other written sources.
This gives them an understanding of grammar and semantics in human language.
Some of the popular LLMs used today are:
GPT-4o from OpenAI (in the form of ChatGPT)
Claude Sonnet 4 from Anthropic
Gemini 2.5 Flash from Google
These models are proprietary, which means that their internal details (weights, parameters, training data, training methods) aren’t publicly available.
The most widely used open-weight models, where model weights are publicly available, are:
Llama by Meta
DeepSeek-V3 by DeepSeek
Mistral Medium 3 by Mistral AI
What Powers An LLM?
The Transformer architecture is the backbone of all popular LLMs that we use today.
Transformer was developed through research by Google in 2017.
What makes it so good is that, unlike previous methods, it lets LLMs understand and process all words in the input text at the same time (in parallel, rather than one after another (sequentially).
This is achieved through its mechanism called Self-attention, which helps figure out how each word relates to every other word in the text sequence.
What makes it so good is that, unlike previous methods, it lets LLMs understand and process all words in the input text at the same time (in parallel, rather than one after another (sequentially).
This is achieved through its mechanism called Self-attention, which helps figure out how each word relates to every other word in the text sequence.
What Is GPT?
GPT or Generative Pre-trained Transformer is one of the earliest and most widely known LLMs.
GPT was born out of research from OpenAI in 2018, just one year after Google introduced the Transformer architecture.
Its successor, ChatGPT, is one of the most popular LLMs used today.
GPTs generate text by predicting the next word/token given a prompt.
This process is called Autoregression, which means that each word is generated based on the previous ones.
You’d see in the image describing GPT that it accepts Input Embeddings and Positional Encoding as inputs.
This seems strange because it should have been a word/ sentence that goes into the GPT for it to produce the next word.
The truth is that LLMs do not understand English (or any other human language).
Any word/ sentence in English has to be first broken down into small pieces called Tokens in a process called Tokenization.
In LLMs like ChatGPT, this is done using a Tokenization algorithm called Byte Pair Encoding.
The tokens obtained are then encoded into mathematical forms known as Embeddings.
Embeddings are high-dimensional vector representations that capture the semantic meaning and relationships between different words/ sentences.
Words with similar meanings have embeddings that are closer to each other in a higher-dimensional space.
This is shown below, where the embedding of “Apple” is closer to that of “Orange” than “Pen”.
We have previously discussed how the Transformer architecture in LLMs lets them process all words/ tokens in parallel.
This could lead to issues because in a language like English, the positioning of words is important for conveying meaning.
This is why Positional Encodings are used to combine the positional information of different words/ tokens in a sentence with the input embeddings of those words/ tokens.
Now that we know about the internals of LLMs, let’s discuss how they are trained.
Training An LLM To Generate Text
The first step to go from zero to a text-generating LLM is Pretraining.
During this stage, the LLM learns by processing massive unlabelled text datasets.
At each step, it is given the context (i.e., the previous words/ tokens) and asked to predict the following word/ token.
This makes it gradually learn grammar, facts, and common-sense reasoning.
Once we get a pre-trained LLM, it can be adapted to perform specific tasks by training it on labeled examples particular to those tasks.
These tasks could range from the model answering questions, summarizing documents, or following instructions more reliably.
This step is called Supervised Fine-tuning (SFT).
Following SFT, an LLM may learn to perform a task well, but its response may still diverge from human values.
As an example, if you ask an LLM, “When is Christmas?”, it might respond with “Isn’t it 25th December?”.
Although this response is correct, you’d prefer something which sounds more polite, like “Christmas is celebrated on 25th December every year.”
This is made possible using a technique called Reinforcement Learning from Human Feedback (RLHF).
RLHF aligns LLMs with human values, preferences, and expectations by using datasets of human judgments that guide them on what responses are considered better.
It is the key technique that enables modern LLMs, such as ChatGPT, to achieve high conversational quality and safety.
How Do You Get Better Responses From An LLM?
Prompting is a popular technique that can help you get responses from an LLM, and a whole field called Prompt Engineering has emerged around this practice.
Two approaches to prompting are popular. These are:
Zero-shot Prompting: where one directly instructs an LLM to perform a task
Few-shot Prompting: where one provides a few examples related to the task to be achieved, along with the instructions to complete the task. This usually results in better responses from an LLM.
Alongside these, many specialized techniques for prompting have been introduced, and one of them is called Chain-of-Thought (CoT) Prompting.
When following Chain-of-thought (CoT) prompting, the LLM is instructed to reason step by step before providing an answer.
This improves its accuracy in mathematical, logical, and reasoning tasks.
There is also a way of further training LLMs so that they internalize this Chain-of-thought approach. This helps them think and reason better when responding to complex problems.
This is achieved by training LLMs on massive datasets of examples of prompts and their Chain-of-thought responses, using reinforcement learning.
The resulting LLMs are referred to as Large Reasoning Models (LRMs). These models take their time thinking before answering a query.
Some of the popular LRMs used today are:
o3 and o4-mini by OpenAI
Claude Opus 4 by Anthropic
DeepSeek-R1 by DeepSeek
LLMs are not only text-generators, but they can do much more than that.
Modern-day LLMs are multi-modal. This means that they can work with data from different modalities (audio, images, and videos) as their inputs and outputs.
Modern-day LLMs can also have agency and be autonomous. This gives rise to their use as AI agents.
Being Agentic means that LLMs can act as the brain of systems, where, when given a task, they can:
Reason and plan the approach to complete the task
Use task-specific tools to interact with environments and other agents
Get feedback and amend their approach for completing the task
Two important protocols have been introduced in the last few months that make agentic workflows more efficient and reliable. These are:
Model Context Protocol (MCP): This protocol, developed and open-sourced by Anthropic, enables agents to seamlessly access and work with external data sources, APIs, tools, and applications.
Agent2Agent (A2A) Protocol: This protocol, developed and open-sourced by Google, enables multiple independent AI agents to collaborate and work towards a given task.
That’s a brief overview of what LLMs are, how they are trained, and how they can be used to get a response that suits your tasks well.
Last words
Special thanks to Ashish for explaining LLMs in a simple way! Make sure to check him out on LinkedIn, his publication
and also check out his book LLMs In 100 Images.If you’re looking to dive deeper into LLMs, this book covers the most important concepts you need to understand Large Language Models, from basic architecture to cutting-edge techniques.
We are not over yet!
How to Start, Grow and Monetize Your Engineering Newsletter
Check out my latest video. If you are thinking about starting a newsletter or you are writing already, this is a video for you! Starting the newsletter has been one of the greatest decisions I have made.
New video every Sunday. Subscribe to not miss it here:
Liked this article? Make sure to 💙 click the like button.
Feedback or addition? Make sure to 💬 comment.
Know someone that would find this helpful? Make sure to 🔁 share this post.
Whenever you are ready, here is how I can help you further
Join the Cohort course Senior Engineer to Lead: Grow and thrive in the role here.
Interested in sponsoring this newsletter? Check the sponsorship options here.
Take a look at the cool swag in the Engineering Leadership Store here.
Want to work with me? You can see all the options here.
Get in touch
You can find me on LinkedIn, X, YouTube, Bluesky, Instagram or Threads.
If you wish to make a request on particular topic you would like to read, you can send me an email to info@gregorojstersek.com.
This newsletter is funded by paid subscriptions from readers like yourself.
If you aren’t already, consider becoming a paid subscriber to receive the full experience!
You are more than welcome to find whatever interests you here and try it out in your particular case. Let me know how it went! Topics are normally about all things engineering related, leadership, management, developing scalable products, building teams etc.
Good one and easy to understand about LLM and agents in a quick glance.
good one, love the simple explanation.