Internode Team

Large Language Models (LLM)

Large Language Models (LLMs) are AI systems that can understand and generate human language.

Large Language Models (LLMs) are AI systems that can understand and generate human language in remarkably human-like ways. It is an artificial intelligence (AI) system designed to process and generate text, among other functions. It is trained on vast amounts of data, which is why it is referred to as "large.”

It’s basically a sophisticated mathematical function that predicts what word comes next for any piece of text. They power popular AI assistants like ChatGPT and Claude, enabling them to write content, answer questions, and assist humans with various tasks.


How do LLM’s achieve this?

LLMs are trained on vast datasets sourced from the internet, often amounting to thousands or even millions of gigabytes of text. They rely on a machine learning technique called deep learning to recognize patterns in characters, words, and sentences. After this initial training, LLMs undergo further refinement through tuning either fine-tuning or prompt-tuning, so they can specialize in specific tasks, such as answering questions, generating text, or translating languages etc.

Vector databases

Vector databases store and manage both structured and unstructured data, such as text or images, along with their vector embeddings. These embeddings are numerical representations of the data, capturing their semantic meaning as long lists of numbers. Typically, machine learning models generate these embeddings. Unlike traditional keyword-based search, semantic search provides a more flexible and intuitive way to find relevant information.

Since similar objects are positioned close to each other in vector space, their similarity can be measured by the distance between their vector embeddings. This enables a powerful search method called vector search, which retrieves data based on similarity rather than exact keyword matches.

What LLMs actually do

LLMs function like an extremely well-read person who have processed billions of documents, articles, and websites. Through this extensive "reading," they learn language patterns - how words follow each other, how sentences flow together, and how ideas connect.

Imagine that you are reading a movie script and you have a magical machine that could predict what word comes next. With this machine you could complete the rest of the script by repeatedly feeding in what you have - and seeing what the machine predicts. When you interact with a chatbot, this is exactly what's happening - a large language model is predicting the next most likely word based on everything that came before.

Unlike traditional software that follows specific programmed rules, LLMs learn these patterns on their own from data. This self-learning ability allows them to generate text that feels natural and contextually appropriate for a wide range of topics. With this ability, you are able to use these LLMs for various number of tasks that includes summarizing documents, generating text or ideas.

How LLMs might help you

The technical side

Behind the scenes, LLMs are built on neural network architectures called "transformers." Here's a simplified explanation of how they work: Transformers are specific neural network designs that process entire sequences at once rather than word-by-word.

  1. Training process: The model analyzes internet text, books, and articles (trillions of words). Imagine reading the entire Wikipedia 1,000 times over. (For a human to read the amount of text used to train even GPT-3, reading non-stop 24-7, it would take over 2,600 years.)
  2. Prediction mechanism: When you provide input text, the model predicts what words should logically follow based on patterns learned during training. Instead of predicting one word with certainty, it assigns probabilities to all possible next words.
  3. Attention system: Unlike older AI that processed text one word at a time, transformers can look at your entire input at once. As the second post explains: "Transformers don't read text from the start to the finish, they soak it all in at once, in parallel." This makes them much better at understanding context - like reading a whole paragraph rather than word-by-word.

LLMs contain "parameters," which you can think of as adjustable knobs the AI uses to make decisions—similar to how your brain forms connections between neurons. The largest models have hundreds of billions of these parameters (GPT-4 has over a trillion), requiring the computing power equivalent to thousands of home computers running simultaneously.

To illustrate the scale of computation involved: if you could perform one billion additions and multiplications every second, it would take well over 100 million years to perform all the operations involved in training the largest language models.

Attention mechanism

The attention mechanism is a key component of transformer-based large language models that allows the model to "focus" on different parts of the input text when making predictions.

Here's how it works:

  1. When processing text, instead of looking at each word in isolation or in a fixed sequence, the attention mechanism allows the model to consider relationships between all words in the text simultaneously.
  2. For each word, the model calculates "attention scores" that determine how much focus to place on every other word in the input when interpreting or generating the next part of the text.
  3. This process helps the model differentiate word meanings based on context. For example, in the phrase "the tree had a rough bark," the attention mechanism helps the model understand that "bark" refers to the covering of a tree rather than a dog's sound by creating stronger connections between "tree" and "bark."
  4. These connections are represented mathematically as weights that determine how much each word influences the interpretation of other words.


The attention mechanism is what gives transformers their power to understand context across long passages of text and what makes them more effective than earlier models that processed text sequentially. It essentially allows words to "communicate" with each other across the entire input, capturing complex relationships regardless of how far apart words are positioned.

Beyond pre-training

The initial training process is called "pre-training," but that's only part of the story. To become good AI assistants, these models undergo another type of training called "reinforcement learning with human feedback." Workers flag unhelpful or problematic predictions, and their corrections further refine the model's parameters, making them more likely to give responses that users prefer.

Limitations to be aware of

Glossary of terms

If you have any questions or feedback, feel free to contact us at: perspectives@internode.app

Get notified about our upcoming posts.

Thank you!
Oops! Something went wrong while submitting the form.