Member-only story

An Overview of Large Language Models

An introduction to large language models, including vectorization, transformers, common NLP models, and considerations for designing apps with OpenAI tokens and PEFT.

Published in

Better Programming

24 min readMay 28, 2023

For Access: https://betterprogramming.pub/an-overview-of-large-language-models-373c6ce6069a?source=friends_link&sk=5ea2348d4677a1acd9100bebe18c6883

Introduction

What is a Language Model?

The probability distribution over strings of words is called the language model. [1]

For example, the knowledge that the word apple juice is more likely than apple juice concentrate is one simplest language models. This has replaced probabilistic methods nowadays with artificial neural networks. The dot product is used to measure how similar two words are in terms of character combination (syntax). But syntax can’t help us find the semantic (meaningful) relationship between two words. After all, the dot product will give us the result of zero because the character combinations are irrelevant. To solve this problem, we take vector representations (embedding) of words. Thus, we can train a model using these vector representations and obtain their context relations with each other.[1]

Word2Vec

The Word2Vec(Find Word Representation Vectors) method is a method that takes a word and establishes its relationships with the words on its right and left. Later, it determines the vector values accordingly and estimates the meanings of the words by converging on both sides of the words in the different texts that follow.[1]

Thanks to the Word2Vec method, we can perform linear algebra operations between words[1]:

king — man + woman ~= queen

Word Vector Space Example [https://medium.com/opla/how-to-train-word-embeddings-using-small-datasets-9ced58b58fde]

Artificial neural networks are the best-performing methods today and there are equations in these neural networks that map the input to the output, and these equations have parameters. First, we get an error during model training, during parameter adjustment…

Better Programming

An Overview of Large Language Models

An introduction to large language models, including vectorization, transformers, common NLP models, and considerations for designing apps with OpenAI tokens and PEFT.

Introduction

What is a Language Model?

Word2Vec

Create an account to read the full story.

Published in Better Programming

Written by Cahit Barkin Ozer

No responses yet