Better Programming

Advice for programmers.

Follow publication

Member-only story

An Overview of Large Language Models

Cahit Barkin Ozer
Better Programming
Published in
24 min readMay 28, 2023

For Access: https://betterprogramming.pub/an-overview-of-large-language-models-373c6ce6069a?source=friends_link&sk=5ea2348d4677a1acd9100bebe18c6883

Introduction

What is a Language Model?

The probability distribution over strings of words is called the language model. [1]

For example, the knowledge that the word apple juice is more likely than apple juice concentrate is one simplest language models. This has replaced probabilistic methods nowadays with artificial neural networks. The dot product is used to measure how similar two words are in terms of character combination (syntax). But syntax can’t help us find the semantic (meaningful) relationship between two words. After all, the dot product will give us the result of zero because the character combinations are irrelevant. To solve this problem, we take vector representations (embedding) of words. Thus, we can train a model using these vector representations and obtain their context relations with each other.[1]

Word2Vec

Word2Vec visual representation [https://www.tensorflow.org/tutorials/text/word2vec?hl=tr]

The Word2Vec(Find Word Representation Vectors) method is a method that takes a word and establishes its relationships with the words on its right and left. Later, it determines the vector values accordingly and estimates the meanings of the words by converging on both sides of the words in the different texts that follow.[1]

Thanks to the Word2Vec method, we can perform linear algebra operations between words[1]:

king — man + woman ~= queen
Word Vector Space Example [https://medium.com/opla/how-to-train-word-embeddings-using-small-datasets-9ced58b58fde]

Artificial neural networks are the best-performing methods today and there are equations in these neural networks that map the input to the output, and these equations have parameters. First, we get an error during model training, during parameter adjustment…

Create an account to read the full story.

The author made this story available to Medium members only.
If you’re new to Medium, create a new account to read this story on us.

Or, continue in mobile web

Already have an account? Sign in

Cahit Barkin Ozer
Cahit Barkin Ozer

Written by Cahit Barkin Ozer

Üretken YZ başta olmak üzere teknoloji alanındaki yenilikleri öğrenip sizlerle paylaşıyorum. Youtube Kanalım: https://www.youtube.com/@cbarkinozer

No responses yet

Write a response