Complete Summary of Absolute, Relative And Rotary Position Embeddings!

Aziz Belaweid
Generative AI
Published in
8 min readMar 31, 2024

--

Position embeddings have been used a lot in recent LLMs. In this article, I explore the concept behind them and discuss the different types of position embeddings and their differences.

What are Position Embeddings?

In RNNs, the hidden state gets updated using the current state and past timestamps. However, transformers don’t naturally grasp the order of a sentence.

This is because the attention mechanism calculates relationships between tokens, with each token paying attention to all others in the sequence, without considering their order.

This means both of these sentences are the same as a transformer

Or any combination of those words.

To address this, researchers suggested ways to introduce the idea of order. They’re called position embeddings. So, position embeddings are vectors that we add to token embeddings to include information about order. An example of position embeddings introduced in the transformer's paper is Absolute embedding.

What are Absolute Positional Embeddings?

Absolute embeddings are vectors with the same dimension as the word itself. This vector shows the word’s position in a sentence. Each word gets a unique absolute embedding to show its position in the sentence.

Then, we add the word embeddings with the token embeddings to give position information and feed them to the transformer.

How do we obtain these absolute embeddings?

There are two main ways you can create these embeddings.

  • Learn from data: You simply treat the position embeddings as weight parameters. For each position, we learn a different vector embedding. This is bounded by the maximum length chosen.
  • Sinusoidal functions: Here we attempt to use mathematical functions that given a position in a sentence, this function will return a vector that encodes information about the position. In the original transformer paper, we alternate between sine and cosine functions in the same dimension as the word embedding. As you go higher in the position, the frequencies oscillate…

--

--

Published in Generative AI

All the latest news and updates on the rapidly evolving field of Generative AI space. From cutting-edge research and developments in LLMs, text-to-image generators, to real-world applications, and the impact of generative AI on various industries.

Written by Aziz Belaweid

Machine Learning Engineer @ SoundCloud | Sharing My learnings About AI Research. ✨ Follow for updates. https://www.linkedin.com/in/mohamed-aziz-belaweid/

Responses (3)

Write a response