94 articles

📰 Dev.to · Rijul Rajesh

Articles from Dev.to · Rijul Rajesh · 94 articles · Updated every 3 hours · View all reads

All ⚡ AI Lessons (9061) ArXiv cs.AIDev.to · FORUM WEBForbes InnovationOpenAI NewsDev.to AIHugging Face Blog
Understanding Attention Mechanisms – Part 6: Final Step in Decoding
Dev.to · Rijul Rajesh 1w ago
Understanding Attention Mechanisms – Part 6: Final Step in Decoding
In the previous article, we obtained the initial output, but we didn’t receive the EOS token yet. To...
Understanding Attention Mechanisms – Part 5: How Attention Produces the First Output
Dev.to · Rijul Rajesh 1w ago
Understanding Attention Mechanisms – Part 5: How Attention Produces the First Output
In the previous article, we stopped at using the softmax function to scale the scores. When we scale...
Understanding Attention Mechanisms – Part 4: Turning Similarity Scores into Attention Weights
Dev.to · Rijul Rajesh 1w ago
Understanding Attention Mechanisms – Part 4: Turning Similarity Scores into Attention Weights
In the previous article, we just explored the benefits of using dot product instead of cosine...
Cosine Similarity vs Dot Product in Attention Mechanisms
Dev.to · Rijul Rajesh 1w ago
Cosine Similarity vs Dot Product in Attention Mechanisms
For comparing the hidden states between the encoder and decoder, we need a similarity score. Two...
Understanding Attention Mechanisms – Part 3: From Cosine Similarity to Dot Product
Dev.to · Rijul Rajesh 2w ago
Understanding Attention Mechanisms – Part 3: From Cosine Similarity to Dot Product
In the previous article, we explored the comparison between encoder and decoder outputs. In this...
Understanding Attention Mechanisms – Part 2: Comparing Encoder and Decoder Outputs
Dev.to · Rijul Rajesh 2w ago
Understanding Attention Mechanisms – Part 2: Comparing Encoder and Decoder Outputs
In the previous article, we explored the main idea of attention and the modifications it requires in...
Understanding Attention Mechanisms – Part 1: Why Long Sentences Break Encoder–Decoders
Dev.to · Rijul Rajesh 2w ago
Understanding Attention Mechanisms – Part 1: Why Long Sentences Break Encoder–Decoders
In the previous articles, we understood Seq2Seq models. Now, on the path toward transformers, we need...
Understanding Seq2Seq Neural Networks – Part 8: When Does the Decoder Stop?
Dev.to · Rijul Rajesh 2w ago
Understanding Seq2Seq Neural Networks – Part 8: When Does the Decoder Stop?
In the previous article, we saw the translation being done. But there is an issue. The decoder does...
Understanding Teacher Forcing in Seq2Seq Models
Dev.to · Rijul Rajesh 2w ago
Understanding Teacher Forcing in Seq2Seq Models
When we learn about seq2seq neural networks, there is a term we should know called Teacher...
Understanding Seq2Seq Neural Networks – Part 7: Generating the Output with Softmax
Dev.to · Rijul Rajesh 3w ago
Understanding Seq2Seq Neural Networks – Part 7: Generating the Output with Softmax
In the previous article, we were transforming the outputs to the fully connected layer. A fully...
Understanding Seq2Seq Neural Networks – Part 6: Decoder Outputs and the Fully Connected Layer
Dev.to · Rijul Rajesh 3w ago
Understanding Seq2Seq Neural Networks – Part 6: Decoder Outputs and the Fully Connected Layer
In the previous article, we were looking at the embedding values in the encoder and the...
Understanding Seq2Seq Neural Networks – Part 5: Decoding the Context Vector
Dev.to · Rijul Rajesh 3w ago
Understanding Seq2Seq Neural Networks – Part 5: Decoding the Context Vector
In the previous article, we stopped at the concept of the context vector. In this article, we will...
Understanding Seq2Seq Neural Networks – Part 4: The Encoder and the Context Vector
Dev.to · Rijul Rajesh 3w ago
Understanding Seq2Seq Neural Networks – Part 4: The Encoder and the Context Vector
In the previous article, we stopped with the problem where we wanted to add more weights and biases...
Understanding Seq2Seq Neural Networks – Part 3: Stacking LSTMs in the Encoder
Dev.to · Rijul Rajesh 3w ago
Understanding Seq2Seq Neural Networks – Part 3: Stacking LSTMs in the Encoder
In the previous article, we created an embedding layer for the input vocabulary In this article, we...
Understanding Seq2Seq Neural Networks – Part 2: Embeddings for Sequence Inputs
Dev.to · Rijul Rajesh 4w ago
Understanding Seq2Seq Neural Networks – Part 2: Embeddings for Sequence Inputs
In the previous article, we just began with the concept of the sequence to sequence problem, and...
Understanding Seq2Seq Neural Networks – Part 1: The Seq2Seq Translation Problem
Dev.to · Rijul Rajesh 4w ago
Understanding Seq2Seq Neural Networks – Part 1: The Seq2Seq Translation Problem
There will be problems where we have sequences of one type of thing that need to be translated into...
Understanding Word2Vec – Part 7: How Negative Sampling Speeds Up Word2Vec
Dev.to · Rijul Rajesh 1mo ago
Understanding Word2Vec – Part 7: How Negative Sampling Speeds Up Word2Vec
In the previous article, we saw the huge number of weights and mentioned about a technnique called...
Understanding Word2Vec – Part 6: Two Ways Word2Vec Learns Context
Dev.to · Rijul Rajesh 1mo ago
Understanding Word2Vec – Part 6: Two Ways Word2Vec Learns Context
In the previous article, we saw the word embeddings concept, and how training causes similar words to...
Understanding Word2Vec – Part 5: How Training Creates Word Embeddings
Dev.to · Rijul Rajesh 1mo ago
Understanding Word2Vec – Part 5: How Training Creates Word Embeddings
In the previous article, we visualized the vectors on a graph and saw how we can represent similarity...
Understanding Word2Vec – Part 4: Visualizing Word Vectors
Dev.to · Rijul Rajesh 1mo ago
Understanding Word2Vec – Part 4: Visualizing Word Vectors
In the previous article, we saw how the next-word prediction is done, and how lack of training is...