Denken
Menu
  • About Me
  • Deep Learning with Pytorch
  • Generative AI: Tutorial Series
  • Python Tutorials
  • Contact Me
Menu

Deep Learning with Pytorch -Text Generation – LSTMs – 3.3

Posted on July 1, 2019July 1, 2019 by Aritra Sen

In this Deep Learning with Pytorch series , so far we have seen the implementation or how to work with tabular data , images , time series data and in this we will how do work normal text data. Along with generating text with the help of LSTMs we will also learn two other important concepts – gradient clipping and Word Embedding.

Gradient Clipping:
The problem of exploding gradients is more common with recurrent neural networks like LSTMs given the accumulation of gradients unrolled over many input time steps or sequence length. In neural network training , once the forward propagation is done then we calculate the loss by comparing the predicted value with actual after that we update the weights using the derivative of the loss and learning rate. Problem comes when these values of these gradients becomes extremely large or small. The weights can take on the value of an “NaN” or “Inf” in these cases of Vanishing or Exploding gradients and network almost stops learning.
Exploding gradients is also a problem in recurrent neural networks such as the Long Short-Term Memory network given the accumulation of error gradients in the unrolled recurrent structure.
On solution to this problem is Gradient clipping where we can force the gradient values to remain in a specific minimum or maximum value if the gradient exceeded a range. In code example below how we can do this in Pytorch.

Word Embedding:
Whenever we work text , we need to convert these texts into numbers before feeding them in the Neural Network.On of the simplest and easy to understand way is to do one hot encoding of these words and then feed them into the neural network. Below is an example of one hot encoding for the below sentence –
I am going to office.
Here we have 5 unique words so the vocabulary length is 5 and can represented as shown below where each vector is a single word.

1 0 0 0 0 – I
0 1 0 0 0 – am
0 0 1 0 0 – going
0 0 0 1 0 – to
0 0 0 0 1 – office
This works well when we have small no of vocabulary , however this one hot encoding suffers from below mentioned problems –

> Consumes lot of memory to store the words
> No relation or context between the words has been preserved.

Also in this example each word is independent and no notion of similarity is maintain. We might want to store the numerical values of these words such a way such that semantic similarity is maintained. Taking the below example from Pytorch official tutorial –

Suppose we are building a language model. Suppose we have seen the sentences –

  • The mathematician ran to the store.
  • The physicist ran to the store.
  • The mathematician solved the open problem.

In our training data we might got the below sentence –
The physicist solved the open problem.

Now how can we store this semantic similarity that mathematician and physicist is good at performing similar tasks(or attributes). So if we treat each attributes as dimensions and we can assign (or the neural network learns by training) similar values for mathematician and physicist then in that multidimensional space they would be close to each other as shown below –

Word Embedding Similarity
Word Embedding in Pytorch
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
view raw 3.2_Pytorch_TextGeneration_LSTM_Implementation.ipynb hosted with ❤ by GitHub

Below is a screenshot from the generated text –

Do like , share and comment if you have any questions.

Category: Machine Learning, Python

Post navigation

← Deep Learning with Pytorch -Sequence Modeling – LSTMs – 3.2
1.0 – Getting started with Transformers for NLP →

1 thought on “Deep Learning with Pytorch -Text Generation – LSTMs – 3.3”

  1. Kajal says:
    March 8, 2021 at 3:29 PM

    Thank you for such a nice explanation. I will try to implement it and come back.

    Reply

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

RSS Feeds

Enter your email address:

Delivered by FeedBurner

Pages

  • About Me
  • Contact Me
  • Deep Learning with Pytorch
  • Generative AI: Tutorial Series
  • Python Tutorials

Tag Cloud

Announcements Anrdoid BERT Bias Celebration Cricket CyanogenMod deep-learning Denken Experience Facebook Features Finetuning GCN GenerativeAI GNN Google HBOOT HBOOT downgrading HTC Wildfire huggingface India Launch Life LLM Lumia 520 MachineLearning mobile My Space nlp Orkut People Python pytorch pytorch-geometric Rooting Sachin Share Social Network tranformers transformers Tutorials Twitter weight-initialization Windows Phone

WP Cumulus Flash tag cloud by Roy Tanck and Luke Morton requires Flash Player 9 or better.

Categories

Random Posts

  • Contact Me
  • Generative AI: LLMs: Feature base finetuning 1.3
  • Deep Learning with Pytorch -CNN from Scratch with Data Augmentation – 2.1
  • Deep Learning with Pytorch -CNN – Transfer Learning – 2.2
  • Orkut vs Facebook

Recent Comments

  • Generative AI: LLMs: Reduce Hallucinations with Retrieval-Augmented-Generation (RAG) 1.8 – Denken on Generative AI: LLMs: Semantic Search and Conversation Retrieval QA using Vector Store and LangChain 1.7
  • vikas on Domain Fuss
  • Kajal on Deep Learning with Pytorch -Text Generation – LSTMs – 3.3
  • Aritra Sen on Python Tutorials – 1.1 – Variables and Data Types
  • Aakash on Python Tutorials – 1.1 – Variables and Data Types

Visitors Count

AmazingCounters.com

Archives

Meta

  • Log in
  • Entries feed
  • Comments feed
  • WordPress.org

Copyright

AritraSen’s site© This site has been protected from copyright by copyscape.Copying from this site is stricktly prohibited. Protected by Copyscape Original Content Validator
© 2025 Denken | Powered by Minimalist Blog WordPress Theme
Menu
  • About Me
  • Deep Learning with Pytorch
  • Generative AI: Tutorial Series
  • Python Tutorials
  • Contact Me