Generative AI

Q/A System — Deep learning(2/2)

How LSTM works?

I think it’s unfair to say that neural network has no memory at all. After all, those learnt weights are some kind of memory of the training data. But this memory is more static. Sometimes we want to remember an input for later use. There are many examples of such a situation, such as the stock market. To make a good investment judgement, we have to at least look at the stock data from a time window.

A LSTM network has the following three aspects that differentiate it from an usual neuron in a recurrent neural network.

1. It has control on deciding when to let the input enter the neuron.

2. It has control on deciding when to remember what was computed in the previous time step.

3. It has control on deciding when to let the output pass on to the next time stamp.

Simple Example OF LSTM in python using KERAS: –

import numpy
from keras.models import Sequential
from keras.layers import Dense
from keras.layers import LSTM
from keras.utils import np_utils
numpy.random.seed(7)
alphabet = "ABCDEFGHIJKLMNOPQRSTUVWXYZ"
char_to_int = dict((c, i) for i, c in enumerate(alphabet))
int_to_char = dict((i, c) for i, c in enumerate(alphabet))
# prepare the dataset of input to output pairs encoded as integers
# seq_length = 1
seq_length=1
dataX = []
dataY = []
for i in range(0, len(alphabet) - seq_length, 1):
 seq_in = alphabet[i:i + seq_length]
 seq_out = alphabet[i + seq_length]
 dataX.append([char_to_int[char] for char in seq_in])
 dataY.append(char_to_int[seq_out])
 print seq_in, '->', seq_out
# reshape X to be [samples, time steps, features]
X = numpy.reshape(dataX, (len(dataX), seq_length, 1))
X = X / float(len(alphabet))
y = np_utils.to_categorical(dataY)
# create and fit the model
model = Sequential()
model.add(LSTM(32, input_shape=(X.shape[1], X.shape[2])))
model.add(Dense(y.shape[1], activation='softmax'))
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
model.fit(X, y, epochs=600, batch_size=1, verbose=2)
# summarize performance of the model
scores = model.evaluate(X, y, verbose=0)
print("Model Accuracy: %.2f%%" % (scores[1]*100))
for pattern in dataX:
 x = numpy.reshape(pattern, (1, len(pattern), 1))
 x = x / float(len(alphabet))
 prediction = model.predict(x, verbose=0)
 index = numpy.argmax(prediction)
 result = int_to_char[index]
 seq_in = [int_to_char[value] for value in pattern]
 print seq_in, "->", result
OUTPUT
# If you set seq_length=3 Then output:--
seq_length=3

Now Lets work with some Text Data:

Create one python file :

Code:

import re
from nltk.tokenize import word_tokenize
from keras.preprocessing.sequence import pad_sequences
import numpy as np
def tokenize(sentence):
    return word_tokenize(sentence)
def vectorize_ques(data, word_id, test_max_length, ques_max_length):
    X = []
    Xq = []
    for subtext, question in data:
        x = [word_id[w] for w in subtext]
        xq = [word_id[w] for w in question]
        
        X.append(x)
        Xq.append(xq)
    return (pad_sequences(X, maxlen=test_max_length),
            pad_sequences(Xq, maxlen=ques_max_length))
def vectorize_text(data, word_id, text_max_length, ques_max_length):
    X = []
    Xq = []
    Y = []
    for subtext, question, answer in data:
        x = [word_id[w] for w in subtext]
        # Save the ID of Questions using SubText
        xq = [word_id[w] for w in question]
        # Save the answers for the Questions in "Y" as "1"
        y = np.zeros(len(word_id) + 1)
        y[word_id[answer]] = 1
        X.append(x)
        Xq.append(xq)
        Y.append(y)
    return (pad_sequences(X, maxlen=text_max_length),
            pad_sequences(Xq, maxlen=ques_max_length),
            np.array(Y))
def read_text():
    text = []
    input_line = input('Story, Empty to stop: ')
    while input_line != '':
        # for now, lines have to be a full sentence
        if not input_line.endswith('.'):
            input_line += '.'
        text.extend(tokenize(input_line))
        input_line = input('Story, Empty to stop: ')
    return text

Another Python file train.py:

# Kashyap
# Dataset from Facebook AI Research Pageimport os
from keras.models import Sequential, Model
from keras.layers.embeddings import Embedding
from keras.preprocessing.sequence import pad_sequences
from keras.layers import Input, Activation, Dense, Permute, Dropout, add, dot, concatenate
from keras.layers import LSTM
from keras.utils.data_utils import get_file
from functools import reduce
from nltk.tokenize import word_tokenize
import tarfile
from Text_Preprocessing import *
import numpy as np
import pickle
import keras
import re
import random
def parse_text(lines, only_supporting=False):
    
    data = []
    text = []
    
    for line in lines:
       
        line = line.decode('utf-8').strip()
        
        id, line = line.split(' ', 1)
        
        id = int(id)
        if id == 1:
            text = []
        
        if '\t' in line:
            ques, ans, supporting = line.split('\t')
           
            ques = tokenize(ques)
            subtext = None
            if only_supporting:
                
                supporting = list(map(int, supporting.split()))
                
                subtext = [text[i - 1] for i in supporting]
            else:
               
                subtext = [x for x in text if x]
            data.append((subtext, ques, ans))
            text.append('')
        else:
            sent = tokenize(line)
            text.append(sent)
    return datadef get_stories(file, only_supporting=False, max_length=None):
    
      data = parse_text(file.readlines(), only_supporting=only_supporting)
    flatten = lambda data: reduce(lambda x, y: x + y, data)# flatten: Takes two sentences and makes one array, 2nd array of Question answer in a list
    # Format: [([First sentence Tokeinized, Second Sentence Tokenized],[Question with Answer Tokenized]), ....]
    data = [(flatten(text), question, answer) for text, question, answer in data if
            not max_length or len(flatten(text)) < max_length]
    return dataclass memoryNetwork(object):
    FILE_NAME = 'model'
    VOCAB_FILE_NAME = 'model_vocab'def __init__(self):
        if (os.path.exists(memoryNetwork.FILE_NAME) and
                os.path.exists(memoryNetwork.VOCAB_FILE_NAME)):
            self.load()
        else:
            self.train()
            self.store()def load(self):
        self.model = keras.models.load_model(memoryNetwork.FILE_NAME)
        with open(memoryNetwork.VOCAB_FILE_NAME, 'rb') as file:
            self.word_id = pickle.load(file)def store(self):
        self.model.save(memoryNetwork.FILE_NAME)
        with open(memoryNetwork.VOCAB_FILE_NAME, 'wb') as file:
            pickle.dump(self.word_id, file)def train(self):
        # Load the bAbI Dataset
        try:
            dataPath = get_file('babi-tasks-v1-2.tar.gz',
                                origin='https://s3.amazonaws.com/text-datasets/babi_tasks_1-20_v1-2.tar.gz')
        except:
            print('Error downloading dataset, please download it manually:\n'
                  '$ wget http://www.thespermwhale.com/jaseweston/babi/tasks_1-20_v1-2.tar.gz\n'
                  '$ mv tasks_1-20_v1-2.tar.gz ~/.keras/datasets/babi-tasks-v1-2.tar.gz')
            raisetar = tarfile.open(dataPath)        challenges = {
            'single_supporting_fact_10k': 'tasks_1-20_v1-2/en-10k/qa1_single-supporting-fact_{}.txt',
            'two_supporting_facts_10k': 'tasks_1-20_v1-2/en-10k/qa2_two-supporting-facts_{}.txt',
        }challenge_type = 'single_supporting_fact_10k'challenge = challenges[challenge_type]# Extract the Text from single_supporting_fact_10k file
        print('Extracting stories for the challenge:', challenge_type)# Load the Testing and Training Text Data
        train_stories = get_stories(tar.extractfile(challenge.format('train')))
        test_stories = get_stories(tar.extractfile(challenge.format('test')))# Initialize vocabulary as as Set
        # Create a Vocabulary list with all words occuring only once
        vocab = set()
        for text, ques, answer in train_stories + test_stories:
            vocab |= set(text + ques + [answer])# Sort the words in Vocabulary List
        vocab = sorted(vocab)# Get the max length of the Vocabulary, text and Questions
        vocab_size = len(vocab) + 1# text_max_length: length of th subtext; no. of subtexts
        text_max_length = max(list(map(len, (x for x, _, _ in train_stories + test_stories))))# ques_max_length: length of questions in input.
        ques_max_length = max(list(map(len, (x for _, x, _ in train_stories + test_stories))))print('-')print('Vocab size:', vocab_size, 'unique words')
        print('Story max length:', text_max_length, 'words')
        print('Query max length:', ques_max_length, 'words')
        print('Number of training stories:', len(train_stories))
        print('Number of test stories:', len(test_stories))print('-')print('Here\'s what a "story" tuple looks like (input, query, answer):')
        print(train_stories[0])print('-')print('Vectorizing the word sequences...')# Vectorize the Training and Testing Data
        self.word_id = dict((c, i + 1) for i, c in enumerate(vocab))# inputs_train: Matrix of Arrays; Arrays containing vectorized sentences
        # ques_train: Matrix of Arrays; Each array has 4 values; Each value corresponds to a character.
        # answers_train: Matrix of Arrays; Each array contains a single "1", index corresponding to answer
        inputs_train, ques_train, answers_train = vectorize_text(train_stories,
                                                                 self.word_id,
                                                                 text_max_length,
                                                                 ques_max_length)inputs_test, ques_test, answers_test = vectorize_text(test_stories,
                                                              self.word_id,
                                                              text_max_length,
                                                              ques_max_length)# Dataset Analysis
        print('-')print('inputs: integer tensor of shape (samples, max_length)')
        print('inputs_train shape:', inputs_train.shape)
        print('inputs_test shape:', inputs_test.shape)print('-')print('queries: integer tensor of shape (samples, max_length)')
        print('queries_train shape:', ques_train.shape)
        print('queries_test shape:', ques_test.shape)print('-')print('answers: binary (1 or 0) tensor of shape (samples, vocab_size)')
        print('answers_train shape:', answers_train.shape)
        print('answers_test shape:', answers_test.shape)print('-')print('Compiling...')# Define Placeholders
        input_sequence = Input((text_max_length,))
        question = Input((ques_max_length,))
        
        input_encoder_m = Sequential()input_encoder_m.add(Embedding(input_dim=vocab_size,
                                      output_dim=64))input_encoder_m.add(Dropout(0.3))
input_encoder_c = Sequential()
input_encoder_c.add(Embedding(input_dim=vocab_size,
                                      output_dim=ques_max_length))input_encoder_c.add(Dropout(0.3))
        
question_encoder = Sequential()question_encoder.add(Embedding(input_dim=vocab_size,
                                       output_dim=64,
                                       input_length=ques_max_length))question_encoder.add(Dropout(0.3))
        # output: (samples, query_maxlen, embedding_dim)
input_encoded_m = input_encoder_m(input_sequence)
input_encoded_c = input_encoder_c(input_sequence)
question_encoded = question_encoder(question)# compute a 'match' between the first input vector sequence
        # and the question vector sequence
        # shape: `(samples, story_maxlen, query_maxlen)`
        match = dot([input_encoded_m, question_encoded], axes=(2, 2))
        match = Activation('softmax')(match)# add the match matrix with the second input vector sequence
        response = add([match, input_encoded_c])  # (samples, story_maxlen, query_maxlen)
        response = Permute((2, 1))(response)  # (samples, query_maxlen, story_maxlen)
        answer = concatenate([response, question_encoded])
        # we choose to use a RNN instead.
        answer = LSTM(32)(answer)  # (samples, 32)
        answer = Dropout(0.3)(answer)
        answer = Dense(vocab_size)(answer)  # (samples, vocab_size)
        # we output a probability distribution over the vocabulary
        answer = Activation('softmax')(answer)# build the final model
        self.model = Model([input_sequence, question], answer)
        self.model.compile(optimizer='rmsprop', loss='categorical_crossentropy',
                      metrics=['accuracy'])# Train the Model
        self.model.fit([inputs_train, ques_train], answers_train,
                  batch_size=32,
                  epochs=120,
                  validation_data=([inputs_test, ques_test], answers_test))

Final.py : —

from Text_Preprocessing import *
from train import *
import numpy as npmemory_network = memoryNetwork()while True:
    print('Use this Vocabulary to form Questions:\n' + ' , '.join(memory_network.word_id.keys()))
    story = read_text()
    print('Story: ' + ' '.join(story))
    question = input('q:')
    if question == '' or question == 'exit':
        break
    story_vector, query_vector = vectorize_ques([(story, tokenize(question))],
                                                  memory_network.word_id, 68, 4)
    prediction = memory_network.model.predict([np.array(story_vector), np.array(query_vector)])
    prediction_word_index = np.argmax(prediction)
    for word, index in memory_network.word_id.items():
        if index == prediction_word_index:
            print('Answer: ',word)

python3 main.py

OUTPUT

Please Let me Know If you are having any problem

Lets Nurture

Share
Published by
Lets Nurture

Recent Posts

Image classification Api — deep learning

What is Deep Learning? Deep Learning is a new area of Machine Learning research, which…

13 hours ago

What is Generative AI?

Generative AI refers to a category of advanced algorithms designed to produce original content across…

2 weeks ago

5 Generative AI Video Tools Everyone Should Know About

Generative AI Video Tools Everyone Should Know About Generative AI is revolutionizing video creation, making…

2 weeks ago

What Exactly Are LLMs (Large Language Models)?

Large Language Models (LLMs)  are a transformative advancement in artificial intelligence, capable of understanding, processing,…

2 weeks ago

How Would You Use a Virtual Clothing Mirror in 2025

In the ever-evolving landscape of retail, virtual clothing mirrors stand out as a key differentiator,…

2 weeks ago

Introducing LetsNurture’s Smart Mirror for Retail and Beauty Businesses

As technology evolves, businesses in the retail and beauty sectors face increased pressure to innovate…

3 weeks ago