Skip to content

himanshurawat443/Twitter-Sentiment-Analysis-using-Deep-Neural-networks

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 

Repository files navigation

Twitter-Sentiment-Analysis-using-Deep-Neural-networks

The objective of this project was to recognize Sentiments generated by tweets and classifying them using Neural Networks. For this project Sentiment140 dataset from Kaggle which has 1,600,000 tweets labeled as “positive = 4” and “negative = 0” was used. Sentiment analysis is really useful since it can be used to determine the overall opinion about selling objects, identifying hate speech or predict stock markets for a given company.

Twitter is a micro-blogging service with user created status messages termed as tweets. The timeline of twitter service displays tweets of all users worldwide and is an extensive source of real-time information. Investigation of tweets reveals that the 140 character length text restricts the vocabulary and the hyperlinks present in these tweets also restrict the vocabulary size. The frequency of misspellings and slang words in tweets is much higher than in other language resources. Micro-blogging language is characterized by expressive punctuations which convey a lot of sentiments. Bold lettered phrases, exclamations, question marks, quoted text etc. leave scope for sentiment extraction. The varied domains discussed would surely impose hurdles for training.

I have done preprocessing on the dataset by removing @user , links , punctuations , special characters , numbers and stop words. Also we applied stemming and tokenization for normalization of the data resulting in better training of the model.

During implementation of the model i used 1 Embedding layer as it is preferred for textual data, 1 Dropout layer to prevent over-fitting of the model, 2 LSTM layers to maximize the performance and accuracy of the model and 1 Dense layer to provide a densely connected Neural Network layer.

With this model I was able to get an accuracy of 98.66% on training set (constituting 80% of the dataset).The model was able to make predictions which were highly accurate and generated a F1 score of 0.80 for ‘Negative’ sentiment and 0.96 for ‘Positive’ sentiment with an overall weighted average F1 score of 0.93.

Dataset Link : https://www.kaggle.com/kazanova/sentiment140

Releases

No releases published

Packages

No packages published