Natural Language Processing

Podcast Sentiment and Topic Analysis

Intro I am a bit of a podcast addict. One of the podcasts I listen to most, The Jordan Harbinger Show, has Feedback Friday every week. It’s effectively an agony aunt, except with two uncles. I always loved it, but over time I’ve felt it has become too negative - too many stories about addiction and abusive relationships. They’re interesting to discuss, but they’re a bit depressing. I wanted to test this hypothesis. Jordan is nice enough to provide all the transcripts on his website, so I thought I’d analyse the FBF episodes from the last year and see if they really are negative. While I had the data, I thought it would be interesting to extract common themes and topics too.

Thursday, January 2, 2025 | 11 minutes Read

Disaster Tweets Natural Language Processing

Intro I have a dataset of tweets, which includes whether they are referring to a disaster or not. The goal is to build a model that takes a tweet and predicts if it is a disaster. This could be useful during an actual disaster to ensure only the most relevant ones are shown to emergency responders. The full code for this project can be found on my GitHub: https://github.com/jamesdeluk/data-science/tree/main/Projects/nlp-with-disaster-tweets Exploring and cleaning the data I started by looking at the raw data in a text editor; as it was only a few hundred kilobytes, it was easy enough to do:

Wednesday, December 11, 2024 | 26 minutes Read