Capstone Project Notes

Fake News

Examples

https://github.com/davidmasse/freelancer-rates

https://github.com/slieb74/NBA-Shot-Analysis

https://github.com/cpease00/etf_forecasting

https://github.com/NaokoSuga/gentrification_yelp

https://github.com/mrethana/news_bias_final

https://github.com/paulinaczheng/twitter_flu_tracking

Project

GitHub - aws/aws-cli: Universal Command Line Interface for Amazon Web ServicesGitHub

Scrapy | A Fast and Powerful Scraping and Web Crawling Framework

Streaming large training and test files into Tensorflow's DNNClassifierStack Overflow

MaxMind GeoIP2 Python API — geoip2 4.5.0 documentation

import geoip2.database
import socket

ip = socket.gethostbyname('nike.com')
reader = geoip2.database.Reader('GeoLite2-Country_20190305/GeoLite2-Country.mmdb')
response = reader.country(ip)
response.country.iso_code # Results in 'US'

Make News Credible Again

Practical Text Classification With Python and Keras – Real Pythonrealpython

https://towardsdatascience.com/multi-class-text-classification-model-comparison-and-selection-5eb066197568towardsdatascience.com

A practical explanation of a Naive Bayes classifierMonkeyLearn Blog

Text Classification Using Naive BayesMonkeyLearn

Sklearn Naive Bayes Classifier Python: Gaussian Naive Bayes Scikit-Learn TutorialDataCamp Community

Workflow

Data Collection
Collect news articles from a set of credible and non-credible websites. Get training labels from OpenSources, a professionally curated database.
Sampling
Sample from the corpus in such a way that the training set contains an even number of unique articles from both credible and non-credible sources for each day of data collection.
Classifier
Build an ensemble classifier that considers the predictions of two separate models: a) "Content-only" model (Multinomial Naive Bayes) b) "Context-only" model (Adaptive Boosting)

PreviousMod 5 Project NextStreaming large training and test files into Tensorflow's DNNClassifier

Last updated 6 years ago