This device is not compatible.
PROJECT
Fake News Detection Using Scikit-learn
In this project, we will use two different data sources of news and combine them as a dataset. After that, we will use the scikit-learn library to create a classifier that will be used to determine if a piece of news is fake.
You will learn to:
Create a data frame using data pulled from the News API.
Select the features from the textual data.
Create a classifier to classify the textual data.
Skills
Machine Learning
Natural Language Processing
Prerequisites
Intermediate knowledge of Python
Basic understanding of Scikit-learn
Basic understanding of classification problems
Intermediate knowledge of DataFrames
Technologies
Python
Scikit-learn
Project Description
Let’s start this project with a simple question, do you trust all the news from social media? How can we detect fake news from real news? It’s a tough question. Luckily, we can detect fake news using a supervised machine learning method.
Fake news is a piece of news that is not true and deliberately designed to mislead people. It is usually spread via social media or other online platforms. Fake news is usually politically driven to give advantages or disadvantages to a political party. Such news items may contain false and exaggerated claims and because of certain algorithms, trap users in a filter bubble.
In this project, we’ll use two different datasets:
- News dataset available on Kaggle.
- The second dataset we’ll create ourselves using the News API. We will use this API to load some data and then append that data to the other dataset.
In the end, we will use a passive-aggressive classifier to classify and differentiate the fake news from the real ones. The passive-aggressive classifier is a classification algorithm in machine learning that changes the model whenever there is a wrong prediction. If there is no wrong prediction, the model will stay the same.
Project Tasks
1
News API
Task 1: Import the Necessary Modules
Task 2: Create a Get News Method
Task 3: Get News Sources
Task 4: Get News Using Multiple Sources
Task 5: Create a DataFrame of News
2
Scikit Learn
Task 6: Load and Concat the DataFrame
Task 7: Import the scikit-learn Modules
Task 8: Split the Training and Testing Data
Task 9: Feature Selection
Task 10: Initialize and Apply the Classifier
Task 11: Test the Classifier
Task 12: Load the Test Data
Task 13: Select Features and Get Predictions
Task 14: Evaluate the Predictions
Congratulations