This device is not compatible.

Named Entity Recognition Using Transformer Architecture

PROJECT

Named Entity Recognition Using Transformer Architecture

In this project, we’ll learn how to apply transformers to solve named entity recognition (NER), a token classification problem that aims to identify different entities in an input text.

You will learn to:

Understand the transformer architecture for NLP.

Preprocess the data for named entity recognition (NER).

Implement a transformer-based model for NER.

Fine-tune the model for better accuracy.

Skills

Transformer Models

Deep Learning

Text Preprocessing

Natural Language Processing

Prerequisites

Basic knowledge of Python programming

Basic knowledge of machine learning

Basic knowledge of natural language processing

Basic knowledge of the transformer architecture

Technologies

Python

Pytorch

Hugging Face

Project Description

In this project, we’ll develop a transformer model to recognize named entities such as persons, places, and organizations in the given input text.

We’ll work with Python libraries: NumPy, pandas, Matplotlib, PyTorch, and the Hugging Face Transformers library. We’ll preprocess our input text dataset to make it suitable for training a transformer model. Next, we’ll develop and train a small transformer model on the preprocessed dataset. Finally, we’ll pick a state-of-the-art trained transformer model from Hugging Face Hub and fine-tune it for named entity recognition. The project will require a text dataset as an input to the model. The expected output is an NLP model capable of accurately identifying entities in a given text.

Project Tasks

Introduction

Task 0: Get Started

Task 1: Import Libraries

Data Preprocessing

Task 2: Load the Dataset

Task 3: Explore the Dataset

Task 4: Construct Validation Dataset

Task 5: Sub-Sample the Data

Task 6: Tokenize the Datasets

Task 7: Align Tokens with Labels

Task 8: Use Aligned Tokens and Labels to Map the Dataset

Building and Configuring the Transformer Model

Task 9: Build a Transformer Model

Task 10: Setup the Model

Training and Evaluating the Model

Task 11: Train the Model Using a PyTorch Training Loop

Task 12: Evaluate the Model's Performance

Fine-Tuning the Model for Performance

Task 13: Initialize a Hugging Face Model for Token Classification

Task 14: Use a Data Collator for Token Classification

Task 15: Define a Custom Function to Compute Evaluation Metrics

Task 16: Set Up Training Arguments

Task 17: Fine-Tune, Evaluate, and Save the Trained Model

Task 18: Test the Model Performance on a Test Dataset

Congratulations!

Subscribe to project updates

Hear what others have to say

Join 1.4 million developers working at companies like

"Another great hands on project to apply your knowledge learned. Thank you Educative ❤️"

Atabek BEKENOV

Senior Software Engineer

"Super excited to learn E-commerce website for my own startup venture. Thanks for your great learning platform."

Pradip Pariyar

Senior Software Engineer

"This was an excellent lesson. I learned a lot working through the process. I enjoyed it so much that I rebuilt it my AWS account to see how hard it would be to deploy to a production environment."

Renzo Scriber

Senior Software Engineer

"It was my first proper data engineering project and it was amazing."

Vasiliki Nikolaidi

Senior Software Engineer

"It's a fantastic way to do hands-on practice; I enjoy this way of learning."

Juan Carlos Valerio Arrieta

Senior Software Engineer

Relevant Courses

Use the following content to review prerequisites or explore specific concepts in detail.