This device is not compatible.
PROJECT
Named Entity Recognition Using Transformer Architecture
In this project, we’ll learn how to apply transformers to solve named entity recognition (NER), a token classification problem that aims to identify different entities in an input text.
You will learn to:
Understand the transformer architecture for NLP.
Preprocess the data for named entity recognition (NER).
Implement a transformer-based model for NER.
Fine-tune the model for better accuracy.
Skills
Transformer Models
Deep Learning
Text Preprocessing
Natural Language Processing
Prerequisites
Basic knowledge of Python programming
Basic knowledge of machine learning
Basic knowledge of natural language processing
Basic knowledge of the transformer architecture
Technologies
Python
Pytorch
Hugging Face
Project Description
In this project, we’ll develop a transformer model to recognize named entities such as persons, places, and organizations in the given input text.
We’ll work with Python libraries: NumPy, pandas, Matplotlib, PyTorch, and the Hugging Face Transformers library. We’ll preprocess our input text dataset to make it suitable for training a transformer model. Next, we’ll develop and train a small transformer model on the preprocessed dataset. Finally, we’ll pick a state-of-the-art trained transformer model from Hugging Face Hub and fine-tune it for named entity recognition. The project will require a text dataset as an input to the model. The expected output is an NLP model capable of accurately identifying entities in a given text.
Project Tasks
1
Introduction
Task 0: Get Started
Task 1: Import Libraries
2
Data Preprocessing
Task 2: Load the Dataset
Task 3: Explore the Dataset
Task 4: Construct Validation Dataset
Task 5: Sub-Sample the Data
Task 6: Tokenize the Datasets
Task 7: Align Tokens with Labels
Task 8: Use Aligned Tokens and Labels to Map the Dataset
3
Building and Configuring the Transformer Model
Task 9: Build a Transformer Model
Task 10: Setup the Model
4
Training and Evaluating the Model
Task 11: Train the Model Using a PyTorch Training Loop
Task 12: Evaluate the Model's Performance
5
Fine-Tuning the Model for Performance
Task 13: Initialize a Hugging Face Model for Token Classification
Task 14: Use a Data Collator for Token Classification
Task 15: Define a Custom Function to Compute Evaluation Metrics
Task 16: Set Up Training Arguments
Task 17: Fine-Tune, Evaluate, and Save the Trained Model
Task 18: Test the Model Performance on a Test Dataset
Congratulations!
Atabek BEKENOV
Senior Software Engineer
Pradip Pariyar
Senior Software Engineer
Renzo Scriber
Senior Software Engineer
Vasiliki Nikolaidi
Senior Software Engineer
Juan Carlos Valerio Arrieta
Senior Software Engineer
Relevant Courses
Use the following content to review prerequisites or explore specific concepts in detail.