This device is not compatible.


Generate Summary of Videos using Python

In this project, we’ll learn how to get transcripts of YouTube videos, tokenize the transcripts, and generate a summary of the transcripts using the Natural Language Toolkit.

Generate Summary of Videos using Python

You will learn to:

Get transcripts of YouTube videos using Python.

Build a Python script to interact with human language data.

Build the process of tokenizing or splitting a string/text into a list of tokens.

Generate a summary of the text using the Natural Language Toolkit.


Machine Learning

Natural Language Processing



Intermediate knowledge of Python.

Intermediate knowledge of Natural Language Processing.

Basic understanding of spaCy models.

Intermediate knowledge of sentence tokenizer.



spaCy logo



Project Description

In this project, we’ll develop a YouTube video transcript summarizer that automatically extracts video transcripts from YouTube and generates concise summaries using the Natural Language Toolkit (NLTK) and sentence tokenization techniques.

To accomplish this, we’ll utilize the YouTube API to fetch the video based on the provided URL or video ID. Once the video is obtained, we’ll get the text of the video using the transcript.

With the video transcript in hand, we’ll leverage the powerful features of NLTK, a widely used natural language processing library, to tokenize the transcript into individual sentences. This sentence tokenization step allows us to break down the transcript into smaller units for analysis.

To generate the summary, we’ll utilize NLTK’s summarization algorithms. By applying techniques like extractive summarization, we’ll identify the most significant sentences and construct a condensed summary that captures the key points and main ideas of the video.

Project Tasks


Video-to-Text Conversion

Task 0: Get Started

Task 1: Import Modules

Task 2: Get the ID of the YouTube Video

Task 3: Get a Transcript of Video


Text to Summary Conversion

Task 4: Get All Available Sentences

Task 5: Get All Tokens from the Document

Task 6: Calculate the Frequency of Tokens

Task 7: Normalize the Frequency of Tokens

Task 8: Calculate the Score of Sentences

Task 9: Generate the Summary