High-level Overview of the spaCy Library

Let's get an overview of the spaCy library and how it compares to other libraries.

We'll cover the following

What is spaCy?

spaCy is an open-source Python library for modern NLP. The creators of spaCy describe their work as industrial-strength NLP. spaCy is shipped with pre-trained language models and word vectors for 60+ languages.

spaCy is focused on production and shipping code, unlike its more academic predecessors. The most famous and frequently used Python predecessor is NLTK. NLTK's main focus was providing students and researchers with an idea of language processing. It never put any claims on efficiency, model accuracy, or being an industrial-strength library. spaCy focused on providing production-ready code from the first day. You can expect models to perform on real-world data, the code to be efficient, and the ability to process a huge amount of text data in a reasonable time. The following table is an efficiency comparison from the spaCy documentation.

Create a free account to view this lesson.

By signing up, you agree to Educative's Terms of Service and Privacy Policy