Introduction

Learn about the course, its prerequisites, and learning objectives.

What is the Microsoft Computer Vision API?

Microsoft Cognitive Services provides a cloud-based API for computer vision called Microsoft Computer Vision API. This API gives access to advanced algorithms for image processing and returning information. Users do not need to have any prior expertise in machine learning. Instead, they can simply upload an image or video, or provide a URL to be analyzed by the algorithm according to the inputs and user choices.

This API has multiple functionalities, such as extracting text from images, describing an image in human-understandable words, moderating the content of an image, and more.

Prerequisites

To get the most out of this course, you’ll need basic knowledge of the Python programming language. However, you don’t need to have any prior knowledge of computer vision or deep learning.

Learning objectives

In this course, we’ll learn how to use the Microsoft Computer Vision service. Specifically, we’ll cover the following topics:

  1. Optical Character Recognition (OCR): This feature helps extract handwritten and printed text from inputs like images and documents. These inputs can have multiple writing styles and languages.
  2. Image Analysis: This feature helps extract various kinds of visual information from images. It can be used for several purposes, such as detecting elements like human faces and brand logos or checking whether an image contains inappropriate content. It can categorize objects into more than 10,000 categories and help generate value from a visual asset.