The Reality is More Complicated

Learn how YouTube can use different techniques to effectively deliver content to the end-user.

Introduction

Now that we have understood the design fairly enough, let us see how YouTube can optimize the usage of storage and network demands while maintaining good Quality of Experience (QoE)QoE is the measure of satisfaction of a customer’s experience with a particular service for the end-user.

When talking about providing effective service to the end-users, the following three steps are important:

  1. Encode: The raw videos uploaded to YouTube have significant storage requirements. It is possible to use various encoding schemes to reduce the size of these raw video files. Apart from the compression capability, the choice of encoding scheme will also depend on the types of end devices used to stream the video content. Since multiple devices could be used to stream the same video, we may have to encode the same video using different encoding schemes resulting in one raw video file being converted into multiple files each encoded differently. This strategy will result in a good user-perceived experience because of two reasons: 1) users will save bandwidth because the video file will be encoded, hence compressed to some limit, and 2) the encoded video file will be appropriate for the client for a smooth playback experience.
  2. Deploy: For low latency, content must be intelligently deployed such that it is closer to a large number of end-users. Not only will this reduce latency, but it will also put less burden on the networks as well as YouTube’s core servers.
  3. Deliver: Delivering to the client requires knowledge about the client/device used for playing the video. This knowledge will help in adapting to the client and network conditions. Thus, we will enable ourselves to serve content efficiently.

Let’s understand each phase in detail now.

Encode

Until now we have considered encoding one video with different encoding schemes. However, what if we encode videos on a per-shot basis. This means we will divide video into smaller time-frames and encode them individually. We can divide videos into shorter time-frames and refer to them as segments. Each segment will be encoded using multiple encoding schemes to generate different files called chunks. The choice of encoding scheme for a segment will be based on the detail within the segment to get optimized quality with lesser storage requirements. Eventually, each shot will be encoded into multiple chunk sizes depending on the segment’s content and encoding scheme used. As we divide the raw video into segments, we will see its advantages during the deployment and delivery phase.

Let us understand how the per-segment encoding will work. For any video with dynamic colors and high-depth, we will encode it differently from a video with fewer colors. This means that a not-so-dynamic segment will be encoded such that it’s compressed more to save additional storage space. Eventually, we will have to transfer smaller file sizes and consequently save bandwidth during deployment and streaming phases.

Create a free account to access the full course.

By signing up, you agree to Educative's Terms of Service and Privacy Policy