TF Lite Framework
Explore how TensorFlow Lite compresses and converts large deep learning models into a compact FlatBuffers format suited for resource-limited mobile devices. Understand model serialization, deployment methods, and hardware acceleration techniques to run efficient on-device inference without cloud dependency.
DL models trained on a huge amount of data can have millions of trained weights and a size of hundreds of megabytes. Testing even a single data point or an image can take a few seconds. Therefore, we can’t deploy large DL models to resource-constrained mobile devices. TF provides us with a lightweight framework, TF Lite, that can compress, optimize, and deploy DL models to mobile devices.
Let’s explore various model serialization formats and understand model development and deployment using TF Lite.
Model serialization formats
Model serialization is the process of converting an ML model into a format that we can store in a file or transmit over a network. Model serialization permits us to save and share trained DL models. Protocol buffers, GraphDef, and FlatBuffers are all serialization formats to represent data in a compact, efficient, and platform-independent way.
Protocol buffers
Google and its TF framework use the protocol buffers (Protobuf) file format to store data. Protobuf efficiently compacts the ...