11:[["$","$La0",null,{"props":{"lessonContent":{"components":[{"type":"MarkdownEditor","content":{"version":"2.0","text":"# Inference\n\nInference is the process of using a trained machine learning model to make a prediction. Below are some of the techniques to scale inference in the production environment. \n\n\n\n## 1. Imbalance workload\n- During inference, one common pattern is to split workloads onto multiple inference servers. We use similar architecture in Load Balancers. It is also sometimes called an Aggregator Service.\n\n\n","mdHtml":"

Inference

Inference is the process of using a trained machine learning model to make a prediction. Below are some of the techniques to scale inference in the production environment.

1. Imbalance workload

During inference, one common pattern is to split workloads onto multiple inference servers. We use similar architecture in Load Balancers. It is also sometimes called an Aggregator Service.

\n","comp_id":"-vJiS-dTjJ92pj1QQ3WAS","cursorPosition":0},"iteration":0,"hash":0,"saveVersion":38,"children":[{"text":""}],"status":"normal"},{"type":"DrawIOWidget","mode":"edit","content":{"path":"/api/collection/5184083498893312/5582183480688640/page/6343824887513088/image/5985140971536384?page_type=collection_lesson","caption":"Dispatcher diagram","editorImagePath":"/api/collection/5184083498893312/5582183480688640/page/6343824887513088/image/5391570484985856?page_type=collection_lesson","version":2,"height":321,"width":551,"editorGCSImagePath":"educative-us-central1/uc/v5/5184083498893312/collections/5582183480688640/rev-116/pages/6343824887513088/images/5391570484985856-2024-03-21T10:37:11.541031","slidesEnabled":true,"isSlides":false,"slidesCaption":[],"redirectionUrl":"","borderColor":"#ffffff","comp_id":"uULfyXIuMjhRVEizUeQkR","slidesId":null},"iteration":0,"hash":1,"children":[{"text":""}],"status":"normal","contentID":"GCAfaADZK7aC3A-O1NTUm","saveVersion":1},{"type":"MarkdownEditor","mode":"edit","content":{"version":"2.0","text":"1. Clients (upstream process) send requests to the Aggregator Service. If the workload is too high, the Aggregator Service splits the workload and sends it to workers in the Worker pool. Aggregator Service can pick workers through one of the following ways:\n\n a) Work load \n\n b) Round Robin \n\n c) Request parameter\n\n\n2. Wait for response from workers. \n\n3. Forward response to client.","mdHtml":"

\n
Clients (upstream process) send requests to the ...

","comp_id":"vA9v71u3hv_yuJCYfj6MR"},"iteration":0,"hash":2,"saveVersion":7,"children":[{"text":""}],"status":"normal"}],"summary":{"description":"Learn common techniques to scale inference in production environments. ","titleUpdated":true,"title":"Inference"},"content":[{"type":"MarkdownEditor","content":{"version":"2.0","text":"# Inference\n\nInference is the process of using a trained machine learning model to make a prediction. Below are some of the techniques to scale inference in the production environment. \n\n\n\n## 1. Imbalance workload\n- During inference, one common pattern is to split workloads onto multiple inference servers. We use similar architecture in Load Balancers. It is also sometimes called an Aggregator Service.\n\n\n","mdHtml":"

Inference

Inference is the process of using a trained machine learning model to make a prediction. Below are some of the techniques to scale inference in the production environment.

1. Imbalance workload

During inference, one common pattern is to split workloads onto multiple inference servers. We use similar architecture in Load Balancers. It is also sometimes called an Aggregator Service.

\n
Clients (upstream process) send requests to the ...

","comp_id":"vA9v71u3hv_yuJCYfj6MR"},"iteration":0,"hash":2,"saveVersion":7,"children":[{"text":""}],"status":"normal"}],"darkModeContent":[{"type":"MarkdownEditor","content":{"version":"2.0","text":"# Inference\n\nInference is the process of using a trained machine learning model to make a prediction. Below are some of the techniques to scale inference in the production environment. \n\n\n\n## 1. Imbalance workload\n- During inference, one common pattern is to split workloads onto multiple inference servers. We use similar architecture in Load Balancers. It is also sometimes called an Aggregator Service.\n\n\n","mdHtml":"

Inference

Inference is the process of using a trained machine learning model to make a prediction. Below are some of the techniques to scale inference in the production environment.

1. Imbalance workload

During inference, one common pattern is to split workloads onto multiple inference servers. We use similar architecture in Load Balancers. It is also sometimes called an Aggregator Service.

\n
Clients (upstream process) send requests to the ...

","comp_id":"vA9v71u3hv_yuJCYfj6MR"},"iteration":0,"hash":2,"saveVersion":7,"children":[{"text":""}],"status":"normal"}]},"isPreviewLesson":false,"pageType":"collection_lesson","aiCoachVideoUrl":"https://youtu.be/kgl8y9J3O6c","collectionDetailsSSR":{"title":"Machine Learning System Design","summary":"Machine Learning System Design is an important component of any ML interview. The ability to address problems, identify requirements, and discuss tradeoffs helps you stand out among hundreds of other candidates. Readers of this course able to get offers from Snapchat, Facebook, Coupang, Stitchfix and LinkedIn. \n\nThis course will help you understand the state of the practice on model techniques along with best practices in applying ML models in production at scale. Once you finished the course you can learn more use-cases at: http://mlengineer.io/\n\nOnce you're done with the course, you will be able to apply and leverage knowledge from top researchers at tech companies. You will have up to date knowledge in model techniques from hundreds of the latest research and industry papers. There is even a chance that the interviewers will be surprised at the depth of your knowledge.","details":"","clos":["Improve your Machine Learning System Design skills. Apply the best techniques in order to structure and drive your interview. "],"toc":{"categories":[{"editMode":false,"pages":[{"is_recovered":false,"editMode":false,"is_preview":true,"title":"Introduction","parentIndex":null,"type":"collection_lesson","id":6570189005520896,"slug":"introduction"},{"is_recovered":false,"editMode":false,"is_preview":true,"title":"Feature Selection and Feature Engineering","parentIndex":null,"type":"collection_lesson","id":5099278691729408,"slug":"feature-selection-and-feature-engineering"},{"is_recovered":false,"editMode":false,"is_preview":true,"title":"Training Pipeline","parentIndex":null,"type":"collection_lesson","id":4556318946361344,"slug":"training-pipeline"},{"is_recovered":false,"editMode":false,"is_preview":false,"page_id":6343824887513088,"title":"Inference","parentIndex":null,"is_lesson":true,"collection_id":5582183480688640,"author_id":5184083498893312,"type":"collection_lesson","id":6343824887513088,"slug":"inference"},{"is_recovered":false,"editMode":false,"title":"Metrics Evaluation","type":"collection_lesson","is_preview":false,"parentIndex":null,"collection_id":5582183480688640,"author_id":5184083498893312,"page_id":6317611737415680,"id":6317611737415680,"slug":"metrics-evaluation"}],"type":"COLLECTION_CATEGORY","id":"t4fjpfgct","title":"Machine Learning Primer","summary":"Get familiar with core machine learning principles, from feature engineering to model deployment."},{"editMode":false,"pages":[{"is_recovered":false,"editMode":false,"is_preview":true,"page_id":6252410509983744,"title":"Problem Statement and Metrics","parentIndex":null,"is_lesson":true,"collection_id":5582183480688640,"author_id":5184083498893312,"type":"collection_lesson","id":6252410509983744,"slug":"problem-statement-and-metrics-g7p515EBD5r"},{"is_recovered":false,"editMode":false,"is_preview":true,"title":"Candidate Generation and Ranking Model","parentIndex":null,"type":"collection_lesson","id":5172148363591680,"slug":"candidate-generation-and-ranking-model"},{"is_recovered":false,"editMode":false,"title":"Video Recommendation System Design","is_preview":true,"parentIndex":null,"type":"collection_lesson","id":6229580709363712,"slug":"video-recommendation-system-design"}],"type":"COLLECTION_CATEGORY","id":"8ov50432s","title":"Video Recommendation","summary":"Discover the logic behind developing and optimizing scalable video recommendation systems for enhanced user engagement."},{"editMode":false,"pages":[{"is_recovered":false,"editMode":false,"is_preview":false,"page_id":6418828748652544,"title":"Problem Statement and Metrics","parentIndex":null,"collection_id":5582183480688640,"author_id":5184083498893312,"type":"collection_lesson","id":6418828748652544,"slug":"problem-statement-and-metrics-g7DrxM64mxZ"},{"is_recovered":false,"editMode":false,"is_preview":false,"title":"Feed Ranking Model","parentIndex":null,"is_lesson":true,"type":"collection_lesson","id":5162713008308224,"slug":"feed-ranking-model"},{"is_recovered":false,"editMode":false,"title":"Feed Ranking System Design","is_preview":false,"parentIndex":null,"type":"collection_lesson","id":5899569032855552,"slug":"feed-ranking-system-design"}],"type":"COLLECTION_CATEGORY","id":"wlhd3jyeb","title":"Feed Ranking","summary":"Work your way through optimizing feed ranking with personalized models for enhanced user engagement."},{"editMode":false,"pages":[{"is_recovered":false,"editMode":false,"is_preview":false,"page_id":5181007664775168,"title":"Problem Statement and Metrics","parentIndex":null,"collection_id":5582183480688640,"author_id":5184083498893312,"type":"collection_lesson","id":5181007664775168,"slug":"problem-statement-and-metrics-N8K35PGw4nm"},{"is_recovered":false,"editMode":false,"is_preview":false,"title":"Ad Click Prediction Model","parentIndex":null,"type":"collection_lesson","id":5316646917898240,"slug":"ad-click-prediction-model"},{"is_recovered":false,"editMode":false,"is_preview":false,"title":"Ads Recommendation System Design","parentIndex":null,"type":"collection_lesson","id":5286670873133056,"slug":"ads-recommendation-system-design"}],"type":"COLLECTION_CATEGORY","id":"upl3nhc0z","title":"Ad Click Prediction","summary":"Enhance your skills in designing and optimizing ad click prediction models for better ad performance."},{"id":"5uepyieiv","title":"Rental Search Ranking","type":"COLLECTION_CATEGORY","editMode":false,"pages":[{"is_recovered":false,"editMode":false,"title":"Problem Statement and Metrics","type":"collection_lesson","is_preview":false,"parentIndex":null,"collection_id":5582183480688640,"author_id":5184083498893312,"page_id":6741669822070784,"id":6741669822070784,"slug":"problem-statement-and-metrics-7DooyyP35Ej"},{"is_recovered":false,"editMode":false,"title":"Booking Model","is_preview":false,"parentIndex":null,"type":"collection_lesson","id":5595089396039680,"slug":"booking-model"},{"is_recovered":false,"editMode":false,"title":"Rental Search Ranking System Design","is_preview":false,"parentIndex":null,"type":"collection_lesson","id":5110775847845888,"slug":"rental-search-ranking-system-design"}],"summary":"Take a closer look at designing Airbnb's rental search ranking system with a booking prediction model and performance metrics."},{"editMode":false,"pages":[{"is_recovered":false,"editMode":false,"is_preview":false,"page_id":5337643898896384,"title":"Problem Statement and Metrics","parentIndex":null,"collection_id":5582183480688640,"author_id":5184083498893312,"type":"collection_lesson","id":5337643898896384,"slug":"problem-statement-and-metrics"},{"is_recovered":false,"editMode":false,"is_preview":false,"page_id":5329076395442176,"title":"Estimated Delivery Model","parentIndex":null,"collection_id":5582183480688640,"author_id":5184083498893312,"type":"collection_lesson","id":5329076395442176,"slug":"estimated-delivery-model"},{"is_recovered":false,"editMode":false,"title":"Estimate Food Delivery System Design","type":"collection_lesson","is_preview":false,"parentIndex":null,"collection_id":5582183480688640,"author_id":5184083498893312,"page_id":5375530661052416,"id":5375530661052416,"slug":"estimate-food-delivery-system-design"}],"type":"COLLECTION_CATEGORY","id":"n02b1z2hm","title":"Estimate Food Delivery Time","summary":"See how it works to design an accurate, scalable food delivery time estimation system."},{"editMode":false,"pages":[{"is_recovered":false,"editMode":false,"is_preview":false,"page_id":6147848060534784,"title":"Summary","parentIndex":null,"collection_id":5582183480688640,"author_id":5184083498893312,"type":"collection_lesson","id":6147848060534784,"slug":"summary"}],"type":"COLLECTION_CATEGORY","id":"elpoes222","title":"Conclusion","summary":"Build on comprehensive insights into designing practical Machine Learning systems for diverse applications."},{"editMode":false,"pages":[],"type":"COLLECTION_ASSESSMENT","id":5339568514007040,"title":"Machine Learning Knowledge","slug":"machine-learning-knowledge","summary":""},{"editMode":false,"pages":[],"type":"COLLECTION_ASSESSMENT","id":4959619479240704,"title":"Machine Learning Model Diagnosis","slug":"machine-learning-model-diagnosis","summary":""}]},"page_titles":{"6280584893038592":null,"6014921392259072":null,"5110775847845888":"Rental Search Ranking System Design","6570189005520896":"Introduction","6637822639865856":null,"4845965987545088":null,"5982141087744000":null,"5971400094384128":null,"6161866727358464":null,"5286670873133056":"Ads Recommendation System Design","4754213377146880":null,"5459970456289280":null,"5014743744512000":null,"6343824887513088":"Inference","5099278691729408":"Feature Selection and Feature Engineering","5375530661052416":"Estimate Food Delivery System Design","5745440704692224":null,"6501376233308160":null,"6315677014032384":null,"5172148363591680":"Candidate Generation and Ranking Model","4556318946361344":"Training Pipeline","5337643898896384":"Problem Statement and Metrics","5432841528672256":null,"5162713008308224":"Feed Ranking Model","6317611737415680":"Metrics Evaluation","5316646917898240":"Ad Click Prediction Model","5339568514007040":"Machine Learning Knowledge","6561015318708224":null,"5077937643585536":null,"6013303776608256":null,"6741669822070784":"Problem Statement and Metrics","4959619479240704":"Machine Learning Model Diagnosis","5619548707880960":null,"6048250338476032":null,"6418828748652544":"Problem Statement and Metrics","5161974132899840":null,"5329076395442176":"Estimated Delivery Model","5627005490954240":null,"4959041599045632":null,"4661881166888960":null,"4593621811068928":null,"6311486675222528":null,"6412406698803200":null,"6147848060534784":"Summary","6229580709363712":"Video Recommendation System Design","5595089396039680":"Booking Model","5181007664775168":"Problem Statement and Metrics","6456611240411136":null,"6069742230568960":null,"5764748548243456":null,"5081699562553344":null,"5793771904565248":null,"6056746027057152":null,"6371796843495424":null,"6287273818062848":null,"5899569032855552":"Feed Ranking System Design","5595150775484416":null,"6252410509983744":"Problem Statement and Metrics","5604326553681920":null},"page_tags":{"6280584893038592":"","6014921392259072":"","5110775847845888":"","6570189005520896":"machine learning,feature engineering,system design","6637822639865856":"","4845965987545088":"","5982141087744000":"","5971400094384128":"","6161866727358464":"","5286670873133056":"","4754213377146880":"","5459970456289280":"","6343824887513088":"","5099278691729408":"machine learning,feature engineering,feature selection","5375530661052416":"","5745440704692224":"","6501376233308160":"","6315677014032384":"","5172148363591680":"recommendation,youtube,machine learning","4556318946361344":"machine learning,training pipeline","5337643898896384":"","5432841528672256":"","5162713008308224":"","6317611737415680":"","5316646917898240":"","5329076395442176":"","5339568514007040":"","6561015318708224":"","5077937643585536":"","6013303776608256":"","6741669822070784":"","4959619479240704":"","5619548707880960":"","6048250338476032":"","6418828748652544":"","5161974132899840":"","5014743744512000":"","5627005490954240":"","4959041599045632":"","5793771904565248":"","4593621811068928":"","6311486675222528":"","6412406698803200":"","6147848060534784":"","6229580709363712":"recommendation,youtube,machine learning","5595089396039680":"","5181007664775168":"","6456611240411136":"","6069742230568960":"","5764748548243456":"","5081699562553344":"","4661881166888960":"","6056746027057152":"","6371796843495424":"","6287273818062848":"","5899569032855552":"","5595150775484416":"","6252410509983744":"recommendation,youtube,machine learning","5604326553681920":""},"collection_toc_is_enabled":true,"page_count":null,"docker":{"container":{"buildLogUrl":"","imageName":"","file":{},"buildStatusUrl":"","track":false},"envs":[],"jobs":[],"testRunners":[],"version":3,"loaded":true},"discounted_price":39,"cover_image_id":6166861937377280,"cover_image_metadata":"{\"width\":1024,\"height\":512,\"sizeInBytes\":81922,\"name\":\"Machine Learning System Design.png\"}","cover_image_serving_url":"/v2api/collection/5184083498893312/5582183480688640/image/6166861937377280","tags":["Machine learning","System design","Feature engineering","Training pipeline","Feature selection"],"intro_video_url":"","intro_video_thumbnail_url":"","aggregated_widget_stats":{"MarkdownEditor":62,"codeExerciseCount":0,"codeRunnableCount":0,"codeSnippetCount":4,"illustrations":39,"MxGraphWidget":1,"HashTable":0,"Quiz":6,"SpoilerEditor":2,"CanvasAnimation":0,"projects":0,"assessments":2,"DrawIOWidget":37,"SlateHTML":4,"cloudlabs":0},"default_themes":{"code_themes":{"Code":"default","Markdown":"default","RunJS":"default","SPA":"default","isForced":{"Code":false,"Markdown":false,"RunJS":false,"SPA":false}}},"api_keys":{"api_keys":[]},"skills":["Machine Learning","System Design"],"testimonials":[{"name":"TV","text":"I really found the quizzes very helpful for testing my ML understanding. Also, the resources shared helped me a lot for revising concepts for my interview preparation. This course will definitely help engineers crack Machine Learning Engineering and Data Science interviews","title":"Senior Data Scientist at Amazon","pic":{},"image_id":""},{"name":"D","text":"I have been using your github repo to prep for my interviews and got an offer with NVIDIA with their data science team. Thanks again for your help!","title":"Data Scientist at NVIDIA","pic":{},"image_id":""},{"name":"K","text":"I really like what you've built, it'll help a lot of engineers.","title":"MLE at Facebook","pic":{},"image_id":""},{"name":"VL","text":"It's well organized and the illustrations are well done. Being able to visualize and walk through the steps in order is really helpful in system design. The hints for quizzes is a nice addition, a hint is given if you get stuck in a real interview, mimicking how a real interview ","title":"DS at Fortune 500","pic":{},"image_id":""},{"name":"Andrew","text":"I just heard back from the recruiter that I passed the Google L5 HC. Thank you very much for sharing the resources on GitHub and for the course on educative.io!","title":"Google Machine Learning Engineer, L5","pic":{},"image_id":""},{"name":"B","text":"I got the offer from Intuit. Thanks so much, it would not have been possible without your help.","title":"Senior Machine Learning Engineer, Intuit","pic":{},"image_id":""},{"name":"J","text":"I got Google, Facebook, Apple, Tesla, Cruise offer for Senior ML engineer. I thought the course is super helpful. ","title":"Senior Machine Learning Engineer at Cruise","pic":{},"image_id":""}],"licensing":null,"target_audience":"intermediate","author_id":"5184083498893312","collection_id":"5582183480688640","approval_status":3005,"price":59,"is_private":false,"path_type":"regular","organization_id":null,"is_mini":false,"is_priced":true,"brief_summary":"Gain insights into ML system design, state-of-the-art techniques, and best practices for scalable production. Learn from top researchers and stand out in your next ML interview.","approval_update_time":"2021-02-09T20:24:51.711Z","rating_visibility":true,"update_last_published_on_homepage":true,"show_developed_by":true,"udata_files":[],"CodeThemes":{"Code":"default","Markdown":"default","RunJS":"default","SPA":"default","isForced":{"Code":false,"Markdown":false,"RunJS":false,"SPA":false}},"is_marked_for_deletion":false,"transition_page_title":"","is_redirectable":false,"collection_type":"collection","adaptive_learning_mode":false,"HLOs_to_toc":{},"is_guide":false,"read_time":5400,"allow_logged_out_executions":false,"unique_live_widget_urls":false,"metadata_status":101},"pageSummarySSR":{"title":"Inference","description":"Learn common techniques to scale inference in production environments.","discourse_page_url":"https://discuss.educative.io/tag/inference__machine-learning-primer__machine-learning-system-design?open=true&ctag=machine-learning-system-design__khang-pham&cslug=machine-learning-system-design&pslug=inference"},"adaptiveLearningConfigConstantSSR":0,"enableLessonPageLockedBannerV2":true,"allowAllLessonPreview":false,"lockedBannerStatsSSR":{"b2cTrialStats":{"is_b2c_trial_active":true,"b2c_trial_active_duration":7,"b2c_trial_categories":"$a1"},"b2cStatus":100,"learnerTags":"$a2","workStats":1430,"interviewWorksStats":76,"inL2cStarterPack":false,"l2cWorkStats":38,"enableL2cStarterPackPaymentWidget":"true"},"pageTocSSR":"

Inference

","authorId":"5184083498893312","collectionId":"5582183480688640","pageId":"6343824887513088","isCollectionPageLockedCachingEnabled":true,"aceFeatureFlags":{"enableAceEditor":true,"enableAceEditorForAnswers":true},"meta":{"type":["Article","TechArticle"],"title":"Inference","name":"Machine Learning System Design","description":"Learn common techniques to scale inference in production environments.","image":"https://educative.io/api/collection/5184083498893312/5582183480688640/image/6166861937377280.png","isAccessibleForFree":false,"keywords":"$a2","provider":"Educative","publisher":"Educative","id":"courses/machine-learning-system-design/inference","author":"Khang Pham","educationalLevel":"intermediate","noIndex":false,"isForcedNoIndex":false,"noFollow":false,"redirectInfo":{"isDeletedCollectionPageRedirectable":false},"page_titles":"$a3","is_marked_for_deletion":false,"transition_page_title":"","is_redirectable":false,"deleted_course_lesson_redirect":{"author_id":null,"collection_id":null,"page_id":null,"redirect_url_slug":null},"metadata_status":101,"additional_course_alternatives":[]},"requestUrl":"/courses/machine-learning-system-design/inference","requestUrlInfo":{"authorId":"5184083498893312","collectionId":"5582183480688640","pageId":"6343824887513088","courseUrlSlug":"machine-learning-system-design","pageUrlSlug":"inference"},"isExternalContent":false}}],[["$","script",null,{"id":"generate-data","type":"application/ld+json","dangerouslySetInnerHTML":{"__html":"$a4"}}],false,"$undefined"]]