0% completed
All LessonsFree Lessons (7)
Hadoop
IntroductionRise of Big DataTypes of Big DataBig Data DefinedBig Data vs Data Warehouse
YARN
IntroductionWorkflowScheduling
Map Reduce
BasicsMapperTesting MapperMapper InputReducerTesting ReducerTesting MapReduce ProgramRunning MapReduce End to EndExploring MapReduce RunsCombiner and PartitionerPutting it TogetherResiliency
HDFS
FilesystemThe Big PictureDisk Blocks & HDFS BlocksBlock ReplicationNamenodeDatanodeWriting and ReadingHigh AvailabilityHDFS in PracticeHDFS in Practice IIDistcp
Spark
Introduction
Architecture
Spark Application Life Cycle
Spark API
Resilient Distributed Datasets (RDDs)
DataFrames
Datasets
An Example
Running Spark Applications
Anatomy of a Spark Application
Execution of a Spark Application
Input & Output Formats
Sequence File: IntroSequence File: Reading & WritingSerDeRows vs Columnar DatabasesAvro: IntroAvro: Code GenerationAvro: IDL & RPCParquet: IntroParquet: Definition LevelParquet: Repetition LevelParquet: Reading & WritingParquet: Projection Schema & Misc. Tools
Misc
Zookeeper: IntroZookeeper: ExampleZookeeper: PracticalPig: OverviewSummary
Quiz
Quiz 1Quiz 2Quiz 3Quiz 4Quiz 5Quiz 6
Reference: Replication
IntroductionSingle Leader ReplicationAsynchronous VS Synchronous ReplicationFollowers in Log ReplicationLog ReplicationIssues in Single Leader ReplicationMulti Leader ReplicationIssues in Multi Leader ReplicationMulti-Leader TopologiesLeaderless ReplicationR + W > NQuorum VariationsConcurrent WritesVideo Streaming Queue a Concurrent Writes Example
Reference: Partitioning
IntroductionPartitioning SchemesSecondary IndexesNumber of Partitions
Reference: Transactions
Introduction to TransactionsMore On TransactionsIsolation LevelsRead Skew and Snapshot IsolationConcurrent Writes and Lost UpdatesWrite SkewSerializabilityTwo Phase LockingSerializable Snapshot Isolation
Reference: Issues in Distributed Systems
IntroductionNetworkAgreeing On TimeWorking with Time issues
Mock interview
Premium
Big Data Fundamentals