Autoplay
Autocomplete
HTML5
Flash
Player
Speed
Previous Lecture
Complete and continue
Learn By Example: Hadoop, MapReduce for Big Data Problems
Why is Big Data a Big Deal?
Hadoop Introduction (1:52)
The Big Data Paradigm (14:20)
Serial vs Distributed Computing (8:37)
What is Hadoop? (7:25)
HDFS or the Hadoop Distributed File System (11:00)
MapReduce Introduced (11:39)
YARN or Yet Another Resource Negotiator (4:00)
Installing Hadoop in a Local Environment
Hadoop Install Modes (8:32)
Hadoop Standalone Mode Install (15:46)
Hadoop Pseudo-Distributed Mode Install (11:44)
The MapReduce "Hello World"
The Basic Philosophy Underlying MapReduce (8:49)
MapReduce - Visualized And Explained (9:03)
MapReduce - Digging a little deeper at every step (10:21)
"Hello World" in MapReduce (10:29)
The Mapper (9:48)
The Reducer (7:46)
The Job (12:27)
Run a MapReduce Job
Get comfortable with HDFS (10:58)
Run your first MapReduce Job (14:30)
Juicing your MapReduce - Combiners, Shuffle and Sort and The Streaming API
Parallelize the Reduce Phase - Use the Combiner (14:39)
Not all Reducers are Combiners (14:31)
How Many Mappers and Reducers does your MapReduce Have? (8:23)
Parallelizing reduce using Shuffle And Sort (14:55)
MapReduce is not limited to the Java language - Introducing the Streaming API (5:05)
Python for MapReduce (12:19)
HDFS and Yarn
HDFS - Protecting Against Data Loss Using Replication (15:38)
HDFS - Name Nodes and Why They're Critical (6:54)
HDFS - Checkpointing to Backup Name Node Information (11:16)
Yarn - Basic Components (8:39)
Yarn - Submitting a Job to Yarn (13:16)
Yarn - Plug in Scheduling Policies (14:27)
Yarn - Configure the Scheduler (12:32)
MapReduce Customizations For Finer Grained Control
Setting up your MapReduce to Accept Command Line Arguments (13:47)
The Tool, ToolRunner and GenericOptionsParser (12:35)
Configuring Properties of the Job Object (10:41)
Customizing the Partitioner, Sort Comparator, and Group Comparator (15:16)
The Inverted Index, Custom Data Types for Keys, Bigram Counts and Unit Tests!
The Heart of Search Engines - The Inverted Index (14:47)
Generating the Inverted index using MapReduce (10:31)
Custom Data Types for Keys - The Writable Interface (10:29)
Represent a Bigram using a WritableComparable (13:19)
MapReduce to Count the Bigrams in Input Text (8:32)
Hadoop Project
Test your MapReduce job using MRUnit (13:47)
Input and Output Formats and Customized Partitioning
Introducing the File Input Format (12:48)
Text And Sequence File Formats (10:21)
Data Partitioning using a Custom Partitioner (7:11)
Lecture 47_w8-m4-FormatsAndSortingCustomizedPartition (10:25)
Lecture 48_w8-m5-FormatsAndSortingTotalOrderPartitionerI (10:10)
Lecture 49_w8-m6-FormatsAndSortingSamplingDistributi (9:04)
Lecture 50_w8-m7-FormatsAndSortingSecondarySortP6 (14:34)
Recommendation Systems using Collaborative Filtering
Lecture 51_w7-m1-RecosCollaborativeFilteringIntroduced (7:25)
Lecture 52_w7-m2-RecosImplementingchainedMRsforfriendrecommendationsP2 (17:15)
Lecture 53_w7-m3-RecosMapReduce1implementedP3 (14:50)
Lecture 54_w7-m4-Recos - MapReduce 2 implemented along with the main class - P4 (13:46)
Hadoop as a Database
Lecture 55_w9-m1-HadoopSQLHadoopvsDatabasesP1 (14:08)
Lecture 56_w9-m2-HadoopSQLSelectWhereP2 (15:31)
Lecture 57_w9-m3-HadoopSQLGroupByHavingP3 (14:02)
Lecture 58_w9-m4-HadoopSQLJoinsTheMapInputOutputP5 (14:19)
Lecture 59_w9-m4-HadoopSQLTheReduceInputOutputP6 (13:07)
Lecture 60_w9-m5-HadoopSQLJoinsSortPartitionGroupByP7 (8:49)
Lecture 61_w9-m6-HadoopSQLJoinsPuttingItAllTogetherP8 (13:46)
K-Means Clustering
Lecture 62_w10-m1-KMeansWhatisClusteringP1 (14:04)
Lecture 63_w10-m2-KMeansKMeansClusteringUsingMapReduceP2 (16:33)
Lecture 64_w10-m3-KMeansDoubleVectorAndDistanceMeasurerP3 (13:52)
Lecture 65_w10-m4-KMeansVectorWritableAndClusterCenterP4 (8:26)
Lecture 66_w10-m5-KMeansTheClusteringJobConfigurationP5 (10:49)
Lecture 67_w10-m5-KMeansTheMapperAndReducerP6 (11:23)
Lecture 68_w10-m6-KMeansIterativeMapReduceSetupP7 (3:39)
Setting up a Hadoop Cluster
Lecture 69_w12-m3-1VBoxPD (13:50)
Lecture 70_w12-m2-2EC2 (6:25)
Lecture 71_w12-m1-3Cloudera (13:04)
Appendix
Lecture 72_-VBoxSetup (15:58)
Lecture 73_-PathVariable (8:25)
Lecture 49_w8-m6-FormatsAndSortingSamplingDistributi
Lecture content locked
If you're already enrolled,
you'll need to login
.
Enroll in Course to Unlock