Map Reduce and Hadoop

Hadoop Behind the Scenes

  1. Tries to have map task in same machine as data.
  2. If a machine dies, its jobs get re-run automatically.
  3. If a key-value causes a crash, it gets ignored.
  4. If slow map task, several copies are started.
  5. A combiner (mini-reduce) can run on the same machine as a map.

José M. Vidal .

23 of 24