E-MapReduce Service

A Big Data service that uses Apache Hadoop and Spark to process and analyze data


Alibaba Cloud Elastic MapReduce (E-MapReduce) is a big data processing solution to quickly process huge amounts of data. It is built on Alibaba Cloud Elastic Compute Service (ECS) based on open source Apache Hadoop and Apache Spark. E-MapReduce flexibly manages your big data use cases such as trend analysis, data warehousing, and analysis of continuously streaming data.

E-MapReduce simplifies big data processing, making it easy, fast, scalable and cost-effective for you to provision distributed Hadoop clusters and process your data. It helps you to streamline your business through better decisions made on the basis of massive data analysis done in real-time.

Product Details

Alibaba Cloud Elastic MapReduce (E-MapReduce) is a big data processing solution for easy set-up and management of clusters. It allows you to quickly launch Hadoop clusters within minutes for massive data processing. This way, it simplifies complex big data processing by performing data-intensive tasks for applications involved into machine learning, data mining, financial analysis, data warehousing, etc. Alibaba Cloud E-MapReduce offers a fully managed service to analyze data through a user friendly web interface easily. It utilizes Apache Storm, Spark, Hue, Hive, MapReduce, etc. as its backend services. The resources provisioned by the workload are released automatically upon completion of the processing task so that you pay only for the resources you consume. It also integrates easily with Alibaba Cloud services such as RAM, ApsaraDB for RDS, ApsaraDB for Redis as per the requirements.You can also develop and run custom applications as per your business requirement using E-MapReduce.



  • Lets you simply select the required ECS model (CPU or memory) and disks, and select the required software for automatic deployment
  • Creates E-MapReduce Hadoop cluster as needed within minutes, and releases the cluster once an offline job is complete
  • Adds nodes dynamically as and when needed
  • Facilitates the provisioning, configuration, and tuning of Hadoop clusters


  • Saves extra overheads involved in managing the underlying instances
  • Pay on an on-demand basis for every instance that you use


  • Permits you to scale up or down the number of instances as per your requirements


  • Seamless integration with other Alibaba Cloud products to use as the input source or output destination of Hadoop/Spark calculation engine



  • Quickly provisions as many instances as needed, and then releases those instances once the job is complete
  • Lets you deploy multiple new Hadoop clusters as well as resize existing clusters as required
  • Supports scaling up of Hadoop clusters as and when needed

Flexible Cluster Configuration

  • Allows you to freely select ECS model and relevant configuration including CPU, memory, and disks
  • Permits selection of required numbers of Master nodes (namenode and resourcemanager nodes) and Core nodes(datanode and nodemanager nodes)


  • Provides flexible payment options on the basis of cluster payment type, subscription or Pay-As-You-Go

Support Multiple Data Storage and Databases

  • Leverage object data storage options using Alibaba Cloud OSS
  • Supports usage of Alibaba Cloud RDS (MySQL), Table Store Service and ApsaraDB for Redis as per your architecture requirements

Integration With 3rd-party Tools

  • Supports integration with

    • Frameworks: Apache Spark, MapReduce, Apache Pig
    • Tools: Apache Sqoop, Spark SQL
    • Data Storage: Apache HDFS, HBase
  • Supports machine learning, orchestration of processes, stream processing and graph analytics
  • You can also perform offline data processing, ad hoc data analysis, live streaming, etc.
  • Ensure efficient processing of massive data while reducing data processing cost and time


  • Ensure security through configurable firewall settings for Alibaba Cloud ECS instances
  • Offers security configurations for encryption of data stored and processed using E-MapReduce
  • Lets you isolate the service permissions via the primary account/sub-account through easy integration with Alibaba Cloud RAM

Flexible Execution of Jobs

  • Efficiently connect jobs (Hive, Pig, Apache Spark, etc.), execute as well as process them to get detailed analysis
  • Allows you to schedule regular workloads in an automated manner