Top
image credit: spark

Transforming data with Apache Spark

June 3, 2019

Via: CIO

Apache Spark is a fast data processing framework dedicated to big data. It allows the processing of big data in a distributed manner (cluster computing). Very popular for a few years now, this framework is about to replace Hadoop. Its main advantages are its speed, ease of use, and versatility.

Apache Spark is an open source big data processing framework that enables large-scale analysis through clustered machines. Coded in Scala, Spark makes it possible to process data from data sources such as Hadoop Distributed File System, NoSQL databases, or relational data stores like Apache Hive.

Read More on CIO