|Original author(s)||Matei Zaharia|
|Initial release||May 26, 2014|
3.0.0 / June 18, 2020
|Operating system||Microsoft Windows, macOS, Linux|
|Available in||Scala, Java, SQL, Python, R|
|Type||Data analytics, machine learning algorithms|
|License||Apache License 2.0|
Apache Spark is an open-source distributed general-purpose cluster-computing framework. Spark provides an interface for programming entire clusters with implicit data parallelism and fault tolerance. Originally developed at the University of California, Berkeley's AMPLab, the Spark codebase was later donated to the Apache Software Foundation, which has maintained it since.
MLlib in R: SparkR now offers MLlib APIs [..] Python: PySpark now offers many more MLlib algorithms"