apache-spark (664)

  1. bigdata kafka 2018 - Apache Spark vs. Apache Storm
  2. hadoop vs 2018 - What is the difference between Apache Spark and Apache Flink?
  3. scala org.apache.spark.sparkexception: sparkexception: - Task not serializable: java.io.NotSerializableException when calling function outside closure only on classes not objects
  4. apache-spark spark example - What is the difference between cache and persist?
  5. apache-spark was introduced - Difference between DataFrame (in Spark 2.0 i.e DataSet[Row] ) and RDD in Spark
  6. apache-spark java examples - What is the difference between map and flatMap and a good use case for each?
  7. out-of-memory local scala - Spark java.lang.OutOfMemoryError: Java heap space


  8. apache-spark vs setup - What are workers, executors, cores in Spark Standalone cluster?
  9. hadoop total-executor-cores how - Apache Spark: The number of cores vs. the number of executors
  10. apache-spark spark dataframe - How to read multiple text files into a single RDD?
  11. apache-spark machine learning - Spark performance for Scala vs Python
  12. scala spark vs - (Why) do we need to call cache or persist on a RDD
  13. apache-spark dataframe coalesce(1) - Spark - repartition() vs coalesce()
  14. apache-spark spark-submit set - How to stop INFO messages displaying on spark console?
  15. scala pyspark cast - How to change column types in Spark SQL's DataFrame?
  16. scala spark dataframe - How to store custom objects in Dataset?
  17. python spark-submit set - How to turn off INFO logging in Spark?
  18. scala spark python - How to print the contents of RDD?
  19. java pyspark --properties-file - Add jars to a Spark Job - spark-submit
  20. scala java pyspark - How to convert rdd object to dataframe in spark