hadoop 474

  1. Difference between Pig and Hive? Why have both?
  2. What is the difference between Apache Spark and Apache Flink?
  3. Hadoop “Unable to load native-hadoop library for your platform” warning
  4. When to use Hadoop, HBase, Hive and Pig?
  5. Apache Spark: The number of cores vs. the number of executors
  6. Chaining multiple MapReduce jobs in Hadoop
  7. Is there a .NET equivalent to Apache Hadoop?
  8. How does the MapReduce sort algorithm work?
  9. How does Hadoop process records split across block boundaries?
  10. Difference between HBase and Hadoop/HDFS
  11. Name node is in safe mode. Not able to leave
  12. How to turn off INFO logging in PySpark?
  13. What is the difference between partitioning and bucketing a table in Hive ?
  14. Large scale data processing Hbase vs Cassandra
  15. What is the purpose of shuffling and sorting phase in the reducer in Map Reduce Programming?
  16. Failed to locate the winutils binary in the hadoop binary path
  17. Difference between Hive internal tables and external tables?
  18. merge output files after reduce phase
  19. Integration testing Hive jobs
  20. How to copy file from HDFS to the local file system
  21. When do reduce tasks start in Hadoop?
  22. How do I output the results of a HiveQL query to CSV?
  23. Buiding Hadoop with Eclipse / Maven - Missing artifact jdk.tools:jdk.tools:jar:1.6
  24. hadoop No FileSystem for scheme: file
  25. Life without JOINs… understanding, and common practices
  26. Hadoop on OSX “Unable to load realm info from SCDynamicStore”
  27. How does Hive compare to HBase?
  28. HDFS error: could only be replicated to 0 nodes, instead of 1
  29. what's the difference between “hadoop fs” shell commands and “hdfs dfs” shell commands?
  30. Difference between Amazon S3 and S3n in Hadoop
  31. out of Memory Error in Hadoop
  32. How to set variables in HIVE scripts
  33. The way to check a HDFS directory's size?
  34. How to know Hive and Hadoop versions from command prompt?
  35. connect to host localhost port 22: Connection refused
  36. Container is running beyond memory limits
  37. Java vs Python on Hadoop
  38. How to get/generate the create statement for an existing hive table?
  39. Scalable Image Storage
  40. Write to multiple outputs by key Spark - one Spark job
  41. Is there any way to get the column name along with the output while execute any query in Hive?
  42. Namenode not getting started
  43. Where does hadoop mapreduce framework send my System.out.print() statements ? (stdout)
  44. PIG how to count a number of rows in alias
  45. How does impala provide faster query response compared to hive
  46. Stop Java Coffee Cup icon from appearing in the Dock on Mac OSX
  47. Spark - load CSV file as DataFrame?
  48. Why is there no 'hadoop fs -head' shell command?
  49. Cascading examples failed to compile?
  50. Is it better to use the mapred or the mapreduce package to create a Hadoop Job?