hadoop 474

  1. Difference between Pig and Hive? Why have both?
  2. What is the difference between Apache Spark and Apache Flink?
  3. Hadoop “Unable to load native-hadoop library for your platform” warning
  4. When to use Hadoop, HBase, Hive and Pig?
  5. Apache Spark: The number of cores vs. the number of executors
  6. Chaining multiple MapReduce jobs in Hadoop
  7. Is there a .NET equivalent to Apache Hadoop?
  8. How does the MapReduce sort algorithm work?
  9. How does Hadoop process records split across block boundaries?
  10. Difference between HBase and Hadoop/HDFS
  11. Name node is in safe mode. Not able to leave
  12. What is the difference between partitioning and bucketing a table in Hive ?
  13. How to turn off INFO logging in PySpark?
  14. Large scale data processing Hbase vs Cassandra
  15. What is the purpose of shuffling and sorting phase in the reducer in Map Reduce Programming?
  16. Difference between Hive internal tables and external tables?
  17. Failed to locate the winutils binary in the hadoop binary path
  18. merge output files after reduce phase
  19. When do reduce tasks start in Hadoop?
  20. Integration testing Hive jobs
  21. How to copy file from HDFS to the local file system
  22. Buiding Hadoop with Eclipse / Maven - Missing artifact jdk.tools:jdk.tools:jar:1.6
  23. How do I output the results of a HiveQL query to CSV?
  24. hadoop No FileSystem for scheme: file
  25. Life without JOINs… understanding, and common practices
  26. Hadoop on OSX “Unable to load realm info from SCDynamicStore”
  27. How does Hive compare to HBase?
  28. HDFS error: could only be replicated to 0 nodes, instead of 1
  29. what's the difference between “hadoop fs” shell commands and “hdfs dfs” shell commands?
  30. Difference between Amazon S3 and S3n in Hadoop
  31. out of Memory Error in Hadoop
  32. How to set variables in HIVE scripts
  33. The way to check a HDFS directory's size?
  34. How to know Hive and Hadoop versions from command prompt?
  35. connect to host localhost port 22: Connection refused
  36. Container is running beyond memory limits
  37. Java vs Python on Hadoop
  38. Scalable Image Storage
  39. How to get/generate the create statement for an existing hive table?
  40. Write to multiple outputs by key Spark - one Spark job
  41. Namenode not getting started
  42. Is there any way to get the column name along with the output while execute any query in Hive?
  43. Where does hadoop mapreduce framework send my System.out.print() statements ? (stdout)
  44. PIG how to count a number of rows in alias
  45. How does impala provide faster query response compared to hive
  46. Stop Java Coffee Cup icon from appearing in the Dock on Mac OSX
  47. Spark - load CSV file as DataFrame?
  48. Cascading examples failed to compile?
  49. Why is there no 'hadoop fs -head' shell command?
  50. Is it better to use the mapred or the mapreduce package to create a Hadoop Job?