Wednesday, April 11, 2018

Most Frequently asked Hive Interview Questions


1. What is Speculative Execution and how it enhance job execution ?
2. You have been asked to write a Hive table. What are the optimization techniques you should apply and make it an effective table so that query execution will be faster ?
3. How will you decide which column(s) of a table to make Partition or Bucket ?
4. On what basis bucketing is applied on a table.
5. Describe the Hive architecture in brief ?
6. Write the Hive Word Count program and describe each step.
7. What are the major differences or key differences between partioning and bucketing.
8. What are the key differences between Managed and External table in Hive ? Why do we require an external table ? Give practical example.
9. Lets say I have 3 partitions in my data. Ex. Bangalore, Pune, Hyderabad. Each partiotion is having employee ids. I want to give read/write/update access to individual partitions to specific people. Let say Bangalore partiotion is only accessible by person:A, Hyderabad to persion: B and so on. Person A cant access Hyderbad partition and so on. How do we achieve that?
10. Can we make the same coulmn as partitioning and bucketing ? What will happen if we make so ?
11. Write a Hive query to find the duplicate rows in a table.
12. Why do we use the OVER() windows function in hive ? We can still use the group by clause which is having the similar functionality.
13. What is the key difference between -copyFromLocal and -put command in hadoop ?
14. Increase in number of partitions is a over head of name node as it has to store the meta data information. However considering the increased data volume day by day how to avoid the Name node crashing or name node being down ?
15. What is a hive server ?
16. What are the hive services available ?
17. Wrire the ALTER table syntax to change the data type of a column from string to bigint.
18. Write a unix shell script to remove 5 days older log from today.
19. How bucketing enahnce joining?
20. What are the key differences between Bucket Map Join and Sort Merge Bucket Join (SMB Join)?
      

1 comment:

  1. Get Oracle training in Chennai. One of the booming industries is Oracle. It has an erasable place in the IT industry. More than 70% of students choose oracle for their career and All the 70% are shining with a good salary package this the power of the Oracle.Oracle Training with Placement | Infycle Technologies

    ReplyDelete