Tiger Analytics DataEngineer Interview Questions

 

Tiger Analytics DataEngineer Interview Questions


By Karthik Kondpak






Tiger Analytics Interview Pattern is of Two 2 rounds:-

=======================================

1) Online Round
=============

Total 10 Questions

1) DataEngineering Concepts Related theoretical Questions (  2 Such Questions)

  Ex:- Write about Spark Optimization

         How Acid Compilance is applied on Spark

2) SQL Queries ( 3 such Questions)

3)  ( Problem Solving Questions) Python/Scala (3 such Questions)

4) 2 Questions on Spark rdd's, DataFrames,DataSets


In total = 2+3+3+2= 10 QUESTIONS


Note:-

======

 1)To Qualify First Round You should Focus more on Spark , U should Attempt All Spark Questions

2) There is no Execution or compiler in Online round just u have to Write the code in TextBox Provided.




2nd Round (Technical Round Questions)

==============================================

1) What is Difference between Coalesce and Repartition?

2)Lets Consider i have a csv File and  you have to load data into  Rdd's now you have to find only
  the maximum occurence word.  I need only Highest Occured Word Using Rdd.

3) Why AWS why Not Azure and GCP, Why ur Organisation Chosed Aws over other?

4) Explain the Execution order of SQL ?

5) Tell me How Spark Job runs behind the scenes?

6) Tell me about BroadCast Join?

7) Tell me about Project Usecase , they Specifically ask about the tables and columns exactly what is in our Project?

8) Tell me About ur Day to Day Activities as a DataEngineer?

9) Tell me About the Challenging scenario you faced while building a Technical Pipeline?

10) What is the Cluster size in your Project? How many Executors and how many Cpu cores and memory is allocated to Each Executor?

11)Can we Read 50 Gb of data into Spark At a time is it possible ? if possible how? If not possible what is need  of Spark when it can't deal Bigdata?

12) How to add New Columns in the spark using DataFrames?

13) Find the second Highest Number in the given List?

14) Tell me About Hashmap in DataStructures?

15) What is the TimeComplexity Of MergeSort (Best, average,Worse), Why three of them have the same Time Complexity?

16) Explain How Broadcast Join Works?

17) Difference Between Rank() and Dense Rank()

18) Find the sum of salary of Each Department in the Given Query ?
    

        Employee Table
====================
 
      Employee_id                   Dept                       salary

       123                                IT                                2000
       134                                Finance                       3000
       144                                IT                                4000
       156                                Hr                                5000
        184                               IT                                 9000


19) Which is Fast Hive or Spark ? when there is hive why  Spark?

20)Find the Duplicates in the given Query ?


Comments

Post a Comment

Popular Posts