Tiger Analytics DataEngineer Interview Questions
Tiger Analytics DataEngineer Interview Questions
By Karthik Kondpak
Tiger Analytics Interview Pattern is of Two 2 rounds:-
=======================================
1) Online Round
=============
Total 10 Questions
1) DataEngineering Concepts Related theoretical Questions ( 2 Such Questions)
Ex:- Write about Spark Optimization
How Acid Compilance is applied on Spark
2) SQL Queries ( 3 such Questions)
3) ( Problem Solving Questions) Python/Scala (3 such Questions)
4) 2 Questions on Spark rdd's, DataFrames,DataSets
In total = 2+3+3+2= 10 QUESTIONS
Note:-
======
1)To Qualify First Round You should Focus more on Spark , U should Attempt All Spark Questions
2) There is no Execution or compiler in Online round just u have to Write the code in TextBox Provided.
2nd Round (Technical Round Questions)
==============================================
1) What is Difference between Coalesce and Repartition?
2)Lets Consider i have a csv File and you have to load data into Rdd's now you have to find only
the maximum occurence word. I need only Highest Occured Word Using Rdd.
3) Why AWS why Not Azure and GCP, Why ur Organisation Chosed Aws over other?
4) Explain the Execution order of SQL ?
5) Tell me How Spark Job runs behind the scenes?
6) Tell me about BroadCast Join?
7) Tell me about Project Usecase , they Specifically ask about the tables and columns exactly what is in our Project?
8) Tell me About ur Day to Day Activities as a DataEngineer?
9) Tell me About the Challenging scenario you faced while building a Technical Pipeline?
10) What is the Cluster size in your Project? How many Executors and how many Cpu cores and memory is allocated to Each Executor?
11)Can we Read 50 Gb of data into Spark At a time is it possible ? if possible how? If not possible what is need of Spark when it can't deal Bigdata?
12) How to add New Columns in the spark using DataFrames?
13) Find the second Highest Number in the given List?
14) Tell me About Hashmap in DataStructures?
15) What is the TimeComplexity Of MergeSort (Best, average,Worse), Why three of them have the same Time Complexity?
16) Explain How Broadcast Join Works?
17) Difference Between Rank() and Dense Rank()
18) Find the sum of salary of Each Department in the Given Query ?
Employee Table
====================
Employee_id Dept salary
123 IT 2000
134 Finance 3000
144 IT 4000
156 Hr 5000
184 IT 9000
19) Which is Fast Hive or Spark ? when there is hive why Spark?
20)Find the Duplicates in the given Query ?
Like Share and Comment
ReplyDeleteIf you post answers it is very useful
ReplyDelete