How GroupBy works Internally in SQL?

August 10, 2022

How GroupBy works Internally in SQL?

🤔How #GroupBy Works Internally In #SQL?
.
.
.
.
.
.

✍️Lets Understand in depth

=====================

Lets consider a #Student table

Name || Subject || Marks

A || Maths || 60

A || che || 85

B || Che || 70

C || phy || 80

C || Maths || 75

✍️lets consider the query

✍️ SELECT name,Count(*) from Student Group By name;

Lets understand Step by Step How group by works
=============================

1)Firstly the column which we mentioned with group by(name) ,

✍️#sql enginee goes to that column and #creates #distinct #buckets on #name column.

✍️As we have Mentioned Name Column so Buckets are Created on Name Column, If u mention Subject Buckets will be created on Subject Column

✍️For Example :-
===============

✏️#Above u can see #A is 2 times but only one bucket is created

✏️#Above u can see #C is repeated 2 times only one bucket is created

✏️Now from above Table 3 distinct Bukcets are Created.

🚀 For "A" one Bucket

🚀For "B" one Bucket

🚀For "C" one bucket

2) Now #sql enginee Bucket by Bucket perfoms the count operation

a) First it starts from "A" Bucket , now it compares #A with #name column record by record.

✏️on First scan it finds #A presence "One" Time

✏️Again it scans it finds #A presence "second" Time

✏️Now it will update the count in #A bucket as "2"

Note:-

====

🚀Dont think that #sql enginee will stop once it find "A" it never stops it scans all the record where present or not.

B) Now it starts with #Bucket "B"

✏️It will repeat the same thing , it compares "B" with #name Column and

✏️ finds #B's Presence one time

✏️Now it updates #B count in Bucket "B"

C) Now it starts from "C" Bucket , now it compares #C with #name column record by record.

✏️on First scan it finds #C presence "One" Time

✏️Again it scans it finds #C presence "second" Time

✏️Now it will update the count in #C bucket as "2"

✍️Now from above we got output in bucket as

#A #B #C

2 1 2

#Finally select (which is projection projects ) as

✏️Count of Name colum as🔥🔥

#A #B #C

2 1 2

Three steps:-

=========

1) It creates Distnct buckets on the column which u mentioned in Group by

2) It Comares the each bucket with ColumName mentioned

3) It updates the count in the bucket

Comments

BigdataTechStack10 August 2022 at 02:38
Like Share and Comment
ReplyDelete
Replies

Add comment

Search This Blog

BigData Tech Stack

How GroupBy works Internally in SQL?

Comments

Post a Comment

Popular Posts

Impetus DataEngineer Interview Questions

Optum DataEngineer Interview Questions ?