What is Bigdata? what does Bigdata Mean?

 πŸš€✍️What is "BIGDATA" ? What Does "BIGDATA" mean ?πŸ˜ŽπŸ’»

.

.

.

.


What is Bigdata?

=============

✏️Some says when dataset is large (or) volume is huge then data is termed as "BIGDATA".


✏️Some says when we are not able to store the data on laptop then it is termed as "BIGDATA"


✏️We get many such definations regarding "BIGDATA"


✍️FORMAL DEFINATION GIVEN BY IBM

==============================


πŸ“ Any data which is characterized by 3v's is termed as "BIGDATA"


 ✍️They are 

     1)volume

     2)variety

     3)velocity

      

1)Volume

 ======

 ✏️ Volume of data should be Large, It should be some Terabytes or petabytes


  ✏️A single System(Machine) is incapable of handling it


  Example

  =======

  ✏️Facebook users upload more than 900 million photos a day. Huge volume of data, traditional system incapable of handling them.


 2) Variety

 =======

 πŸ“Œ Data can be of any type:-

      a)Structured (RDBMS Databases, Oracle,Mysql)


  b)Semistructured(Csv,Xml,Json)


  c)unstructured(Audio,Video,Image,Logfiles)


✏️  it is not like traditional "DBMS" where data we get in a structured manner can be of any type as shown above.


3)Velocity:

 =========

 ✏️Speed at which data is coming is termed to be "Velocity".


 ✏️In simple words the speed at which "Ingesting data", "processing data" and "retriving data"(response) is termed as "Velocity"

  

 Example

=========== 

 ✏️Remember our Facebook example? 250 billion images may seem like a lot. But if you want your mind blown,

 consider this: Facebook users upload more than 900 million photos a day.


 ✏️A day. So that 250 billion number from last year will seem like a drop in the bucket in a few months. 


 ✏️Velocity is the measure of how fast the data is coming in.


 ✏️Facebook has to handle a tsunami of photographs every day. 


 ✏️It has to ingest it all, process it, file it, and somehow, later, be able to retrieve it.


✍️According to Ibm Formal defination , the Dataset with above 3 characteristics is termed as "Bigdata."


πŸ”₯There might be 4v's,5v's so-on....... 



✍️4v's are somewhat relevant and lets talk about 4th v:

======================


πŸ“ 3v's are same as above discussed and 4th 'V' is "Veracity"


 ✍️Veracity:

 ==========

  ✏️Quality of the data that is being analyzed is termed as "Veracity".

   

 ✍️ Low veracity:

  ============

  ✏️We can find meaningless, poor quality data like :-

  a)we can find lot of null values 

   b)age might be in negative

   

 ✏️Such Low veracity data doesn't contribute any meaning-full insight.


 ✍️ High Veracity:

  =============

  ✏️On the other hand High veracity data contribute meaning-full insights 


 ✍️Example:

 ======= 

a)High veracity data set would be "" data from a medical experiment or trial "", which gives us a meaning-full insight of an "Experiment"

Comments

Popular Posts