BI Insights

Big Data 101: Intro To Probabilistic Data Structures

17 April, 2017

Big Data 101: Intro To Probabilistic Data Structures

Oftentimes while analyzing big data we have a need to make checks on pieces of data like number of items in the dataset, number of unique items, and their occurrence frequency. Hash tables or Hash sets are usually employed for this purpose. But when the dataset becomes so enormous that it cannot fit inside the memory all at once, we need to use special kinds of data structures known as Probabilistic Data Structures. Streaming applications usually require data processing in one pass and then incremental updates. Fortunately, probabilistic data structures fit that processing model very well. Such data structures ignore collisions but errors are controlled under a certain specified threshold. They trade in a small margin of error for considerably less memory footprint and constant query time. This article discusses some commonly used probabilistic data structures:

Read full story

Related Articles

13 April, 2017

No Digital Transformation Without Big Data

Publication: It Business Edge

Shared:

14 April, 2017

Where's The Value In Big Data?

Publication: Forbes

Shared:

17 April, 2017

Everything You Wanted To Know About Big Data But Were Afraid To Ask

Publication: Huffington Post

Shared:

17 April, 2017

Persistence Pays Off For Software Containers In Big Data

Publication: TechTarget

Shared:

20 April, 2017

Here's How Penn Is Using Big Data To Recruit New Students

Publication: The Daily Pennsylvanian

Shared:

The BI Guru
Presented by