Introduction to Big Data Platforms


Big Data is ubiquitous. It relates to all aspects of our lives. While Big Data is giving us intelligence and offering new capabilities, Big Data also cause grief and concerns for many of us. Reflecting these concerns, I came across memes calling Big Data as Big Brother.

Photo by Alexander Jawfox on Unsplash

Big Data is a broad topic. Thus. I plan to share my experience posting several articles in order of importance.

As a technologist, dealing with Big Data is my hobby, passion, and part of my profession. Data analytics and analysis serve fuel to my metalhead. My passion helps me earn my living in technology, so I am grateful to share my expertise with aspiring data professionals and technology enthusiasts.

To introduce Big Data, I want to start with platforms and cover other aspects in my upcoming posts.

In this article, I want to start with and introduce Big Data platforms to beginners. Data platforms are critical because every Big Data business solution requires a specific platform. A Big Data platform is consisting of several layers. These layers perform different functions but they are interrelated.

Let me briefly introduce these layers with some practical examples.

Layer One

The first layer of the Big Data platform is the shared operational information zone.

The information zone consists of data types such as:

  • data in motion,
  • data at rest, and
  • data in several other forms.

The information zone also includes:

  • legacy data sources,
  • new data sources,
  • master data hubs,
  • reference data hubs, and
  • content repositories.

Layer Two

The second layer of the data platform is called processing. This substantial layer includes:

  • data ingestion,
  • operational information,
  • landing area,
  • analytics zone,
  • archive,
  • real-time analytics,
  • exploration,
  • integrated warehouse,
  • data lakes, and
  • data mart zones.

This layer needs to have a governance model for metadata catalogue including data security and disaster recovery of systems, storage and hosting and other infrastructure components such as local processing and storage.

The critical infrastructure for Big Data platforms is Cloud computing and Edge Computing processing and storage. The IoT (Internet of Things) backbone also relate to this layer.

Photo by Daniil Onischenko on Unsplash

Layer Three

The third layer of the data platform is the analytics platform.

The analytics platform consists of:

  • functions,
  • process, and
  • tools

These functions, process and tools can include:

  • real-time analytics,
  • information planning,
  • forecasting,
  • decision making,
  • predictive analytics,
  • descriptive analytics,
  • prognostic analytics,
  • data discovery,
  • data visualisations,
  • executive dashboard, and
  • other analytics features as required in a particular Big Data solution.

This layer is also comprehensive and involves many practitioners such as data architects, data scientists, data speiclists, implementers and administrators.

In addition, substantial input may be required from business stakeholders such as executive decision-makers, CDO (Chief Data Officer), CMO (Chief Marketing Officer), even CFO (Chief Financial Officer).

Photo by Daniil Onischenko on Unsplash

Layer Four

The fourth layer of the data platform consists of outputs such as:

  • business processes,
  • decision-making schemes, and
  • point of interactions.

This layer of the data platform must be well-governed. Access needs to be provided with established controls for the data platform professionals such as data scientists, data architects, analytics experts, and business users.

After introducing these essential layers, I want to highlight a critical point: level of schema.

Level of schema

Level of the schema for the data platform is a crucial architectural and design consideration. We can classify the schema level under three categories:

  • no schema,
  • partially structured schema, and
  • full structured schema.
Schema reflects the structure of data and databases. We can think of a schema as a blueprint for data management.

Photo by Daniil Onischenko on Unsplash

Some examples of no schema are:

  • video files,
  • audio files,
  • picture files,
  • social media feed,

Some examples of the partial schema are:

  • email,
  • instant messaging logs,
  • system logs, and
  • call centre logs;.

Some examples of the full structured schema are:

  • structured sensor data, and
  • relational transaction data.

Related to platforms another critical point is data processing levels.

Data processing levels

The data processing levels are the other architectural considerations.

The processing levels could be:

  • raw data,
  • validated data,
  • transformed data and
  • calculated data.

Other structural classifications of data in data platforms are related the business relevance.

Business relevance

We can categorise the business relevance of data as:

  • external data,
  • personal data,
  • departmental data, and
  • enterprise data.

Understanding Big Data platform function and components can be useful for all stakeholders of the Big Data solution in business organizations. While business executives like CIO, CISO, CDO, CMO, and CFO need to understand these layers at a high level, data architects, data scientists, data specialists, implementers, testers, and administrators need to understand them in more detailed level.

Photo by Franki Chamaki on Unsplash

I hope this brief introduction gives you an overview to understand the Big Data platforms.

Thank you for reading my perspectives.

If you enjoyed this story, you may check my other technology articles on News Break.

Discovering Millions of Free Datasets

Importance of Protocols And Standards For IoT Solutions

I Solve The Mystery of IoT and Explain It In Plain Language

Edge Computing Is Not As Complicated & Scary As You May Think

My View Of Blockchain Is Different Because I Design It.

How To Deal With Big Data For Artificial Intelligence?

An Overview of Business Architecture For Entrepreneurs

Remarkable Leadership Traits for Technology Executives

How To Design Data Lakes

Reference: Architecting Big Data & Analytics Solutions by Dr Mehmet Yildiz.

Comments / 0

Published by

I write about important and valuable life lessons. My ultimate goal is to delight my readers. My content aims to inform and engage my readers. Truth, diversity, collaboration, and inclusiveness are my core values. I am a pragmatic technologist, scientist, postdoctoral academic and industry researcher focusing on practical and important life matters for the last four decades.


More from DigitalIntelligence

Comments / 0