Data democratization is the term which is making the buzz all around the Business. Data is never restricted to a subset of users or groups like Data Engineers, Data Analysts or Data scientists.
According to Collibra
Data democratization is when an organization makes data accessible to all employees and stakeholders, and educates them on how to work with data, regardless of their technical background.
What is data democratization?
Data democratization is defined as all users having access to data, with no gatekeepers or barriers in the way of…
Why do we need to Democratize the Data
Usage of data in different areas of the organization is different.
Every team has their own set of requirements and perception of the data. By letting these teams to use the data and educating them about the different areas and aspects of it would make everyone in the team more self reliable and more agnostic.
We are basically removing all the middlemen from the equation and all the team go and query the data.
Semantic layer is also known as a Translation layer that lies between the your data and the business stakeholders.
Semantic Layer helps the business users to use the data directly without having to know any of the nuances of the data technical terms.
Consider you have 100s of tables representing the dimension of a product and its sales. But business might only be asking simple questions like total sales and the items. Here by using the Semantic layer, you remove all the complexity of managing and cleaning of the data and only exposing the set of data which is required and keeping it as simple as possible.
Semantic layer keeps all the complexity withing by integrating the data with different sources like API, Database, Data warehouse etc, and exposing the given data in a simple term to the business users.
Cube is an API-first, four-part semantic layer that enables data engineers and developers to make their data consistent, secure, performant, and accessible across every application.
Cube, is nothing but a Semantic Layer as we discussed in the beginning.
Cube is built to help the business to be able to access the data without having to know any of the complex data terms.
Cube helps you to connect to most of the Data sources like Snowflake, BigQuery, Redshift, Postgres etc and build the semantic layer on top of it.
Cube exposes the data using REST, GraphQL, and SQL APIs and also it controls the user accesses.
Cube has the built in charting library using which you can build the visualization on top of the given datasets.
Using Cube you can model your data. Semantic layer helps you to pull the given data into a perspecitve.
Once the data is ready the very next step is to model your data at different dimenttions and adding any measures.
Example for Product table:
public.products AS "products"
(count("products".id) <= $ 1)
Cube supports 2 types of caching.
- In-memory: which is enabled by default. When the same data is accessed multiple times, the given dataset is cached for future requests for a quick response to the queries.
- pre-aggregated: In here, the Cube reads the source table and runs your aggregation and stores the result in the per-aggregation schema in the source database.
Cube has rich integration of APIs to different source systems.
At a high level, Cube has an API connectivity of SQL, REST, GraphQL and orchestration API.
APIs & Integrations
APIs & Integrations | Documentation for working with Cube, the open-source analytics framework
Orchestration API | Documentation for working with Cube, the open-source analytics framework
Cube out of the box has the support to connect to Airflow, Dagster and Prefect
With that Cube has many more features like
Single Sign On: You can manage the user accesses and permissions.
SQL Runner: Where you can build and run your queries.
Playground: Its a web-based tool which allows executing queries and previewing their results.
CLI: to help you create new Cube services or build data models.
Monitoring: To alert for any outage or issues. Cube can also connect to Datadog or Grafana.
Create a new folder and create a docker-compose.yaml
Run the docker
docker compose up -d
Hit the URL
Let’s use the sample database: PostgresSQL
Build The Data Model using YAML
Let’s query the data
You can get the SQL quey for the given view
You can build the pre-aggregation
In the era of Data where everyone is asking questions based on the the historical performance and future implications, its very important to have the data democraised.
Semantic Layer is jut built for that. It acts as a middleman between your data and the organisation or end users. It takes away all the complecity and provides the simple solution where anyone can can business questions without having to know the data terms and nuances.
In this blog we have only provided very high level information about Semantic layer and how Cube is helping the organisation to connect your end users with the data.
I strogly recommend you to read the Cube documentation to get the full picture and the all the features provided by them in the Cube Cloud.
Cube - Semantic Layer for Building Data Applications
Access data, organize it, and deliver it to every tool so developers can build powerful, fast, and consistent data…
GitHub - cube-js/cube: 📊 Cube - The Semantic Layer for Building Data Applications
📊 Cube - The Semantic Layer for Building Data Applications - GitHub - cube-js/cube: 📊 Cube - The Semantic Layer for…
Use Cube as a semantic layer to define metrics upstream and provide consistent data to every stakeholder.
Introduction | Documentation for working with Cube, the open-source analytics framework
Bigdata Engineer — Bigdata, Analytics, Cloud and Infrastructure.
Interested in getting the weekly newsletter on the big data analytics around the world, do subscribe to my: Newsletter Just Enough Data