Very much real time by Apache PINOT

source: https://docs.pinot.apache.org/

Quick fact

What is Pinot in a nutshell

  1. Highly Available
  2. Scalable
  3. Cost effective
  4. As fresh as it can be
  5. Millisecond latency

Source: https://github.com/apache/pinot

  1. Apache Pinot stores the data in a columnar format
  2. It can ingest data from different stream sources like kafka, druid etc and as well as the batch data like s3, hdfs, json etc.
  3. It supports wide variety of indexes to help your read queries.
  4. You can segment or isolate your tables or queries by the help of tagging server.
  5. It uses Zookeeper for highly availability and Apache Helix for state management.
  6. Cool thing is we can ingest realtime and offline data to a 1 table where pinot takes care of merging them.

Architecture

INGEST

  1. Online like Kafka, Amazon Kinesis
  2. Offline like ADLS, S3 etc

Core Layer

Presentation

Secret Sauce: Segments

  1. Forward Index
  • Dictionary-encoded forward index with bit compression
  • Raw value forward index
  • Sorted forward index with run-length encoding
  • Bitmap inverted index
  • Sorted inverted index

Ingestion Architecture

  1. Brokers accepts queries from the clients and forwards them to the servers
  2. receive results back from the servers and merge them and forward to the client

How does the data gets into the server

Core Concepts

Servers

Controller

Tenants

Offline Segments

  1. We define the schema with table name and server tenant
  2. This table config gets stored in the zoo keeper.
  3. We start our ingestion job and generate a segment
  4. Ingestion job will push this data to deeps store.
  5. Notify the controller about the new segment
  6. Controller will gather the series of info:

Realtime segments

  1. Broker routes queries to the realtime servers
  2. realtime server of the pinot segment it holds. Also these servers consuming realtime data from streams.
  3. Data is indexed as it is consumed and kept in memory. This is reffered as consuming segments. this will also be used to serving query
  4. Periodically the consuming segments are confverted to completed pinot segments. This will flush to the segment store.
  5. Controller coordinates all these activities across all the components using Helix and Zokeeper.

Kafka Realtime stream

  1. We define the table schema whic hgets stored in the zoo keeper which consist of topic to read, the number of topics etc by controller.
  2. Controller will ask the server list in the given tenant
  3. Controlelr will fetch the number of partitions from the kafka streams
  4. controller will define the mapping of partition and the servers including replica. Eg. partition A: Server:1 and 2 based on the replica count.
  5. Controller will also define the earliest offset of the partition from where to read.
  6. Helix will see the new entry and instructs the server to start consuming.
  7. Servers will directly consume from the stream for the partitions its assigned to.
  8. Periodically, the consuming segments will be flushed based on time or the number of rows ingested by the consuming segments.
  9. Server will let the controller know the offset it reached.
  10. Controller will inform the server to flush and the server will convert it to completed segment and stores in the segment store. This will also build indexes.
  11. Once the flush is completed, the controller will update the state mapping table with state from “CONSUMING” to “ONLINE” which means its ready to serve queries. With that it stores the offset until when its stored.
  12. Now since we have written only 1 copy, the other copy will get notified and they given a chance to catch up. if they succeed they build their own local copy, if not the segment will be downloaded to the server from the segment store.
  13. We make a new entry in the state table for the same partition and the start offset to be as the last offset number which we have committed.
  14. helix will do the same process again.

--

--

Bigdata Engineer — Love for BigData, Analytics, Cloud and Infrastructure. Want to talk more? Ping me in Linked In: https://www.linkedin.com/in/ajshetty28/

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Ajith Shetty

Bigdata Engineer — Love for BigData, Analytics, Cloud and Infrastructure. Want to talk more? Ping me in Linked In: https://www.linkedin.com/in/ajshetty28/