Ticket 1235 :: Add new column values to backward records in clickhouse database.

Ticket 1235 :: Add new column values to backward records in clickhouse database.





Solution Document


Add new column values to backward records in clickhouse database.

Overview

General/Customer specific

Customer specific

Author

Seema

Reviewer


Approver

 

Release date

23-02-2024

Product Version

NG 2.5.1


Audience: CSG/TechWarriors/PAC/Platform/Product teams


What’s the Issue? 

The challenge arose from the addition of new columns for enrichment purposes. While the values are successfully appending to upcoming records, the client wanted these values to be retroactively added to previous records as well.


Issue Description and Solution Approaches : In response to a new request from the Sahamati team, five new columns were added to two tables, and the data processing layer was modified to populate these columns for incoming data. Subsequently, there is a need to update the historical dataset, consisting of approximately 71 lakhs records.

Three approaches were considered:

Approach 1: Traditional DB Update Statement 

  1. Not recommended due to potential performance issues and pressure on the database.

Approach 2: Create a new table, Update historical values, and Swap it.

  1. Time-consuming and involves downtime during the table renaming process. Live data needs to be updated to the new tables to ensure zero data loss.

Approach 3:

  1. Perform data transformation/enrichment at Data Delivery Layer
  2. Create master tables, build a dynamic view by joining existing tables with master tables.
  3. No downtime required, offers flexibility, and supports future requests for additional columns.
  4. Leverages the benefits of hyperscale DB by performing data enrichment at the time of data delivery.

Opted Solution: Approach 3

  1. Chosen for its dynamic nature, efficiency in handling both historical and current datasets. 
  2. Provides an opportunity to leverage hyperscale DB benefits in data enrichment at the time of data delivery.
  3. Offers flexibility and seamless support for future requests involving additional columns.


Solution Steps in Detail

Sahamati New Tables & Views


Tables Schema:

Master AA - To store the organization name for the AA’s

create table master_aa on CLUSTER vusmart (
   aaId LowCardinality(String),
   aa_org_name LowCardinality(String),
   last_update_date DateTime default now()
Engine = ReplicatedReplacingMergeTree()
ORDER BY (aaId);



Master FIP - To store the organization name for the FIP’s and other associated attributes

create table master_fip on CLUSTER vusmart (
   fipId LowCardinality(String),
   fip_org_name LowCardinality(String),
   regulator LowCardinality(String),
   license_type LowCardinality(String),

   weights Float 
   last_update_date DateTime default now()
)Engine = ReplicatedReplacingMergeTree()
ORDER BY (fipId);


Master Business Hour - To store the business hour flag for each hour in a day

create table master_business_hour on CLUSTER vusmart (
   hour UInt8,
   is_biz_hr LowCardinality(String)
Engine = ReplicatedReplacingMergeTree()
ORDER BY (hour);



Views:

Push Metric View:
This view query combines the push metric dataset with the master table of AAs, FIPs and business hours to bring a consolidated view. This view should be used in the existing storyboard queries instead of “push metric” table.


CREATE OR REPLACE VIEW push_metric_data_view ON CLUSTER vusmart AS
SELECT pm.timestamp AS timestamp
    , pm.aaId AS aaId
    , pm.fipId AS fipId
    , pm.event_name AS event_name
    , pm.success_percent AS success_percent
    , pm.timeout_percent AS timeout_percent
    , pm.acc_notfound_percent AS acc_notfound_percent
    , pm.server_error_percent AS server_error_percent
    , pm.client_error_percent AS client_error_percent
    , pm.latencyP50_ms AS latencyP50_ms
    , pm.latencyP95_ms AS latencyP95_ms
    , pm.latencyP99_ms AS latencyP99_ms
    , pm.latency_avg_ms AS latency_avg_ms
    , pm.window AS window
    , aa.aa_org_name AS aa_org_name
    , fip.fip_org_name AS fip_org_name
    , fip.regulator AS regulator
    , fip.license_type AS license_type
    , bh.is_biz_hr AS business_hour
 FROM (SELECT *, toHour(toDateTime(timestamp,'Asia/Kolkata')) as hour
         FROM push_metric_data
      ) pm
 JOIN master_business_hour bh
   ON bh.hour = pm.hour
 JOIN (SELECT fipId, fip_org_name, regulator, license_type
         FROM master_fip
        WHERE (fipId, last_update_date) IN (SELECT fipId, MAX(last_update_date) FROM master_fip GROUP BY fipId)
      ) fip
   ON pm.fipId = fip.fipId
 JOIN (SELECT aaId, aa_org_name
         FROM master_aa
        WHERE (aaId, last_update_date) IN (SELECT aaId, MAX(last_update_date) FROM master_aa GROUP BY aaId)
      ) aa
   ON pm.aaId = aa.aaId



Push Notification  View:
This view query combines the push notification dataset with the master table of AAs, FIPs and business hours to bring a consolidated view. This view should be used in the existing storyboard queries instead of “push notification”  table.


CREATE OR REPLACE VIEW push_notification_metric_data_view ON CLUSTER vusmart AS
SELECT pnm.timestamp AS timestamp
    , pnm.aaId AS aaId
    , pnm.fipId AS fipId
    , pnm.notification_event_name AS notification_event_name
    , pnm.success_response_percent AS success_response_percent
    , pnm.failure_response_percent AS failure_response_percent
    , pnm.no_response_percent AS no_response_percent
    , pnm.latencyP50_ms AS latencyP50_ms
    , pnm.latencyP95_ms AS latencyP95_ms
    , pnm.latencyP99_ms AS latencyP99_ms
    , pnm.latency_avg_ms AS latency_avg_ms
    , pnm.window AS window
    , aa.aa_org_name AS aa_org_name
    , fip.fip_org_name AS fip_org_name
    , fip.regulator AS regulator
    , fip.license_type AS license_type
    , bh.is_biz_hr AS business_hour
 FROM (SELECT *, toHour(toDateTime(timestamp,'Asia/Kolkata')) AS hour
         FROM push_notification_metric_data) pnm
 JOIN master_business_hour bh
   ON bh.hour = pnm.hour
 JOIN (SELECT fipId, fip_org_name, regulator, license_type
              FROM master_fip
             WHERE (fipId, last_update_date) IN (SELECT fipId, MAX(last_update_date) FROM master_fip GROUP BY fipId)) fip
   ON pnm.fipId = fip.fipId
 JOIN (SELECT aaId, aa_org_name
              FROM master_aa
             WHERE (aaId, last_update_date) IN (SELECT aaId, MAX(last_update_date) FROM master_aa GROUP BY aaId)) aa
   ON pnm.aaId = aa.aaId

Data Load via CSV command:

kubectl exec -i chi-clickhouse-vusmart-0-0-0 -n vsmaps -- clickhouse-client --user vusmart --password Vunet#7814 -q "insert into {table_name}  FORMAT CSVWithNames" < {path_to_csv}/{file_name.csv}  



Sample CSV:

AA

FIP

Business Hour



Corrective actions if any to avoid in future

  1. No


More Help


References



    • Related Articles

    • Ticket Id: #835 -ICICI || VuLogAgent is getting stopped

      Solution Document ICICI || VuLogAgent is getting stopped Overview General/Customer specific General Author Komali Buddha Reviewer NA Approver NA Release date 29/04/2023 Product Version 8.5r5 Audience: CSG/TechWarriors/PAC/Platform/Product teams ...
    • Ticket ID 1192 : Data Retention is not working

      Solution Document [1192] Data Retention is not working Overview General/Customer specific General Author Yash Gawhad Reviewer NA Approver NA Release date 03/11/2023 Product Version 9.5r0 Audience: CSG/TechWarriors/PAC/Platform/Product teams What’s ...
    • Container based setup migration issue

      In the container-based setup with the new developer update, there is an error as below: (Access denied for user) Please suggest how to fix this issue. To fix this issue, kindly login into the database using the below command and create one more user. ...
    • Ticket #1158 :: Remove or edit auto-refresh time-interval and time-interval in quick ranges

      Solution Document Remove or edit auto-refresh time-interval and time-interval in quick ranges Overview General/Customer specific General Author SHIVANUR VIMARSHA VINOD Reviewer - Approver - Release date 28/02/2024 Product Version 8.5r11 Audience: ...
    • Ticket ID 1194 : Getting request to ES failed on multiple dashboards

      Solution Document [1194] Getting request to ES failed on multiple dashboards Overview General/Customer specific General Author Yash Gawhad Reviewer NA Approver NA Release date 03/11/2023 Product Version 9.5r0 Audience: ...