Sign in
Categories
Your Saved List Become a Channel Partner Sell in AWS Marketplace Amazon Web Services Home Help
ProServ

Overview

HCLTech’s Intelligent Ingestion solution demonstrates on how to define end-to-end ETL workflow that can be triggered with a AWS Step Function job which will internally invoke multiple different state machines in parallel for a seamless ETL experience providing reliable parallel data ingestion to data curation, aggregation until consumption in most efficient and cost-effective manner. Entire solution is parameterized to ensure no-code/low-code experience, which will enable users to add any desired data source by just passing the source data details in a standardized format and let the solution build ETL Ingestion jobs on the fly for new added data sources.

  • HCL’s Intelligent Ingestion solution can start the execution based on the auto event trigger mechanism or by scheduler.

  • Main state machine will start the process by invoking crawler state machine which will internally execute the glue crawlers to connect, scan data from various sources, such as RDBMS, edge devices, logs, batch data and simultaneously create metadata in centralized AWS Glue Data Catalog. A single glue job will ingest data in parallel to Amazon S3 raw zone for different databases/tables.

  • Once the data ingestion is completed successfully, then an email alert will be sent to the user via Amazon SNS service and data quality validation will be auto-executed by another separate state machine which is plugged into the main state machine.

  • Data Quality validation will be performed by AWS Glue DataBrew jobs. Based on the predefined configured threshold, the pass or fail results will be published

  • Upon data quality validation failure, AWS lambda will create high severity ServiceNow ticket and an email alert will be sent to the respective team for further troubleshooting. Incase data quality validation is passed, then a successful email alert will be triggered to the respective team and the solution will execute the next curation state machine.

  • Curation state machine will internally execute a single AWS Glue curation job which will executed in parallel, and transformations will be executed for respective data sources.

  • If the curation job is executed successfully, an email alert will be sent to the respective team by Amazon SNS service and the curated data will be stored into the Amazon S3 curation zone.

  • Once the curation state machine completes its execution successfully, then the final enrichment state machine will execute and internally a single glue job will execute to perform the aggregations in parallel for respective data sources.

  • Upon completion of all jobs configured in Intelligent Ingestion solution, final success email will be triggered to the respective teams and the aggregated data will be stored into the Amazon S3 enrichment zone.

  • Each state machines are lightly coupled enabling business to easily plug in or plug out any specific state machine from the main state machine.

  • The AWS Glue Data Catalog and Amazon S3 are governed by AWS Lake formation and access privileges will be provided based on the relevant IAM groups or IAM roles.

This solution comprises of the following key pillars, each having several interesting features that are crucial for building a robust and complete ingestion solution:

  • Schema Evolution
  • Fully Automated ETL Pipeline
  • Re-Usable ETL Workflow & Transformation
  • Centralized Data Catalog-
  • Data Quality & Governance-
  • Dynamic Alert Mechanism,Incident Reporting & Management:
Sold by HCLTech
Categories
Fulfillment method Professional Services

Pricing Information

This service is priced based on the scope of your request. Please contact seller for pricing details.

Support

Please contact us at digitaltransformation@hcl.com with our solution which you are interested to know more on deployment and our support.