Gopal G.

Data Engineer

Gopal ist ein Dateningenieur mit mehr als acht Jahren Erfahrung in regulierten Sektoren wie Automobil, Technologie und Energie. Er kennt sich hervorragend mit GCP, Azure, AWS und Snowflake aus und verfügt über Fachkenntnisse in den Bereichen Entwicklung über den gesamten Lebenszyklus, Datenmodellierung, Datenbankarchitektur und Leistungsoptimierung.

Zu seinen stolzesten Errungenschaften gehören die Erstellung und Optimierung von ETL/ELT-Pipelines in Multi-Cloud-Umgebungen. Gopals Google Cloud-, AWS-, Microsoft Azure- und Snowflake-Zertifizierungen unterstreichen sein Engagement für kontinuierliches Lernen und professionelle Spitzenleistungen.

Er hat einen Master-Abschluss in Computertechnik.

Hauptkompetenz
  • Fact Data Modeling 8 Jahre
  • ETL
    ETL 8 Jahre
  • Unix shell 7 Jahre
Andere Fähigkeiten
  • Pandas
    Pandas 4 Jahre
  • MySQL
    MySQL 4 Jahre
  • Apache ZooKeeper
    Apache ZooKeeper 4 Jahre
Gopal
Gopal G.

United Kingdom

Erste Schritte

Ausgewählte Erfahrung

Beschäftigung

  • Data Engineer

    Nissan Motor Corporation - 1 jahr 1 monat

    • Designing and implementing efficient and scalable data pipelines on Google Cloud Platform (GCP) to collect, process, and transform raw data into usable formats for analysis and consumption;

    • Leading and managing offshore teams to successfully implement various data engineering tasks, ensuring alignment with project goals and maintaining high-quality standards through regular communication, clear documentation, and effective task delegation;

    • Overseeing governance and compliance of data stored in Big Query, ensuring adherence to UK and EU GDPR regulations;

    • Conducting Data Privacy Impact Assessments (DPIA) for various projects at Nissan UK Limited and implementing necessary measures to mitigate or reduce risks;

    • Building and maintaining data warehouses, data lakes, and data lake houses on GCP using services like Big Query, Google Cloud Storage (GCS), and Bigtable;

    • Integrating data from various sources into GCP using services like Cloud Storage, Cloud Pub/Sub, and Cloud SQL;

    • Implementing proper data governance and security measures using GCP Identity and Access Management (IAM) and Data Loss Prevention (DLP) for compliance;

    • Building data pipelines using Google Dataflow to handle large volumes of data efficiently;

    • Implementing ETL/ELT processes to extract data from various sources and load them into data warehouses or data lakes;

    • Developing streaming pipelines for real-time data ingestion utilizing Kafka and Kafka Connect;

    • Implementing Python-based transformations and Big Query procedures, orchestrating their execution seamlessly with Google Cloud Composer;

    • Engineering transformations using Apache Beam, optimized for peak performance on Google DataProc clusters.

    Technologien:

    • Technologien:
    • Fact Data Modeling
    • ETL ETL
    • Unix shell
    • Performance Testing
    • Unit Testing
    • AWS S3 AWS S3
    • Data Analytics
    • Looker Looker
    • Snowflake Snowflake
    • BigQuery BigQuery
    • Pandas Pandas
    • MySQL MySQL
    • Data Modeling
    • Database testing
    • Apache ZooKeeper Apache ZooKeeper
    • AWS Athena
    • Redshift Redshift
    • Python Python
    • SQL SQL
    • Apache Kafka Apache Kafka
    • Apache Airflow Apache Airflow
    • Apache Spark Apache Spark
    • Hadoop Hadoop
    • Google Cloud Google Cloud
    • Data Engineering
  • Lead Data Engineer

    Technovert - 2 jahre 7 monate

    • Developing ETL processes using Python and SQL to transform raw data into usable formats and load them into Big Query for analysis;

    • Building and architecting multiple data pipelines, managing end-to-end ETL and ELT processes for data ingestion and transformation in GCP, and coordinating tasks among the team;

    • Designing and implementing data pipelines using GCP services such as Dataflow, Dataproc, and Pub/Sub;

    • Migrating Oracle DSR to Big Query using Data Proc, Python, Airflow, and Looker;

    • Designing and developing a Python ingestion framework to load data from various source systems, including AR modules, inventory modules, files, and web services, into Big Query;

    • Developing pipelines to load data from customer-placed manual files in Google Drive to GCS and subsequently to Big Query using Big Query stored procedures;

    • Participating in code reviews and contributing to the development of best practices for data engineering on GCP;

    • Implementing data security and access controls using GCP's Identity and Access Management (IAM) and Cloud Security Command Centre.

    Technologien:

    • Technologien:
    • Databricks Databricks
    • Fact Data Modeling
    • ETL ETL
    • Unix shell
    • Performance Testing
    • Unit Testing
    • AWS S3 AWS S3
    • Oracle Oracle
    • Salesforce Salesforce
    • Data Analytics
    • Microsoft Power BI Microsoft Power BI
    • Snowflake Snowflake
    • BigQuery BigQuery
    • Pandas Pandas
    • MySQL MySQL
    • Data Modeling
    • Database testing
    • Apache ZooKeeper Apache ZooKeeper
    • Azure Azure
    • Azure Data Factory Azure Data Factory
    • Azure Synapse Azure Synapse
    • Python Python
    • SQL SQL
    • Apache Kafka Apache Kafka
    • Apache Airflow Apache Airflow
    • Apache Spark Apache Spark
    • Hadoop Hadoop
    • Google Cloud Google Cloud
    • Data Engineering
  • Data Engineer

    Accenture - 1 jahr 8 monate

    • Designing and implementing Snowflake data warehouses, developing schemas, tables, and views optimized for performance and data accessibility;

    • Extracting data from Oracle databases, transforming it into CSV files, and loading these files into a Snowflake data warehouse stage hosted on AWS S3, ensuring secure and efficient data transfer and storage;

    • Creating and utilizing virtual warehouses in Snowflake according to business requirements, effectively tracking credit usage to enhance business insights and resource allocation;

    • Designing and configuring Snow pipe pipelines for seamless and near-real-time data loading, reducing manual intervention, and enhancing data freshness;

    • Parsing XML data and organizing it into structured Snowflake tables for efficient data storage and seamless data analysis;

    • Designing and implementing JSON data ingestion pipelines, leveraging Snowflake's capabilities to handle nested and complex JSON structures;

    • Designing and deploying Amazon Redshift clusters, optimizing schema design, distribution keys, and sort keys for optimal query performance;

    • Leveraging AWS Lambda functions and Step Functions to orchestrate ETL workflows, ensuring data accuracy and timely processing;

    • Creating and maintaining data visualizations and reports using Amazon Quick Sight to facilitate data analysis and insights.

    Technologien:

    • Technologien:
    • Fact Data Modeling
    • ETL ETL
    • Unix shell
    • Performance Testing
    • Unit Testing
    • Oracle Oracle
    • Data Analytics
    • Tableau Tableau
    • Data Modeling
    • Database testing
    • Python Python
    • SQL SQL
    • Data Engineering
  • BI Consultant, General Electric

    Tech Mahindra - 2 jahre 7 monate

    • Designing and implementing Teradata packages to facilitate seamless data extraction, transformation, and loading (ETL) operations from diverse sources into data warehouses;

    • Developing interactive and dynamic reports using SSRS, providing stakeholders with timely and insightful data visualizations for informed decision-making;

    • Conducting rigorous data validation and quality checks to ensure the integrity and accuracy of processed data;

    • Optimizing ETL performance by employing advanced techniques, resulting in a 25% reduction in processing time;

    • Developing the ingestion strategy for loading data from multiple source systems to the operational layer in the data warehouse using Python, SQL, and stored procedures;

    • Understanding and developing design documents as deliverables for the project;

    • Implementing SCD Type 1 and Type 2 functionality and developing custom scripts in Teradata for integration and functionality development for different modules like Primavera P6 and Oracle Project module;

    • Managing and troubleshooting issues as a DWH analyst to ensure the smooth flow of business operations;

    • Preparing unit test cases and performing end-to-end integration testing;

    • Actively participating in design discussions and reviewing solutions;

    • Involving in peer review discussions on development before moving to higher environments;

    • Loading data from multiple files to a single target table using ODI variables;

    • Configuring and developing ETL mappings to load data from XML and complex (unstructured/semi-structured) files;

    • Utilizing Power BI to design and develop insightful visualizations and interactive dashboards, enabling data-driven decision-making for stakeholders and enhancing overall data engineering solutions.

    Technologien:

    • Technologien:
    • Fact Data Modeling
    • ETL ETL
    • Unix shell
    • Performance Testing
    • Unit Testing
    • Oracle Oracle
    • Data Analytics
    • Tableau Tableau
    • Data Modeling
    • SQL SQL
    • Data Engineering

Ausbildung

  • MSc.Computer Software Engineering

    University of West London · 2022 - 2023

  • MSc.Electronics and Communications

    Jawaharlal university of Hyderabad · 2012 - 2016

Finden Sie Ihren nächsten Entwickler innerhalb von Tagen, nicht Monaten

In einem kurzen 25-minütigen Gespräch würden wir gerne:

  • Auf Ihren Bedarf bezüglich des Recruitments von Software-Entwicklern eingehen
  • Unseren Prozess vorstellen und somit wie wir Sie mit talentierten und geprüften Kandidaten aus unserem Netzwerk zusammenbringen können
  • Die nächsten Schritte besprechen, um den richtigen Kandidaten zu finden - oft in weniger als einer Woche

Unterhalten wir uns