Cleber M.

Data Engineer

Cleber is an experienced and versatile Data Engineer with over six years of expertise in the data field. He demonstrates proficiency in various technologies and is committed to continuous learning and exploring new technologies to stay ahead of the curve.

Cleber excels in designing, implementing, and deploying reliable data products. He has a strong ability to collaborate with stakeholders to define business requirements and ensure effective data ingestion from external sources.

Additionally, he thrives in agile environments, promoting data-driven quality and quick iterations. Cleber is dedicated to staying updated with the latest technologies and trends in the field, showcasing his commitment to continuous learning and professional growth.

Hoofd expertise

  • BigQuery
    BigQuery 2 jaar
  • Data Engineering 5 jaar
  • ETL
    ETL 4 jaar

Andere vaardigheden

  • Docker
    Docker 4 jaar
  • MySQL
    MySQL 3 jaar
  • Machine Learning
    Machine Learning 2 jaar
Cleber

Cleber M.

Brazil

Aan de slag

Geselecteerde ervaring

Dienstverband

  • PyData & Ontology Developer

    Related Sciences LLC (via Proxify) - 1 jaar

    Related Sciences is a biotechnology firm building an AI-driven platform to map global scientific innovation by analyzing scholarly publications, patents, and research networks.

    • Developed and maintained Python-based data pipelines integrating patent and publication data from OpenAlex, USPTO, and PubMed to power drug discovery analytics.

    • Designed and optimized network graph models and ontologies using Neo4j and NetworkX, improving entity linking and semantic search accuracy.

    • Built visualization tools to represent relationships across scientific disciplines, leveraging Plotnine and interactive visualization libraries.

    • Implemented continuous integration workflows with GitHub Actions, mypy type checking, and pre-commit hooks to ensure code reliability.

    • Collaborated closely with AI researchers and data scientists to embed LLM-driven data enrichment and embedding models into production workflows.

    • Contributed to internal documentation and open-source components supporting the “All of Science” knowledge graph initiative.

    Technologieën:

    • Technologieën:
    • Python Python
    • Google Cloud Google Cloud
    • Neo4j Neo4j
    • BigQuery BigQuery
    • dbt dbt
    • Large Language Models (LLM) Large Language Models (LLM)
    • Pipeline optimization
    • PySpark PySpark
    • GitHub Actions GitHub Actions
  • Senior Data Engineer

    OneTrack - 8 maanden

    • Implemented a scalable real-time data pipeline using Kafka and ClickHouse to parse events and enrich customer data across hundreds of tenants, supporting cross-provider ad tracking and personalized analytics.

    • Developed and tested different technologies for a new identity resolution system, exploring graph database and Python-based solutions to improve user matching and deduplication while keeping downstream data accurate and consistent.

    • Improved query performance in ClickHouse by implementing key-hashed dictionaries, aggregation tables, and optimizing partitioning and indexing. These enhancements significantly reduced query latency and improved overall system efficiency, enabling faster access to large volumes of analytical data.

    • Introduced dbt to build and maintain incremental data pipelines for preprocessing analytical data, enhancing data reliability and enabling scalable analytics.

    Technologieën:

    • Technologieën:
    • MySQL MySQL
    • Docker Docker
    • PostgreSQL PostgreSQL
    • Redis Redis
    • Python Python
    • SQL SQL
    • AWS Lambda AWS Lambda
    • AWS S3 AWS S3
    • Data Engineering
    • Neo4j Neo4j
    • Git Git
    • Data Modeling
    • Dagster Dagster
    • dbt dbt
    • ClickHouse ClickHouse
    • Memgraph
    • Pipeline optimization
  • Database Engineer

    One More DMCC (via Proxify) - 1 jaar 7 maanden

    One More DMCC is a data engineering and analytics company based in Germany, providing advanced tracking and marketing optimization solutions for digital advertisers and agencies.

    • Designed and optimized high-performance databases in PostgreSQL and ClickHouse to support large-scale marketing analytics workloads.

    • Improved query performance and reduced storage costs by implementing table partitioning, materialized views, and advanced indexing strategies.

    • Automated ETL workflows to ingest multi-source tracking data, ensuring data consistency and low-latency availability for reporting.

    • Collaborated with founders and engineers on schema design, query tuning, and architecture decisions to support rapid feature iteration.

    • Actively participated in code reviews and architecture discussions, promoting transparency, accountability, and technical excellence within the distributed team.

    Technologieën:

    • Technologieën:
    • Docker Docker
    • PostgreSQL PostgreSQL
    • Python Python
    • Git Git
    • Data Modeling
    • ETL ETL
    • ClickHouse ClickHouse
    • Pipeline optimization
  • Senior Data Engineer

    TVA2 LLC (via Proxify) - 3 maanden

    TVA2 LLC is a U.S.-based data analytics consultancy that builds scalable data pipelines and automates reporting systems for clients across medical, financial, retail, and eCommerce sectors.

    • Built and automated ETL pipelines in Python and AWS Lambda to clean, transform, and deliver analytical data across multiple client environments.

    • Developed SQL data models and optimized database architecture to ensure efficient storage and querying across large datasets.

    • Designed cloud-based automation processes for error logging, tracking, and fault-tolerant execution of data workflows.

    • Supported data visualization initiatives by preparing datasets for Tableau and Power BI, enabling clients to derive actionable insights.

    • Collaborated with cross-functional teams and clients to define project requirements, ensuring data accuracy and timely delivery.

    Technologieën:

    • Technologieën:
    • Python Python
    • SQL SQL
    • AWS Lambda AWS Lambda
    • Pandas Pandas
    • Linux Linux
    • Git Git
    • Data Modeling
    • ETL ETL
    • REST API REST API
  • Data Engineer

    FreshBooks - 1 jaar 6 maanden

    • Implemented a reliable and centralized single source of truth for users' and customers' data, ensuring daily refreshes and storage of historical changes.
    • Took charge of planning and building Python API integrations to ingest external data sources that were not supported by Fivetran. Implemented DBT transformation layers and conducted data quality testing to ensure the accuracy and reliability of the integrated data;
    • Transferred resource-intensive data transformations from Looker to dbt, resulting in optimized processes and reduced costs.

    Technologieën:

    • Technologieën:
    • Docker Docker
    • Python Python
    • SQL SQL
    • Data Engineering
    • BigQuery BigQuery
    • Git Git
    • Data Modeling
    • Dimensional modeling
    • Fact Data Modeling
    • dbt dbt
    • Data Quality
  • Data Engineer

    Pmweb / Oto CRM - 8 maanden

    • Took charge of refactoring a critical legacy pipeline, significantly improving its performance and scalability.
    • Designed and implemented a Python command line interface to streamline the data engineering team's preparation of new Oto CRM client infrastructures.
    • Strategized and executed the development of a Python API integration to ingest external data sources and enrich them with internal data from various sources.
    • Demonstrated strong skills in data warehousing, data quality, Git, data modeling, data ingestion, Apache Airflow, communication, problem-solving, data pipelines, NoSQL, ETL (Extract, Transform, Load), data governance, Python programming language, SQL, big data, team leadership, machine learning, and agile methodologies.
  • Lead Data Engineer

    Pmweb / Oto CRM - 8 maanden

    • Successfully led a team of five data engineers, providing mentorship, task delegation, and code reviews to ensure high-quality deliverables across all data team projects;

    • Took charge of refactoring a critical legacy pipeline, significantly improving its performance and scalability. The release of the new version granted customers more autonomy in building the campaign creation platform, reducing the average time for client requests received by the data engineering team by 1 hour;

    • Developed new data routines utilized by Oto CRM's 50+ Brazilian retail clients, enabling the processing of customer data for millions of customers. These routines facilitated the implementation of optional features on the Oto CRM user interface, such as campaign priorities, preferential customer store, best customer-seller matching, best customer-store matching, random redistribution, weighted redistribution, and multiple ready-to-use campaigns;

    • Designed and implemented a Python command line interface to streamline the data engineering team's preparation of new Oto CRM client infrastructures. This interface significantly reduced human error and improved the onboarding experience for new clients by expediting the data provisioning process;

    • Strategized and executed the development of a Python API integration to ingest external data sources and enrich them with internal data from various sources;

    • Demonstrated strong skills in data warehousing, data quality, Git, data modeling, data ingestion, Apache Airflow, communication, problem-solving, data pipelines, NoSQL, ETL (Extract, Transform, Load), data governance, Python programming language, SQL, big data, team leadership, machine learning, and agile methodologies.

    Technologieën:

    • Technologieën:
    • MySQL MySQL
    • Docker Docker
    • Python Python
    • SQL SQL
    • Data Engineering
    • Git Git
    • Data Analytics
    • Data Modeling
    • Dimensional modeling
    • Fact Data Modeling
    • dbt dbt
    • ClickHouse ClickHouse
    • Data Quality
  • Business Intelligence Specialist

    Pmweb / Oto CRM - 2 jaar 5 maanden

    • Developed a customer identifier engine capable of identifying and assigning internal user identifiers to behavioral website, CRM, and omnichannel transactional data.
    • Restructured the RFM Ledger, a critical data pipeline that generates historical Recency, Frequency, and Monetary Value groupings for each customer based on custom rules.
    • Refactored MySQL queries and built new ones in ClickHouse to align with specific business requirements.
    • Created a Python reporting automation tool that automated data extraction from analytical tools and generated ready-to-use reporting files.
    • Demonstrated expertise in data warehousing, Git, data modeling, data ingestion, Apache Airflow, communication, problem-solving, data pipelines, NoSQL, ETL (Extract, Transform, Load), Python programming language, SQL, big data, machine learning, data analysis, and agile methodologies.
  • Senior Business Intelligence Analyst

    Pmweb / Oto CRM - 2 jaar 5 maanden

    • Developed a customer identifier engine capable of identifying and assigning internal user identifiers to behavioral website, CRM, and omnichannel transactional data. This engine facilitated scalable and customized data integration from diverse external data sources;

    • Restructured the RFM Ledger, a critical data pipeline that generates historical Recency, Frequency, and Monetary Value groupings for each customer based on custom rules. By implementing the SQL template technique to parameterize the query according to Oto CRM clients' settings, the average processing speed of this pipeline increased by over 75%;

    • Refactored MySQL queries and built new ones in ClickHouse to align with specific business requirements. These queries were parameterized and made available as metrics for Oto CRM users, significantly reducing latency by up to 95%. The metrics included Omnichannel media attribution, CRM analytical view of store performance, CRM analytical view of product sales, analytical view of personas and clusters by store, Digital channel performance, and Ranking of sales reps by different KPIs;

    • Created a Python reporting automation tool that automated data extraction from analytical tools and generated ready-to-use reporting files. This automation significantly reduced the time spent on repetitive tasks within the team;

    • Demonstrated expertise in data warehousing, Git, data modeling, data ingestion, Apache Airflow, communication, problem-solving, data pipelines, NoSQL, ETL (Extract, Transform, Load), Python programming language, SQL, big data, machine learning, data analysis, and agile methodologies.

    Technologieën:

    • Technologieën:
    • MySQL MySQL
    • Python Python
    • SQL SQL
    • Data Engineering
    • Data Analytics
    • ClickHouse ClickHouse
    • Pipeline optimization
  • Data Analyst - Conversion Rate Optimization

    Pmweb, a Wunderman company - 1 jaar 1 maand

    • Conducted A/B testing experiments, designing and analyzing experiments for clients with diverse key business requirements;
    • Utilized various tools such as Maxymiser, Google Optimize, Adobe Target, Hotjar heatmaps, Google Analytics, and Adobe Analytics to facilitate A/B testing processes;
    • Designed experiment frameworks, including hypothesis formulation, test group definition, sample size determination, and measurement metrics selection;
    • Collected and analyzed experiment data, applying statistical techniques to derive insights and make data-driven recommendations;
    • Demonstrated strong skills in communication, problem-solving, data analytics, Python programming language, SQL, Machine Learning, Data analysis, and Agile methodologies.

    Technologieën:

    • Technologieën:
    • Data Analytics
    • Machine Learning Machine Learning
  • Metrics Analyst

    Herval Group - 4 maanden

    • Conducted e-commerce data analysis and web analytics for Grupo Herval, leveraging external tools and internal databases to gather relevant data;

    • Identified opportunities for improving marketing results by reducing costs and optimizing campaigns based on data analysis;

    • Gathered requirements from stakeholders to understand their critical business needs for reporting purposes;

    • Developed interactive dashboards to provide stakeholders with insightful and actionable information for informed decision-making;

    • Collaborated closely with stakeholders to ensure the dashboards met their specific requirements and provided valuable insights;

    • Played a key role in improving marketing strategies and overall business performance by utilizing data-driven insights and analytics.

    Technologieën:

    • Technologieën:
    • Data Analytics
  • Digital Marketing Analyst

    Calçados Beira Rio S/A - 2 jaar 7 maanden

    • Managed marketing analytics responsibilities, specifically handling Facebook and Instagram promotions for new products;

    • Oversaw the reporting of customer feedback and analyzed the data to derive actionable insights for marketing strategies;

    • Conducted website analytics to monitor and optimize performance, ensuring effective online presence and customer engagement;

    • Collaborated with the IT sector to implement automation processes in various websites, including Beira Rio Conforto, Moleca, Vizzano, Molekinha, and Modare Ultraconforto;

    • Utilized effective communication skills to collaborate with cross-functional teams and stakeholders, ensuring alignment and understanding of marketing goals and strategies;

    • Demonstrated proficiency in data analytics and data analysis techniques to drive informed decision-making in marketing initiatives;

    • Successfully leveraged marketing analytics and automation to improve campaign performance, customer satisfaction, and overall online presence.

    Technologieën:

    • Technologieën:
    • Data Analytics

Educatie

  • MSc.MBA - Data Engineering

    FIAP · 2021 - 2022

  • MSc.MBA Business Analytics

    Federal University of Rio Grande do Sul · 2018 - 2019

  • Standalone courseNanoDegree - Predictive Analytics for Business

    Udacity · 2017 - 2018

  • Standalone courseNanoDegree - Data Analytics

    Udacity · 2017 - 2018

  • BSc.Advertising and Marketing

    Unisinos · 2013 - 2017

Vind jouw volgende ontwikkelaar binnen enkele dagen, niet maanden

In een kort gesprek van 25 minuten:

  • gaan we in op wat je nodig hebt om je product te ontwikkelen;
  • Ons proces uitleggen om u te matchen met gekwalificeerde, doorgelichte ontwikkelaars uit ons netwerk
  • delen we de stappen met je om de juiste match te vinden, vaak al binnen een week.

Maak een afspraak