Cleber M.

Cleber M.

Data Engineer

Meilleur performeur 9.3/10
Brazil
Membre de confiance depuis 2023
6 années d'expérience

Cleber excels in designing, implementing, and deploying reliable data products. He has a strong ability to collaborate with stakeholders to define business requirements and ensure effective data ingestion from external sources.

Additionally, he thrives in agile environments, promoting data-driven quality and quick iterations. Cleber is dedicated to staying updated with the latest technologies and trends in the field, showcasing his commitment to continuous learning and professional growth.

Expertise principale

BigQueryBigQuery2 ans
Data Engineering5 ans
ETLETL4 ans
PythonPython6 ans
9+

Expérience12

PyData & Ontology Developer

Related Sciences LLC (via Proxify)
Biotechnology
May 2024 - May 2025 · 1y

Related Sciences is a biotechnology firm building an AI-driven platform to map global scientific innovation by analyzing scholarly publications, patents, and research networks.

  • Developed and maintained Python-based data pipelines integrating patent and publication data from OpenAlex, USPTO, and PubMed to power drug discovery analytics.

  • Designed and optimized network graph models and ontologies using Neo4j and NetworkX, improving entity linking and semantic search accuracy.

  • Built visualization tools to represent relationships across scientific disciplines, leveraging Plotnine and interactive visualization libraries.

  • Implemented continuous integration workflows with GitHub Actions, mypy type checking, and pre-commit hooks to ensure code reliability.

  • Collaborated closely with AI researchers and data scientists to embed LLM-driven data enrichment and embedding models into production workflows.

  • Contributed to internal documentation and open-source components supporting the “All of Science” knowledge graph initiative.

PythonPython
Google CloudGoogle Cloud
Neo4jNeo4j
BigQueryBigQuery
dbtdbt
4+
OneTrack

Senior Data Engineer

OneTrack
Ad Tech
Oct 2023 - Jun 2024 · 8m
  • Implemented a scalable real-time data pipeline using Kafka and ClickHouse to parse events and enrich customer data across hundreds of tenants, supporting cross-provider ad tracking and personalized analytics.

  • Developed and tested different technologies for a new identity resolution system, exploring graph database and Python-based solutions to improve user matching and deduplication while keeping downstream data accurate and consistent.

  • Improved query performance in ClickHouse by implementing key-hashed dictionaries, aggregation tables, and optimizing partitioning and indexing. These enhancements significantly reduced query latency and improved overall system efficiency, enabling faster access to large volumes of analytical data.

  • Introduced dbt to build and maintain incremental data pipelines for preprocessing analytical data, enhancing data reliability and enabling scalable analytics.

MySQLMySQL
DockerDocker
PostgreSQLPostgreSQL
RedisRedis
PythonPython
12+

Database Engineer

One More DMCC (via Proxify)
Marketing and Advertising
Oct 2023 - May 2025 · 1y 7m

One More DMCC is a data engineering and analytics company based in Germany, providing advanced tracking and marketing optimization solutions for digital advertisers and agencies.

  • Designed and optimized high-performance databases in PostgreSQL and ClickHouse to support large-scale marketing analytics workloads.

  • Improved query performance and reduced storage costs by implementing table partitioning, materialized views, and advanced indexing strategies.

  • Automated ETL workflows to ingest multi-source tracking data, ensuring data consistency and low-latency availability for reporting.

  • Collaborated with founders and engineers on schema design, query tuning, and architecture decisions to support rapid feature iteration.

  • Actively participated in code reviews and architecture discussions, promoting transparency, accountability, and technical excellence within the distributed team.

DockerDocker
PostgreSQLPostgreSQL
PythonPython
GitGit
Data Modeling
3+

Senior Data Engineer

TVA2 LLC (via Proxify)
Information Technology (IT) and Services
Jul 2023 - Oct 2023 · 3m

TVA2 LLC is a U.S.-based data analytics consultancy that builds scalable data pipelines and automates reporting systems for clients across medical, financial, retail, and eCommerce sectors.

  • Built and automated ETL pipelines in Python and AWS Lambda to clean, transform, and deliver analytical data across multiple client environments.

  • Developed SQL data models and optimized database architecture to ensure efficient storage and querying across large datasets.

  • Designed cloud-based automation processes for error logging, tracking, and fault-tolerant execution of data workflows.

  • Supported data visualization initiatives by preparing datasets for Tableau and Power BI, enabling clients to derive actionable insights.

  • Collaborated with cross-functional teams and clients to define project requirements, ensuring data accuracy and timely delivery.

PythonPython
SQLSQL
AWS LambdaAWS Lambda
PandasPandas
LinuxLinux
4+
FreshBooks

Data Engineer

FreshBooks
Financial Technology (FinTech)
Apr 2022 - Oct 2023 · 1y 6m
  • Implemented a reliable and centralized single source of truth for users' and customers' data, ensuring daily refreshes and storage of historical changes.
  • Took charge of planning and building Python API integrations to ingest external data sources that were not supported by Fivetran. Implemented DBT transformation layers and conducted data quality testing to ensure the accuracy and reliability of the integrated data;
  • Transferred resource-intensive data transformations from Looker to dbt, resulting in optimized processes and reduced costs.
DockerDocker
PythonPython
SQLSQL
Data Engineering
BigQueryBigQuery
6+

Data Engineer

Pmweb / Oto CRM
Aug 2021 - Apr 2022 · 8m
  • Took charge of refactoring a critical legacy pipeline, significantly improving its performance and scalability.
  • Designed and implemented a Python command line interface to streamline the data engineering team's preparation of new Oto CRM client infrastructures.
  • Strategized and executed the development of a Python API integration to ingest external data sources and enrich them with internal data from various sources.
  • Demonstrated strong skills in data warehousing, data quality, Git, data modeling, data ingestion, Apache Airflow, communication, problem-solving, data pipelines, NoSQL, ETL (Extract, Transform, Load), data governance, Python programming language, SQL, big data, team leadership, machine learning, and agile methodologies.

Lead Data Engineer

Pmweb / Oto CRM
Customer Relationship Management (CRM)
Aug 2021 - Apr 2022 · 8m
  • Successfully led a team of five data engineers, providing mentorship, task delegation, and code reviews to ensure high-quality deliverables across all data team projects;

  • Took charge of refactoring a critical legacy pipeline, significantly improving its performance and scalability. The release of the new version granted customers more autonomy in building the campaign creation platform, reducing the average time for client requests received by the data engineering team by 1 hour;

  • Developed new data routines utilized by Oto CRM's 50+ Brazilian retail clients, enabling the processing of customer data for millions of customers. These routines facilitated the implementation of optional features on the Oto CRM user interface, such as campaign priorities, preferential customer store, best customer-seller matching, best customer-store matching, random redistribution, weighted redistribution, and multiple ready-to-use campaigns;

  • Designed and implemented a Python command line interface to streamline the data engineering team's preparation of new Oto CRM client infrastructures. This interface significantly reduced human error and improved the onboarding experience for new clients by expediting the data provisioning process;

  • Strategized and executed the development of a Python API integration to ingest external data sources and enrich them with internal data from various sources;

  • Demonstrated strong skills in data warehousing, data quality, Git, data modeling, data ingestion, Apache Airflow, communication, problem-solving, data pipelines, NoSQL, ETL (Extract, Transform, Load), data governance, Python programming language, SQL, big data, team leadership, machine learning, and agile methodologies.

MySQLMySQL
DockerDocker
PythonPython
SQLSQL
Data Engineering
8+

Business Intelligence Specialist

Pmweb / Oto CRM
Mar 2019 - Aug 2021 · 2y 5m
  • Developed a customer identifier engine capable of identifying and assigning internal user identifiers to behavioral website, CRM, and omnichannel transactional data.
  • Restructured the RFM Ledger, a critical data pipeline that generates historical Recency, Frequency, and Monetary Value groupings for each customer based on custom rules.
  • Refactored MySQL queries and built new ones in ClickHouse to align with specific business requirements.
  • Created a Python reporting automation tool that automated data extraction from analytical tools and generated ready-to-use reporting files.
  • Demonstrated expertise in data warehousing, Git, data modeling, data ingestion, Apache Airflow, communication, problem-solving, data pipelines, NoSQL, ETL (Extract, Transform, Load), Python programming language, SQL, big data, machine learning, data analysis, and agile methodologies.

Senior Business Intelligence Analyst

Pmweb / Oto CRM
Customer Relationship Management (CRM)
Mar 2019 - Aug 2021 · 2y 5m
  • Developed a customer identifier engine capable of identifying and assigning internal user identifiers to behavioral website, CRM, and omnichannel transactional data. This engine facilitated scalable and customized data integration from diverse external data sources;

  • Restructured the RFM Ledger, a critical data pipeline that generates historical Recency, Frequency, and Monetary Value groupings for each customer based on custom rules. By implementing the SQL template technique to parameterize the query according to Oto CRM clients' settings, the average processing speed of this pipeline increased by over 75%;

  • Refactored MySQL queries and built new ones in ClickHouse to align with specific business requirements. These queries were parameterized and made available as metrics for Oto CRM users, significantly reducing latency by up to 95%. The metrics included Omnichannel media attribution, CRM analytical view of store performance, CRM analytical view of product sales, analytical view of personas and clusters by store, Digital channel performance, and Ranking of sales reps by different KPIs;

  • Created a Python reporting automation tool that automated data extraction from analytical tools and generated ready-to-use reporting files. This automation significantly reduced the time spent on repetitive tasks within the team;

  • Demonstrated expertise in data warehousing, Git, data modeling, data ingestion, Apache Airflow, communication, problem-solving, data pipelines, NoSQL, ETL (Extract, Transform, Load), Python programming language, SQL, big data, machine learning, data analysis, and agile methodologies.

MySQLMySQL
PythonPython
SQLSQL
Data Engineering
Data Analytics
2+

Data Analyst - Conversion Rate Optimization

Pmweb, a Wunderman company
Marketing and Advertising
Feb 2018 - Mar 2019 · 1y 1m
  • Conducted A/B testing experiments, designing and analyzing experiments for clients with diverse key business requirements;
  • Utilized various tools such as Maxymiser, Google Optimize, Adobe Target, Hotjar heatmaps, Google Analytics, and Adobe Analytics to facilitate A/B testing processes;
  • Designed experiment frameworks, including hypothesis formulation, test group definition, sample size determination, and measurement metrics selection;
  • Collected and analyzed experiment data, applying statistical techniques to derive insights and make data-driven recommendations;
  • Demonstrated strong skills in communication, problem-solving, data analytics, Python programming language, SQL, Machine Learning, Data analysis, and Agile methodologies.
Data Analytics
Machine LearningMachine Learning

Metrics Analyst

Herval Group
Retail
Oct 2017 - Feb 2018 · 4m
  • Conducted e-commerce data analysis and web analytics for Grupo Herval, leveraging external tools and internal databases to gather relevant data;

  • Identified opportunities for improving marketing results by reducing costs and optimizing campaigns based on data analysis;

  • Gathered requirements from stakeholders to understand their critical business needs for reporting purposes;

  • Developed interactive dashboards to provide stakeholders with insightful and actionable information for informed decision-making;

  • Collaborated closely with stakeholders to ensure the dashboards met their specific requirements and provided valuable insights;

  • Played a key role in improving marketing strategies and overall business performance by utilizing data-driven insights and analytics.

Data Analytics
Calçados Beira Rio S/A

Digital Marketing Analyst

Calçados Beira Rio S/A
Fashion and Apparel
Jul 2014 - Feb 2017 · 2y 7m
  • Managed marketing analytics responsibilities, specifically handling Facebook and Instagram promotions for new products;

  • Oversaw the reporting of customer feedback and analyzed the data to derive actionable insights for marketing strategies;

  • Conducted website analytics to monitor and optimize performance, ensuring effective online presence and customer engagement;

  • Collaborated with the IT sector to implement automation processes in various websites, including Beira Rio Conforto, Moleca, Vizzano, Molekinha, and Modare Ultraconforto;

  • Utilized effective communication skills to collaborate with cross-functional teams and stakeholders, ensuring alignment and understanding of marketing goals and strategies;

  • Demonstrated proficiency in data analytics and data analysis techniques to drive informed decision-making in marketing initiatives;

  • Successfully leveraged marketing analytics and automation to improve campaign performance, customer satisfaction, and overall online presence.

Data Analytics

Évaluations

Excellence en ingénierie

Les performances globales de Cleber lors d'une évaluation technique en direct de 90 minutes se classent dans le top 25% des Data Engineer évalués chez Proxify.

Certificats 1

ClickHouse
ClickHouse Certified DeveloperClickHouse

Issued Aug 2025

SQLSQL
ClickHouseClickHouse
ClickHouse
ClickHouse Certified DeveloperClickHouse

Issued Aug 2025

SQLSQL
ClickHouseClickHouse
Souhaitez-vous en savoir plus sur les certifications de Cleber ?Planifier un appel

Éducation

FIAP
FIAP
MBA - Data Engineering2021 - 2022
Federal University of Rio Grande do Sul
Federal University of Rio Grande do Sul
MBA Business Analytics2018 - 2019
Udacity
Udacity
NanoDegree - Predictive Analytics for Business2017 - 2018
Udacity
Udacity
NanoDegree - Data Analytics2017 - 2018
Unisinos
Unisinos
Advertising and Marketing2013 - 2017

Arrêtez de naviguer.
Soyez jumelé plus rapidement.