NEW
Proxify is bringing transparency to tech team performance based on research conducted at Stanford. An industry first, built for engineering leaders.
Learn more
Cleber M.
Data Engineer
Cleber is an experienced and versatile Data Engineer with over six years of expertise in the data field. He demonstrates proficiency in various technologies and is committed to continuous learning and exploring new technologies to stay ahead of the curve.
Cleber excels in designing, implementing, and deploying reliable data products. He has a strong ability to collaborate with stakeholders to define business requirements and ensure effective data ingestion from external sources.
Additionally, he thrives in agile environments, promoting data-driven quality and quick iterations. Cleber is dedicated to staying updated with the latest technologies and trends in the field, showcasing his commitment to continuous learning and professional growth.
Principale expertise
- BigQuery 2 ans
- Data Engineering 5 ans
- ETL 4 ans

Autres compétences
- Docker 4 ans
- MySQL 3 ans
- Machine Learning 2 ans

Expérience sélectionnée
Emploi
PyData & Ontology Developer
Related Sciences LLC (via Proxify) - 1 an
Related Sciences is a biotechnology firm building an AI-driven platform to map global scientific innovation by analyzing scholarly publications, patents, and research networks.
-
Developed and maintained Python-based data pipelines integrating patent and publication data from OpenAlex, USPTO, and PubMed to power drug discovery analytics.
-
Designed and optimized network graph models and ontologies using Neo4j and NetworkX, improving entity linking and semantic search accuracy.
-
Built visualization tools to represent relationships across scientific disciplines, leveraging Plotnine and interactive visualization libraries.
-
Implemented continuous integration workflows with GitHub Actions, mypy type checking, and pre-commit hooks to ensure code reliability.
-
Collaborated closely with AI researchers and data scientists to embed LLM-driven data enrichment and embedding models into production workflows.
-
Contributed to internal documentation and open-source components supporting the “All of Science” knowledge graph initiative.
Les technologies:
- Les technologies:
Python
Google Cloud
Neo4j
BigQuery
dbt
Large Language Models (LLM)
- Pipeline optimization
PySpark
GitHub Actions
-
Senior Data Engineer
OneTrack - 8 mois
-
Implemented a scalable real-time data pipeline using Kafka and ClickHouse to parse events and enrich customer data across hundreds of tenants, supporting cross-provider ad tracking and personalized analytics.
-
Developed and tested different technologies for a new identity resolution system, exploring graph database and Python-based solutions to improve user matching and deduplication while keeping downstream data accurate and consistent.
-
Improved query performance in ClickHouse by implementing key-hashed dictionaries, aggregation tables, and optimizing partitioning and indexing. These enhancements significantly reduced query latency and improved overall system efficiency, enabling faster access to large volumes of analytical data.
-
Introduced dbt to build and maintain incremental data pipelines for preprocessing analytical data, enhancing data reliability and enabling scalable analytics.
Les technologies:
- Les technologies:
MySQL
Docker
PostgreSQL
Redis
Python
SQL
AWS Lambda
AWS S3
- Data Engineering
Neo4j
Git
- Data Modeling
Dagster
dbt
ClickHouse
- Memgraph
- Pipeline optimization
-
Database Engineer
One More DMCC (via Proxify) - 1 an 7 mois
One More DMCC is a data engineering and analytics company based in Germany, providing advanced tracking and marketing optimization solutions for digital advertisers and agencies.
-
Designed and optimized high-performance databases in PostgreSQL and ClickHouse to support large-scale marketing analytics workloads.
-
Improved query performance and reduced storage costs by implementing table partitioning, materialized views, and advanced indexing strategies.
-
Automated ETL workflows to ingest multi-source tracking data, ensuring data consistency and low-latency availability for reporting.
-
Collaborated with founders and engineers on schema design, query tuning, and architecture decisions to support rapid feature iteration.
-
Actively participated in code reviews and architecture discussions, promoting transparency, accountability, and technical excellence within the distributed team.
Les technologies:
- Les technologies:
Docker
PostgreSQL
Python
Git
- Data Modeling
ETL
ClickHouse
- Pipeline optimization
-
Senior Data Engineer
TVA2 LLC (via Proxify) - 3 mois
TVA2 LLC is a U.S.-based data analytics consultancy that builds scalable data pipelines and automates reporting systems for clients across medical, financial, retail, and eCommerce sectors.
-
Built and automated ETL pipelines in Python and AWS Lambda to clean, transform, and deliver analytical data across multiple client environments.
-
Developed SQL data models and optimized database architecture to ensure efficient storage and querying across large datasets.
-
Designed cloud-based automation processes for error logging, tracking, and fault-tolerant execution of data workflows.
-
Supported data visualization initiatives by preparing datasets for Tableau and Power BI, enabling clients to derive actionable insights.
-
Collaborated with cross-functional teams and clients to define project requirements, ensuring data accuracy and timely delivery.
Les technologies:
- Les technologies:
Python
SQL
AWS Lambda
Pandas
Linux
Git
- Data Modeling
ETL
REST API
-
Data Engineer
FreshBooks - 1 an 6 mois
- Implemented a reliable and centralized single source of truth for users' and customers' data, ensuring daily refreshes and storage of historical changes.
- Took charge of planning and building Python API integrations to ingest external data sources that were not supported by Fivetran. Implemented DBT transformation layers and conducted data quality testing to ensure the accuracy and reliability of the integrated data;
- Transferred resource-intensive data transformations from Looker to dbt, resulting in optimized processes and reduced costs.
Les technologies:
- Les technologies:
Docker
Python
SQL
- Data Engineering
BigQuery
Git
- Data Modeling
- Dimensional modeling
- Fact Data Modeling
dbt
- Data Quality
Data Engineer
Pmweb / Oto CRM - 8 mois
- Took charge of refactoring a critical legacy pipeline, significantly improving its performance and scalability.
- Designed and implemented a Python command line interface to streamline the data engineering team's preparation of new Oto CRM client infrastructures.
- Strategized and executed the development of a Python API integration to ingest external data sources and enrich them with internal data from various sources.
- Demonstrated strong skills in data warehousing, data quality, Git, data modeling, data ingestion, Apache Airflow, communication, problem-solving, data pipelines, NoSQL, ETL (Extract, Transform, Load), data governance, Python programming language, SQL, big data, team leadership, machine learning, and agile methodologies.
Lead Data Engineer
Pmweb / Oto CRM - 8 mois
-
Successfully led a team of five data engineers, providing mentorship, task delegation, and code reviews to ensure high-quality deliverables across all data team projects;
-
Took charge of refactoring a critical legacy pipeline, significantly improving its performance and scalability. The release of the new version granted customers more autonomy in building the campaign creation platform, reducing the average time for client requests received by the data engineering team by 1 hour;
-
Developed new data routines utilized by Oto CRM's 50+ Brazilian retail clients, enabling the processing of customer data for millions of customers. These routines facilitated the implementation of optional features on the Oto CRM user interface, such as campaign priorities, preferential customer store, best customer-seller matching, best customer-store matching, random redistribution, weighted redistribution, and multiple ready-to-use campaigns;
-
Designed and implemented a Python command line interface to streamline the data engineering team's preparation of new Oto CRM client infrastructures. This interface significantly reduced human error and improved the onboarding experience for new clients by expediting the data provisioning process;
-
Strategized and executed the development of a Python API integration to ingest external data sources and enrich them with internal data from various sources;
-
Demonstrated strong skills in data warehousing, data quality, Git, data modeling, data ingestion, Apache Airflow, communication, problem-solving, data pipelines, NoSQL, ETL (Extract, Transform, Load), data governance, Python programming language, SQL, big data, team leadership, machine learning, and agile methodologies.
Les technologies:
- Les technologies:
MySQL
Docker
Python
SQL
- Data Engineering
Git
- Data Analytics
- Data Modeling
- Dimensional modeling
- Fact Data Modeling
dbt
ClickHouse
- Data Quality
-
Business Intelligence Specialist
Pmweb / Oto CRM - 2 années 5 mois
- Developed a customer identifier engine capable of identifying and assigning internal user identifiers to behavioral website, CRM, and omnichannel transactional data.
- Restructured the RFM Ledger, a critical data pipeline that generates historical Recency, Frequency, and Monetary Value groupings for each customer based on custom rules.
- Refactored MySQL queries and built new ones in ClickHouse to align with specific business requirements.
- Created a Python reporting automation tool that automated data extraction from analytical tools and generated ready-to-use reporting files.
- Demonstrated expertise in data warehousing, Git, data modeling, data ingestion, Apache Airflow, communication, problem-solving, data pipelines, NoSQL, ETL (Extract, Transform, Load), Python programming language, SQL, big data, machine learning, data analysis, and agile methodologies.
Senior Business Intelligence Analyst
Pmweb / Oto CRM - 2 années 5 mois
-
Developed a customer identifier engine capable of identifying and assigning internal user identifiers to behavioral website, CRM, and omnichannel transactional data. This engine facilitated scalable and customized data integration from diverse external data sources;
-
Restructured the RFM Ledger, a critical data pipeline that generates historical Recency, Frequency, and Monetary Value groupings for each customer based on custom rules. By implementing the SQL template technique to parameterize the query according to Oto CRM clients' settings, the average processing speed of this pipeline increased by over 75%;
-
Refactored MySQL queries and built new ones in ClickHouse to align with specific business requirements. These queries were parameterized and made available as metrics for Oto CRM users, significantly reducing latency by up to 95%. The metrics included Omnichannel media attribution, CRM analytical view of store performance, CRM analytical view of product sales, analytical view of personas and clusters by store, Digital channel performance, and Ranking of sales reps by different KPIs;
-
Created a Python reporting automation tool that automated data extraction from analytical tools and generated ready-to-use reporting files. This automation significantly reduced the time spent on repetitive tasks within the team;
-
Demonstrated expertise in data warehousing, Git, data modeling, data ingestion, Apache Airflow, communication, problem-solving, data pipelines, NoSQL, ETL (Extract, Transform, Load), Python programming language, SQL, big data, machine learning, data analysis, and agile methodologies.
Les technologies:
- Les technologies:
MySQL
Python
SQL
- Data Engineering
- Data Analytics
ClickHouse
- Pipeline optimization
-
Data Analyst - Conversion Rate Optimization
Pmweb, a Wunderman company - 1 an 1 mois
- Conducted A/B testing experiments, designing and analyzing experiments for clients with diverse key business requirements;
- Utilized various tools such as Maxymiser, Google Optimize, Adobe Target, Hotjar heatmaps, Google Analytics, and Adobe Analytics to facilitate A/B testing processes;
- Designed experiment frameworks, including hypothesis formulation, test group definition, sample size determination, and measurement metrics selection;
- Collected and analyzed experiment data, applying statistical techniques to derive insights and make data-driven recommendations;
- Demonstrated strong skills in communication, problem-solving, data analytics, Python programming language, SQL, Machine Learning, Data analysis, and Agile methodologies.
Les technologies:
- Les technologies:
- Data Analytics
Machine Learning
Metrics Analyst
Herval Group - 4 mois
-
Conducted e-commerce data analysis and web analytics for Grupo Herval, leveraging external tools and internal databases to gather relevant data;
-
Identified opportunities for improving marketing results by reducing costs and optimizing campaigns based on data analysis;
-
Gathered requirements from stakeholders to understand their critical business needs for reporting purposes;
-
Developed interactive dashboards to provide stakeholders with insightful and actionable information for informed decision-making;
-
Collaborated closely with stakeholders to ensure the dashboards met their specific requirements and provided valuable insights;
-
Played a key role in improving marketing strategies and overall business performance by utilizing data-driven insights and analytics.
Les technologies:
- Les technologies:
- Data Analytics
-
Digital Marketing Analyst
Calçados Beira Rio S/A - 2 années 7 mois
-
Managed marketing analytics responsibilities, specifically handling Facebook and Instagram promotions for new products;
-
Oversaw the reporting of customer feedback and analyzed the data to derive actionable insights for marketing strategies;
-
Conducted website analytics to monitor and optimize performance, ensuring effective online presence and customer engagement;
-
Collaborated with the IT sector to implement automation processes in various websites, including Beira Rio Conforto, Moleca, Vizzano, Molekinha, and Modare Ultraconforto;
-
Utilized effective communication skills to collaborate with cross-functional teams and stakeholders, ensuring alignment and understanding of marketing goals and strategies;
-
Demonstrated proficiency in data analytics and data analysis techniques to drive informed decision-making in marketing initiatives;
-
Successfully leveraged marketing analytics and automation to improve campaign performance, customer satisfaction, and overall online presence.
Les technologies:
- Les technologies:
- Data Analytics
-
Éducation
Maîtrise ès sciencesMBA - Data Engineering
FIAP · 2021 - 2022
Maîtrise ès sciencesMBA Business Analytics
Federal University of Rio Grande do Sul · 2018 - 2019
FormationNanoDegree - Predictive Analytics for Business
Udacity · 2017 - 2018
FormationNanoDegree - Data Analytics
Udacity · 2017 - 2018
License ès sciencesAdvertising and Marketing
Unisinos · 2013 - 2017
Trouvez votre prochain développeur en quelques jours et non sur plusieurs mois
Dans un court appel de 25 minutes, nous voulons:
- Comprendre vos besoins en développement
- Vous expliquez comment nous allons vous mettre en relation avec le développeur le mieux qualifié pour votre projet, sélectionné avec soin
- Vous indiquez nos prochaines démarches afin de vous trouver le meilleur développeur, souvent en moins d'une semaine
