Rihab B.

Data Engineer

Tunisia

Trusted member since 2024

7 years of experience

Alongside her technical abilities, Rihab has broad experience in leadership and project management. One of her key achievements is building a data curation service while also performing as Scrum Master, where she successfully managed a team and implemented a new data service using Scala.

Rihab’s mix of strong technical skills and leadership experience makes her a great fit for projects in regulated industries.

Main expertise

AWS S35 years

ETL5 years

MLOps2 years

Jenkins4 years

14+

Experience10

Senior Data Engineer

Data4Geeks•

Data Analytics

Jan 2023 · 3y 3m

Designed and implemented data pipelines for both batch and stream processing, optimizing data flow and efficiency;
Explored and implemented data pipelines using AWS Glue and PySpark, ensuring scalability and robustness;
Integrated Delta Lake into the pipelines to enable delta processing, enhancing data management capabilities;
Developed job templating using Jinja to streamline the creation and management of data processing jobs;
Built and automated data validation pipelines, ensuring the accuracy and reliability of processed data;
Deployed and configured Trino to facilitate efficient data access and querying across various sources;
Prepared comprehensive documentation for each component and tool explored, ensuring knowledge transfer and easy maintenance;
Utilized tools such as Python, PySpark, Glue (Jobs, Crawlers, Catalogs), Athena, AWS, MWAA (Airflow), Kubernetes, Trino, and Jinja to achieve project goals.

AWS

Databricks

Apache Spark

Python

AWS S3

11+

Senior Data Engineer

Data4Geeks

Jan 2023 · 3y 3m

Design & Implementation of a Forecasting Platform – Engie (French Global Energy Company)

Designed and implemented a comprehensive forecasting platform tailored to the global energy sector.
Developed data pipelines using Python and PySpark, ensuring efficient and scalable data processing.
Orchestrated job workflows using Airflow and Databricks, optimizing task management and execution.
Implemented data engineering processes utilizing Databricks’ Delta Live Tables (DLT) for robust data management.
Built and deployed data stream processing pipelines using DLTs, enabling real-time data processing capabilities.
Developed Feature Store APIs for interaction with components and created reusable templates to standardize processes.
Utilized MLflow to build, manage, and track experiments and machine learning models, ensuring rigorous experimentation.
Managed the lifecycle of ML models using MLOps techniques, implementing reusable templates for consistency and efficiency.
Created dashboards for data analysis and visualization, facilitating data-driven decision-making.
Developed APIs using .NET/C# to expose data, ensuring seamless integration and accessibility across systems.
Employed tools such as Databricks, PySpark, Python, R, SQL, Glue, Athena, Kubernetes, and Airflow to deliver a robust and scalable solution.

AI/Data Engineer

Data4Geeks•

Data Analytics

Jan 2022 - Dec 2023 · 1y 11m

Led projects focused on integrating Large Language Models (LLM) and AI technologies, driving innovation within the organization;
Assisted in designing and implementing data migration solutions, ensuring seamless transitions for various clients;
Developed integrations and clients for vector databases, leveraging different open-source AI tools to enhance capabilities;
Actively communicated with clients to gather requirements and ensure alignment with their specific needs;
Utilized tools such as Python, Google Cloud Platform (GCP), and Datastax to deliver robust solutions.

Cassandra

Python

Google Cloud

TensorFlow

Git

LangChain

Software Engineering Manager/Senior Data ENGINEER

Cognira•

Information Technology (IT) and Services

Jan 2022 - Jul 2022 · 6m

Building and supporting promotion planning demo solution

Developed generic data pipelines to transform raw client data into a format compatible with the data model of the promotion planning demo system;
Wrote scripts to generate meaningful business data, ensuring alignment with the needs of the application;
Collaborated with the science team to understand business requirements and determine the necessary data transformations to enhance data utility;
Designed and implemented a generic PySpark codebase that efficiently transforms data to fit the required data model;
Utilized tools such as PySpark, JupyterHub, Kubernetes, and Azure Data Lake to execute and support the project.

Docker

Databricks

Apache Spark

Maven

Kubernetes

Senior Data Engineer

Data4Geeks•

Financial Technology (FinTech)

Oct 2021 - Jul 2024 · 2y 9m

Implementing and Migrating Data Pipelines, and Supporting Legacy Systems - SumUp (Fintech German Company)

Designed and implemented data pipelines for both batch and stream processing, optimizing data flow and efficiency;
Explored and implemented data pipelines using AWS Glue and PySpark, ensuring scalability and robustness;
Integrated Delta Lake into the pipelines to enable delta processing, enhancing data management capabilities;
Developed job templating using Jinja to streamline the creation and management of data processing jobs;
Built and automated data validation pipelines, ensuring the accuracy and reliability of processed data;
Deployed and configured Trino to facilitate efficient data access and querying across various sources;
Prepared comprehensive documentation for each component and tool explored, ensuring knowledge transfer and easy maintenance;
Utilized tools such as Python, PySpark, Glue (Jobs, Crawlers, Catalogs), Athena, AWS, MWAA (Airflow), Kubernetes, Trino, and Jinja to achieve project goals.

PostgreSQL

AWS

Python

Terraform

AWS Athena

Software Engineering Manager/Senior Data ENGINEER

Cognira•

Retail

Jan 2019 - Jan 2022 · 3y

Building a Data Curation Platform

Implemented a platform designed to make building data pipelines generic, easy, scalable, and quick to assemble for any new client;
Prepared detailed design documents, architectural blueprints, and specifications for the platform;
Gathered and documented requirements, creating specific epics and tasks, and efficiently distributed work among team members;
Developed command-line and pipeline functionalities that enable chaining transformations, facilitating the creation of generic data pipelines;
Supported the management of metadata for various entities defined within the platform;
Conducted runtime analysis and optimized the performance of different platform functionalities;
Studied scalability requirements and designed performance improvement strategies to enhance the platform's robustness;
Built a PySpark interface to facilitate seamless integration with data science workflows.

Scala

Azure Blob storage

Software Engineering Manager/Senior Data ENGINEER

Cognira•

Retail

Sep 2017 - Aug 2022 · 4y 11m

Developed generic data pipelines to transform raw client data into a format compatible with the data model of the promotion planning demo system;
Wrote scripts to generate meaningful business data, ensuring alignment with the needs of the application;
Collaborated with the science team to understand business requirements and determine the necessary data transformations to enhance data utility;
Designed and implemented a generic PySpark codebase that efficiently transforms data to fit the required data model;
Utilized tools such as PySpark, JupyterHub, Kubernetes, and Azure Data Lake to execute and support the project.

Scala

Azure Blob storage

Software Engineering Manager/Senior Data ENGINEER

Cognira

Sep 2017 - Aug 2022 · 4y 11m

Led the team in building data pipelines to support a retailer's promotion planning solution;
Participated in meetings with business and data science teams to understand and identify project needs;
Collaborated with the team to translate business requirements into actionable epics and stories;
Designed and implemented the identified business requirements, ensuring alignment with project goals;
Developed and executed unit tests to ensure the functional correctness of implementations;
Created a data loader application using Scala Spark to load data from Parquet files into Cosmos DB/Cassandra API;
Implemented an online forecaster API using Scala, Akka, and Docker to enable real-time promotion forecasting;
Managed the deployment of the project on the client’s Kubernetes cluster, ensuring smooth operation and integration;
Utilized tools such as Scala, Spark, Azure Databricks, Azure Data Lake, and Kubernetes to achieve project objectives.

R&D Engineer

Cognira•

Information Technology (IT) and Services

Sep 2017 - May 2019 · 1y 8m

Project 1: Building a Speech Recognition Solution

Developed a speech recognition solution aimed at transforming retailers' questions and commands into actionable tasks executed against a user interface (UI);
Utilized TensorFlow, Python, AWS, and Node.js to design and implement the solution, ensuring seamless interaction between the speech recognition engine and the UI.

Project 2: Design and Implementation of a Short Life Cycle Forecasting System

Prepared comprehensive design documents and conducted studies on existing AI solutions, with a focus on voice and speech recognition capabilities;
Collaborated with the team to prepare and collect relevant data for the project;
Executed the processes of data augmentation, validation, and transformation to extract essential information for forecasting purposes;
Contributed to building a user interface and integrated backend functionalities using tools such as TensorFlow, Python, AWS, JavaScript, Node.js, Scala, and Spark.

Python

Scala

Azure Blob storage

TensorFlow

Machine Learning

Fullstack Data Scientist

Infor•

Information Technology (IT) and Services

Aug 2014 - Sep 2017 · 3y 1m

Designed and structured the architecture for various components of a retail forecasting project;
Implemented and deployed key components, ensuring seamless functionality within the overall system;
Integrated all components, automating the processes and establishing an end-to-end batch process for streamlined operations;
Optimized the runtime and performance of each component, enhancing the system's overall efficiency;
Developed forecast comparison templates to facilitate the evaluation of forecast quality, aiding in accurate performance assessments;
Utilized Logicblox, Python, and Tableau Software to achieve project goals, ensuring high-quality results.

Python

Data Science

Data Engineering

Machine Learning

Integration Testing

Tableau

Assessments

Engineering excellence

Rihab’s overall performance in a 90-minute live technical assessment ranks in the top 25% of vetted Data Engineers at Proxify.

See vetting process

Certificates 1

Databricks Certified Data Engineer AssociateDatabricks, Inc.

Issued Feb 2025 - Expires Feb 2027
Credential ID 133741658

Databricks

Data Engineering

Databricks Certified Data Engineer AssociateDatabricks, Inc.

Issued Feb 2025 - Expires Feb 2027
Credential ID 133741658

Databricks

Data Engineering

Do you want to know more about Rihab’s certifications?Book a call

Education

NSO

National School Of Computer Science

Computer Science2011 - 2014

Stop browsing.
Get matched faster.

Talk to an expert and get tailored matches from our network in just 2 days.

A network of over 6,000+ tech experts
Get matched with perfect-fit talent in 2 days on average
Hire quickly and easily with 94% match success

Book a call

Rihab B.

Main expertise

Experience10

Senior Data Engineer

Senior Data Engineer

AI/Data Engineer

Software Engineering Manager/Senior Data ENGINEER

Senior Data Engineer

Software Engineering Manager/Senior Data ENGINEER

Software Engineering Manager/Senior Data ENGINEER

Software Engineering Manager/Senior Data ENGINEER

R&D Engineer

Fullstack Data Scientist

Assessments

Certificates 1

Education

Stop browsing. Get matched faster.

Stop browsing.
Get matched faster.