Arthur J.

Data Engineer

Data engineer with 6+ years of experience in Python, Apache Spark, Data Engineering, Apache Hive, and ETL.

Arthur is a passionate and result-focused Data engineer with ten years of experience developing solid and reliable data pipelines and BI dashboards and solving problems through data for international companies. He has mainly worked on Big data and Machine learning.

He has a strategic mindset focused on understanding context, testing hypotheses, drawing conclusions based on facts, and establishing a data-driven culture with team members. Analytical and well-organized with strong theoretical engineering and mathematical background, Arthur quickly learns new technologies.

He can play a vital role throughout the engagement's development/support life cycle to ensure quality solutions.

Main expertise
  • Python
    Python 8 years
  • Data Engineering 6 years
  • Apache Spark
    Apache Spark 6 years
Other skills
  • Git
    Git 6 years
  • Scrum
    Scrum 5 years
  • Java
    Java 5 years
Arthur
Arthur J.

Brazil

Get started

Selected experience

Employment

  • Data Engineer

    Thoughtworks - 1 year 9 months

    • Data migration using Azure Data Factory. Data processing using Apache Spark at Databricks. Processing automation using Python.

    Technologies:

    • Technologies:
    • Python Python
    • Apache Spark Apache Spark
    • ETL ETL
    • Databricks Databricks
    • Scrum Scrum
    • Azure Data Factory Azure Data Factory
  • I.T. Analyst/Data Engineer

    Grupo Pão de Açúcar - 1 year 5 months

    • Data ETL from Teradata DW using Sqoop on Hive, Impala, and Apache Kudu. Data processing with Apache Spark 2 in a Hadoop environment. Maintenance of legacy systems using Python, Shell Script (Bash), and Java.

    Technologies:

    • Technologies:
    • Apache Spark Apache Spark
    • ETL ETL
    • Bash Bash
    • Apache Hive Apache Hive
    • Java Java
  • I.T. Analyst/Data Engineer

    Nextel (Stefanini IT Solutions contractor) - 3 months

    • Load data from PostgreSQL using Sqoop, Apache Spark 2, and Python 3. Versioning data on a “snapshot” table with Apache Spark 2.

    Technologies:

    • Technologies:
    • Python Python
    • Apache Spark Apache Spark
    • ETL ETL
    • PostgreSQL PostgreSQL
  • I.T. Analyst/Data Engineer

    Semantix - 1 year 1 month

    • Data analysis using Hive and Impala (Cloudera distribution). Data processing in the Hadoop environment. Automation script development using Python and Shell script. Result of IoT engagements. Real-time batch processing using Apache Spark, Kafka, and Elasticsearch.

    Technologies:

    • Technologies:
    • Python Python
    • Apache Spark Apache Spark
    • ETL ETL
    • Shell Shell
    • Apache Hive Apache Hive
    • Apache Kafka Apache Kafka
    • ElasticSearch ElasticSearch

Education

  • BSc.Computer and Information Sciences (Dropout)

    Universidade Federal do ABC · 2015 - 2019

Find your next developer within days, not months

In a short 25-minute call, we would like to:

  • Understand your development needs
  • Explain our process to match you with qualified, vetted developers from our network
  • You are presented the right candidates 2 days in average after we talk

Not sure where to start? Let’s have a chat