Website

Victor Sotero

Senior Data Engineer

Professional Experience

Geely Technology Europe

Senior Data Engineer (through Software by Quokka)

  • Continuing the EU-integrated data platform work across Geely Technology Europe operations
  • Evolving Databricks + Unity Catalog data products from PoC and stakeholder demos into production solutions
  • Designing and orchestrating governed pipelines that ingest, canonicalize, and serve data back to users and markets

PySpark · Delta Lake · Databricks · AWS · Redshift · MySQL · NoSQL

Zeekr Technology Europe

Senior Data Engineer (through Software by Quokka)

  • Building an EU-integrated data platform uniting scattered sources into a single Databricks + Unity Catalog layer
  • Aligning customer-data flows with the EU Data Act compliance requirements
  • Designing and orchestrating pipelines to ingest, canonicalize, govern, and serve data back to users and markets
  • Integrating data resources across European operations
  • Running PoCs and stakeholder demos that are maturing into production solutions

PySpark · Delta Lake · Databricks · AWS · Redshift · MySQL · NoSQL

Software by Quokka

Senior Data Engineer

  • Developed innovative HR solutions including Employee Portal and User Management Service
  • Created KPI dashboard for employee metrics with integrated feedback services
  • Built standardized CV generator to streamline employee documentation

TypeScript · Node.js · React · AWS

JCPenney

Senior Data Engineer (through Globant)

  • Built the entire Marketing Technology data platform in-house, replacing an outsourced solution and significantly reducing data processing costs
  • Enabled data ownership for JCP by migrating data, optimizing and fixing pipelines and dashboards
  • Leveraged AWS EMR, Airflow, and Spark capabilities to deliver high-quality datasets for decision-making
  • Responsible for daily development and maintenance of Big Data processing pipelines on AWS EMR
  • Ingested data from various sources into Redshift for business intelligence and analytics

PySpark · AWS EMR · Airflow · Redshift · Marketing Tech

Adobe

Data Engineer (through Globant)

  • Part of team that migrated terabyte-scale data and processing from on-premises to AWS
  • Participated in workload migration from on-premises (Hive, Cloudera, Trino, OOZIE) to Cloud (AWS EMR, S3, KDA, Quicksight)
  • Utilized AWS EMR for processing data with S3 for storage and data lake purposes

AWS EMR · S3 · Hive · PySpark · Trino · Quicksight

Banco Inter

Data Engineer (through Dadosfera)

  • Migrated SQL procedures to PySpark for scalable processing
  • Processed data sent to S3 by Kafka's CDC platform in parquet format
  • Curated and materialized data into Delta Tables
  • Used Airflow to orchestrate ephemeral EMR cluster deployments

EMR · S3 · PySpark · Kafka · Delta Lake · Airflow

Ciclic

Data Analyst

  • Led implementation of data catalog tool enabling self-serve data analytics
  • Developed ETL pipelines using Airflow from multiple sources (MySQL, RDS, S3, APIs) to Redshift
  • Conducted data cleansing, transformation, and modeling using dbt
  • CRM data analysis generated insights achieving 15x higher conversion rate on top campaigns

Airflow · Redshift · dbt · Python

Hamoye.com

Data Intern

  • Assisted in data cleaning and preprocessing to ensure data quality
  • Collaborated with data science and engineering teams on ETL pipelines
  • Performed ad-hoc data analysis and generated reports for business insights
  • Automated repetitive data tasks through scripting

Ilhasoft Tecnologia (now Weni)

NLP Junior Researcher

  • Conducted Natural Language Processing research focusing on chatbot interactions
  • Developed and evaluated models using Python and NLP libraries
  • Collaborated on refining conversational flows for Weni's platform

NEES - Núcleo de Excelência em Tecnologias Sociais

Graduate Research Assistant

  • Published research on International Conference (CSEDU) on the use of Educational Technologies in Medical Education

Key Projects

EU Data Platform - Zeekr & Geely Technology Europe

Building an EU-integrated data platform that started under Zeekr Technology Europe and continues under Geely Technology Europe, uniting scattered data sources into a single Databricks + Unity Catalog layer and ensuring compliance with the EU Data Act across European operations.

Databricks · Unity Catalog · PySpark · Terraform · GitHub Actions · SQL · AWS

Cloud Migration - Adobe

Engineered the migration of terabyte-scale big data workloads from legacy on-premises infrastructure (Cloudera, Hive, Oozie) to a modern AWS cloud architecture. Optimized data pipelines for cost and performance.

AWS EMR · S3 · Hive · PySpark · AWS Glue · Terraform · SQL

Marketing Technology Platform - JCPenney

Built the entire Marketing Technology data platform from scratch, replacing an expensive outsourced solution. Ingested data from various marketing sources into a unified platform.

PySpark · AWS EMR · Airflow · Redshift · Python · SQL · Terraform

CDC Data Platform - Banco Inter

Acted as a Data Engineer on projects within Banco Inter, migrating SQL procedures to PySpark to scale processing for Brazil's leading digital bank. Utilized EMR, S3, PySpark, Kafka, Delta Lake, and Airflow to build robust data pipelines.

Kafka · Delta Lake · PySpark · Airflow · Spark Streaming · AWS EMR