What will you do:

  • Design, build, and maintain the data pipelines architecture through data governance principles, starting from external data source ingestion jobs, ETL/ELT jobs.
  • Establish robust data governance and assure data integrity by implementing systems to track and maintain data quality within the data warehouse.
  • Persistently explore strategies to refine and streamline current data processes for greater cost-effectiveness and efficiency in time.
  • Thoroughly dedicated to the evolution of services that meet and exceed set technical benchmarks, spanning from coding frameworks and comprehensive documentation to data input, accurate logging, effective error resolution, extensive testing, and fortified security practices.
  • Collaborate with cross-functional divisions to tackle and overcome their functionality topics related to data (engineering, product, marketing, business)

What are we looking for:

  • Having 3+ years of experience in data engineering, analytics engineering, or software engineering
  • Extensive knowledges in Python, SQL, and Prompt Engineering
  • Proficiency in using dbt (Data Build Tool) for data transformation and modeling in a modern data stack is essential
  • Exposure to relational databases (e.g. MySQL, PostgreSQL) and non-relational databases (e.g. ArangoDB, MongoDB, Redis)
  • Experience with one of the cloud data solutions is a must, for example:
  • GCP: Composer (workflow orchestration), BigQuery (data warehousing), DataFlow (stream/batch data processing), Pub/Sub (messaging), Cloud Run (containerized apps)
  • AWS: Step Functions (workflow orchestration), Redshift (data warehousing), Data Pipeline/Kinesis (stream/batch data processing), SNS/SQS (messaging), Lambda with Amazon API Gateway (containerized apps)
  • Capable to navigate and working effectively with DevOps model (e.g. utilizing Git, Jenkins, CI/CD, Virtual Machine, Docker, Kubernetes)
  • Preferably exposure to data governance principles (data security (PII data masking, access control), stewardship, metadata management, lifecycle, etc.)
  • A commitment to continuous learning and staying up-to-date with the latest trends in data engineering