Software Engineer II (Storage Services)

What you’ll do

  • Implement, manage, and monitor databases in our Apache Iceberg-powered data lake environment to ensure high levels of data availability and performance.
  • Work with data engineering teams to design and implement scalable database schemas and optimize data storage and retrieval processes.
  • Perform regular database maintenance tasks such as backups, indexing, and performance tuning to ensure data integrity and efficiency.
  • Develop and implement data security measures, including access controls and encryption, to protect sensitive information.
  • Collaborate with data analysts and business teams to understand data requirements and ensure the database meets business needs.
  • Troubleshoot and resolve database-related issues in a timely manner.
  • Stay current with emerging technologies and advancements in Lakehouse architectures, specifically Apache Iceberg, to recommend and implement improvements to our data infrastructure.
  • Document database architectures, procedures, and processes for internal use and compliance purposes.

What we look for

  • Knowledge of SQL:
    • Proven experience as a Database Administrator, with a strong preference for experience in managing data lakes and using Apache Iceberg.
    • Deep understanding of database principles, architecture, and data modeling techniques.
  • Python:. 1-3 years of general python experience
  • Spark:
    • Extensive experience with writing Spark SQL and working with DataFrames
    • Experience with debugging Spark applications via metrics, history server, etc
    • Understanding of  shuffling and re-partitioning concepts
    • Understanding of off-heap and on-heap memory usage in Spark
    • nderstands joins in a distributed context; eg sort-merge vs broadcast joins (nice to have)