What you’ll do
- Implement, manage, and monitor databases in our Apache Iceberg-powered data lake environment to ensure high levels of data availability and performance.
 - Work with data engineering teams to design and implement scalable database schemas and optimize data storage and retrieval processes.
 - Perform regular database maintenance tasks such as backups, indexing, and performance tuning to ensure data integrity and efficiency.
 - Develop and implement data security measures, including access controls and encryption, to protect sensitive information.
 - Collaborate with data analysts and business teams to understand data requirements and ensure the database meets business needs.
 - Troubleshoot and resolve database-related issues in a timely manner.
 - Stay current with emerging technologies and advancements in Lakehouse architectures, specifically Apache Iceberg, to recommend and implement improvements to our data infrastructure.
 - Document database architectures, procedures, and processes for internal use and compliance purposes.
 
What we look for
- Knowledge of SQL:
- Proven experience as a Database Administrator, with a strong preference for experience in managing data lakes and using Apache Iceberg.
 - Deep understanding of database principles, architecture, and data modeling techniques.
 
 
- Python:. 1-3 years of general python experience
 - Spark:
- Extensive experience with writing Spark SQL and working with DataFrames
 - Experience with debugging Spark applications via metrics, history server, etc
 - Understanding of shuffling and re-partitioning concepts
 - Understanding of off-heap and on-heap memory usage in Spark
 - nderstands joins in a distributed context; eg sort-merge vs broadcast joins (nice to have)