General
We are looking for Data Engineers with strong Databricks expertise to join our client’s Data Team and contribute to building scalable, reliable, and cloud-native data solutions.
Responsibilities/Activities
- Design, develop, and maintain scalable, high-performance data pipelines using Python, PySpark, Apache Spark, and Databricks
- Build, optimize, and monitor ETL/ELT pipelines for performance, reliability, cost-efficiency, and scalability
- Work extensively with the Databricks Platform, including notebooks, jobs, clusters, workflows, Delta Lake, and Lakehouse architecture
- Develop data transformation processes, data ingestion flows, and data warehousing solutions using SQL, PySpark, and modern ETL/ELT frameworks
- Design and implement scalable Data Lake and Lakehouse solutions to support analytics, reporting, and business intelligence needs
- Build and optimize Delta Lake tables, ensuring efficient data storage, partitioning, schema evolution, and performance tuning
- Design and implement data models to support DWH, BI, analytics, and downstream data consumption
- Ensure data quality, validation, lineage, consistency, and reliability across data pipelines and datasets
- Work with cloud platforms to build secure, scalable, cloud-native data engineering solutions integrated with Databricks
- Collaborate with BI, analytics, and business teams to ensure seamless data accessibility and accurate insights delivery
- Implement CI/CD pipelines for automated deployment, testing, monitoring, and lifecycle management of data solutions
- Apply data engineering best practices for performance optimization, governance, security, observability, and maintainability
Requirements
Technical
- At least 4 years of experience in Data Engineering
- Strong hands-on experience with Databricks Platform
- Strong proficiency in Python, Apache Spark, and PySpark
- Solid experience with Delta Lake and Lakehouse architecture
- Experience designing, building, and optimizing ETL/ELT pipelines in large-scale data environments
- Experience with Databricks workflows, jobs, notebooks, clusters, and performance optimization
- Strong experience in big data processing and distributed data systems
- Advanced SQL skills for data manipulation, validation, transformation, and optimization
- Strong knowledge of Data Lakes, Data Warehousing, Data Modeling, and modern data architecture principles
- Experience with cloud platforms and cloud-native data services
- Experience with workflow orchestration tools
- Experience in working with BI tools
- Proven ability to scale data pipelines for efficiency, reliability, and cost-effectiveness
- Solid understanding of CI/CD pipelines and automated deployment practices for data engineering solutions
- Good understanding of data quality, monitoring, governance, and security principles in data platforms
Education
- University degree in Computer Science, Mathematics or another related field
Others
- Good level of English (oral and written)
- Strong analytical and problem-solving skills
- Ability to quickly learn and adapt to new technologies
- Ability to work in a fast-paced, agile environment
Nice to have requirements
- Experience with streaming data solutions (Kafka, Pub/Sub)
- Experience with data observability, pipeline monitoring, and automated data quality checks
- Experience with Databricks Asset Bundles, Terraform, or infrastructure-as-code practices