FIND INTERNSHIPS

Tech Lead - Data Engineering

Posted on Sept. 25, 2025 by Citco

  • Full Time

Tech Lead - Data Engineering

Company Overview

Citco is a global leader in financial services, delivering innovative solutions to some of the world’s largest institutional clients. We harness the power of data to drive operational efficiency and informed decision-making. We are looking for a Tech Lead – Data Engineering with extensive Databricks expertise and AWS experience to lead mission-critical data initiatives

Role Summary

As the Tech Lead – Data Engineering , you will be responsible for architecting, implementing, and optimizing end-to-end data solutions on Databricks (Spark, Delta Lake, MLflow, etc.) while integrating with core AWS services (S3, Glue, Lambda, etc.). You will lead a technical team of data engineers, ensuring best practices in performance, security, and scalability. This role requires a deep, hands-on understanding of Databricks internals and a track record of delivering large-scale data platforms in a cloud environment.


Key Responsibilities

  • Databricks Platform & Architecture
    • Architect and maintain Databricks Lakehouse solutions using Delta Lake for ACID transactions and efficient data versioning.
    • Leverage Databricks SQL Analytics for interactive querying and report generation.
    • Manage cluster lifecycle (provisioning, sizing, scaling) and optimize Spark jobs for cost and performance.
    • Implement structured streaming pipelines for near real-time data ingestion and processing.
    • Configure and administer Databricks Repos , notebooks, and job scheduling/orchestration to streamline development workflows.
  • AWS Cloud Integration
    • Integrate Databricks with AWS S3 as the primary data lake storage layer.
    • Design and implement ETL/ELT pipelines using AWS Glue catalog, AWS Lambda, and AWS Step Functions where needed.
    • Ensure proper networking configuration (VPC, security groups, private links) for secure and compliant data access.
    • Automate infrastructure deployment and scaling using AWS CloudFormation or Terraform .
  • Data Pipeline & Workflow Management
    • Develop and maintain scalable, reusable ETL frameworks using Spark (Python/Scala).
    • Orchestrate complex workflows, applying CI/CD principles (Git-based version control, automated testing).
    • Implement Delta Live Tables or similar frameworks to handle real-time data ingestion and transformations.
    • Integrate with MLflow (if applicable) for experiment tracking and model versioning, ensuring data lineage and reproducibility.
  • Performance Tuning & Optimization
    • Conduct advanced Spark job tuning (caching strategies, shuffle partitions, broadcast joins, memory optimization).
    • Fine-tune Databricks clusters (autoscaling policies, instance types) to manage cost without compromising performance.
    • Optimize I/O performance and concurrency for large-scale data sets.
  • Security & Governance
    • Implement Unity Catalog or equivalent Databricks features for centralized governance, access control, and data lineage.
    • Ensure compliance with industry standards (e.g., GDPR, SOC, ISO) and internal security policies.
    • Apply IAM best practices across Databricks and AWS to enforce least-privilege access.
  • Technical Leadership & Mentorship
    • Lead and mentor a team of data engineers, conducting code reviews, design reviews, and knowledge-sharing sessions.
    • Champion Agile or Scrum development practices, coordinating sprints and deliverables.
    • Serve as a primary technical liaison, working closely with product managers, data scientists, DevOps, and external stakeholders.
  • Monitoring & Reliability
    • Configure observability solutions (e.g., Datadog, CloudWatch, Prometheus) to proactively identify performance bottlenecks.
    • Set up alerting mechanisms for latency, cost overruns, and cluster health.
    • Maintain SLAs and KPIs for data pipelines, ensuring robust data quality and reliability.
  • Innovation & Continuous Improvement
    • Stay updated on Databricks roadmap and emerging data engineering trends (e.g., Photon, Lakehouse features).
    • Evaluate new tools and technologies, driving POCs to improve data platform capabilities.
    • Collaborate with business units to identify data-driven opportunities and craft solutions that align with strategic goals.

Qualifications

  • Educational Background
    • Bachelor’s or Master’s degree in Computer Science, Data Science, Engineering, or equivalent experience.
  • Technical Experience
    • Databricks Expertise: 5+ years of hands-on Databricks (Spark) experience, with a focus on building and maintaining production-grade pipelines.
    • AWS Services: Proven track record with AWS S3, EC2, Glue, EMR, Lambda, Step Functions, and security best practices (IAM, VPC).
    • Programming Languages: Strong proficiency in Python (PySpark) or Scala ; SQL for analytics and data modeling.
    • Data Warehousing & Modeling: Familiarity with RDBMS (e.g., Postgres, Redshift) and dimensional modeling techniques.
    • Infrastructure as Code: Hands-on experience using Terraform or AWS CloudFormation to manage cloud infrastructure.
    • Version Control & CI/CD: Git-based workflows (GitHub/GitLab), Jenkins or similar CI/CD tools for automated builds and deployments.
  • Leadership & Soft Skills
    • Demonstrated experience leading a team of data engineers in a complex, high-traffic data environment.
    • Outstanding communication and stakeholder management skills, with the ability to translate technical jargon into business insights.
    • Adept at problem-solving, with a track record of quickly diagnosing and resolving data performance issues.
  • Certifications (Preferred)
    • Databricks Certified Associate/Professional (e.g., Databricks Certified Professional Data Engineer ).
    • AWS Solutions Architect (Associate or Professional).

Advertised until:
Oct. 25, 2025


Are you Qualified for this Role?


Click Here to Tailor Your Resume to Match this Job


Share with Friends!

Similar Internships


Tech Lead - Data Engineering

We're not looking for a manager who used to code. We're looking for a Data Engineer who also knows …