Role Summary
As a Senior Data Engineer, you’ll design and build highly scalable data pipelines, architect foundational data systems, and support machine learning and GenAI capabilities. You’ll also contribute to the backend service layer, working with Java is mandatory to ensure seamless data integration between internal systems and our broader platform. This is a highly cross-functional role that blends data engineering, backend software design, Ncloud architecture, and AI/ML enablement.
What You’ll Do
Platform & Infrastructure Engineering
- Build and maintain robust data pipelines (batch and streaming) using Airflow, AWS Glue, Step Functions, Lambda, and more
- Develop data-centric APIs in Java, with clean, modular architecture and secure data access patterns (also have skills in microservices)
- Deploy and monitor services in AWS with infrastructure-as-code tools like Terraform and Docker
- Data Modeling, Observability & Lineage
- Design and implement reliable data models to support analytics, data products, and AI workloads
- Establish data lineage, quality monitoring, and testing frameworks using tools like Great Expectations, Marquez, or Monte Carlo
- Maintain metadata management and documentation for compliance and discoverability
Data Science & GenAI Enablement
- Collaborate with data scientists to provision training datasets, feature stores, and model pipelines
- Build orchestration and evaluation workflows to support LLM and GenAI development (e.g., RAG pipelines, embedding search, document intelligence)
- Integrate unstructured data (PDFs, documents, messages) into structured datasets for analytics and AI
Security & Compliance
- Implement best practices aligned with SOC 2, GDPR, and internal infosec standards
- Ensure secure access controls, audit logging, and encrypted storage for sensitive data
- Work with cybersecurity and infrastructure teams to ensure end-to-end data governance
Cross-functional Collaboration
- Partner with engineering, product, analytics, and operations teams to support cross-cutting data initiatives
- Collaborate closely with backend and DevOps engineers to align services, APIs, and deployment patterns
What You Bring
Required
- 7+ years of experience in data engineering or backend software development
- Proficiency in Java, with experience developing scalable APIs
- Strong expertise in SQL, data modeling, and building reliable ETL/ELT pipelines
- Deep familiarity with AWS services (Step Functions, Lambda, Glue, S3, Redshift)
- Hands-on experience with Airflow, dbt, or similar orchestration and transformation tools
- Knowledge of data lineage, quality frameworks, and monitoring systems
- Prior experience working alongside data scientists or ML engineers
Send your CV to applications@zindi.africa and WhatsApp (082) 519 5189