Data Engineer
SolarAfrica Energy
The Data Engineer will architect, build, and continuously enhance enterprise-grade data infrastructure and pipelines, ensuring the delivery of high-quality, trusted, and scalable data that underpins effective decision-making across SolarAfrica’s expanding portfolio of IPP plants, off-takers, and commercial partners.
Position of the job in the organisation
The Data Engineer will report directly to the Head of IT.
Principal responsibilities
- Architect and implement scalable ELT/ETL pipelines on AWS and/or Azure, leveraging cloud-native services (e.g., AWS Glue/S3/Redshift, Azure Data Factory/ADLS/Synapse) and Databricks where applicable
- Integrate and transform data from Financial, ERP (Dynamics 365), CRM, Sales, and operational systems into a unified analytical data platform
- Implement and enforce Medallion Architecture (Bronze → Silver → Gold) patterns, ensuring clean separation of raw ingestion, cleansed/conformed, and business-ready data layers
- Develop rigorous pipeline testing frameworks including unit tests, data reconciliation, schema validation, row count checks, and end-to-end UAT coordination with business stakeholders
- Implement data quality monitoring, alerting, and governance controls — maintaining full lineage and documentation across all data assets
- Build and maintain Power BI semantic models and enterprise reports for operational, investor, and executive reporting
- Optimise cloud resource usage to ensure solutions are cost-effective — including compute right-sizing, scheduling optimisation, storage tiering, and active cloud spend monitoring
- Engage independently with internal and external stakeholders to translate business requirements into technical data solutions, managing delivery from scoping through to production deployment
- Maintain data documentation, including pipeline runbooks, schema registries, and architectural decision records (ADRs)
- Any other job duties associated with this role.
Experience & Skills
- Bachelor’s degree in Computer Science, Information Systems, Engineering, Mathematics, or a related field
- 5–8 years of hands-on Data Engineering experience with demonstrable ownership of senior-level projects
- Strong SQL skills, including complex query optimisation, window functions, and performance tuning on large datasets
- Proven experience building production-grade ELT/ETL pipelines using Python (PySpark preferred)
- Hands-on experience with AWS (Glue, S3, Redshift, Lambda, Step Functions) and/or Azure (ADF, ADLS Gen2, Synapse Analytics, Azure Databricks)
- Experience implementing Medallion Architecture (Bronze/Silver/Gold) in a cloud lakehouse environment
- Proficiency in Power BI, including Power Query, DAX modelling, semantic layer design, and publishing enterprise-grade reports
- Demonstrated experience integrating ERP (Dynamics 365 preferred), CRM, and Financial/Sales data platforms into analytical environments
- Strong understanding of data governance, data quality frameworks, and access control best practices
- Experience with data warehousing solutions (Redshift, Synapse, Snowflake, or equivalent)
- Ability to produce architectural documentation: ERDs, data flow diagrams, source-to-target mappings
- Previous experience with ERP (preferably Dynamics 365)
Strong Differentiators
- Hands-on experience with Databricks (Delta Lake, Unity Catalog, PySpark notebooks, Databricks Workflows)
- Experience in renewable energy, IPP, asset management, or infrastructure environments
Specific attributes
- An architectural thinker who approaches new integrations with a design-first mentality before writing code
- Applies test-driven practices to data pipelines, not satisfied until data accuracy is independently verified
- Proactively manages cloud costs and flags inefficiencies without being prompted
- Can operate fully independently in an ambiguous environment, self-managing priorities across concurrent projects
- Communicates technical concepts clearly to non-technical stakeholders (finance, operations, executives)
Desirable attributes
- Familiarity with Prefect 3 for orchestrating data ingestion workflows.
- Exposure to creating monitoring dashboards/alerts in Grafana Cloud
What Does Success Look Like?
By month 3
- I will understand all core data sources, pipelines, and dependencies
- I will have documented existing data architecture and known data issues
- I can independently maintain and troubleshoot existing data pipelines
- I will be able to engage confidently with stakeholders on data requirements
By month 6
- I will have improved data reliability, accuracy, and pipeline stability
- I can design and deploy new data pipelines with minimal oversight
- I will have implemented basic data quality checks and monitoring
- I will be able to support analysts and business users with trusted datasets
By month 12
- I will have contributed to scalable, future-ready data architecture
- I can proactively identify and resolve data risks before they impact the business
- I will have reduced data-related incidents and rework
- I will be able to enable faster, better decision-making through reliable data
Core Values
We hire, reward, and recognise our team against these values. It is imperative that you believe in these values and demonstrate them consistently.
- We are passionate and proud of what we do.
- We communicate candidly, especially when it is difficult
- We take the initiative, share our mistakes, and grow together
- We are dependable and take accountability
- No one person is bigger than the solution - no egos.
We are currently only considering candidates based in South Africa with valid working rights. Unfortunately, we are unable to consider applications from outside the country at this time.