Data Engineer

SolarAfrica Energy

The Data Engineer will architect, build, and continuously enhance enterprise-grade data infrastructure and pipelines, ensuring the delivery of high-quality, trusted, and scalable data that underpins effective decision-making across SolarAfrica’s expanding portfolio of IPP plants, off-takers, and commercial partners.

Position of the job in the organisation

The Data Engineer will report directly to the Head of IT.

Principal responsibilities

Architect and implement scalable ELT/ETL pipelines on AWS and/or Azure, leveraging cloud-native services (e.g., AWS Glue/S3/Redshift, Azure Data Factory/ADLS/Synapse) and Databricks where applicable
Integrate and transform data from Financial, ERP (Dynamics 365), CRM, Sales, and operational systems into a unified analytical data platform
Implement and enforce Medallion Architecture (Bronze → Silver → Gold) patterns, ensuring clean separation of raw ingestion, cleansed/conformed, and business-ready data layers
Develop rigorous pipeline testing frameworks including unit tests, data reconciliation, schema validation, row count checks, and end-to-end UAT coordination with business stakeholders
Implement data quality monitoring, alerting, and governance controls — maintaining full lineage and documentation across all data assets
Build and maintain Power BI semantic models and enterprise reports for operational, investor, and executive reporting
Optimise cloud resource usage to ensure solutions are cost-effective — including compute right-sizing, scheduling optimisation, storage tiering, and active cloud spend monitoring
Engage independently with internal and external stakeholders to translate business requirements into technical data solutions, managing delivery from scoping through to production deployment
Maintain data documentation, including pipeline runbooks, schema registries, and architectural decision records (ADRs)
Any other job duties associated with this role.

Experience & Skills

Bachelor’s degree in Computer Science, Information Systems, Engineering, Mathematics, or a related field
5–8 years of hands-on Data Engineering experience with demonstrable ownership of senior-level projects
Strong SQL skills, including complex query optimisation, window functions, and performance tuning on large datasets
Proven experience building production-grade ELT/ETL pipelines using Python (PySpark preferred)
Hands-on experience with AWS (Glue, S3, Redshift, Lambda, Step Functions) and/or Azure (ADF, ADLS Gen2, Synapse Analytics, Azure Databricks)
Experience implementing Medallion Architecture (Bronze/Silver/Gold) in a cloud lakehouse environment
Proficiency in Power BI, including Power Query, DAX modelling, semantic layer design, and publishing enterprise-grade reports
Demonstrated experience integrating ERP (Dynamics 365 preferred), CRM, and Financial/Sales data platforms into analytical environments
Strong understanding of data governance, data quality frameworks, and access control best practices
Experience with data warehousing solutions (Redshift, Synapse, Snowflake, or equivalent)
Ability to produce architectural documentation: ERDs, data flow diagrams, source-to-target mappings
Previous experience with ERP (preferably Dynamics 365)

Strong Differentiators

Hands-on experience with Databricks (Delta Lake, Unity Catalog, PySpark notebooks, Databricks Workflows)
Experience in renewable energy, IPP, asset management, or infrastructure environments

Specific attributes

An architectural thinker who approaches new integrations with a design-first mentality before writing code
Applies test-driven practices to data pipelines, not satisfied until data accuracy is independently verified
Proactively manages cloud costs and flags inefficiencies without being prompted
Can operate fully independently in an ambiguous environment, self-managing priorities across concurrent projects
Communicates technical concepts clearly to non-technical stakeholders (finance, operations, executives)

Desirable attributes

Familiarity with Prefect 3 for orchestrating data ingestion workflows.
Exposure to creating monitoring dashboards/alerts in Grafana Cloud

What Does Success Look Like?

By month 3

I will understand all core data sources, pipelines, and dependencies
I will have documented existing data architecture and known data issues
I can independently maintain and troubleshoot existing data pipelines
I will be able to engage confidently with stakeholders on data requirements

By month 6

I will have improved data reliability, accuracy, and pipeline stability
I can design and deploy new data pipelines with minimal oversight
I will have implemented basic data quality checks and monitoring
I will be able to support analysts and business users with trusted datasets

By month 12

I will have contributed to scalable, future-ready data architecture
I can proactively identify and resolve data risks before they impact the business
I will have reduced data-related incidents and rework
I will be able to enable faster, better decision-making through reliable data

Core Values

We hire, reward, and recognise our team against these values. It is imperative that you believe in these values and demonstrate them consistently.

We are passionate and proud of what we do.
We communicate candidly, especially when it is difficult
We take the initiative, share our mistakes, and grow together
We are dependable and take accountability
No one person is bigger than the solution - no egos.

We are currently only considering candidates based in South Africa with valid working rights. Unfortunately, we are unable to consider applications from outside the country at this time.