Data Engineer

SolarAfrica Energy

The Data Engineer will architect, build, and continuously enhance enterprise-grade data infrastructure and pipelines, ensuring the delivery of high-quality, trusted, and scalable data that underpins effective decision-making across SolarAfrica’s expanding portfolio of IPP plants, off-takers, and commercial partners.

Position of the job in the organisation

The Data Engineer will report directly to the Head of IT.

Principal responsibilities

  • Architect and implement scalable ELT/ETL pipelines on AWS and/or Azure, leveraging cloud-native services (e.g., AWS Glue/S3/Redshift, Azure Data Factory/ADLS/Synapse) and Databricks where applicable
  • Integrate and transform data from Financial, ERP (Dynamics 365), CRM, Sales, and operational systems into a unified analytical data platform
  • Implement and enforce Medallion Architecture (Bronze → Silver → Gold) patterns, ensuring clean separation of raw ingestion, cleansed/conformed, and business-ready data layers
  • Develop rigorous pipeline testing frameworks including unit tests, data reconciliation, schema validation, row count checks, and end-to-end UAT coordination with business stakeholders
  • Implement data quality monitoring, alerting, and governance controls — maintaining full lineage and documentation across all data assets
  • Build and maintain Power BI semantic models and enterprise reports for operational, investor, and executive reporting
  • Optimise cloud resource usage to ensure solutions are cost-effective — including compute right-sizing, scheduling optimisation, storage tiering, and active cloud spend monitoring
  • Engage independently with internal and external stakeholders to translate business requirements into technical data solutions, managing delivery from scoping through to production deployment
  • Maintain data documentation, including pipeline runbooks, schema registries, and architectural decision records (ADRs)
  • Any other job duties associated with this role.

Experience & Skills

  • Bachelor’s degree in Computer Science, Information Systems, Engineering, Mathematics, or a related field
  • 5–8 years of hands-on Data Engineering experience with demonstrable ownership of senior-level projects
  • Strong SQL skills, including complex query optimisation, window functions, and performance tuning on large datasets
  • Proven experience building production-grade ELT/ETL pipelines using Python (PySpark preferred)
  • Hands-on experience with AWS (Glue, S3, Redshift, Lambda, Step Functions) and/or Azure (ADF, ADLS Gen2, Synapse Analytics, Azure Databricks)
  • Experience implementing Medallion Architecture (Bronze/Silver/Gold) in a cloud lakehouse environment
  • Proficiency in Power BI, including Power Query, DAX modelling, semantic layer design, and publishing enterprise-grade reports
  • Demonstrated experience integrating ERP (Dynamics 365 preferred), CRM, and Financial/Sales data platforms into analytical environments
  • Strong understanding of data governance, data quality frameworks, and access control best practices
  • Experience with data warehousing solutions (Redshift, Synapse, Snowflake, or equivalent)
  • Ability to produce architectural documentation: ERDs, data flow diagrams, source-to-target mappings
  • Previous experience with ERP (preferably Dynamics 365)

Strong Differentiators

  • Hands-on experience with Databricks (Delta Lake, Unity Catalog, PySpark notebooks, Databricks Workflows)
  • Experience in renewable energy, IPP, asset management, or infrastructure environments

Specific attributes

  • An architectural thinker who approaches new integrations with a design-first mentality before writing code
  • Applies test-driven practices to data pipelines, not satisfied until data accuracy is independently verified
  • Proactively manages cloud costs and flags inefficiencies without being prompted
  • Can operate fully independently in an ambiguous environment, self-managing priorities across concurrent projects
  • Communicates technical concepts clearly to non-technical stakeholders (finance, operations, executives)

Desirable attributes

  • Familiarity with Prefect 3 for orchestrating data ingestion workflows.
  • Exposure to creating monitoring dashboards/alerts in Grafana Cloud

What Does Success Look Like?

By month 3

  • I will understand all core data sources, pipelines, and dependencies
  • I will have documented existing data architecture and known data issues
  • I can independently maintain and troubleshoot existing data pipelines
  • I will be able to engage confidently with stakeholders on data requirements

By month 6

  • I will have improved data reliability, accuracy, and pipeline stability
  • I can design and deploy new data pipelines with minimal oversight
  • I will have implemented basic data quality checks and monitoring
  • I will be able to support analysts and business users with trusted datasets

By month 12

  • I will have contributed to scalable, future-ready data architecture
  • I can proactively identify and resolve data risks before they impact the business
  • I will have reduced data-related incidents and rework
  • I will be able to enable faster, better decision-making through reliable data

Core Values

We hire, reward, and recognise our team against these values. It is imperative that you believe in these values and demonstrate them consistently.

  • We are passionate and proud of what we do.
  • We communicate candidly, especially when it is difficult
  • We take the initiative, share our mistakes, and grow together
  • We are dependable and take accountability
  • No one person is bigger than the solution - no egos.

We are currently only considering candidates based in South Africa with valid working rights. Unfortunately, we are unable to consider applications from outside the country at this time.

How to apply

To apply for this job you need to authorize on our website. If you don't have an account yet, please register.