Junior Data Engineer: Group Insights & Analytics

PPS Recruitment

Job Advert Summary

To support the development, maintenance, and optimisation of data pipelines and data management processes within the Group Analytics Data Ecosystem on Microsoft Fabric. Working under the guidance of senior team members, the role focuses on designing data pipelines, ensuring strong data quality while supporting analytics and reporting initiatives.

Minimum Requirements

Education

A relevant tertiary qualification (BSc/BCom/BTech in IT, Computer Science, Information Systems, Data Engineering, or related field).

Experience

Exposure to data engineering, data analytics, or database work (internships, academic projects, or entry-level roles).
Basic working knowledge of SQL and experience with relational databases.
Experience with Power BI/visualisation tools (e.g., Power BI, Tableau) is advantageous.
Experience or academic exposure to data pipelines, ETL concepts, or cloud environments is beneficial.
Exposure to cloud environment (Azure preferred) or cloud-based data platforms is a strong advantage

Knowledge and Skills

Foundational understanding of data modelling, relational and dimensional design, and data governance principles.
Working knowledge of Python for data manipulation and scripting (pandas, PySpark) is an added advantage.
Familiarity with Microsoft Fabric, Azure Synapse, Databricks, or equivalent modern data platforms is advantageous.
Awareness of version control practice (Git) and collaborative development workflows.
Interest in financial services data and analytics.

Duties and Responsibilities

Data Pipeline Development and Support

Assist in designing, building, testing, and maintaining a data ingestion and transformation pipeline on Microsoft Fabric and Azure Data Factory.
Support senior engineers with end-to-end data extraction, transformation, and loading (ETL) activities across structured and semi-structured data sources.
Conduct routine data quality checks, investigate, document, and escalate anomalies or inconsistencies.
Write clean, well-documented Python and SQL code that follows team standards and passes peer review.

Data Management & Governance

Maintain and improve data quality, accessibility, and lineage across the Data Ecosystem.
Document data sources, metadata, transformation logic, and pipeline dependencies in our central knowledge base.
Support root-cause analysis when data issues arise and propose preventive controls.
Adhere to data governance policies and help embed best practices across the team.

Collaboration & Stakeholder Engagement

Work with the data engineering team, data scientist, business analysis, and IT teams to ensure pipelines meet business requirements.
Help business users discover, understand, and trust the data available to them.
Prepare clean, analysis-ready datasets and data models that accelerate the analytics and data science team.

Operational Support

Monitor scheduled data processes and pipeline health while responding promptly to failures or alerts.
Assist in the development of Power BI reports and basic analytics outputs aligned to business needs.
Actively learn and apply best practices in cloud data engineering, Lakehouse architecture, and governance.