Data Scientist

Imizizi

Reference: JHB001446-NS-1

ESSENTIAL SKILLS

  • Proven experience designing and building agentic system architectures using Amazon Bedrock AgentCore and
  • agent frameworks (e.g., LangChain, LangGraph, Strands Agents).
  • Strong expertise in orchestrating multi-step reasoning, tool invocation, state management, and workflow
  • automation for AI agents.
  • Deep hands-on knowledge of training and deploying models with PyTorch and TensorFlow.
  • Experience defining model strategy, including architecture selection, fine-tuning approaches, inference
  • patterns, and cost/performance trade-offs.
  • Containerization and orchestration skills: Docker and Kubernetes for scalable, fault-tolerant ML/GenAI
  • deployments.
  • Solid understanding of networking for ML workloads, including VPC design, ingress/egress, private and
  • internet-facing communication patterns, and low-latency design.
  • Systems-level performance engineering: selecting CPU/GPU/accelerator hardware, plus experience with load
  • testing, stress testing, and capacity planning for ML systems.
  • MLOps and GenAI Ops experience: CI/CD for models, model versioning, observability, logging, monitoring, and incident response practices.
  • Strong software engineering skills in Python and familiarity with building robust, production-ready APIs and back-end services.
  • Experience integrating foundation models with Retrieval-Augmented Generation (RAG) pipelines, tool use, and agentic workflows for enterprise use cases.

ADVANTAGEOUS SKILLS

  • Prior experience working with Amazon Bedrock and other cloud-managed foundation model services.
  • Familiarity with LangChain extensions, LangGraph, Strands Agents or similar orchestration toolkits at scale.
  • Experience with infrastructure-as-code tools (e.g., Terraform, Terragrunt) for reproducible cloud infrastructure.
  • Knowledge of serverless components (Lambda, Step Functions, EventBridge) for orchestration and eventdriven
  • workflows.
  • Background in secure cloud architectures, IAM best practices, and security hardening for AI platforms.
  • Experience with data engineering and building reliable ETL/data pipelines for model training and feature stores.
  • Familiarity with observability stacks (Prometheus, Grafana, CloudWatch) and distributed tracing for ML
  • services.
  • Experience optimizing inference costs through batching, quantization, and model distillation techniques.
  • Prior work with enterprise customers in regulated industries (e.g., automotive, pharma, finance) and
  • understanding of compliance considerations.
  • Knowledge of hybrid and multi-cloud deployment patterns for AI workloads.

ROLE & RESPONSIBILITIES

  • Define and build agentic system architectures that leverage Amazon Bedrock AgentCore and agent frameworks
  • to enable multi-step reasoning and automated workflows.
  • Lead technical strategy for model selection, fine-tuning, and inference, advising on cost vs. performance tradeoffs.
  • Design and implement containerized deployment standards using Docker and Kubernetes to ensure consistent,
  • scalable, and fault-tolerant ML operations.
  • Architect secure, low-latency networking for model-to-service and service-to-service communication across
  • private and public networks.
  • Perform systems-level performance engineering: select appropriate compute accelerators, run load and stress
  • tests, and conduct capacity planning for production readiness.
  • Establish and operate MLOps and GenAI Ops practices, including CI/CD pipelines, model versioning, and deployment automation.
  • Implement observability, logging, monitoring, and incident response for production AI systems to ensure operational excellence.
  • Own end-to-end system design for AI workloads: data pipelines, model training, inference, orchestration, and lifecycle management.
  • Integrate foundation models into enterprise RAG and tool-use pipelines, enabling complex, real-world use cases.
  • Provide technical leadership and mentorship to engineers and stakeholders on architecture, best practices, and operational standards.

QUALIFICATIONS/EXPERIENCE

  • Appropriate academic qualification such as Computer Science, Engineering or Statistics
  • Demonstrated track record delivering large-scale AI solutions for enterprise customers, including end-to-end
  • ownership of architecture, operations, and stakeholder engagement

Submit your CV to: ***email_hidden*** and Subject line

Role title