Senior DevOps & Platform Infrastructure Engineer
I build the infrastructure
cities run on.
100,000+ IoT sensors · bare-metal Kubernetes · GitOps from commit to cluster · public-sector grade
I'm Yannis Belhadj Kessas, infrastructure lead of one of France's largest municipal Smart City IoT deployments, at Montpellier Méditerranée Métropole — a metropolitan authority serving 500,000+ citizens. Air quality, smart street lighting, water and energy metering, mobility: not a pilot, not a demo. A live city, running on a platform I designed, built, and operate.
Montpellier, France · CKA — Certified Kubernetes Administrator, in progress (CNCF, 2026)
GitOps
Provisioned entirely from code
The platform is fully on-premise and GitOps-driven: three bare-metal Kubernetes clusters with every layer declared in Git — cluster provisioning, application rollout, drift detection, environment promotion.
A git push triggers a self-hosted runner that calls the hypervisor
API and stands up a complete cluster — RKE2, Cilium, MetalLB, ArgoCD — with zero manual
steps. Destroy it, push again, get an identical one back.
Data sovereignty, auditability, open source end to end — infrastructure built to serve citizens and built to last.
$ git push origin main → self-hosted runner picks up the pipeline → hypervisor API provisions bare-metal VMs → RKE2 bootstraps the cluster → Cilium · MetalLB · ArgoCD roll out ✓ complete cluster — 0 manual steps
Edge to cloud
One platform, sensor to dashboard
What I master
Depth where it counts
Kubernetes & platform engineering
RKE2/Rancher multi-cluster on bare metal — production on VMware vSphere, pre-production on Proxmox. MetalLB, Rook-Ceph, Cilium (eBPF), Envoy Gateway, Helm.
CKA in progress — CNCF, expected 2026
GitOps & infrastructure as code
ArgoCD, GitLab CI/CD, OpenTofu, Terraform, Ansible. Full lifecycle as code: cluster provisioning, app rollout, drift detection, environment promotion. Built a self-service platform where external partners deploy via Git without ever touching cluster internals.
Large-scale IoT & networking
End-to-end LoRaWAN at city scale: RF planning, gateway deployment, VLAN segmentation, a multi-tenant LoRaWAN network server on Kubernetes. Automated device provisioning and OTA updates for a 100k+ fleet. Edge-to-cloud pipelines that survive partial outages.
Observability & SRE
Designed a dual-layer telemetry stack from zero — Zabbix outside the clusters, Prometheus/Grafana/Loki/Alloy inside. Proactive degradation thresholds, not post-failure alarms.
MTTD cut from days to minutes
Programming & emerging
Python (Django), Rust, Bash. Computer vision in production: edge inference and GPU-aware Kubernetes scheduling. Prototyping on-prem Kubeflow for AI-assisted log analysis.
Proof, not promises
Independently verifiable
- T+0:00 A LoRa gateway goes silent — physical power failure.
- T+1 min Monitoring fires. Not when sensor data goes missing — when the gateway stops responding.
- Same hour A technician is on site and finds the fault — resolved before the data gap mattered.
“The real measure of an observability platform isn't how many incidents it helps resolve — it's how many it prevents.”
Beyond the day job
Radio is the passion
Contact
Let's talk.
If you're building infrastructure people depend on — or you just want to talk radio and Kubernetes — I'd like to hear from you.
kessas.belhadj@gmail.com