Observability Engineer
Metrics, traces, logs at scale. Prometheus, OTel, the cost spreadsheet.
Underrated specialism, almost always understaffed, with a career arc that compounds quietly. Pick it if you actually care about what production is telling you.
- You enjoy structured data, sampling and cardinality problems
- You like working closely with engineering and SRE on incidents
- You're happy quantifying things others guess at
- You can keep cost under control while improving signal
- You see observability as 'just dashboards'
- You don't enjoy long-running platform work
- You'd struggle being the team that explains its value every budget cycle
- Incidents get resolved faster because of telemetry you shipped
- Other teams ship with instrumentation by default, not as an afterthought
- You can defend your sampling and retention choices on paper
- Your vendor spend goes down at least once a year without losing fidelity
The role lives or dies on leadership maturity. Without an engineering culture that takes signal seriously, you'll spend two years building dashboards nobody opens. With one, you'll be one of the most influential infrastructure engineers in the org. Interview hard for whether observability is owned, funded, and used in decision-making. If it's a side-of-desk job for the platform team, the seat will frustrate you.
Tradeoffs at a glance
Hover any chip for the calibrated meaning. Ratings are directional, not absolute.
Promotion ceiling
High. Observability platform leads are well paid at scale.
- +SRE
- +Platform Engineer
- +Backend dev
- −That it's vendor-tool config, schema design and cost are the real work.
Where this leads
- SRE
- Platform Engineer
- Data Engineer
Tech you'll see
- Prometheus
- OpenTelemetry
Pathways that pass through here
The serious next step
You've read about the role. The harder question is whether it's the right one for you.
A Career Verdict is the written, practitioner-authored call on your specific route into and out of this role. Six primitives, same format every time.
Built on POST's practitioner-authored assessment framework, calibrated by James from twenty years across helpdesk, infrastructure and security. Framework is human-authored; the verdict applies it to your inputs.