Health and Safety Engineer
Engineers hazards out of systems and, where they remain, controls them by the most reliable means — never by relying on people to be careful — to prevent injury, illness, and catastrophic loss.
Also known as: Safety Engineer, EHS Engineer, Process Safety Engineer, Occupational Safety Engineer
It is a starting point, and parts of it may be thin, generic, or wrong. If you do this work, help us fix it — no GitHub account needed.
Purpose
People are hurt and killed by systems that were designed without their bodies and mistakes in mind — machines that amputate, processes that release energy or toxins, environments that poison slowly. Health and safety engineering exists to design the hazard out before anyone is exposed to it, and where it can't be designed out, to guard, control, and warn against it in that order. The discipline is engineering applied to the prevention of injury, illness, and catastrophic loss, fusing mechanical, chemical, and human-factors knowledge with the law and the unforgiving statistics of how accidents actually happen. Without it, safety is left to luck, blame, and the assumption that workers will simply be careful — an assumption every serious incident disproves.
Core Mission
Prevent injury, illness, and catastrophic loss by engineering hazards out of systems — and where they remain, control them by the most reliable means, never by relying on people to be careful.
Primary Responsibilities
The work is hazard identification, risk assessment, and control design across products, workplaces, and processes. That means analyzing systems for what can release harmful energy or substances (machine guarding, electrical, fall, fire, confined space, chemical exposure, ergonomic); quantifying risk (likelihood × severity) and prioritizing; designing and specifying controls up the hierarchy (elimination, substitution, engineering controls, administrative, PPE); ensuring regulatory compliance (OSHA, NFPA, ANSI, ISO); investigating incidents to root cause; and managing process safety where the failure is catastrophic (PSM, the prevention of fires, explosions, and toxic releases). A large part of the job is making the safe way the easy, default way, because controls people bypass don't control anything.
Guiding Principles
- The hierarchy of controls is non-negotiable order. Eliminate, then substitute, then engineer, then administrate, then PPE. PPE is the last resort, not the first answer, because it depends on the person every single time.
- Design for the human who errs and the worst case. People will be tired, rushed, untrained, and wrong. Safe systems assume it.
- Hazard energy wants out. Every accident is uncontrolled energy or substance reaching a person; control the energy, not just the behavior.
- Prevention beats protection beats reaction. Stop the release, then contain it, then respond — in that priority and that order of reliability.
- Leading indicators over lagging ones. Counting injuries tells you you've already failed; near-misses and unsafe conditions tell you before.
- Safety is engineered in, not inspected in. Bolt-on safety and posters don't change a fundamentally unsafe design.
Mental Models
- The hierarchy of controls. The spine of the field: reliability of a control is inversely proportional to how much it depends on human behavior.
- The energy-barrier (Haddon) model. Injury is harmful energy transferred to a person; prevention means barriers between the energy and the body at each phase (pre-event, event, post-event).
- Swiss cheese / defense in depth. Accidents happen when holes in independent layers of defense line up; safety is keeping the layers independent and the holes from aligning.
- The accident pyramid (Heinrich/Bird). Many near-misses and unsafe acts underlie each minor injury, and many minor injuries underlie each fatality — work the base, not the tip.
- Risk = likelihood × severity. The matrix that prioritizes finite resources toward the hazards that matter, not the ones that feel scary.
- Inherent safety (the Kletz principle). "What you don't have can't leak." The safest plant minimizes the hazardous inventory rather than controlling a large one.
- Latent vs. active failures. Front-line errors (active) are usually triggered by management and design decisions made long before (latent); fix the latent.
First Principles
- Every injury is the transfer of uncontrolled energy or a harmful substance to a human body — prevention is about controlling that, not exhorting caution.
- Humans are fallible by nature; a system that requires perfect behavior to be safe is unsafe.
- The reliability of a safety measure falls as its dependence on a person rises.
- An accident is almost never one cause; it's an alignment of latent conditions with a trigger.
Questions Experts Constantly Ask
- What's the hazardous energy or substance here, and what keeps it from a person?
- Can I eliminate or substitute this hazard before I try to guard it?
- What happens when — not if — someone does this wrong, tired, or in a hurry?
- Is this control one people will actually use, or one they'll bypass to get the job done?
- What's the worst credible outcome, and how many independent layers stand between us and it?
- Are my indicators leading or just counting bodies after the fact?
- What latent decision upstream made this front-line error likely?
Decision Frameworks
- Apply the hierarchy of controls, in order. For each hazard, exhaust elimination and substitution before engineering controls, and never stop at PPE if a more reliable control is feasible.
- Risk assessment and prioritization. Score hazards on a likelihood-severity matrix; drive resources to high-severity hazards even when rare, because the tail is what kills.
- Process safety (for catastrophic hazards). Use HAZOP, LOPA, and bow-tie analysis to ensure enough independent protection layers for the worst-case release.
- Incident root-cause analysis. Drive past the active error to latent and systemic causes (5-whys, fault tree, MORT); the corrective action must fix the system, not blame the worker.
Workflow
- Identify hazards. Walk the process, review designs, analyze tasks and substances; involve the people who do the work.
- Assess and prioritize risk. Likelihood × severity; rank against resources.
- Design controls. Up the hierarchy; specify guarding, ventilation, interlocks, lockout/tagout, and only then administrative measures and PPE.
- Verify compliance. Against OSHA, NFPA, ANSI, ISO 45001, and process-safety regulation; document the rationale.
- Implement and train. Make the safe way the easy default; train and check understanding, not just attendance.
- Monitor leading indicators. Near-misses, inspections, exposure monitoring; act before the lagging numbers move.
- Investigate and improve. Every incident and near-miss to root cause; close the loop into design and procedure.
Common Tradeoffs
- Productivity vs. protection. Guards, lockout, and procedures cost time; controls that cost too much time get bypassed, so usability is a safety property.
- Cost of control vs. cost of the loss. Engineering controls cost capital now against a probabilistic future loss; severity, not just frequency, justifies the spend.
- Reliable vs. cheap controls. Engineering controls cost more than PPE and signage but don't depend on behavior; cheaping out moves you down the hierarchy.
- Compliance minimum vs. actual safety. Meeting the regulation is a floor, not a guarantee; some compliant systems are still unsafe.
- Centralized rules vs. front-line flexibility. Rigid procedures are auditable but brittle; workers need enough latitude to be safe in situations the rules didn't foresee.
Rules of Thumb
- If your control depends on someone remembering, it will fail; engineer it instead.
- PPE is the last line, never the plan.
- A guard that slows the job will be removed; design it not to.
- Investigate the near-miss as if it were the fatality it nearly was.
- What you don't store can't leak, burn, or explode — minimize the inventory.
- Blame stops learning; find the latent cause, not the careless worker.
- If you can't measure the exposure, you can't claim it's safe.
Failure Modes
- PPE-first thinking — reaching for gloves, goggles, and signs instead of removing or guarding the hazard.
- Compliance theater — binders, posters, and toolbox talks that satisfy an auditor while the real hazard is untouched.
- Blaming the worker — closing incidents as "human error" and missing the design and management decisions behind it.
- Lagging-indicator complacency — declaring safety because no one's been hurt lately, while near-misses pile up.
- Bypassed controls — guards and interlocks defeated because they obstruct the job, leaving the hazard fully exposed.
- Catastrophic-tail blindness — managing frequent minor injuries while ignoring the rare, fatal process hazard.
Anti-patterns
- Safety by exhortation — "be careful" campaigns substituting for engineering.
- Procedure proliferation — answering every incident with another rule until no one can follow them all.
- Audit-driven safety — optimizing for the inspection rather than the hazard.
- PPE as the first and only control for a hazard that could be engineered out.
- Normalizing deviance — accepting a bypassed guard or skipped lockout because "we've always done it that way and nothing happened."
Vocabulary
- Hierarchy of controls — the ranked order of control reliability: elimination → substitution → engineering → administrative → PPE.
- Hazard vs. risk — the source of harm vs. the likelihood-severity of harm from it.
- Lockout/tagout (LOTO) — isolating hazardous energy before servicing.
- HAZOP / LOPA — hazard-and-operability study / layers-of-protection analysis for process safety.
- Bow-tie — a diagram of causes, the top event, and consequences with barriers on each side.
- PSM — process safety management, for catastrophic-hazard facilities.
- PEL / TLV — permissible exposure limit / threshold limit value for toxic substances.
- Leading vs. lagging indicators — predictive measures vs. counts of past harm.
- Inherently safer design — reducing hazard by reducing the hazardous inventory or condition itself.
Tools
- Risk-assessment methods and matrices — JSA/JHA, FMEA, risk scoring.
- Process-safety techniques — HAZOP, LOPA, fault and event trees, bow-tie software.
- Exposure-monitoring instruments — gas detectors, dosimeters, sound and air sampling.
- Standards and regulations — OSHA, NFPA, ANSI, ISO 45001 as the design reference.
- EHS management systems — incident, audit, and corrective-action tracking.
- Ergonomics and human-factors tools — task analysis, anthropometric and lifting assessments (NIOSH lifting equation).
Collaboration
Health and safety engineers work across the whole organization: design and process engineers (to build safety in early, where it's cheapest), operations and maintenance crews (who know where the real hazards and workarounds are), industrial hygienists and occupational-health staff, management (who own the resources and the safety culture), and regulators and insurers. The hardest and most important relationship is with the front-line workers, whose buy-in determines whether controls are used or bypassed — which is why the best safety engineers design with them, not for them. Friction lives at the productivity-vs-protection line and in incident investigations, where the temptation to blame an individual collides with the duty to fix the system.
Ethics
The work is, plainly, about whether people go home unharmed — and the engineer often stands between a worker's safety and a schedule or budget that would compromise it. Duties: place worker and public safety above production and cost, and have the authority and spine to stop unsafe work; tell the truth about risk to workers and management, including the hazards that are inconvenient to name; refuse to let "compliant" substitute for "safe," or to scapegoat a worker for a systemic failure; and protect the vulnerable — temporary, untrained, or non-native- speaking workers who bear hazards disproportionately. The gray zones — accepting a residual risk, allocating finite safety budget, balancing privacy against exposure monitoring — demand that the engineer name the trade-off honestly rather than let it be made silently by default.
Scenarios
A machine that occasionally amputates. A press has injured operators reaching in to clear jams. The plant's first instinct is a warning sign and a glove policy. The engineer applies the hierarchy: can the jam be eliminated by fixing the feed (elimination)? If not, a light curtain and two-hand control that make it physically impossible for the machine to cycle with a hand in the danger zone (engineering control). PPE and signage are the last, weakest layer — and never the plan. The fix is judged by whether a rushed operator can still get hurt, not by whether a rule now exists.
A near-miss that was nearly a fatality. A worker is almost crushed when a suspended load shifts; no injury, no lost time. The temptation is to log it and move on. The engineer investigates it as if it had killed someone, traces it past the rigger's "error" to a latent cause — a lifting procedure that didn't specify the right rigging for that load and a schedule pressure that skipped the check — and fixes the system. The accident pyramid says the next one like it could be the fatality.
A new process with a toxic inventory. A design calls for storing a large quantity of a hazardous gas. Rather than design ever-more-elaborate containment and detection, the engineer pushes inherently safer design: can the process run with a far smaller inventory, generate the reagent on demand, or substitute a less hazardous chemical? Reducing what's stored cuts the worst-case release directly — the Kletz principle that what you don't have can't leak — before adding the layers of protection that a residual inventory still requires.
Related Occupations
Health and safety engineers apply engineering to a goal — human protection — that cuts across every other field. Mechanical, chemical, and electrical engineers are both their collaborators and the source of the hazards they control. The environmental engineer shares the exposure, emissions, and mass-balance discipline aimed outward at the public rather than the worker. The nuclear engineer shares the defense-in-depth and catastrophic-risk mindset. The fire inspector and construction inspector enforce overlapping safety codes in the field. The epidemiologist studies the population-level health outcomes the safety engineer works to prevent at the source.
References
- Safeware: System Safety and Computers — Nancy Leveson
- What Went Wrong? and Lees' Loss Prevention in the Process Industries — Kletz / Mannan
- Industrial Safety and Health Management — Asfahl & Rieske
- Human Error — James Reason
- OSHA standards (29 CFR 1910/1926), NFPA, ANSI, ISO 45001
- NIOSH publications and the Lifting Equation