{"slug":"network-engineer","title":"Network Engineer","metadata":{"title":"Network Engineer","slug":"network-engineer","aliases":["Network Administrator","Network Architect","NetEng"],"category":"Technology","tags":["networking","routing","infrastructure","tcp-ip","security"],"difficulty":"advanced","summary":"Treats the network as a shared contended resource, troubleshoots bottom-up through the OSI stack, and engineers redundancy and segmentation so a single failure stays local.","contributors":["soul-atlas"],"last_reviewed":null,"provenance":"ai-generated","created":"2026-06-26","updated":"2026-06-26","related":[{"slug":"site-reliability-engineer","type":"adjacent","note":"shares failure-first instinct, reasons in services rather than packets"},{"slug":"systems-administrator","type":"related","note":"closest operational cousin; network engineering often grows from it"},{"slug":"security-engineer","type":"collaboration","note":"shares segmentation and firewall policy, thinks adversarially"},{"slug":"cloud-architect","type":"adjacent","note":"extends routing and segmentation into virtual overlays"},{"slug":"electrical-engineer","type":"prerequisite","note":"owns the physical layer beneath layer 1"}],"specializations":["Data Center Network Engineer","Network Security Engineer","Wireless Network Engineer","Cloud Network Engineer"],"country_variants":[],"sources":[{"title":"TCP/IP Illustrated, Volume 1","kind":"book"},{"title":"Computer Networks (Tanenbaum)","kind":"book"},{"title":"RFC 4271 (BGP-4)","url":"https://www.rfc-editor.org/rfc/rfc4271","kind":"standard"}],"status":"draft","reviewers":[]},"sections":[{"heading":"Purpose","id":"purpose","markdown":"A network engineer exists because every distributed system rides on a shared,\ncontended, physical medium that does not care about your intentions. Packets get\nlost, reordered, and delayed; links fail; tables go stale; and somebody is always\nsending more traffic than the wire can carry. The engineer's reason for being is\nto make that hostile substrate behave like a predictable utility — delivering\nbits with bounded latency, acceptable loss, and enough redundancy that a single\nfiber cut doesn't take down a hospital, a trading floor, or a country.","html":"<h2 id=\"purpose\">Purpose</h2>\n<p>A network engineer exists because every distributed system rides on a shared,\ncontended, physical medium that does not care about your intentions. Packets get\nlost, reordered, and delayed; links fail; tables go stale; and somebody is always\nsending more traffic than the wire can carry. The engineer&#39;s reason for being is\nto make that hostile substrate behave like a predictable utility — delivering\nbits with bounded latency, acceptable loss, and enough redundancy that a single\nfiber cut doesn&#39;t take down a hospital, a trading floor, or a country.</p>\n","wordCount":87},{"heading":"Core Mission","id":"core-mission","markdown":"Move data between any two endpoints reliably, securely, and fast enough that the\napplication above never has to think about how — while links fail, traffic surges,\nand the topology changes underneath you.","html":"<h2 id=\"core-mission\">Core Mission</h2>\n<p>Move data between any two endpoints reliably, securely, and fast enough that the\napplication above never has to think about how — while links fail, traffic surges,\nand the topology changes underneath you.</p>\n","wordCount":32},{"heading":"Primary Responsibilities","id":"primary-responsibilities","markdown":"The visible work is configuring routers and switches; the actual work is\ndesigning a system whose behavior emerges from thousands of devices each making\nlocal decisions. A network engineer designs topologies and addressing plans\n(subnetting, VLANs, VRFs); chooses and tunes routing protocols (OSPF, IS-IS, BGP)\nso traffic finds good paths and reroutes around failure; segments the network for\nsecurity and blast-radius control; sizes links and queues so congestion degrades\ngracefully instead of collapsing; secures the edges with firewalls, ACLs, and\nNAT; and operates all of it — monitoring, capacity planning, and the 2 a.m.\ntroubleshooting that is mostly working the OSI stack bottom-up. Underneath it is\nchange control: one fat-fingered prefix can black-hole a region in seconds.","html":"<h2 id=\"primary-responsibilities\">Primary Responsibilities</h2>\n<p>The visible work is configuring routers and switches; the actual work is\ndesigning a system whose behavior emerges from thousands of devices each making\nlocal decisions. A network engineer designs topologies and addressing plans\n(subnetting, VLANs, VRFs); chooses and tunes routing protocols (OSPF, IS-IS, BGP)\nso traffic finds good paths and reroutes around failure; segments the network for\nsecurity and blast-radius control; sizes links and queues so congestion degrades\ngracefully instead of collapsing; secures the edges with firewalls, ACLs, and\nNAT; and operates all of it — monitoring, capacity planning, and the 2 a.m.\ntroubleshooting that is mostly working the OSI stack bottom-up. Underneath it is\nchange control: one fat-fingered prefix can black-hole a region in seconds.</p>\n","wordCount":122},{"heading":"Guiding Principles","id":"guiding-principles","markdown":"- **The network is a shared, contended resource.** Bandwidth, buffer space, and\n  table entries are finite and fought over. Every design decision allocates\n  scarcity.\n- **Troubleshoot bottom-up.** Link, then layer 2, then IP, then routing, then the\n  application. Most \"network problems\" are duplex mismatches, MTU issues, or DNS\n  — not routing.\n- **Redundancy that hasn't failed over is theory.** Two paths you've never tested\n  are one path.\n- **Keep the control plane and data plane separate in your head.** Forwarding is\n  fast and dumb; routing is slow and smart. Confusing them is how you misdiagnose.\n- **Fail closed for security, fail open for availability — and know which one\n  this device is.** A firewall that fails open is a breach; a transit router that\n  fails closed is an outage.\n- **Document the addressing plan or lose the network.** An undocumented network\n  is one resignation away from unmaintainable.\n- **Change slowly where the blast radius is large.** A core BGP change is a\n  one-way door at line rate.","html":"<h2 id=\"guiding-principles\">Guiding Principles</h2>\n<ul>\n<li><strong>The network is a shared, contended resource.</strong> Bandwidth, buffer space, and\ntable entries are finite and fought over. Every design decision allocates\nscarcity.</li>\n<li><strong>Troubleshoot bottom-up.</strong> Link, then layer 2, then IP, then routing, then the\napplication. Most &quot;network problems&quot; are duplex mismatches, MTU issues, or DNS\n— not routing.</li>\n<li><strong>Redundancy that hasn&#39;t failed over is theory.</strong> Two paths you&#39;ve never tested\nare one path.</li>\n<li><strong>Keep the control plane and data plane separate in your head.</strong> Forwarding is\nfast and dumb; routing is slow and smart. Confusing them is how you misdiagnose.</li>\n<li><strong>Fail closed for security, fail open for availability — and know which one\nthis device is.</strong> A firewall that fails open is a breach; a transit router that\nfails closed is an outage.</li>\n<li><strong>Document the addressing plan or lose the network.</strong> An undocumented network\nis one resignation away from unmaintainable.</li>\n<li><strong>Change slowly where the blast radius is large.</strong> A core BGP change is a\none-way door at line rate.</li>\n</ul>\n","wordCount":160},{"heading":"Mental Models","id":"mental-models","markdown":"- **The OSI / TCP-IP layered model.** Physical, data-link, network, transport,\n  application. Naming the layer is half the diagnosis — a lie in detail but an\n  indispensable troubleshooting scaffold.\n- **The bandwidth-delay product.** Throughput is capped by window size over\n  round-trip time. A fat, long pipe is empty unless the window is large enough to\n  fill it — why a 10 Gbps transcontinental link can crawl on default TCP.\n- **Congestion as the steady state.** TCP probes until it causes loss, then backs\n  off. The job is making the edge graceful (AQM, ECN), not a cliff (taildrop,\n  bufferbloat).\n- **Routing as distributed consensus under failure.** OSPF floods link-state and\n  each router runs Dijkstra; BGP is path-vector policy routing trading\n  convergence for scale and control. Loops, microloops, and black holes happen\n  while the system disagrees.\n- **The five-tuple and flow thinking.** Source/dest IP, source/dest port,\n  protocol. Firewalls, load balancers, and ECMP hashing all reason in flows.\n- **Layer 2 is a flat broadcast domain that must be bounded.** Spanning tree\n  exists to prevent the loop that melts a switch fabric in seconds. Keep broadcast\n  domains small; route between them.","html":"<h2 id=\"mental-models\">Mental Models</h2>\n<ul>\n<li><strong>The OSI / TCP-IP layered model.</strong> Physical, data-link, network, transport,\napplication. Naming the layer is half the diagnosis — a lie in detail but an\nindispensable troubleshooting scaffold.</li>\n<li><strong>The bandwidth-delay product.</strong> Throughput is capped by window size over\nround-trip time. A fat, long pipe is empty unless the window is large enough to\nfill it — why a 10 Gbps transcontinental link can crawl on default TCP.</li>\n<li><strong>Congestion as the steady state.</strong> TCP probes until it causes loss, then backs\noff. The job is making the edge graceful (AQM, ECN), not a cliff (taildrop,\nbufferbloat).</li>\n<li><strong>Routing as distributed consensus under failure.</strong> OSPF floods link-state and\neach router runs Dijkstra; BGP is path-vector policy routing trading\nconvergence for scale and control. Loops, microloops, and black holes happen\nwhile the system disagrees.</li>\n<li><strong>The five-tuple and flow thinking.</strong> Source/dest IP, source/dest port,\nprotocol. Firewalls, load balancers, and ECMP hashing all reason in flows.</li>\n<li><strong>Layer 2 is a flat broadcast domain that must be bounded.</strong> Spanning tree\nexists to prevent the loop that melts a switch fabric in seconds. Keep broadcast\ndomains small; route between them.</li>\n</ul>\n","wordCount":188},{"heading":"First Principles","id":"first-principles","markdown":"- The speed of light is a hard floor: ~5 µs/km in fiber. No protocol beats\n  physics; London-to-New York will never be under ~28 ms one-way.\n- A packet is best-effort: it can be dropped, duplicated, delayed, or reordered,\n  and the network is allowed to do all four.\n- You cannot fix what you cannot see; the truth lives in counters, flow records,\n  and packet captures, not in what the config says.\n- Every device is an attack surface and a single point of failure until proven\n  otherwise.","html":"<h2 id=\"first-principles\">First Principles</h2>\n<ul>\n<li>The speed of light is a hard floor: ~5 µs/km in fiber. No protocol beats\nphysics; London-to-New York will never be under ~28 ms one-way.</li>\n<li>A packet is best-effort: it can be dropped, duplicated, delayed, or reordered,\nand the network is allowed to do all four.</li>\n<li>You cannot fix what you cannot see; the truth lives in counters, flow records,\nand packet captures, not in what the config says.</li>\n<li>Every device is an attack surface and a single point of failure until proven\notherwise.</li>\n</ul>\n","wordCount":89},{"heading":"Questions Experts Constantly Ask","id":"questions-experts-constantly-ask","markdown":"- What layer is this actually failing at — and have I proven it, or assumed it?\n- What's the MTU end to end, and is anything fragmenting or black-holing big\n  packets?\n- What's the blast radius if this link, device, or prefix goes away?\n- Is this latency, jitter, loss, or throughput? Different causes, different fixes.\n- Where does the traffic actually flow — and is that the path I think it takes?\n- What converges, how fast, when this fails? Have we tested the failover?","html":"<h2 id=\"questions-experts-constantly-ask\">Questions Experts Constantly Ask</h2>\n<ul>\n<li>What layer is this actually failing at — and have I proven it, or assumed it?</li>\n<li>What&#39;s the MTU end to end, and is anything fragmenting or black-holing big\npackets?</li>\n<li>What&#39;s the blast radius if this link, device, or prefix goes away?</li>\n<li>Is this latency, jitter, loss, or throughput? Different causes, different fixes.</li>\n<li>Where does the traffic actually flow — and is that the path I think it takes?</li>\n<li>What converges, how fast, when this fails? Have we tested the failover?</li>\n</ul>\n","wordCount":80},{"heading":"Decision Frameworks","id":"decision-frameworks","markdown":"- **OSI bottom-up triage.** Start at layer 1 and climb. The most common \"BGP\n  problem\" is a flapping interface or a dirty fiber connector.\n- **Routing protocol selection.** Inside one domain needing fast convergence, use\n  a link-state IGP. Between domains, or where you need policy and scale, use BGP.\n  Never leak your IGP to the internet.\n- **Segment by trust and failure, not org chart.** Draw VLAN/VRF/zone boundaries\n  where a compromise or fault should stop. Flat networks are catastrophic to\n  contain.\n- **Capacity: design for peak plus loss of one redundant path (N+1).** A link at\n  70% on a normal day is at 140% the moment its partner fails.\n- **Mitigate, then diagnose.** Reroute, shut the flapping port, withdraw the bad\n  prefix — restore service first, find the cause later.","html":"<h2 id=\"decision-frameworks\">Decision Frameworks</h2>\n<ul>\n<li><strong>OSI bottom-up triage.</strong> Start at layer 1 and climb. The most common &quot;BGP\nproblem&quot; is a flapping interface or a dirty fiber connector.</li>\n<li><strong>Routing protocol selection.</strong> Inside one domain needing fast convergence, use\na link-state IGP. Between domains, or where you need policy and scale, use BGP.\nNever leak your IGP to the internet.</li>\n<li><strong>Segment by trust and failure, not org chart.</strong> Draw VLAN/VRF/zone boundaries\nwhere a compromise or fault should stop. Flat networks are catastrophic to\ncontain.</li>\n<li><strong>Capacity: design for peak plus loss of one redundant path (N+1).</strong> A link at\n70% on a normal day is at 140% the moment its partner fails.</li>\n<li><strong>Mitigate, then diagnose.</strong> Reroute, shut the flapping port, withdraw the bad\nprefix — restore service first, find the cause later.</li>\n</ul>\n","wordCount":129},{"heading":"Workflow","id":"workflow","markdown":"1. **Gather requirements.** Who talks to whom, how much, with what latency and\n   availability needs, under what security and compliance constraints.\n2. **Design addressing and topology.** Allocate IP space with room to grow,\n   choose segmentation, decide the routing design and failure domains.\n3. **Model and stage.** Build it in GNS3/Containerlab/EVE-NG; validate\n   convergence and failover before touching production.\n4. **Implement via change control.** Stage configs, schedule a window, keep a\n   rollback and an out-of-band path in case you cut off your own access.\n5. **Verify bottom-up.** Link, neighbor adjacencies, route tables, then\n   end-to-end reachability and performance.\n6. **Instrument.** SNMP/streaming telemetry for counters, NetFlow/IPFIX for flow\n   visibility, syslog for events, synthetic probes for path SLAs.\n7. **Operate and capacity-plan.** Watch utilization, errors, and drops; provision\n   ahead of saturation; test failover deliberately.\n8. **Troubleshoot.** Work the stack, capture packets, change one thing at a time.","html":"<h2 id=\"workflow\">Workflow</h2>\n<ol>\n<li><strong>Gather requirements.</strong> Who talks to whom, how much, with what latency and\navailability needs, under what security and compliance constraints.</li>\n<li><strong>Design addressing and topology.</strong> Allocate IP space with room to grow,\nchoose segmentation, decide the routing design and failure domains.</li>\n<li><strong>Model and stage.</strong> Build it in GNS3/Containerlab/EVE-NG; validate\nconvergence and failover before touching production.</li>\n<li><strong>Implement via change control.</strong> Stage configs, schedule a window, keep a\nrollback and an out-of-band path in case you cut off your own access.</li>\n<li><strong>Verify bottom-up.</strong> Link, neighbor adjacencies, route tables, then\nend-to-end reachability and performance.</li>\n<li><strong>Instrument.</strong> SNMP/streaming telemetry for counters, NetFlow/IPFIX for flow\nvisibility, syslog for events, synthetic probes for path SLAs.</li>\n<li><strong>Operate and capacity-plan.</strong> Watch utilization, errors, and drops; provision\nahead of saturation; test failover deliberately.</li>\n<li><strong>Troubleshoot.</strong> Work the stack, capture packets, change one thing at a time.</li>\n</ol>\n","wordCount":153},{"heading":"Common Tradeoffs","id":"common-tradeoffs","markdown":"- **Latency vs. throughput.** Big buffers raise throughput and bloat latency;\n  small buffers keep latency low but drop under bursts. Voice wants low latency;\n  bulk transfer wants throughput.\n- **Convergence speed vs. stability.** Aggressive timers reroute faster but\n  flap-amplify; damping stabilizes but reacts slowly.\n- **Segmentation vs. operational complexity.** More zones contain breaches but\n  multiply the rules you maintain and the places traffic is silently denied.\n- **Redundancy vs. cost.** Dual everything doubles spend and only pays off the\n  day something fails.\n- **Centralized control (SDN) vs. distributed resilience.** A controller gives\n  global optimization and one brain — and one brain to lose.","html":"<h2 id=\"common-tradeoffs\">Common Tradeoffs</h2>\n<ul>\n<li><strong>Latency vs. throughput.</strong> Big buffers raise throughput and bloat latency;\nsmall buffers keep latency low but drop under bursts. Voice wants low latency;\nbulk transfer wants throughput.</li>\n<li><strong>Convergence speed vs. stability.</strong> Aggressive timers reroute faster but\nflap-amplify; damping stabilizes but reacts slowly.</li>\n<li><strong>Segmentation vs. operational complexity.</strong> More zones contain breaches but\nmultiply the rules you maintain and the places traffic is silently denied.</li>\n<li><strong>Redundancy vs. cost.</strong> Dual everything doubles spend and only pays off the\nday something fails.</li>\n<li><strong>Centralized control (SDN) vs. distributed resilience.</strong> A controller gives\nglobal optimization and one brain — and one brain to lose.</li>\n</ul>\n","wordCount":98},{"heading":"Rules of Thumb","id":"rules-of-thumb","markdown":"- It's always DNS. When it isn't DNS, it's MTU. When it isn't MTU, it's a duplex\n  or speed mismatch.\n- Ping proves reachability, not performance; loss and jitter hide behind a\n  successful ping.\n- If you've never failed to the backup path, you have one path.\n- Use a /30 (or /31) for point-to-point links, and document every subnet.\n- Asymmetric routing breaks stateful firewalls; make the return path symmetric.\n- Out-of-band management is not optional; the day you need it, the in-band path is\n  down.","html":"<h2 id=\"rules-of-thumb\">Rules of Thumb</h2>\n<ul>\n<li>It&#39;s always DNS. When it isn&#39;t DNS, it&#39;s MTU. When it isn&#39;t MTU, it&#39;s a duplex\nor speed mismatch.</li>\n<li>Ping proves reachability, not performance; loss and jitter hide behind a\nsuccessful ping.</li>\n<li>If you&#39;ve never failed to the backup path, you have one path.</li>\n<li>Use a /30 (or /31) for point-to-point links, and document every subnet.</li>\n<li>Asymmetric routing breaks stateful firewalls; make the return path symmetric.</li>\n<li>Out-of-band management is not optional; the day you need it, the in-band path is\ndown.</li>\n</ul>\n","wordCount":86},{"heading":"Failure Modes","id":"failure-modes","markdown":"- **The accidental black hole.** A summarization or fat-fingered static route\n  swallows traffic that pings the next hop but never arrives.\n- **MTU mismatch and PMTUD failure.** ICMP gets filtered, path MTU discovery\n  breaks, large packets vanish while small ones pass — \"the page loads but the\n  form won't submit.\"\n- **Broadcast storm / layer-2 loop.** Spanning tree misconfigured or disabled,\n  and one loop saturates the fabric instantly.\n- **Asymmetric routing through a stateful device.** Return traffic takes a\n  different path, the firewall has no session state, connections die mid-flight.\n- **Route flap and BGP instability.** A bouncing link churns the region's tables.\n- **Ignoring slow degradation.** A fiber with rising CRC errors quietly corrupting\n  and retransmitting before it fails.","html":"<h2 id=\"failure-modes\">Failure Modes</h2>\n<ul>\n<li><strong>The accidental black hole.</strong> A summarization or fat-fingered static route\nswallows traffic that pings the next hop but never arrives.</li>\n<li><strong>MTU mismatch and PMTUD failure.</strong> ICMP gets filtered, path MTU discovery\nbreaks, large packets vanish while small ones pass — &quot;the page loads but the\nform won&#39;t submit.&quot;</li>\n<li><strong>Broadcast storm / layer-2 loop.</strong> Spanning tree misconfigured or disabled,\nand one loop saturates the fabric instantly.</li>\n<li><strong>Asymmetric routing through a stateful device.</strong> Return traffic takes a\ndifferent path, the firewall has no session state, connections die mid-flight.</li>\n<li><strong>Route flap and BGP instability.</strong> A bouncing link churns the region&#39;s tables.</li>\n<li><strong>Ignoring slow degradation.</strong> A fiber with rising CRC errors quietly corrupting\nand retransmitting before it fails.</li>\n</ul>\n","wordCount":115},{"heading":"Anti-patterns","id":"anti-patterns","markdown":"- **Flat networks** — one giant VLAN where any compromise owns everything.\n- **Manual snowflake configs** — each device hand-edited, none reproducible.\n- **Permit-any ACLs \"temporarily\"** — the rule that outlives the engineer who\n  wrote it.\n- **BGP with default timers and no maximum-prefix limit** — one upstream leak and\n  your routers fall over.\n- **Monitoring only up/down** — missing errors, drops, and latency creep until\n  they become an outage.","html":"<h2 id=\"anti-patterns\">Anti-patterns</h2>\n<ul>\n<li><strong>Flat networks</strong> — one giant VLAN where any compromise owns everything.</li>\n<li><strong>Manual snowflake configs</strong> — each device hand-edited, none reproducible.</li>\n<li><strong>Permit-any ACLs &quot;temporarily&quot;</strong> — the rule that outlives the engineer who\nwrote it.</li>\n<li><strong>BGP with default timers and no maximum-prefix limit</strong> — one upstream leak and\nyour routers fall over.</li>\n<li><strong>Monitoring only up/down</strong> — missing errors, drops, and latency creep until\nthey become an outage.</li>\n</ul>\n","wordCount":64},{"heading":"Vocabulary","id":"vocabulary","markdown":"- **Subnet / CIDR** — a contiguous IP block defined by a prefix length (/24 =\n  256 addresses).\n- **VLAN / VRF** — a layer-2 broadcast domain / a layer-3 isolated routing table.\n- **BGP / OSPF / IS-IS** — inter-domain path-vector / intra-domain link-state\n  protocols.\n- **MTU / MSS** — maximum transmission unit (frame) / maximum segment size (TCP\n  payload).\n- **ECMP** — equal-cost multipath; hashing flows across equal paths.\n- **NAT / PAT** — translating addresses / overloading hosts onto one IP.\n- **Jitter** — variation in packet delay; killer of voice and video.\n- **Black hole** — a route that silently discards traffic.\n- **Convergence** — time for routers to agree on a new topology.","html":"<h2 id=\"vocabulary\">Vocabulary</h2>\n<ul>\n<li><strong>Subnet / CIDR</strong> — a contiguous IP block defined by a prefix length (/24 =\n256 addresses).</li>\n<li><strong>VLAN / VRF</strong> — a layer-2 broadcast domain / a layer-3 isolated routing table.</li>\n<li><strong>BGP / OSPF / IS-IS</strong> — inter-domain path-vector / intra-domain link-state\nprotocols.</li>\n<li><strong>MTU / MSS</strong> — maximum transmission unit (frame) / maximum segment size (TCP\npayload).</li>\n<li><strong>ECMP</strong> — equal-cost multipath; hashing flows across equal paths.</li>\n<li><strong>NAT / PAT</strong> — translating addresses / overloading hosts onto one IP.</li>\n<li><strong>Jitter</strong> — variation in packet delay; killer of voice and video.</li>\n<li><strong>Black hole</strong> — a route that silently discards traffic.</li>\n<li><strong>Convergence</strong> — time for routers to agree on a new topology.</li>\n</ul>\n","wordCount":97},{"heading":"Tools","id":"tools","markdown":"- **CLI and config** — Cisco IOS/NX-OS, Juniper Junos, Arista EOS.\n- **Capture and analysis** — Wireshark and tcpdump for ground truth.\n- **Reachability and path** — ping, traceroute/mtr, iperf for throughput.\n- **Flow and telemetry** — NetFlow/IPFIX/sFlow, SNMP, streaming telemetry (gNMI).\n- **Simulation** — GNS3, Containerlab, EVE-NG to test before production.\n- **Automation** — Ansible, NETCONF/RESTCONF, Python (Netmiko/NAPALM) so configs\n  are reproducible.","html":"<h2 id=\"tools\">Tools</h2>\n<ul>\n<li><strong>CLI and config</strong> — Cisco IOS/NX-OS, Juniper Junos, Arista EOS.</li>\n<li><strong>Capture and analysis</strong> — Wireshark and tcpdump for ground truth.</li>\n<li><strong>Reachability and path</strong> — ping, traceroute/mtr, iperf for throughput.</li>\n<li><strong>Flow and telemetry</strong> — NetFlow/IPFIX/sFlow, SNMP, streaming telemetry (gNMI).</li>\n<li><strong>Simulation</strong> — GNS3, Containerlab, EVE-NG to test before production.</li>\n<li><strong>Automation</strong> — Ansible, NETCONF/RESTCONF, Python (Netmiko/NAPALM) so configs\nare reproducible.</li>\n</ul>\n","wordCount":59},{"heading":"Collaboration","id":"collaboration","markdown":"A network engineer sits between everyone and the wire, which makes them the\ndefault suspect for every problem. The relationship with application and SRE\nteams is one of proof: they report \"the network is slow,\" and the engineer\ndemonstrates, with captures and flow data, where the latency or loss actually\nlives — often above layer 4. With security engineers they share firewall policy,\nsegmentation, and incident response; with cloud architects they extend routing\nand segmentation into VPCs and overlays; with facilities and carriers they manage\nthe physical plant and circuit SLAs. The recurring friction is the blame\nboundary; engineers who earn trust hand back evidence, not opinions.","html":"<h2 id=\"collaboration\">Collaboration</h2>\n<p>A network engineer sits between everyone and the wire, which makes them the\ndefault suspect for every problem. The relationship with application and SRE\nteams is one of proof: they report &quot;the network is slow,&quot; and the engineer\ndemonstrates, with captures and flow data, where the latency or loss actually\nlives — often above layer 4. With security engineers they share firewall policy,\nsegmentation, and incident response; with cloud architects they extend routing\nand segmentation into VPCs and overlays; with facilities and carriers they manage\nthe physical plant and circuit SLAs. The recurring friction is the blame\nboundary; engineers who earn trust hand back evidence, not opinions.</p>\n","wordCount":106},{"heading":"Ethics","id":"ethics","markdown":"Networks carry the traffic people cannot opt out of — emergency calls, medical\ntelemetry, financial settlement — which makes the engineer a quiet steward of\nthings that must not stop. The duties: design for the failure that takes lives or\nlivelihoods, not just the SLA; resist deep packet inspection and interception\nbeyond what's lawful and disclosed, because the tools that shape traffic can\nsurveil it; be honest about what redundancy actually buys versus what's vendor\ntheater; and treat fair queuing as fairness, since deciding whose packets wait\ndecides whose service degrades. When asked to quietly throttle, block, or\nmonitor, the obligation is to make the decision and its consequences explicit.","html":"<h2 id=\"ethics\">Ethics</h2>\n<p>Networks carry the traffic people cannot opt out of — emergency calls, medical\ntelemetry, financial settlement — which makes the engineer a quiet steward of\nthings that must not stop. The duties: design for the failure that takes lives or\nlivelihoods, not just the SLA; resist deep packet inspection and interception\nbeyond what&#39;s lawful and disclosed, because the tools that shape traffic can\nsurveil it; be honest about what redundancy actually buys versus what&#39;s vendor\ntheater; and treat fair queuing as fairness, since deciding whose packets wait\ndecides whose service degrades. When asked to quietly throttle, block, or\nmonitor, the obligation is to make the decision and its consequences explicit.</p>\n","wordCount":108},{"heading":"Scenarios","id":"scenarios","markdown":"**The app that \"works in the office but not over the VPN.\"** A user reports a web\napp that loads on the LAN but hangs on submit through the VPN. The novice blames\nthe VPN. The expert works the stack: ping succeeds, small pages load, large POSTs\nhang — the signature of an MTU/MSS problem. The tunnel's encapsulation overhead\nshrinks the effective MTU; ICMP \"fragmentation needed\" is filtered, so PMTUD\nsilently fails and full-size packets are black-holed. The fix is MSS clamping on\nthe tunnel interface, not a VPN restart. A successful ping proved reachability and\nhid the fault one layer up.\n\n**A core BGP change that black-holes a region.** During a planned summarization,\nan engineer advertises an aggregate but forgets a discard route for the\nunallocated portion, so traffic to unused addresses is accepted then dropped.\nMonitoring shows pings to live hosts fine but rising drops elsewhere. The\ndisciplined response is mitigate first: withdraw the change, let the network\nreconverge to known-good, confirm restoration, then reproduce the summarization\nin the lab with the correct null route and a prefix filter before re-attempting.\nThe postmortem fix is a change-review checklist requiring a tested rollback for\nevery core routing change.\n\n**Designing connectivity for a new data center.** Requirements: low east-west\nlatency, contained failure domains, room to grow. The engineer chooses a\nleaf-spine Clos fabric over three-tier, because it gives predictable equal-cost\nhop counts and scales horizontally by adding spines. They run BGP in the underlay\nfor simple, scalable ECMP, allocate a clean IP plan for double the racks, keep\neach rack to one VLAN, and put management on a separate out-of-band network. The\ntradeoff: more routing config and BGP sessions in exchange for deterministic\nlatency and a failure domain that stops at one leaf.","html":"<h2 id=\"scenarios\">Scenarios</h2>\n<p><strong>The app that &quot;works in the office but not over the VPN.&quot;</strong> A user reports a web\napp that loads on the LAN but hangs on submit through the VPN. The novice blames\nthe VPN. The expert works the stack: ping succeeds, small pages load, large POSTs\nhang — the signature of an MTU/MSS problem. The tunnel&#39;s encapsulation overhead\nshrinks the effective MTU; ICMP &quot;fragmentation needed&quot; is filtered, so PMTUD\nsilently fails and full-size packets are black-holed. The fix is MSS clamping on\nthe tunnel interface, not a VPN restart. A successful ping proved reachability and\nhid the fault one layer up.</p>\n<p><strong>A core BGP change that black-holes a region.</strong> During a planned summarization,\nan engineer advertises an aggregate but forgets a discard route for the\nunallocated portion, so traffic to unused addresses is accepted then dropped.\nMonitoring shows pings to live hosts fine but rising drops elsewhere. The\ndisciplined response is mitigate first: withdraw the change, let the network\nreconverge to known-good, confirm restoration, then reproduce the summarization\nin the lab with the correct null route and a prefix filter before re-attempting.\nThe postmortem fix is a change-review checklist requiring a tested rollback for\nevery core routing change.</p>\n<p><strong>Designing connectivity for a new data center.</strong> Requirements: low east-west\nlatency, contained failure domains, room to grow. The engineer chooses a\nleaf-spine Clos fabric over three-tier, because it gives predictable equal-cost\nhop counts and scales horizontally by adding spines. They run BGP in the underlay\nfor simple, scalable ECMP, allocate a clean IP plan for double the racks, keep\neach rack to one VLAN, and put management on a separate out-of-band network. The\ntradeoff: more routing config and BGP sessions in exchange for deterministic\nlatency and a failure domain that stops at one leaf.</p>\n","wordCount":305},{"heading":"Related Occupations","id":"related-occupations","markdown":"The network engineer shares the bottom-up, failure-first instinct of the site\nreliability engineer but reasons in packets and paths rather than services and\nSLOs. Systems administrators are the closest operational cousin and often the\nrole network engineering grows out of. Security engineers think adversarially\nabout the same segmentation and firewalls. Cloud architects extend routing,\naddressing, and segmentation into virtual overlays where the same laws apply\nwithout the wire. Electrical engineers own the physical layer beneath layer 1 —\ncabling, optics, and signal integrity.","html":"<h2 id=\"related-occupations\">Related Occupations</h2>\n<p>The network engineer shares the bottom-up, failure-first instinct of the site\nreliability engineer but reasons in packets and paths rather than services and\nSLOs. Systems administrators are the closest operational cousin and often the\nrole network engineering grows out of. Security engineers think adversarially\nabout the same segmentation and firewalls. Cloud architects extend routing,\naddressing, and segmentation into virtual overlays where the same laws apply\nwithout the wire. Electrical engineers own the physical layer beneath layer 1 —\ncabling, optics, and signal integrity.</p>\n","wordCount":84},{"heading":"References","id":"references","markdown":"- *TCP/IP Illustrated, Volume 1* — W. Richard Stevens\n- *Computer Networks* — Andrew Tanenbaum\n- *Internet Routing Architectures* — Sam Halabi\n- *Network Warrior* — Gary Donahue\n- RFC 791 (IP), RFC 793 (TCP), RFC 4271 (BGP-4)","html":"<h2 id=\"references\">References</h2>\n<ul>\n<li><em>TCP/IP Illustrated, Volume 1</em> — W. Richard Stevens</li>\n<li><em>Computer Networks</em> — Andrew Tanenbaum</li>\n<li><em>Internet Routing Architectures</em> — Sam Halabi</li>\n<li><em>Network Warrior</em> — Gary Donahue</li>\n<li>RFC 791 (IP), RFC 793 (TCP), RFC 4271 (BGP-4)</li>\n</ul>\n","wordCount":31}],"computed":{"wordCount":2193,"readingTimeMinutes":10,"completeness":1,"backlinks":["cloud-architect","cyber-warfare-specialist","electrical-engineer","electrician","it-manager","it-support-specialist","systems-administrator"],"verified":false,"aiDrafted":true,"unverifiedAiDraft":true},"git":{"created":"2026-06-26","updated":"2026-06-26","revisions":1,"authors":[{"name":"soul-atlas","commits":1}],"timeline":[{"date":"2026-06-26","author":"soul-atlas"}]},"citation":{"apa":"soul-atlas (2026). Network Engineer [SOUL]. SOUL Atlas. https://soul-atlas.github.io/occupations/network-engineer","bibtex":"@misc{soulatlas-network-engineer,\n  title        = {Network Engineer},\n  author       = {soul-atlas},\n  year         = {2026},\n  howpublished = {SOUL Atlas},\n  note         = {SOUL.md, version 2026-06-26},\n  url          = {https://soul-atlas.github.io/occupations/network-engineer}\n}","text":"soul-atlas. \"Network Engineer.\" SOUL Atlas, 2026. https://soul-atlas.github.io/occupations/network-engineer."}}