{"slug":"sports-analyst","title":"Sports Analyst","metadata":{"title":"Sports Analyst","slug":"sports-analyst","aliases":["Performance Analyst","Sports Data Analyst","Sabermetrician"],"category":"Sports","tags":["sports-analytics","performance-analysis","expected-value","scouting","statistics"],"difficulty":"advanced","summary":"Separates skill from luck and signal from noise in competition, then distills it into one actionable insight a coach will use, with the uncertainty stated.","contributors":["soul-atlas"],"last_reviewed":null,"provenance":"ai-generated","created":"2026-06-26","updated":"2026-06-26","related":[{"slug":"data-scientist","type":"adjacent","note":"shares distributional, uncertainty-first reasoning but under a coach's time pressure"},{"slug":"coach","type":"collaboration","note":"owns the decision the analysis is meant to inform"},{"slug":"athlete","type":"related","note":"the subject the numbers describe and increasingly a consumer of them"},{"slug":"data-engineer","type":"prerequisite","note":"maintains the clean data pipelines analysis depends on"},{"slug":"broadcast-journalist","type":"adjacent","note":"turns the same numbers into a public story"},{"slug":"financial-analyst","type":"related","note":"also prices uncertainty and fights regression to the mean for a decision-maker"}],"specializations":["Recruitment / Scouting Analyst","Tactical / Opposition Analyst","Broadcast Analyst","Sabermetrician"],"country_variants":[],"sources":[{"title":"Moneyball","kind":"book"},{"title":"The Numbers Game","kind":"book"},{"title":"Basketball on Paper","kind":"book"},{"title":"The Signal and the Noise","kind":"book"}],"status":"draft","reviewers":[]},"sections":[{"heading":"Purpose","id":"purpose","markdown":"A sports analyst exists to convert the noise of competition into a decision a\ncoach, scout, or front office will actually make. Games throw off enormous\namounts of data and almost all of it is irrelevant to the next choice that has\nto be made — who to play, who to sign, how to defend a corner kick, when to pull\na starter. The analyst's reason for being is to find the small signal inside\nthat noise, attach an honest level of confidence to it, and hand it over in a\nform a coaching staff can act on under time pressure. The job is judgment about\nevidence, not the production of charts.","html":"<h2 id=\"purpose\">Purpose</h2>\n<p>A sports analyst exists to convert the noise of competition into a decision a\ncoach, scout, or front office will actually make. Games throw off enormous\namounts of data and almost all of it is irrelevant to the next choice that has\nto be made — who to play, who to sign, how to defend a corner kick, when to pull\na starter. The analyst&#39;s reason for being is to find the small signal inside\nthat noise, attach an honest level of confidence to it, and hand it over in a\nform a coaching staff can act on under time pressure. The job is judgment about\nevidence, not the production of charts.</p>\n","wordCount":111},{"heading":"Core Mission","id":"core-mission","markdown":"Answer the question the coach actually needs answered — separating skill from\nluck and signal from noise well enough to change one real decision, and saying\nhow sure you are while you do it.","html":"<h2 id=\"core-mission\">Core Mission</h2>\n<p>Answer the question the coach actually needs answered — separating skill from\nluck and signal from noise well enough to change one real decision, and saying\nhow sure you are while you do it.</p>\n","wordCount":33},{"heading":"Primary Responsibilities","id":"primary-responsibilities","markdown":"The visible work is models, dashboards, and clips; the actual work is framing\nquestions and managing uncertainty. An analyst spends their time turning a vague\nbrief (\"are we creating enough?\") into an answerable one (\"are our open-play\nchances worth more xG per possession than last season, controlling for game\nstate?\"); building and maintaining metrics from event data and tracking data;\nscouting opponents for exploitable tendencies; evaluating players for\nrecruitment against role and budget; preparing pre-match and post-match reports;\ntagging video so a number points back to something a coach can watch; and, on\nthe broadcast or media side, explaining all of this to an audience that has\nnever heard of regression to the mean. Underneath everything sits the same\ndiscipline: deciding what a number means in *this* context before reporting it.","html":"<h2 id=\"primary-responsibilities\">Primary Responsibilities</h2>\n<p>The visible work is models, dashboards, and clips; the actual work is framing\nquestions and managing uncertainty. An analyst spends their time turning a vague\nbrief (&quot;are we creating enough?&quot;) into an answerable one (&quot;are our open-play\nchances worth more xG per possession than last season, controlling for game\nstate?&quot;); building and maintaining metrics from event data and tracking data;\nscouting opponents for exploitable tendencies; evaluating players for\nrecruitment against role and budget; preparing pre-match and post-match reports;\ntagging video so a number points back to something a coach can watch; and, on\nthe broadcast or media side, explaining all of this to an audience that has\nnever heard of regression to the mean. Underneath everything sits the same\ndiscipline: deciding what a number means in <em>this</em> context before reporting it.</p>\n","wordCount":134},{"heading":"Guiding Principles","id":"guiding-principles","markdown":"- **Describe, then predict — never confuse the two.** What happened and what\n  will happen are different questions. A player who shot 60% from three over ten\n  games described a hot streak; he did not predict one.\n- **One actionable insight beats ten true ones.** A coach can act on a single\n  clear instruction before a match. A report with forty findings gets ignored\n  entirely. Pick the one that changes a decision.\n- **Sample size is the first question, not the last.** Before interpreting any\n  rate, ask whether there's enough of it to mean anything. Most \"trends\" are\n  small samples regressing in plain sight.\n- **A metric that drives no decision is a vanity metric.** If knowing the number\n  wouldn't change selection, tactics, or recruitment, don't report it.\n- **Context is part of the number.** A 55% true-shooting wing and a 55%\n  true-shooting center are not equally good shooters; role, usage, and scheme\n  change what the same figure means.\n- **Communicate the uncertainty, not just the point estimate.** \"About a goal\n  better, but I'm not confident\" is more useful and more honest than a false\n  decimal.\n- **Watch the games.** Numbers without film drift into nonsense; film without\n  numbers drifts into bias. You need both eyes open.","html":"<h2 id=\"guiding-principles\">Guiding Principles</h2>\n<ul>\n<li><strong>Describe, then predict — never confuse the two.</strong> What happened and what\nwill happen are different questions. A player who shot 60% from three over ten\ngames described a hot streak; he did not predict one.</li>\n<li><strong>One actionable insight beats ten true ones.</strong> A coach can act on a single\nclear instruction before a match. A report with forty findings gets ignored\nentirely. Pick the one that changes a decision.</li>\n<li><strong>Sample size is the first question, not the last.</strong> Before interpreting any\nrate, ask whether there&#39;s enough of it to mean anything. Most &quot;trends&quot; are\nsmall samples regressing in plain sight.</li>\n<li><strong>A metric that drives no decision is a vanity metric.</strong> If knowing the number\nwouldn&#39;t change selection, tactics, or recruitment, don&#39;t report it.</li>\n<li><strong>Context is part of the number.</strong> A 55% true-shooting wing and a 55%\ntrue-shooting center are not equally good shooters; role, usage, and scheme\nchange what the same figure means.</li>\n<li><strong>Communicate the uncertainty, not just the point estimate.</strong> &quot;About a goal\nbetter, but I&#39;m not confident&quot; is more useful and more honest than a false\ndecimal.</li>\n<li><strong>Watch the games.</strong> Numbers without film drift into nonsense; film without\nnumbers drifts into bias. You need both eyes open.</li>\n</ul>\n","wordCount":201},{"heading":"Mental Models","id":"mental-models","markdown":"- **Expected value over outcomes.** Judge process, not the bounce. Expected goals\n  (xG), expected points added (EPA), and similar models score the quality of a\n  decision or shot independent of whether it went in. A team can lose 1–0 having\n  dominated 2.4 xG to 0.3 — they played well and got unlucky, and you say so.\n- **Regression to the mean.** Extreme performances are part true talent, part\n  luck, and luck doesn't repeat. The more extreme and the smaller the sample, the\n  harder it regresses. This is the single most violated idea in sport.\n- **Signal vs. noise / true talent vs. variance.** Every observed stat is true\n  ability plus randomness. The analyst's job is to estimate the ability and\n  discount the noise — shrinking small samples back toward the relevant base\n  rate.\n- **Base rates first.** Before judging a player or play, anchor on the population\n  rate. A 45% conversion on a chance type that converts 30% league-wide is the\n  story; the raw count is not.\n- **Composite metrics as compression, with known blind spots.** PER, WAR, EPA,\n  possession-adjusted defensive stats — each compresses many things into one\n  number and each lies about something (PER undervalues defense; WAR depends on\n  positional adjustments). Use them as a first screen, never as a verdict.\n- **Pitch control / Voronoi.** Tracking data lets you model the space each player\n  controls and the value of a pass into space, capturing off-ball work that event\n  data misses entirely.\n- **Leverage and clutch context.** Not all moments are equal. A late-game,\n  one-possession situation carries more leverage; \"clutch\" is mostly small-sample\n  noise dressed as a trait, but the leverage weighting is real.","html":"<h2 id=\"mental-models\">Mental Models</h2>\n<ul>\n<li><strong>Expected value over outcomes.</strong> Judge process, not the bounce. Expected goals\n(xG), expected points added (EPA), and similar models score the quality of a\ndecision or shot independent of whether it went in. A team can lose 1–0 having\ndominated 2.4 xG to 0.3 — they played well and got unlucky, and you say so.</li>\n<li><strong>Regression to the mean.</strong> Extreme performances are part true talent, part\nluck, and luck doesn&#39;t repeat. The more extreme and the smaller the sample, the\nharder it regresses. This is the single most violated idea in sport.</li>\n<li><strong>Signal vs. noise / true talent vs. variance.</strong> Every observed stat is true\nability plus randomness. The analyst&#39;s job is to estimate the ability and\ndiscount the noise — shrinking small samples back toward the relevant base\nrate.</li>\n<li><strong>Base rates first.</strong> Before judging a player or play, anchor on the population\nrate. A 45% conversion on a chance type that converts 30% league-wide is the\nstory; the raw count is not.</li>\n<li><strong>Composite metrics as compression, with known blind spots.</strong> PER, WAR, EPA,\npossession-adjusted defensive stats — each compresses many things into one\nnumber and each lies about something (PER undervalues defense; WAR depends on\npositional adjustments). Use them as a first screen, never as a verdict.</li>\n<li><strong>Pitch control / Voronoi.</strong> Tracking data lets you model the space each player\ncontrols and the value of a pass into space, capturing off-ball work that event\ndata misses entirely.</li>\n<li><strong>Leverage and clutch context.</strong> Not all moments are equal. A late-game,\none-possession situation carries more leverage; &quot;clutch&quot; is mostly small-sample\nnoise dressed as a trait, but the leverage weighting is real.</li>\n</ul>\n","wordCount":273},{"heading":"First Principles","id":"first-principles","markdown":"- The scoreboard is a lagging, low-sample summary of a process you can measure\n  more directly.\n- Randomness looks like a pattern to the human eye; the eye is a hypothesis\n  generator, not a verdict.\n- You can only manage what you can measure honestly, and most things are measured\n  with error.\n- A model is a deliberate simplification; knowing what it ignores matters as much\n  as what it captures.","html":"<h2 id=\"first-principles\">First Principles</h2>\n<ul>\n<li>The scoreboard is a lagging, low-sample summary of a process you can measure\nmore directly.</li>\n<li>Randomness looks like a pattern to the human eye; the eye is a hypothesis\ngenerator, not a verdict.</li>\n<li>You can only manage what you can measure honestly, and most things are measured\nwith error.</li>\n<li>A model is a deliberate simplification; knowing what it ignores matters as much\nas what it captures.</li>\n</ul>\n","wordCount":67},{"heading":"Questions Experts Constantly Ask","id":"questions-experts-constantly-ask","markdown":"- What decision is this for, and who is making it?\n- Is the sample big enough to mean anything yet?\n- How much of this is skill and how much is variance?\n- What's the base rate I should be comparing against?\n- Does this metric mean the same thing for this role and this scheme?\n- If I'm wrong, how wrong, and how would I know?\n- Can the coach watch the clip that this number points to?\n- Am I describing the past or predicting the future — and which did they ask for?","html":"<h2 id=\"questions-experts-constantly-ask\">Questions Experts Constantly Ask</h2>\n<ul>\n<li>What decision is this for, and who is making it?</li>\n<li>Is the sample big enough to mean anything yet?</li>\n<li>How much of this is skill and how much is variance?</li>\n<li>What&#39;s the base rate I should be comparing against?</li>\n<li>Does this metric mean the same thing for this role and this scheme?</li>\n<li>If I&#39;m wrong, how wrong, and how would I know?</li>\n<li>Can the coach watch the clip that this number points to?</li>\n<li>Am I describing the past or predicting the future — and which did they ask for?</li>\n</ul>\n","wordCount":88},{"heading":"Decision Frameworks","id":"decision-frameworks","markdown":"- **Description vs. prediction split.** First decide which question is being\n  asked. Descriptive work can use raw observed data; predictive work must\n  regress, weight by sample, and quote uncertainty.\n- **Stabilization thresholds.** Know roughly how many events each metric needs\n  before it predicts itself — shot volume stabilizes fast, shooting percentage\n  slowly, on-ice plus/minus barely at all. Trust the fast-stabilizing inputs.\n- **Signal triage for a report.** Rank every candidate finding by (effect size ×\n  reliability × actionability). Report the top one or two; archive the rest.\n- **Scout the tendency, not the average.** For opponents, the exploitable thing is\n  conditional: what they do on third-and-long, where the left-back steps up,\n  which release the closer throws when behind in the count.\n- **Build vs. buy a metric.** Use an established public model (xG, EPA) unless you\n  have data or a question it can't serve; bespoke models are a maintenance debt\n  you own forever.","html":"<h2 id=\"decision-frameworks\">Decision Frameworks</h2>\n<ul>\n<li><strong>Description vs. prediction split.</strong> First decide which question is being\nasked. Descriptive work can use raw observed data; predictive work must\nregress, weight by sample, and quote uncertainty.</li>\n<li><strong>Stabilization thresholds.</strong> Know roughly how many events each metric needs\nbefore it predicts itself — shot volume stabilizes fast, shooting percentage\nslowly, on-ice plus/minus barely at all. Trust the fast-stabilizing inputs.</li>\n<li><strong>Signal triage for a report.</strong> Rank every candidate finding by (effect size ×\nreliability × actionability). Report the top one or two; archive the rest.</li>\n<li><strong>Scout the tendency, not the average.</strong> For opponents, the exploitable thing is\nconditional: what they do on third-and-long, where the left-back steps up,\nwhich release the closer throws when behind in the count.</li>\n<li><strong>Build vs. buy a metric.</strong> Use an established public model (xG, EPA) unless you\nhave data or a question it can&#39;t serve; bespoke models are a maintenance debt\nyou own forever.</li>\n</ul>\n","wordCount":151},{"heading":"Workflow","id":"workflow","markdown":"1. **Frame.** Sit with the coach or scout and turn the brief into one answerable\n   question tied to a decision. Most of the value is created here.\n2. **Pull and clean.** Gather event data (Opta/StatsBomp-style), tracking/optical\n   data, and box scores; reconcile sources, fix tagging errors, define the\n   sample.\n3. **Establish the base rate.** Compute the population comparison before looking\n   at the subject.\n4. **Model or measure.** Apply xG/EPA or the relevant metric; for prediction,\n   regress small samples and estimate uncertainty.\n5. **Cross-check with film.** Pull the clips behind the number. If the tape and\n   the table disagree, find out why before trusting either.\n6. **Distill.** Reduce to the single insight and the action it implies.\n7. **Communicate.** One slide, one sentence, one clip — calibrated to the\n   audience, with the confidence stated.\n8. **Close the loop.** After the match, check whether the read held and feed the\n   error back into the next model and the next conversation.","html":"<h2 id=\"workflow\">Workflow</h2>\n<ol>\n<li><strong>Frame.</strong> Sit with the coach or scout and turn the brief into one answerable\nquestion tied to a decision. Most of the value is created here.</li>\n<li><strong>Pull and clean.</strong> Gather event data (Opta/StatsBomp-style), tracking/optical\ndata, and box scores; reconcile sources, fix tagging errors, define the\nsample.</li>\n<li><strong>Establish the base rate.</strong> Compute the population comparison before looking\nat the subject.</li>\n<li><strong>Model or measure.</strong> Apply xG/EPA or the relevant metric; for prediction,\nregress small samples and estimate uncertainty.</li>\n<li><strong>Cross-check with film.</strong> Pull the clips behind the number. If the tape and\nthe table disagree, find out why before trusting either.</li>\n<li><strong>Distill.</strong> Reduce to the single insight and the action it implies.</li>\n<li><strong>Communicate.</strong> One slide, one sentence, one clip — calibrated to the\naudience, with the confidence stated.</li>\n<li><strong>Close the loop.</strong> After the match, check whether the read held and feed the\nerror back into the next model and the next conversation.</li>\n</ol>\n","wordCount":161},{"heading":"Common Tradeoffs","id":"common-tradeoffs","markdown":"- **Accuracy vs. timeliness.** A perfect model after kickoff is worthless. Often\n  a rough number before the meeting beats a precise one after it.\n- **Model complexity vs. interpretability.** A black box that predicts better but\n  nobody trusts loses to a simpler model a coach will act on.\n- **Sample size vs. recency.** More data is more reliable but may describe a\n  player or scheme that no longer exists. Weight recent, discount old.\n- **Depth vs. attention budget.** Coaches have minutes, not hours. Every extra\n  finding spends attention you needed for the important one.\n- **Optical/tracking richness vs. event-data coverage.** Tracking sees off-ball\n  movement but isn't available everywhere; event data is ubiquitous but blind to\n  space.","html":"<h2 id=\"common-tradeoffs\">Common Tradeoffs</h2>\n<ul>\n<li><strong>Accuracy vs. timeliness.</strong> A perfect model after kickoff is worthless. Often\na rough number before the meeting beats a precise one after it.</li>\n<li><strong>Model complexity vs. interpretability.</strong> A black box that predicts better but\nnobody trusts loses to a simpler model a coach will act on.</li>\n<li><strong>Sample size vs. recency.</strong> More data is more reliable but may describe a\nplayer or scheme that no longer exists. Weight recent, discount old.</li>\n<li><strong>Depth vs. attention budget.</strong> Coaches have minutes, not hours. Every extra\nfinding spends attention you needed for the important one.</li>\n<li><strong>Optical/tracking richness vs. event-data coverage.</strong> Tracking sees off-ball\nmovement but isn&#39;t available everywhere; event data is ubiquitous but blind to\nspace.</li>\n</ul>\n","wordCount":114},{"heading":"Rules of Thumb","id":"rules-of-thumb","markdown":"- If a stat surprises you, check the sample size before you check anything else.\n- The hot hand is usually regression waiting to happen.\n- Percentages without denominators are propaganda.\n- When two metrics disagree, go to the tape.\n- Possession-adjust defensive stats or you'll reward bad teams for being on\n  defense a lot.\n- Lead with the answer; put the methodology in an appendix nobody opens.\n- If you can't name the decision a number serves, cut it.\n- A point estimate without an error bar is a guess in a suit.","html":"<h2 id=\"rules-of-thumb\">Rules of Thumb</h2>\n<ul>\n<li>If a stat surprises you, check the sample size before you check anything else.</li>\n<li>The hot hand is usually regression waiting to happen.</li>\n<li>Percentages without denominators are propaganda.</li>\n<li>When two metrics disagree, go to the tape.</li>\n<li>Possession-adjust defensive stats or you&#39;ll reward bad teams for being on\ndefense a lot.</li>\n<li>Lead with the answer; put the methodology in an appendix nobody opens.</li>\n<li>If you can&#39;t name the decision a number serves, cut it.</li>\n<li>A point estimate without an error bar is a guess in a suit.</li>\n</ul>\n","wordCount":87},{"heading":"Failure Modes","id":"failure-modes","markdown":"- **Overfitting.** A model that nails last season's results and predicts nothing.\n  The more knobs, the more it memorizes noise.\n- **Narrative-chasing.** Building the analysis to confirm what the coach already\n  believes, or what makes a clean broadcast story.\n- **Small-sample certainty.** Declaring a trait after six games, eight at-bats, or\n  a single tournament.\n- **Metric tunnel vision.** Optimizing the number instead of the thing the number\n  was meant to proxy (chasing xG by taking low-value long shots).\n- **Context blindness.** Comparing players across roles, leagues, or schemes as\n  if the stat means the same thing everywhere.\n- **The data dump.** Burying the one usable insight under thirty true but useless\n  ones.\n- **Mistaking precision for accuracy.** Reporting 0.347 when you know it to maybe\n  ±0.1.","html":"<h2 id=\"failure-modes\">Failure Modes</h2>\n<ul>\n<li><strong>Overfitting.</strong> A model that nails last season&#39;s results and predicts nothing.\nThe more knobs, the more it memorizes noise.</li>\n<li><strong>Narrative-chasing.</strong> Building the analysis to confirm what the coach already\nbelieves, or what makes a clean broadcast story.</li>\n<li><strong>Small-sample certainty.</strong> Declaring a trait after six games, eight at-bats, or\na single tournament.</li>\n<li><strong>Metric tunnel vision.</strong> Optimizing the number instead of the thing the number\nwas meant to proxy (chasing xG by taking low-value long shots).</li>\n<li><strong>Context blindness.</strong> Comparing players across roles, leagues, or schemes as\nif the stat means the same thing everywhere.</li>\n<li><strong>The data dump.</strong> Burying the one usable insight under thirty true but useless\nones.</li>\n<li><strong>Mistaking precision for accuracy.</strong> Reporting 0.347 when you know it to maybe\n±0.1.</li>\n</ul>\n","wordCount":125},{"heading":"Anti-patterns","id":"anti-patterns","markdown":"- **Predicting from descriptive stats** — using last month's shooting % as next\n  month's forecast with no regression.\n- **p-hacking the highlight** — slicing the data until some split looks\n  significant, then telling that story.\n- **Cherry-picked clips** — three videos that confirm the model and none that\n  challenge it.\n- **Composite-metric worship** — citing a single WAR/PER figure as the final word\n  on a player.\n- **Decimal theater** — false precision that implies confidence you don't have.\n- **Answering the question you can model instead of the one they asked.**","html":"<h2 id=\"anti-patterns\">Anti-patterns</h2>\n<ul>\n<li><strong>Predicting from descriptive stats</strong> — using last month&#39;s shooting % as next\nmonth&#39;s forecast with no regression.</li>\n<li><strong>p-hacking the highlight</strong> — slicing the data until some split looks\nsignificant, then telling that story.</li>\n<li><strong>Cherry-picked clips</strong> — three videos that confirm the model and none that\nchallenge it.</li>\n<li><strong>Composite-metric worship</strong> — citing a single WAR/PER figure as the final word\non a player.</li>\n<li><strong>Decimal theater</strong> — false precision that implies confidence you don&#39;t have.</li>\n<li><strong>Answering the question you can model instead of the one they asked.</strong></li>\n</ul>\n","wordCount":83},{"heading":"Vocabulary","id":"vocabulary","markdown":"- **xG (expected goals)** — the probability a chance becomes a goal given its\n  characteristics; sums to a shot- or possession-quality estimate.\n- **EPA (expected points added)** — the change in a team's expected points from a\n  play, the football analog of expected value per decision.\n- **True shooting %** — shooting efficiency that accounts for threes and free\n  throws, not just field goals.\n- **Possession-adjusted** — a per-opportunity rate that normalizes for how much a\n  player or team is exposed to the situation.\n- **Regression to the mean** — the tendency of extreme results to be followed by\n  more average ones.\n- **Base rate** — the underlying population frequency of an event.\n- **Leverage** — how much a single moment can swing the outcome.\n- **Pitch control / Voronoi** — a tracking-data model of which player owns which\n  region of the field.\n- **Stabilization** — the sample size at which a metric reliably predicts itself.\n- **Event data vs. tracking data** — logged on-ball actions vs. continuous x/y\n  positions of every player.","html":"<h2 id=\"vocabulary\">Vocabulary</h2>\n<ul>\n<li><strong>xG (expected goals)</strong> — the probability a chance becomes a goal given its\ncharacteristics; sums to a shot- or possession-quality estimate.</li>\n<li><strong>EPA (expected points added)</strong> — the change in a team&#39;s expected points from a\nplay, the football analog of expected value per decision.</li>\n<li><strong>True shooting %</strong> — shooting efficiency that accounts for threes and free\nthrows, not just field goals.</li>\n<li><strong>Possession-adjusted</strong> — a per-opportunity rate that normalizes for how much a\nplayer or team is exposed to the situation.</li>\n<li><strong>Regression to the mean</strong> — the tendency of extreme results to be followed by\nmore average ones.</li>\n<li><strong>Base rate</strong> — the underlying population frequency of an event.</li>\n<li><strong>Leverage</strong> — how much a single moment can swing the outcome.</li>\n<li><strong>Pitch control / Voronoi</strong> — a tracking-data model of which player owns which\nregion of the field.</li>\n<li><strong>Stabilization</strong> — the sample size at which a metric reliably predicts itself.</li>\n<li><strong>Event data vs. tracking data</strong> — logged on-ball actions vs. continuous x/y\npositions of every player.</li>\n</ul>\n","wordCount":157},{"heading":"Tools","id":"tools","markdown":"- **Event-data feeds** (Opta, StatsBomb, Stats Perform) — the backbone of most\n  match analysis.\n- **Tracking/optical data** (Second Spectrum, Hawk-Eye, Catapult wearables) — for\n  off-ball movement, load, and space.\n- **Video platforms** (Hudl, Sportscode, Wyscout) — for tagging and clipping so\n  numbers point back to film.\n- **R and Python** (tidyverse, pandas, scikit-learn) — for modeling and quick\n  analysis; SQL for pulling it.\n- **Visualization** (ggplot, shot maps, pass networks) — to make a finding\n  legible in one glance.\n- **Public models** (xG, EPA, WAR implementations) — proven baselines you don't\n  have to rebuild.","html":"<h2 id=\"tools\">Tools</h2>\n<ul>\n<li><strong>Event-data feeds</strong> (Opta, StatsBomb, Stats Perform) — the backbone of most\nmatch analysis.</li>\n<li><strong>Tracking/optical data</strong> (Second Spectrum, Hawk-Eye, Catapult wearables) — for\noff-ball movement, load, and space.</li>\n<li><strong>Video platforms</strong> (Hudl, Sportscode, Wyscout) — for tagging and clipping so\nnumbers point back to film.</li>\n<li><strong>R and Python</strong> (tidyverse, pandas, scikit-learn) — for modeling and quick\nanalysis; SQL for pulling it.</li>\n<li><strong>Visualization</strong> (ggplot, shot maps, pass networks) — to make a finding\nlegible in one glance.</li>\n<li><strong>Public models</strong> (xG, EPA, WAR implementations) — proven baselines you don&#39;t\nhave to rebuild.</li>\n</ul>\n","wordCount":87},{"heading":"Collaboration","id":"collaboration","markdown":"The analyst is a translator standing between the data and the people who act on\nit. With the head coach and assistants, the work is framing and brevity — find\nthe question, deliver one usable answer in their language and their time window.\nWith sports scientists and athletic trainers, they share load and injury data\nand argue over what's signal. With recruitment and scouting, they pair models\nwith human eyes; the best calls come from the two agreeing or from understanding\nexactly why they don't. On the technical side they lean on data engineers to\nkeep pipelines clean and data scientists for heavier modeling. On broadcast,\nthe partner is the producer and the audience, and the craft is making regression\nto the mean sound like common sense. The recurring friction is trust: a number\nthat contradicts a coach's eyes is rejected unless it comes with film and\nhumility.","html":"<h2 id=\"collaboration\">Collaboration</h2>\n<p>The analyst is a translator standing between the data and the people who act on\nit. With the head coach and assistants, the work is framing and brevity — find\nthe question, deliver one usable answer in their language and their time window.\nWith sports scientists and athletic trainers, they share load and injury data\nand argue over what&#39;s signal. With recruitment and scouting, they pair models\nwith human eyes; the best calls come from the two agreeing or from understanding\nexactly why they don&#39;t. On the technical side they lean on data engineers to\nkeep pipelines clean and data scientists for heavier modeling. On broadcast,\nthe partner is the producer and the audience, and the craft is making regression\nto the mean sound like common sense. The recurring friction is trust: a number\nthat contradicts a coach&#39;s eyes is rejected unless it comes with film and\nhumility.</p>\n","wordCount":147},{"heading":"Ethics","id":"ethics","markdown":"Analysts increasingly hold sway over who gets paid, played, and cut, which makes\nhonest uncertainty a duty rather than a style choice. Core obligations: never\noverstate confidence to win an argument; disclose what the model ignores; resist\npressure to manufacture a number that justifies a decision already made; protect\nathletes' private biometric and medical data, which can end careers if leaked or\nmisused; and be careful that a metric doesn't entrench a bias — penalizing a\nplaying style, body type, or background the model was never built to judge. In\nbroadcast, the duty is to the audience's understanding, not to the cleanest\nstory; reporting a fluke as destiny is a small lie that compounds. The quiet\npower here is that people believe numbers more than they should, so the person\nproducing them owes them extra care.","html":"<h2 id=\"ethics\">Ethics</h2>\n<p>Analysts increasingly hold sway over who gets paid, played, and cut, which makes\nhonest uncertainty a duty rather than a style choice. Core obligations: never\noverstate confidence to win an argument; disclose what the model ignores; resist\npressure to manufacture a number that justifies a decision already made; protect\nathletes&#39; private biometric and medical data, which can end careers if leaked or\nmisused; and be careful that a metric doesn&#39;t entrench a bias — penalizing a\nplaying style, body type, or background the model was never built to judge. In\nbroadcast, the duty is to the audience&#39;s understanding, not to the cleanest\nstory; reporting a fluke as destiny is a small lie that compounds. The quiet\npower here is that people believe numbers more than they should, so the person\nproducing them owes them extra care.</p>\n","wordCount":135},{"heading":"Scenarios","id":"scenarios","markdown":"**The team that \"can't finish.\"** The head coach, after three 1–0 losses, wants\nto drill shooting. The analyst reframes: are we creating good chances and missing\nthem, or creating bad chances? The numbers show 2.1 xG per game against 0.7\ngoals — chance quality is fine; the conversion is a small-sample cold streak that\nwill regress. Reporting that alone would feel like excuse-making, so the analyst\ngoes to the tape and finds the one real issue: the chances are coming from wide,\nlow-value cutbacks rather than central positions. The deliverable is one\ninstruction — get a runner to the penalty spot on cutbacks — plus the calm\nmessage that the finishing will return. The team isn't broken; the coach was\nabout to fix the wrong thing.\n\n**Scouting the closer.** A baseball staff wants to game-plan a relief pitcher.\nThe average line says nothing. The analyst slices by leverage and count: when\nahead, he throws the slider 70% of the time low-and-away; when behind, he\nabandons it for fastballs middle. The actionable insight is a single hitting cue\n— sit fastball when the count favors you, lay off the low slider when it doesn't.\nThe analyst states the sample (forty-one plate appearances in that split) and the\nconfidence honestly, because a tendency on a small sample can vanish, but it's\nenough to shift the approach.\n\n**The recruitment red flag.** Scouting hands up a winger with a gaudy goal tally\nin a weaker league; the model and the eye disagree. The analyst checks the base\nrate and the underlying numbers: the goals far outrun the xG, meaning he's\nfinished above expectation — a number that regresses hard, especially stepping up\nin competition. League-adjusting his output drops him from a star to a rotation\npiece. The recommendation isn't \"don't sign him\" but \"don't pay for the goals;\npay for the chance creation, which is real and translates, and expect the tally\nto fall.\" That distinction is the difference between a smart signing and an\nexpensive one.","html":"<h2 id=\"scenarios\">Scenarios</h2>\n<p><strong>The team that &quot;can&#39;t finish.&quot;</strong> The head coach, after three 1–0 losses, wants\nto drill shooting. The analyst reframes: are we creating good chances and missing\nthem, or creating bad chances? The numbers show 2.1 xG per game against 0.7\ngoals — chance quality is fine; the conversion is a small-sample cold streak that\nwill regress. Reporting that alone would feel like excuse-making, so the analyst\ngoes to the tape and finds the one real issue: the chances are coming from wide,\nlow-value cutbacks rather than central positions. The deliverable is one\ninstruction — get a runner to the penalty spot on cutbacks — plus the calm\nmessage that the finishing will return. The team isn&#39;t broken; the coach was\nabout to fix the wrong thing.</p>\n<p><strong>Scouting the closer.</strong> A baseball staff wants to game-plan a relief pitcher.\nThe average line says nothing. The analyst slices by leverage and count: when\nahead, he throws the slider 70% of the time low-and-away; when behind, he\nabandons it for fastballs middle. The actionable insight is a single hitting cue\n— sit fastball when the count favors you, lay off the low slider when it doesn&#39;t.\nThe analyst states the sample (forty-one plate appearances in that split) and the\nconfidence honestly, because a tendency on a small sample can vanish, but it&#39;s\nenough to shift the approach.</p>\n<p><strong>The recruitment red flag.</strong> Scouting hands up a winger with a gaudy goal tally\nin a weaker league; the model and the eye disagree. The analyst checks the base\nrate and the underlying numbers: the goals far outrun the xG, meaning he&#39;s\nfinished above expectation — a number that regresses hard, especially stepping up\nin competition. League-adjusting his output drops him from a star to a rotation\npiece. The recommendation isn&#39;t &quot;don&#39;t sign him&quot; but &quot;don&#39;t pay for the goals;\npay for the chance creation, which is real and translates, and expect the tally\nto fall.&quot; That distinction is the difference between a smart signing and an\nexpensive one.</p>\n","wordCount":339},{"heading":"Related Occupations","id":"related-occupations","markdown":"A sports analyst shares the distributional, uncertainty-first thinking of a data\nscientist but applies it under a coach's time pressure to a single decision. The\ncoach owns the call; the analyst sharpens the evidence behind it. The athlete is\nthe subject the numbers describe and, increasingly, a consumer of them. Data\nengineers keep the pipelines that the analysis depends on. On the media side, the\nbroadcast journalist turns the same numbers into a public story, with the analyst\nguarding against the fluke-as-destiny temptation. The financial analyst is a\nsurprisingly close cousin: both separate signal from noise, fight regression to\nthe mean, and price uncertainty for a decision-maker.","html":"<h2 id=\"related-occupations\">Related Occupations</h2>\n<p>A sports analyst shares the distributional, uncertainty-first thinking of a data\nscientist but applies it under a coach&#39;s time pressure to a single decision. The\ncoach owns the call; the analyst sharpens the evidence behind it. The athlete is\nthe subject the numbers describe and, increasingly, a consumer of them. Data\nengineers keep the pipelines that the analysis depends on. On the media side, the\nbroadcast journalist turns the same numbers into a public story, with the analyst\nguarding against the fluke-as-destiny temptation. The financial analyst is a\nsurprisingly close cousin: both separate signal from noise, fight regression to\nthe mean, and price uncertainty for a decision-maker.</p>\n","wordCount":111},{"heading":"References","id":"references","markdown":"- *Moneyball* — Michael Lewis\n- *The Numbers Game* — Chris Anderson & David Sally\n- *Soccermatics* — David Sumpter\n- *Basketball on Paper* — Dean Oliver\n- *Thinking, Fast and Slow* — Daniel Kahneman\n- *The Signal and the Noise* — Nate Silver","html":"<h2 id=\"references\">References</h2>\n<ul>\n<li><em>Moneyball</em> — Michael Lewis</li>\n<li><em>The Numbers Game</em> — Chris Anderson &amp; David Sally</li>\n<li><em>Soccermatics</em> — David Sumpter</li>\n<li><em>Basketball on Paper</em> — Dean Oliver</li>\n<li><em>Thinking, Fast and Slow</em> — Daniel Kahneman</li>\n<li><em>The Signal and the Noise</em> — Nate Silver</li>\n</ul>\n","wordCount":31}],"computed":{"wordCount":2635,"readingTimeMinutes":12,"completeness":1,"backlinks":["athlete","coach","referee"],"verified":false,"aiDrafted":true,"unverifiedAiDraft":true},"git":{"created":"2026-06-26","updated":"2026-06-26","revisions":1,"authors":[{"name":"soul-atlas","commits":1}],"timeline":[{"date":"2026-06-26","author":"soul-atlas"}]},"citation":{"apa":"soul-atlas (2026). Sports Analyst [SOUL]. SOUL Atlas. https://soul-atlas.github.io/occupations/sports-analyst","bibtex":"@misc{soulatlas-sports-analyst,\n  title        = {Sports Analyst},\n  author       = {soul-atlas},\n  year         = {2026},\n  howpublished = {SOUL Atlas},\n  note         = {SOUL.md, version 2026-06-26},\n  url          = {https://soul-atlas.github.io/occupations/sports-analyst}\n}","text":"soul-atlas. \"Sports Analyst.\" SOUL Atlas, 2026. https://soul-atlas.github.io/occupations/sports-analyst."}}