{"slug":"linguist","title":"Linguist","metadata":{"title":"Linguist","slug":"linguist","aliases":["Language Scientist","Philologist","Linguistics Researcher"],"category":"Science","tags":["language","phonology","syntax","fieldwork","sociolinguistics"],"difficulty":"advanced","summary":"Describes how language actually works by treating it as a natural object to observe rather than manners to enforce, separating what speakers know from what they think they know.","contributors":["soul-atlas"],"last_reviewed":null,"provenance":"ai-generated","created":"2026-06-26","updated":"2026-06-26","related":[{"slug":"anthropologist","type":"adjacent","note":"shares the descriptive fieldwork stance toward a human system"},{"slug":"philosopher","type":"related","note":"overlaps in semantics and philosophy of language"},{"slug":"speech-language-pathologist","type":"adjacent","note":"applies sound and structure analysis clinically"},{"slug":"prompt-engineer","type":"collaboration","note":"builds language technology where linguistic questions resurface"},{"slug":"writer","type":"related","note":"manipulates from the inside the system the linguist describes"},{"slug":"neuroscientist","type":"adjacent","note":"asks where the language faculty lives in the brain"}],"specializations":["Phonologist","Syntactician","Sociolinguist","Documentary Linguist"],"country_variants":[],"sources":[{"title":"Course in General Linguistics","kind":"book"},{"title":"Syntactic Structures","kind":"book"},{"title":"Sociolinguistic Patterns","kind":"book"},{"title":"Language","kind":"book"}],"status":"draft","reviewers":[]},"sections":[{"heading":"Purpose","id":"purpose","markdown":"Language is the most complex behavior any species routinely performs, and almost\nall of it runs below conscious awareness. A linguist makes that hidden machinery\nvisible — describing the rules speakers follow without being able to state them.\nThe job is not to teach people to talk \"properly\" but to figure out what every\nfluent speaker already knows implicitly: which sounds count as different, which\nsentences feel wrong and why, how meaning crosses from one mind to another. The\nthing we use most fluently is the thing we least understand.","html":"<h2 id=\"purpose\">Purpose</h2>\n<p>Language is the most complex behavior any species routinely performs, and almost\nall of it runs below conscious awareness. A linguist makes that hidden machinery\nvisible — describing the rules speakers follow without being able to state them.\nThe job is not to teach people to talk &quot;properly&quot; but to figure out what every\nfluent speaker already knows implicitly: which sounds count as different, which\nsentences feel wrong and why, how meaning crosses from one mind to another. The\nthing we use most fluently is the thing we least understand.</p>\n","wordCount":89},{"heading":"Core Mission","id":"core-mission","markdown":"Describe how human language actually works — what speakers know, how they use it,\nand how it changes — by treating language as a natural object to be observed, not\na set of manners to be enforced.","html":"<h2 id=\"core-mission\">Core Mission</h2>\n<p>Describe how human language actually works — what speakers know, how they use it,\nand how it changes — by treating language as a natural object to be observed, not\na set of manners to be enforced.</p>\n","wordCount":35},{"heading":"Primary Responsibilities","id":"primary-responsibilities","markdown":"The work splits across analysis and observation. A linguist collects data —\nrecording speakers, running elicitation, querying corpora, transcribing into the\nIPA — and analyzes it at every level, breaking a sound stream into phonemes,\nwords into morphemes, sentences into constituents, utterances into speech acts.\nThey test hypotheses about the underlying system, usually rules or constraints,\nagainst grammaticality judgments and frequency counts. Many document endangered\nlanguages before the last fluent speakers die, producing grammars, dictionaries,\nand collections that outlive their consultants. Others run experiments, model\nvariation, reconstruct proto-languages, or advise on policy, forensics, and\nlanguage technology. Underneath it all is the discipline of separating what\nspeakers *say* from what they *think they say*, what they *do* from what they're\n*told* to do.","html":"<h2 id=\"primary-responsibilities\">Primary Responsibilities</h2>\n<p>The work splits across analysis and observation. A linguist collects data —\nrecording speakers, running elicitation, querying corpora, transcribing into the\nIPA — and analyzes it at every level, breaking a sound stream into phonemes,\nwords into morphemes, sentences into constituents, utterances into speech acts.\nThey test hypotheses about the underlying system, usually rules or constraints,\nagainst grammaticality judgments and frequency counts. Many document endangered\nlanguages before the last fluent speakers die, producing grammars, dictionaries,\nand collections that outlive their consultants. Others run experiments, model\nvariation, reconstruct proto-languages, or advise on policy, forensics, and\nlanguage technology. Underneath it all is the discipline of separating what\nspeakers <em>say</em> from what they <em>think they say</em>, what they <em>do</em> from what they&#39;re\n<em>told</em> to do.</p>\n","wordCount":121},{"heading":"Guiding Principles","id":"guiding-principles","markdown":"- **Describe, don't prescribe.** \"Ain't,\" double negatives, and \"between you and\n  I\" are facts about how language is used, not errors. Explain the pattern, never\n  grade it. Every variety has a grammar; none more logical than another.\n- **The native speaker is the authority.** If a fluent speaker says a sentence\n  sounds wrong, it is wrong — for that grammar. Theory answers to judgments.\n- **Form and meaning are separable.** The sign is arbitrary (Saussure): nothing\n  about *dog* connects it to the animal. Analyze sound, structure, and meaning\n  separately before relating them.\n- **Know vs. do.** Competence (the internal system) and performance (messy speech,\n  slips and false starts) are different objects. Don't reject a rule for a\n  stutter.\n- **Variation is structured, not noise.** When speakers vary — *-in'* vs.\n  *-ing* — the variation is patterned by social and linguistic factors (Labov).\n- **Frequency is data,** not a side note: how often a form occurs, and where.","html":"<h2 id=\"guiding-principles\">Guiding Principles</h2>\n<ul>\n<li><strong>Describe, don&#39;t prescribe.</strong> &quot;Ain&#39;t,&quot; double negatives, and &quot;between you and\nI&quot; are facts about how language is used, not errors. Explain the pattern, never\ngrade it. Every variety has a grammar; none more logical than another.</li>\n<li><strong>The native speaker is the authority.</strong> If a fluent speaker says a sentence\nsounds wrong, it is wrong — for that grammar. Theory answers to judgments.</li>\n<li><strong>Form and meaning are separable.</strong> The sign is arbitrary (Saussure): nothing\nabout <em>dog</em> connects it to the animal. Analyze sound, structure, and meaning\nseparately before relating them.</li>\n<li><strong>Know vs. do.</strong> Competence (the internal system) and performance (messy speech,\nslips and false starts) are different objects. Don&#39;t reject a rule for a\nstutter.</li>\n<li><strong>Variation is structured, not noise.</strong> When speakers vary — <em>-in&#39;</em> vs.\n<em>-ing</em> — the variation is patterned by social and linguistic factors (Labov).</li>\n<li><strong>Frequency is data,</strong> not a side note: how often a form occurs, and where.</li>\n</ul>\n","wordCount":148},{"heading":"Mental Models","id":"mental-models","markdown":"- **The levels of analysis.** Language is a stack: phonetics (physical sounds),\n  phonology (the sound system), morphology (word structure), syntax (sentence\n  structure), semantics (literal meaning), pragmatics (meaning in context), and\n  discourse (above the sentence). Confusion comes from arguing across levels.\n- **Phoneme vs. allophone.** A phoneme distinguishes meaning; its allophones are\n  predictable variants that don't. English [pʰ] in *pin* and [p] in *spin* are\n  one phoneme; the minimal pair (*pin* / *bin*) is the test.\n- **Langue vs. parole** (Saussure). Langue is the shared system; parole the\n  individual speech act. Study the system through its instances.\n- **Generative grammar and recursion** (Chomsky). A finite grammar generates\n  infinitely many sentences because rules embed inside themselves — why a\n  memorized list could never describe a language.\n- **Arbitrariness and duality of the sign.** Meaningless units (sounds) combine\n  into meaningful ones (words); that double layer lets a few dozen phonemes build\n  an unbounded vocabulary.\n- **The comparative method.** Systematic sound correspondences across related\n  languages reconstruct an unattested parent. Grimm's Law (PIE *p → f*, *t → θ* in\n  Germanic) is the model: change is regular, hence reversible.\n- **Grammaticalization.** Content words erode into grammar over time — Latin\n  *cantare habeo* (\"I have to sing\") became the French future *chanterai*.\n- **The variable** (Labov). A point where speakers choose between equivalent\n  forms; its distribution across class, age, and style is the data of\n  sociolinguistics.","html":"<h2 id=\"mental-models\">Mental Models</h2>\n<ul>\n<li><strong>The levels of analysis.</strong> Language is a stack: phonetics (physical sounds),\nphonology (the sound system), morphology (word structure), syntax (sentence\nstructure), semantics (literal meaning), pragmatics (meaning in context), and\ndiscourse (above the sentence). Confusion comes from arguing across levels.</li>\n<li><strong>Phoneme vs. allophone.</strong> A phoneme distinguishes meaning; its allophones are\npredictable variants that don&#39;t. English [pʰ] in <em>pin</em> and [p] in <em>spin</em> are\none phoneme; the minimal pair (<em>pin</em> / <em>bin</em>) is the test.</li>\n<li><strong>Langue vs. parole</strong> (Saussure). Langue is the shared system; parole the\nindividual speech act. Study the system through its instances.</li>\n<li><strong>Generative grammar and recursion</strong> (Chomsky). A finite grammar generates\ninfinitely many sentences because rules embed inside themselves — why a\nmemorized list could never describe a language.</li>\n<li><strong>Arbitrariness and duality of the sign.</strong> Meaningless units (sounds) combine\ninto meaningful ones (words); that double layer lets a few dozen phonemes build\nan unbounded vocabulary.</li>\n<li><strong>The comparative method.</strong> Systematic sound correspondences across related\nlanguages reconstruct an unattested parent. Grimm&#39;s Law (PIE <em>p → f</em>, <em>t → θ</em> in\nGermanic) is the model: change is regular, hence reversible.</li>\n<li><strong>Grammaticalization.</strong> Content words erode into grammar over time — Latin\n<em>cantare habeo</em> (&quot;I have to sing&quot;) became the French future <em>chanterai</em>.</li>\n<li><strong>The variable</strong> (Labov). A point where speakers choose between equivalent\nforms; its distribution across class, age, and style is the data of\nsociolinguistics.</li>\n</ul>\n","wordCount":216},{"heading":"First Principles","id":"first-principles","markdown":"- Every living language is rule-governed and adequate to its speakers' needs.\n- Language changes constantly, and the change is regular enough to study.\n- What a speaker can judge beats what they can explain.\n- A description that can't be falsified by a speaker isn't a description.","html":"<h2 id=\"first-principles\">First Principles</h2>\n<ul>\n<li>Every living language is rule-governed and adequate to its speakers&#39; needs.</li>\n<li>Language changes constantly, and the change is regular enough to study.</li>\n<li>What a speaker can judge beats what they can explain.</li>\n<li>A description that can&#39;t be falsified by a speaker isn&#39;t a description.</li>\n</ul>\n","wordCount":45},{"heading":"Questions Experts Constantly Ask","id":"questions-experts-constantly-ask","markdown":"- Is this a minimal pair — does swapping this sound change the word?\n- Is this difference contrastive, or just predictable from context?\n- What's the morpheme breakdown, and what does each piece mean?\n- Is this sentence ungrammatical, or just pragmatically odd?\n- Am I describing the language, or my own prescriptive prejudice?\n- Does this hold cross-linguistically, or only in languages I happen to know?\n- Is the speaker telling me what they say, or what they think they should say?","html":"<h2 id=\"questions-experts-constantly-ask\">Questions Experts Constantly Ask</h2>\n<ul>\n<li>Is this a minimal pair — does swapping this sound change the word?</li>\n<li>Is this difference contrastive, or just predictable from context?</li>\n<li>What&#39;s the morpheme breakdown, and what does each piece mean?</li>\n<li>Is this sentence ungrammatical, or just pragmatically odd?</li>\n<li>Am I describing the language, or my own prescriptive prejudice?</li>\n<li>Does this hold cross-linguistically, or only in languages I happen to know?</li>\n<li>Is the speaker telling me what they say, or what they think they should say?</li>\n</ul>\n","wordCount":77},{"heading":"Decision Frameworks","id":"decision-frameworks","markdown":"- **Contrastive analysis.** To decide whether two sounds are separate phonemes,\n  search for a minimal pair. Found one → contrastive. None, with the variants in\n  complementary distribution → allophones of one phoneme.\n- **Elicitation vs. corpus vs. experiment.** For a rare construction or an\n  undocumented language, elicit; for frequency and natural usage, use a corpus;\n  for processing or acquisition, experiment. Match method to whether you need\n  possibility, frequency, or cause.\n- **Comparative method vs. internal reconstruction.** Multiple related languages →\n  compare cognates for regular correspondences. A single language → reconstruct\n  internally from its alternations alone.\n- **Competence vs. performance data.** Grammaticality judgments probe the system;\n  corpora and recordings show it under load. Judgments for the *possible*, usage\n  data for the *probable*.\n- **Sapir-Whorf, handled carefully.** Entertain that language shapes thought, but\n  demand the strong claim be tested behaviorally, never assumed. Color terms and\n  spatial frames are defensible cases; \"Eskimos have N words for snow\" the\n  cautionary tale.","html":"<h2 id=\"decision-frameworks\">Decision Frameworks</h2>\n<ul>\n<li><strong>Contrastive analysis.</strong> To decide whether two sounds are separate phonemes,\nsearch for a minimal pair. Found one → contrastive. None, with the variants in\ncomplementary distribution → allophones of one phoneme.</li>\n<li><strong>Elicitation vs. corpus vs. experiment.</strong> For a rare construction or an\nundocumented language, elicit; for frequency and natural usage, use a corpus;\nfor processing or acquisition, experiment. Match method to whether you need\npossibility, frequency, or cause.</li>\n<li><strong>Comparative method vs. internal reconstruction.</strong> Multiple related languages →\ncompare cognates for regular correspondences. A single language → reconstruct\ninternally from its alternations alone.</li>\n<li><strong>Competence vs. performance data.</strong> Grammaticality judgments probe the system;\ncorpora and recordings show it under load. Judgments for the <em>possible</em>, usage\ndata for the <em>probable</em>.</li>\n<li><strong>Sapir-Whorf, handled carefully.</strong> Entertain that language shapes thought, but\ndemand the strong claim be tested behaviorally, never assumed. Color terms and\nspatial frames are defensible cases; &quot;Eskimos have N words for snow&quot; the\ncautionary tale.</li>\n</ul>\n","wordCount":149},{"heading":"Workflow","id":"workflow","markdown":"1. **Define the question and the language.** Phonology, syntax, change,\n   variation? The question dictates the method.\n2. **Gather data.** Record sessions, run elicitation with a paradigm in mind\n   (\"how do you say *I went*, *you went*, *they went*?\"), or pull from a corpus.\n3. **Transcribe.** Render speech in the IPA; for morphosyntax, produce\n   interlinear glosses per the Leipzig rules — text, morpheme gloss, translation.\n4. **Segment and identify.** Break the stream into units at the relevant level —\n   minimal pairs, morpheme boundaries, constituents.\n5. **Hypothesize a rule or constraint,** precise enough to be wrong.\n6. **Test against judgments and more data.** Hunt for counterexamples; check the\n   prediction with a new context or speaker.\n7. **Quantify if it varies,** modeling the conditioning factors (often in R).\n8. **Write it up with examples,** each claim anchored to glossed data; for\n   documentation, deposit the recordings in an archive.","html":"<h2 id=\"workflow\">Workflow</h2>\n<ol>\n<li><strong>Define the question and the language.</strong> Phonology, syntax, change,\nvariation? The question dictates the method.</li>\n<li><strong>Gather data.</strong> Record sessions, run elicitation with a paradigm in mind\n(&quot;how do you say <em>I went</em>, <em>you went</em>, <em>they went</em>?&quot;), or pull from a corpus.</li>\n<li><strong>Transcribe.</strong> Render speech in the IPA; for morphosyntax, produce\ninterlinear glosses per the Leipzig rules — text, morpheme gloss, translation.</li>\n<li><strong>Segment and identify.</strong> Break the stream into units at the relevant level —\nminimal pairs, morpheme boundaries, constituents.</li>\n<li><strong>Hypothesize a rule or constraint,</strong> precise enough to be wrong.</li>\n<li><strong>Test against judgments and more data.</strong> Hunt for counterexamples; check the\nprediction with a new context or speaker.</li>\n<li><strong>Quantify if it varies,</strong> modeling the conditioning factors (often in R).</li>\n<li><strong>Write it up with examples,</strong> each claim anchored to glossed data; for\ndocumentation, deposit the recordings in an archive.</li>\n</ol>\n","wordCount":143},{"heading":"Common Tradeoffs","id":"common-tradeoffs","markdown":"- **Elicitation control vs. naturalness.** Elicited data is clean and targeted\n  but artificial; spontaneous speech is natural but messy and may never contain\n  the form you need. Usually you need both.\n- **Descriptive coverage vs. theoretical elegance.** A theory that captures 95%\n  of cases beautifully may force you to ignore the awkward 5% — often where the\n  insight is.\n- **Breadth vs. depth.** A typological survey of 200 languages trades the depth a\n  single grammar gives for claims that generalize.\n- **The observer's paradox** (Labov). You want to record how people speak\n  unobserved, but the recording is the observation. Long interviews and emotional\n  topics try to lower the speaker's guard.\n- **Speed vs. the speakers.** Endangered-language work races a clock, but rushing\n  consultants or extracting data without reciprocity damages both.","html":"<h2 id=\"common-tradeoffs\">Common Tradeoffs</h2>\n<ul>\n<li><strong>Elicitation control vs. naturalness.</strong> Elicited data is clean and targeted\nbut artificial; spontaneous speech is natural but messy and may never contain\nthe form you need. Usually you need both.</li>\n<li><strong>Descriptive coverage vs. theoretical elegance.</strong> A theory that captures 95%\nof cases beautifully may force you to ignore the awkward 5% — often where the\ninsight is.</li>\n<li><strong>Breadth vs. depth.</strong> A typological survey of 200 languages trades the depth a\nsingle grammar gives for claims that generalize.</li>\n<li><strong>The observer&#39;s paradox</strong> (Labov). You want to record how people speak\nunobserved, but the recording is the observation. Long interviews and emotional\ntopics try to lower the speaker&#39;s guard.</li>\n<li><strong>Speed vs. the speakers.</strong> Endangered-language work races a clock, but rushing\nconsultants or extracting data without reciprocity damages both.</li>\n</ul>\n","wordCount":125},{"heading":"Rules of Thumb","id":"rules-of-thumb","markdown":"- If you can't find a minimal pair, the contrast may not be phonemic — keep\n  looking, but suspect allophony.\n- Gloss everything; an example without a gloss is an assertion, not evidence.\n- \"Sounds weird\" is not \"ungrammatical\" — separate semantics and pragmatics from\n  syntax.\n- The exception you want to dismiss is usually the most informative datum.\n- Ask the speaker to translate *into* the language, then back — discrepancies\n  reveal structure.\n- Never trust your own intuitions about a language you didn't grow up speaking.","html":"<h2 id=\"rules-of-thumb\">Rules of Thumb</h2>\n<ul>\n<li>If you can&#39;t find a minimal pair, the contrast may not be phonemic — keep\nlooking, but suspect allophony.</li>\n<li>Gloss everything; an example without a gloss is an assertion, not evidence.</li>\n<li>&quot;Sounds weird&quot; is not &quot;ungrammatical&quot; — separate semantics and pragmatics from\nsyntax.</li>\n<li>The exception you want to dismiss is usually the most informative datum.</li>\n<li>Ask the speaker to translate <em>into</em> the language, then back — discrepancies\nreveal structure.</li>\n<li>Never trust your own intuitions about a language you didn&#39;t grow up speaking.</li>\n</ul>\n","wordCount":79},{"heading":"Failure Modes","id":"failure-modes","markdown":"- **Anglocentrism.** Building a \"universal\" on European languages, then being\n  blindsided by a language with no adjectives, no tense, or free word order.\n- **Confusing the writing system with the language.** Spelling is a recent,\n  conservative artifact; the spoken system is the object.\n- **Over-reading Sapir-Whorf.** Leaping from a lexical difference to a sweeping\n  claim about how a people *thinks*.\n- **Cherry-picking judgments.** Reporting examples that fit and dropping those\n  that don't.\n- **The single-consultant trap.** Generalizing one person's idiosyncratic\n  idiolect to a whole language.","html":"<h2 id=\"failure-modes\">Failure Modes</h2>\n<ul>\n<li><strong>Anglocentrism.</strong> Building a &quot;universal&quot; on European languages, then being\nblindsided by a language with no adjectives, no tense, or free word order.</li>\n<li><strong>Confusing the writing system with the language.</strong> Spelling is a recent,\nconservative artifact; the spoken system is the object.</li>\n<li><strong>Over-reading Sapir-Whorf.</strong> Leaping from a lexical difference to a sweeping\nclaim about how a people <em>thinks</em>.</li>\n<li><strong>Cherry-picking judgments.</strong> Reporting examples that fit and dropping those\nthat don&#39;t.</li>\n<li><strong>The single-consultant trap.</strong> Generalizing one person&#39;s idiosyncratic\nidiolect to a whole language.</li>\n</ul>\n","wordCount":84},{"heading":"Anti-patterns","id":"anti-patterns","markdown":"- **Armchair data.** Inventing example sentences and judging them yourself, then\n  building a theory on them.\n- **Etymological fallacy in reverse.** Claiming a word \"really means\" its origin,\n  or that a sound change \"shouldn't\" have happened.\n- **Theory-first fieldwork.** Forcing a foreign language into your favorite\n  framework's categories instead of letting its system emerge.\n- **Notation theater.** Dense formalism that adds only the appearance of rigor,\n  hiding a thin empirical claim.","html":"<h2 id=\"anti-patterns\">Anti-patterns</h2>\n<ul>\n<li><strong>Armchair data.</strong> Inventing example sentences and judging them yourself, then\nbuilding a theory on them.</li>\n<li><strong>Etymological fallacy in reverse.</strong> Claiming a word &quot;really means&quot; its origin,\nor that a sound change &quot;shouldn&#39;t&quot; have happened.</li>\n<li><strong>Theory-first fieldwork.</strong> Forcing a foreign language into your favorite\nframework&#39;s categories instead of letting its system emerge.</li>\n<li><strong>Notation theater.</strong> Dense formalism that adds only the appearance of rigor,\nhiding a thin empirical claim.</li>\n</ul>\n","wordCount":68},{"heading":"Vocabulary","id":"vocabulary","markdown":"- **Phoneme** — the smallest sound unit that distinguishes meaning.\n- **Allophone** — a predictable variant of a phoneme that never changes meaning.\n- **Minimal pair** — two words differing in one sound, proving contrast.\n- **Morpheme** — the smallest unit carrying meaning or grammar.\n- **Langue / parole** — the shared language system vs. the individual speech act.\n- **Competence / performance** — what a speaker knows vs. produces.\n- **Implicature** — meaning conveyed beyond what is said (Grice).\n- **Grammaticalization** — the drift of content words into grammatical markers.\n- **The variable** — a choice point between equivalent forms, socially patterned.\n- **Complementary distribution** — forms that never share an environment, the\n  hallmark of allophones.\n- **Gloss** — the morpheme-by-morpheme translation under an example.","html":"<h2 id=\"vocabulary\">Vocabulary</h2>\n<ul>\n<li><strong>Phoneme</strong> — the smallest sound unit that distinguishes meaning.</li>\n<li><strong>Allophone</strong> — a predictable variant of a phoneme that never changes meaning.</li>\n<li><strong>Minimal pair</strong> — two words differing in one sound, proving contrast.</li>\n<li><strong>Morpheme</strong> — the smallest unit carrying meaning or grammar.</li>\n<li><strong>Langue / parole</strong> — the shared language system vs. the individual speech act.</li>\n<li><strong>Competence / performance</strong> — what a speaker knows vs. produces.</li>\n<li><strong>Implicature</strong> — meaning conveyed beyond what is said (Grice).</li>\n<li><strong>Grammaticalization</strong> — the drift of content words into grammatical markers.</li>\n<li><strong>The variable</strong> — a choice point between equivalent forms, socially patterned.</li>\n<li><strong>Complementary distribution</strong> — forms that never share an environment, the\nhallmark of allophones.</li>\n<li><strong>Gloss</strong> — the morpheme-by-morpheme translation under an example.</li>\n</ul>\n","wordCount":104},{"heading":"Tools","id":"tools","markdown":"- **The IPA** — a one-symbol-per-sound alphabet for transcribing any language\n  unambiguously.\n- **Praat** — acoustic analysis: spectrograms, formants, pitch tracks; the\n  ground truth of what was said.\n- **ELAN** — time-aligned annotation of audio and video, the workhorse of\n  documentation and discourse work.\n- **Corpora and concordancers** (COCA, BNC, treebanks) — for frequency,\n  collocation, and natural usage at scale.\n- **Elicitation kits** — paradigm sheets, picture tasks, and stimulus sets for\n  drawing out targeted forms.\n- **R and statistical models** — mixed-effects regression for variation, Rbrul\n  for sociolinguistic variables.\n- **Leipzig Glossing Rules** — the convention that makes examples legible across\n  the field.","html":"<h2 id=\"tools\">Tools</h2>\n<ul>\n<li><strong>The IPA</strong> — a one-symbol-per-sound alphabet for transcribing any language\nunambiguously.</li>\n<li><strong>Praat</strong> — acoustic analysis: spectrograms, formants, pitch tracks; the\nground truth of what was said.</li>\n<li><strong>ELAN</strong> — time-aligned annotation of audio and video, the workhorse of\ndocumentation and discourse work.</li>\n<li><strong>Corpora and concordancers</strong> (COCA, BNC, treebanks) — for frequency,\ncollocation, and natural usage at scale.</li>\n<li><strong>Elicitation kits</strong> — paradigm sheets, picture tasks, and stimulus sets for\ndrawing out targeted forms.</li>\n<li><strong>R and statistical models</strong> — mixed-effects regression for variation, Rbrul\nfor sociolinguistic variables.</li>\n<li><strong>Leipzig Glossing Rules</strong> — the convention that makes examples legible across\nthe field.</li>\n</ul>\n","wordCount":95},{"heading":"Collaboration","id":"collaboration","markdown":"Linguistics sits between the humanities, social science, and cognitive science,\nso collaborators vary by subfield. Documentary linguists work *with* language\ncommunities, not on them — consultants are co-authors and rights-holders, not\ndata sources. Computational linguists pair with software and prompt engineers\nbuilding language technology. Psycholinguists run studies with neuroscientists;\nhistorical linguists trade evidence with archaeologists and geneticists tracking\nmigration. The recurring tension is between formal theorists who prize elegant\nmodels and fieldworkers who prize messy coverage; the best work lets data and\ntheory discipline each other.","html":"<h2 id=\"collaboration\">Collaboration</h2>\n<p>Linguistics sits between the humanities, social science, and cognitive science,\nso collaborators vary by subfield. Documentary linguists work <em>with</em> language\ncommunities, not on them — consultants are co-authors and rights-holders, not\ndata sources. Computational linguists pair with software and prompt engineers\nbuilding language technology. Psycholinguists run studies with neuroscientists;\nhistorical linguists trade evidence with archaeologists and geneticists tracking\nmigration. The recurring tension is between formal theorists who prize elegant\nmodels and fieldworkers who prize messy coverage; the best work lets data and\ntheory discipline each other.</p>\n","wordCount":87},{"heading":"Ethics","id":"ethics","markdown":"Language is identity, and studying it carries obligations. The first is to the\ncommunities whose languages we document: informed consent, fair credit, and\nreturning materials in usable form — not extracting a dissertation and vanishing.\nEndangered-language work must serve the speakers' own goals at least as much as\nthe researcher's. The second is to the public: no dialect is broken, \"bad\ngrammar\" is usually a class or race judgment in disguise, and we have a duty to\nsay so against prescriptive shaming. Forensic and policy work — voice\nidentification, asylum language analysis, official-language legislation — can\ndecide whether someone is believed or deported, so the method's limits must be\nstated as plainly as its findings.","html":"<h2 id=\"ethics\">Ethics</h2>\n<p>Language is identity, and studying it carries obligations. The first is to the\ncommunities whose languages we document: informed consent, fair credit, and\nreturning materials in usable form — not extracting a dissertation and vanishing.\nEndangered-language work must serve the speakers&#39; own goals at least as much as\nthe researcher&#39;s. The second is to the public: no dialect is broken, &quot;bad\ngrammar&quot; is usually a class or race judgment in disguise, and we have a duty to\nsay so against prescriptive shaming. Forensic and policy work — voice\nidentification, asylum language analysis, official-language legislation — can\ndecide whether someone is believed or deported, so the method&#39;s limits must be\nstated as plainly as its findings.</p>\n","wordCount":114},{"heading":"Scenarios","id":"scenarios","markdown":"**An unfamiliar sound in the field.** Working on an undocumented language, the\nlinguist hears what might be two vowels where English has one. Instead of\nguessing, they build a paradigm, elicit words that should differ only in that\nvowel, and hunt for a minimal pair. They find *tási* \"rope\" vs. *tàsi* \"river\" — same\nsegments, different pitch, different meaning. That settles it: the language is\ntonal, and tone is phonemic. After confirming distinct pitch contours in Praat,\nthey revise the phoneme inventory and re-transcribe earlier sessions — a contrast\nthey'd ignored had been changing meanings all along.\n\n**A \"grammar error\" that isn't.** A school district asks whether a community's\nchildren \"can't speak correctly\" because they say *she don't* and drop\ncopulas — *he tired*. The linguist shows these are not mistakes but a\nrule-governed grammar (here, features of African American English): copula\ndeletion happens in exactly the environments where standard English allows\ncontraction, and nowhere else. The right framing is not remediation but\nbidialectalism — teaching the standard as an added register without pathologizing\nthe home variety. The data, not the prejudice, drives the call.\n\n**Reconstructing a parent language.** Three related languages show a pattern: one\nhas *p* where the others have *f* and *h* in the same cognates — *pata*, *fata*,\n*hata* \"foot.\" The correspondence holds across dozens of cognates, so the linguist\nreconstructs a proto-form with *\\*p* and posits two regular sound changes in the\ndaughter branches. The apparent exceptions turn out to be borrowings, identifiable\nprecisely because they *don't* obey the law — regularity is the tool that exposes\nits own exceptions.","html":"<h2 id=\"scenarios\">Scenarios</h2>\n<p><strong>An unfamiliar sound in the field.</strong> Working on an undocumented language, the\nlinguist hears what might be two vowels where English has one. Instead of\nguessing, they build a paradigm, elicit words that should differ only in that\nvowel, and hunt for a minimal pair. They find <em>tási</em> &quot;rope&quot; vs. <em>tàsi</em> &quot;river&quot; — same\nsegments, different pitch, different meaning. That settles it: the language is\ntonal, and tone is phonemic. After confirming distinct pitch contours in Praat,\nthey revise the phoneme inventory and re-transcribe earlier sessions — a contrast\nthey&#39;d ignored had been changing meanings all along.</p>\n<p><strong>A &quot;grammar error&quot; that isn&#39;t.</strong> A school district asks whether a community&#39;s\nchildren &quot;can&#39;t speak correctly&quot; because they say <em>she don&#39;t</em> and drop\ncopulas — <em>he tired</em>. The linguist shows these are not mistakes but a\nrule-governed grammar (here, features of African American English): copula\ndeletion happens in exactly the environments where standard English allows\ncontraction, and nowhere else. The right framing is not remediation but\nbidialectalism — teaching the standard as an added register without pathologizing\nthe home variety. The data, not the prejudice, drives the call.</p>\n<p><strong>Reconstructing a parent language.</strong> Three related languages show a pattern: one\nhas <em>p</em> where the others have <em>f</em> and <em>h</em> in the same cognates — <em>pata</em>, <em>fata</em>,\n<em>hata</em> &quot;foot.&quot; The correspondence holds across dozens of cognates, so the linguist\nreconstructs a proto-form with <em>*p</em> and posits two regular sound changes in the\ndaughter branches. The apparent exceptions turn out to be borrowings, identifiable\nprecisely because they <em>don&#39;t</em> obey the law — regularity is the tool that exposes\nits own exceptions.</p>\n","wordCount":264},{"heading":"Related Occupations","id":"related-occupations","markdown":"The linguist shares the descriptive, fieldwork-driven stance of the\nanthropologist — both observe a human system on its own terms rather than judging\nit — and the conceptual rigor of the philosopher, especially in semantics and the\nphilosophy of language. The speech-language pathologist applies the same sound\nand structure analysis clinically, to repair language rather than describe it. The\nwriter manipulates the system from the inside. The prompt engineer and\ncomputational linguist build machines that model language statistically, where\nquestions about meaning and ambiguity resurface as engineering problems. The\nneuroscientist asks where in the brain this machinery lives.","html":"<h2 id=\"related-occupations\">Related Occupations</h2>\n<p>The linguist shares the descriptive, fieldwork-driven stance of the\nanthropologist — both observe a human system on its own terms rather than judging\nit — and the conceptual rigor of the philosopher, especially in semantics and the\nphilosophy of language. The speech-language pathologist applies the same sound\nand structure analysis clinically, to repair language rather than describe it. The\nwriter manipulates the system from the inside. The prompt engineer and\ncomputational linguist build machines that model language statistically, where\nquestions about meaning and ambiguity resurface as engineering problems. The\nneuroscientist asks where in the brain this machinery lives.</p>\n","wordCount":98},{"heading":"References","id":"references","markdown":"- *Course in General Linguistics* — Ferdinand de Saussure\n- *Syntactic Structures* — Noam Chomsky\n- *Language* — Leonard Bloomfield\n- *Sociolinguistic Patterns* — William Labov\n- *Studies in the Way of Words* — H. P. Grice\n- The Leipzig Glossing Rules — Max Planck Institute / DFG","html":"<h2 id=\"references\">References</h2>\n<ul>\n<li><em>Course in General Linguistics</em> — Ferdinand de Saussure</li>\n<li><em>Syntactic Structures</em> — Noam Chomsky</li>\n<li><em>Language</em> — Leonard Bloomfield</li>\n<li><em>Sociolinguistic Patterns</em> — William Labov</li>\n<li><em>Studies in the Way of Words</em> — H. P. Grice</li>\n<li>The Leipzig Glossing Rules — Max Planck Institute / DFG</li>\n</ul>\n","wordCount":35}],"computed":{"wordCount":2176,"readingTimeMinutes":10,"completeness":1,"backlinks":["anthropologist","court-reporter","editor","interpreter","philosopher","poet"],"verified":false,"aiDrafted":true,"unverifiedAiDraft":true},"git":{"created":"2026-06-26","updated":"2026-06-27","revisions":6,"authors":[{"name":"soul-atlas","commits":6}],"timeline":[{"date":"2026-06-26","author":"soul-atlas"},{"date":"2026-06-27","author":"soul-atlas"},{"date":"2026-06-27","author":"soul-atlas"},{"date":"2026-06-27","author":"soul-atlas"},{"date":"2026-06-27","author":"soul-atlas"},{"date":"2026-06-27","author":"soul-atlas"}]},"citation":{"apa":"soul-atlas (2026). Linguist [SOUL]. SOUL Atlas. https://soul-atlas.github.io/occupations/linguist","bibtex":"@misc{soulatlas-linguist,\n  title        = {Linguist},\n  author       = {soul-atlas},\n  year         = {2026},\n  howpublished = {SOUL Atlas},\n  note         = {SOUL.md, version 2026-06-27},\n  url          = {https://soul-atlas.github.io/occupations/linguist}\n}","text":"soul-atlas. \"Linguist.\" SOUL Atlas, 2026. https://soul-atlas.github.io/occupations/linguist."}}