The Four Whispers
What SEC Filing Language Reveals Before Analysts Notice
Every public company in the US files a 10-K once a year and a 10-Q three times a year. The documents are drafted by disclosure counsel, reviewed by the audit committee, and signed by the CEO and CFO. They are the most carefully worded text any company produces. That fastidiousness is exactly what makes them useful. When a risk factor that read the same way for four years is rewritten in the fifth year, when a term that appeared in every quarterly filing in a sector quietly vanishes from one company's filing, when an MD&A section substantially rewrites the same story it told the previous year β these are not accidents. Something changed in what management felt it had to say, and the record carries the change in a form a reader can measure.
The Language Analysis module measures those changes at the sector level. It is not a predictive model, and it does not claim to know what divergence means. It claims only that divergence exists and that it can be located precisely enough for a human analyst to read the filing and decide what to do about it. The distinction is the whole point. A detection system that surfaces the filings most worth reading, without pretending to classify them, produces a genuinely useful surface for pre-analyst-downgrade research. A classifier that labels filings as bullish or bearish from the sentiment scores alone is claiming to know the future from a piece of audited regulatory text, and the claim does not hold.
The sections below explain why divergence is worth measuring, what a concrete example looks like in the managed-care sector, and how the Language Analysis output sits alongside the other research modules on this site. None of it is mysterious. All of it reduces to arithmetic on tokenized filing text against sector-specific lexicons, with enough statistical care to distinguish real anomalies from chance.
Why Filing Language Moves Before Analysts Do
The standard equity-research information flow runs in a predictable order. A company reports earnings. The CFO gives forward guidance on the call. The analyst community revises models within a week. Consensus numbers shift. The stock price adjusts. Coverage notes are published over the following two to four weeks. Price targets move. Ratings, if they move, typically lag the price change rather than leading it. This cadence is well documented in the sell-side literature and it is not controversial.
What is less widely discussed is that the audited 10-K and 10-Q filings usually arrive with or slightly after the earnings release, and their narrative sections β Risk Factors (Item 1A), MD&A (Item 7 for annuals, Item 2 for quarterlies), and Financial Statements commentary (Item 8) β contain detail that does not make it into the earnings call transcript or the press release. The narrative sections carry the work product of the disclosure counsel and audit committee: the risk language that the company's own lawyers judged necessary to include as a condition of signing the filing. Management cannot soften a Risk Factors disclosure by picking careful words on the call if the underlying risk has crystallized enough that counsel requires it on the record. The filing language, therefore, tends to be less optimistic than the call commentary β and more precise β precisely because the liability attaches to the filing rather than to the call.
The interesting claim in practice is narrower than βfiling language predicts price.β It is that changes in filing language β the appearance of a new risk paragraph, the disappearance of a term that had been consistent across prior filings, a material rewrite of a section that had stabilized β can precede analyst recognition of the underlying change by one to two quarters. Analysts do not ignore 10-K text. They are simply processing a large number of filings against tight deadlines and they tend to focus on numerical disclosures and guidance updates rather than on careful diffs of the narrative sections. A reader with a quantitative instrument that measures narrative divergence across a peer group simultaneously has a small but real time advantage, and the advantage is largest at the moments when language is shifting most.
The Four Whispers, Concretely
The module decomposes language divergence into four measurable signals. Each one asks a different question of the same text, and any one of them in isolation is noisier than the four of them in agreement. The data page renders all four; the timeline compresses them into a per-filing signal-strength tier so that a reader can scan a multi-year sector history and locate the filings worth reading in full.
The first whisper is term-frequency deviation. Every sector has a curated lexicon β single words, bigrams, and trigrams that matter specifically for how companies in that sector describe their economics and their risks. For each term, for each filing, the pipeline asks two questions: how does the target company's usage rate compare to its peers filing in the same period, and how does it compare to the company's own history. Both questions produce a z-score, and extreme values at either tail are flagged. A managed-care company that uses a term about rate adequacy at three standard deviations below the peer mean is telling a reader something about its current posture toward that topic, and the signal is measurable.
The second whisper is Loughran-McDonald sentiment. The LM dictionary is the standard finance-specific tonal lexicon, with categories for negative, positive, uncertain, litigious, and constraining language plus a separate complexity score. The module applies LM scoring section by section β Risk Factors, MD&A, and Financials sections each have different baseline tones and averaging them together produces a meaningless composite. The observation that matters is usually the drift of the sector-mean sentiment across years, not the absolute level in any one year. A sector whose Risk Factors section is collectively trending more negative over three consecutive annual filings is registering a joint change in posture across the peer group, and that trend is visible on the sentiment chart before it is priced.
The third whisper is cosine-similarity drift. Each filing section is represented as a term-frequency vector over the sector lexicon, and the cosine similarity between the current-year section and the prior-year section is computed. A similarity near 1.0 means the section reads as an update to the previous year's template with small numerical changes; a similarity below the configured threshold (default 0.80 in managed care) means the section has been substantively rewritten. A full rewrite of a Risk Factors section is expensive in legal review time, and companies do not perform them lightly. When they do, the reason is usually worth reading.
The fourth whisper is conspicuous silence. The pipeline tracks every lexicon term across every filing in the peer group over time, and flags two specific patterns. The first is term disappearance: a term that appeared consistently in a company's prior four or more filings but is absent from the current filing. The second is stability against peer motion: a term whose peer-group mean frequency has shifted materially while the target company has kept its language fixed. Both patterns carry information. The disappeared term is the sharpest anomaly in the dataset, because management had previously judged the topic worth discussing and has now chosen to stop.
A Worked Example β The Managed Care Sector
Managed care is the most-calibrated sector currently enabled on this site. The peer group contains nine tickers: Molina Healthcare (MOH), UnitedHealth Group (UNH), Elevance Health (ELV), Centene (CNC), Humana (HUM), Alignment Healthcare (ALHC), Oscar Health (OSCR), Clover Health (CLOV), and Cigna Group (CI). CI is included with an explicit caveat in the peer-yaml notes: its revenue is dominated by the Evernorth pharmacy-benefit business, and its filing language is heavily weighted toward PBM terminology rather than MCO-core language. It stays in the corpus so that readers see the full sector surface, but the PBM weighting is flagged wherever it affects a signal.
The sector-configuration file (sectors/managed-care/config.yaml) sets the thresholds that govern flag generation. The minimum peer count for cross-sectional scoring is eight β meaning a term-year observation is only scored against the peer distribution when at least eight of the nine peers have a filing in that year above the 500-word extraction floor. The longitudinal z-score threshold is 2.0, with a minimum of four prior observations required before the longitudinal channel fires. The cosine-similarity drop threshold for structural rewriting is 0.80. The signal-strength tier cutoffs on the filing timeline are: up to five flags is quiet, up to thirty flags is notable, up to eighty flags is elevated, and higher than that is loud. A filing with a cosine drop below 0.75 or a section-length change above fifty percent reaches the loud tier regardless of raw flag count.
What the reader sees on the Language Analysis data page for managed care is the full output of this pipeline. The filing timeline plots every 10-K and 10-Q in the sector across years, with dot size proportional to flag count and color reflecting signal-strength tier. The term-frequency heatmap shows the most divergent term-ticker-year cells in either cross-sectional or longitudinal mode β terms one company emphasizes far more or far less than its peers in the same year, or terms the company itself has shifted materially against its own baseline. The sentiment chart tracks the sector-mean LM scores by year, filtered to one section type at a time. The conspicuous-silence list shows the most recent term-disappearance flags and stability flags, sortable by ticker and date.
The practical reader flow is: scan the timeline for filings that reach the notable or elevated tier, pull up the specific filing, and use the heatmap and silence list to locate the specific terms or sections driving the flag. The filing itself is a click away on EDGAR. The whole point of the pipeline is to shorten the distance between βwhich filing in this sector is worth reading carefullyβ and the reading itself. The classification step β what the divergence means for the business, for the valuation, for the thesis β is left where it belongs: with the reader, who has the filing open.
Why a Reader Checks This Surface
The use case is narrow and worth being explicit about. A reader is building or monitoring a thesis on a company in a sector where the peer group is large enough to support cross-sectional scoring β currently managed care, cable-broadband, athletic retail, or the Chinese ADR cohort. The thesis is fundamental: it rests on some combination of unit economics, management quality, balance sheet strength, and valuation. Language Analysis does not replace any of that work. What it adds is a routine check against the filings themselves, in a form that scales.
A reader who is long a position uses the surface to monitor for early signs that the fundamental story is changing faster than the reported numbers. A rising flag count, a cosine drop in the Risk Factors section, or a term disappearance on a topic that matters for the thesis is a cue to read the filing against the prior year's filing with specific attention. A reader who is building a new thesis uses the surface to check whether the candidate's language is in line with its peers or whether it sits at an extreme that deserves explanation. A reader who is short, or considering a short, uses the surface for the mirror image: rising negative sentiment, a newly-appeared risk-factor paragraph, or a deviation from peer language in a direction that matches the short thesis.
What the surface is not: a standalone trading system, a crystal ball, or a substitute for reading filings. Every flagged filing is an invitation to read the filing. The pipeline is an instrument that narrows a multi-year sector history down to the small number of filings where the narrative shifted materially. It does not tell the reader what the shift means. That is interpretation, and interpretation is what the reader brings.
Position in the Research Stack
The research architecture on this site runs in layers. Each layer operates at a different cadence, asks a different question, and produces a different kind of output. The layers are complementary and not redundant.
Language Analysis operates at the filing cadence β quarterly, with the richest signal concentrated in the annual 10-K Risk Factors section. It is a qualitative early-warning layer. Its output is a list of filings where the language has shifted materially against peers or against the company's own prior filings, with enough specificity that a reader can locate the exact sections and terms driving the flag. It answers the question: where should I read more carefully. It does not answer the question of what the language change means.
The Signal Sweep module operates at the daily price cadence. It runs sixty-one technical indicators across a curated peer group, measures hit rates at four forward-return horizons, aggregates across tickers to identify indicator-direction combinations that produce consistent beta-adjusted expected value, and applies the Deflated Sharpe Ratio correction for multiple testing. The output is a small number of surviving indicator-direction-horizon combinations that have cleared the full filtering chain. It answers a different question: given a thesis I already hold, is this an advantageous entry point. The sweep does not build or validate theses; it provides timing confirmation for entries on positions the reader has already decided to take.
The valuation work β reverse DCF and the value-scoring scorecard β operates at the multi-year horizon. It prices the thesis itself: what return does the stock offer at the current price under a defensible earnings scenario, and what does the quality and safety scorecard (Piotroski F-score, Altman Z-score, Graham number, Beneish M-score, Greenwald EPV, and the rest) report about the company's financial posture. Valuation is the anchor that makes a position decision defensible. Without it, the other layers are directionless.
The Real Estate vs Index paper is a separate module addressing a different asset-class question β whether retail 1-4 unit owner-operated rental property beats passive S&P 500 exposure over thirty years after pricing the landlord's imputed labor. It is not part of the equity research cadence. It operates at the asset-allocation level and informs the question of conviction in equities as an asset class relative to the popular retail alternative.
The ordering among the equity layers matters. The right sequence is to establish a thesis fundamentally, price it through the valuation layer, monitor it through Language Analysis as filings arrive, and time entries or accumulation using the Signal Sweep. A surviving sweep signal on a stock with no thesis is not an actionable finding. A language flag on a stock with no position is an invitation to do the fundamental work, not a reason to act. The layers combine into conviction; none of them produces conviction alone.
What the Surface Is Honest About
The claims on this page and on the data page are scoped narrowly for a reason. The system detects measurable divergence. It does not predict price. It does not classify filings as bullish or bearish from sentiment alone. It does not claim that every flagged filing contains a meaningful signal β some flags are boilerplate updates required by SEC guidance that every filer in the sector had to incorporate in the same quarter, and the divergence is measurable but the interpretation is that nothing happened. The surface is useful as a reading-prioritization tool and it is used on this site exactly that way.
The system is also honest about where it has not done the work. A formal study linking flagged filings to downstream price performance is deferred. Calibration has been focused on managed care; the other three enabled sectors carry their own threshold configurations and have not been tuned with the same intensity. The lexicon for each sector is curated and the curation is the bottleneck for signal quality. The system cannot distinguish between genuine concern and boilerplate updates in the general case, and it does not pretend to.
What the system gets right is the decomposition. Filing-language divergence is not one signal; it is four, and the four are independent views of the same underlying question. A filing that registers anomalies on three of four whispers is a stronger candidate for reading than one that registers on only one. Coincident term-frequency, sentiment, and cosine-drift anomalies on the same filing are unusual in the ordinary flow and, in the calibration window for managed care, tend to mark the filings that a careful reader would want to spend time with. The surface is what it looks like. A reader who uses it as a reading prioritizer will get value from it. A reader who expects it to replace the reading will not.