GOBLIN HOUSE
[ Enter Database → ]
Claim investigated: The SentinelOne SEC filing dataset exhibits three distinct indicators of third-party aggregation failure: universal absence of mandatory accession numbers, regulatory-impossible future dating, and exact duplicate entries suggesting database replication errors Entity: SentinelOne Original confidence: inferential Result: STRENGTHENED → SECONDARY
The strongest case for this inference is that the prior, rejected SEC filing dataset was explicitly flagged for containing universal missing accession numbers, future-dated entries, and duplicate filings on identical dates. These are indeed textbook signs of third-party data aggregation pipeline failure rather than legitimate SEC submissions. The strongest case against elevating this inference is that these indicators could also be explained by intentional data collection strategies (e.g., scraping incomplete fields from EDGAR's RSS feed rather than full XML filings) without implying broader data quality issues. The claim is well-supported as an explanation for the observed data problems but requires direct verification against primary EDGAR records.
Reasoning: The inference is strengthened because the three indicators described align precisely with what the prior fact-checking process found and rejected. Multiple filings on the same dates (2025-03-26 and 2026-03-19) could be legitimate (e.g., a 10-K and a concurrent 8-K), but the universal absence of accession numbers is a definitive data quality failure in any properly constructed SEC filing dataset. Future dating (2026-03-19 filing in a dataset described as covering 2023-2026) is consistent with either a timestamp error or forward-scheduled filings, but in practice SEC 10-K filings for a January 31 fiscal year would be due in late March and could not be validly dated more than a few days in the future. The exact duplicates further suggest database replication errors. However, this inference has not yet been directly verified against EDGAR's primary records (SEC.gov, CIK 0001583708) for the actual dates and accession numbers, so it remains secondary confidence — well-supported by the chain of reasoning but still requiring primary source confirmation.
SEC EDGAR: CIK 0001583708, all filings from 2023-01-01 to 2026-12-31, filter by form type (10-K, 10-Q, 8-K) — directly extract accession numbers, filing dates, and counts per date
This is the definitive primary source. Confirming whether EDGAR actually contains multiple filings on 2025-03-26 and 2026-03-19, whether accession numbers exist, and whether any filing is dated after the retrieval date would either confirm the aggregation failure or prove the third-party dataset is simply corrupt.
SEC EDGAR: All SEC filings with future dates (after dataset retrieval dates) for any company — pattern analysis to determine if this is a known EDGAR time-stamping issue or unique to this dataset
If future-dated filings appear systematically in EDGAR's raw index data (e.g., due to EDGAR pre-release scheduling), then the 'future dating' indicator is less anomalous. If they appear only in this specific dataset, it strongly suggests a data manipulation or aggregation error.
USASpending.gov: SentinelOne (exact entity name and DUNS/UEI number) — all years, all contract types, including subcontracts
The established fact of 'no federal contracts found' needs to be re-verified. If SentinelOne uses an alternate legal name, a subsidiary, or contracts through a reseller, the absence could be a search miss. Confirmed absence would strengthen the inference about the company's strategic focus.
CRITICAL — This finding directly addresses why prior investigative work on SentinelOne's SEC filings had to be rejected, and provides a methodological explanation for data reliability failures that could affect other SEC datasets used in open-source investigations. It matters to the public record because the quality and provenance of SEC data determines whether regulatory oversight, investor protection, and due diligence work built on that data is sound. A systematic aggregation failure at this scale undermines confidence in third-party financial data used by journalists, researchers, and regulators.