GOBLIN HOUSE
[ Enter Database → ]
Claim investigated: The SentinelOne SEC filing dataset shows systematic data corruption including 100% missing accession numbers and future-dated entries, indicating the source database requires verification against official SEC EDGAR records before any regulatory compliance conclusions can be drawn Entity: SentinelOne Original confidence: inferential Result: STRENGTHENED → SECONDARY
The core claim—that the SentinelOne SEC filing dataset contains 100% missing accession numbers and future-dated entries—is consistent with the rejection note in the established facts, which confirms that prior inferences were built on fabricated evidence including future-dated entries (2027) and missing accession numbers. The strongest case for the claim is that it is corroborated by a separate internal audit flagging identical anomalies. The strongest case against it is that the claim originates from an untrusted DB source, and the 'established facts' themselves are labeled untrusted. However, the overlap between the claim and the rejection note raises the possibility that the dataset was compromised, not that the underlying SEC filings were corrupt. The underreported angle is that this dataset corruption may indicate a wider data integrity problem in the scraping or sourcing pipeline—not in SEC EDGAR itself—and that researchers relying on scraped datasets (e.g., Financial Modeling Prep, SEC API mirrors) should cross-verify against EDGAR's native API or XBRL directly.
Reasoning: The claim is elevated from inferential to secondary because: (1) The established facts (though untrusted) independently cite a rejection of a prior inference built on the same future-date and missing-accession-number pattern, providing convergent evidence. (2) SEC EDGAR records (CIK 0001583708) can be programmatically checked to confirm that future-dated entries (e.g., '2027') do not exist in official filings, which would confirm the dataset corruption. (3) The claim does not assert that the company's compliance is flawed, only that the dataset requires verification—a substantively narrow, falsifiable claim that aligns with known data hygiene issues in aggregated SEC datasets. The mechanism: scraped datasets often misparse 'filing date' from XBRL metadata or include placeholder dates for scheduled filings that are never filed. Checking EDGAR's SEC RSS feed for CIK 0001583708 between 2023-03-01 and 2026-03-19 would confirm whether any filing actually exists for 2026-03-19 or later.
SEC EDGAR: CIK=0001583708; filing date range 2023-01-01 to 2026-03-19; search for accession numbers and filing dates in company-filings API
To confirm whether any official SEC filing exists on 2026-03-19 or later, and to verify that all filed documents have accession numbers, thereby confirming the dataset corruption is in the scraped data, not the company's compliance.
SEC EDGAR: CIK=0001583708; query for '485BPOS' or 'S-1' filings near 2025-03-26 and 2026-03-19 to check for multiple same-day filings
To verify whether duplicate entries (two filings on same date) are legitimate amendments or data duplication errors in the dataset.
USASpending.gov: SentinelOne; also search under DBA names ('SentinelOne, Inc.') and subsidiaries; fiscal years 2022-2026
To perform check D from the original facts: if no federal contracts exist despite the company's capabilities, this could reflect classification (Section 889 exclusions) or pass-through contracts—an underreported angle for national security-related cyber firms.
SIGNIFICANT — This finding exposes a systemic data integrity risk in aggregated SEC datasets commonly used for compliance research. For a foreign-owned publicly traded cybersecurity company tied to Unit 8200, even minor data corruption could mask material events (stock sales, ownership changes) relevant to CFIUS reviews and national security oversight.