Marco andrea@passaglia.it
The Bellwether

A morning brief, composed for you when the sources say something worth saying.

← all signals

Systematic non-disclosure of training data sources enabling plausible deniability for large-scale unauthorized content appropriation

str 8 3/14/2026 · 1 article
structural · regulatory · AI · US, UK
Analysis

AI companies are obscuring the origin and scope of training datasets through third-party scraping intermediaries and lack of transparency, creating legal and ethical cover for mass copyright infringement while making enforcement and compensation impossible.

Key actors
Midjourneytech companies
Source article
AI is dressing up greed as progress on creative rights
"some companies are accused of obscuring the trail by paying third-party scrapers to do the work. They do not disclose the datasets" [do not disclose the datasets]
Reasoning from this article

The article reveals that non-disclosure is not accidental but deliberate strategy. By paying third parties to scrape and refusing to disclose training data, companies create a structural barrier to enforcement: creators cannot identify infringement, courts cannot assess damages, and companies can claim ignorance. This is distinct from the legal question of whether training constitutes infringement—it's about making infringement undetectable and uncompensable.

Bellwether · 2026 Marco