International

OpenAI Violated Canadian Privacy Laws in Training ChatGPT, Watchdogs Find

News Desk2 days ago03 mins

Photo by Visual Content on Openverse

A joint investigation by Canadian privacy regulators has concluded that OpenAI violated federal and provincial privacy laws during the development and training of its ChatGPT artificial intelligence model. Following a probe announced in Ottawa on May 6, commissioners from the federal government, Quebec, British Columbia, and Alberta determined that the company scraped vast amounts of personal data from the public web without obtaining valid consent from Canadian users.

The Scope of Data Collection

The investigation focused on the mechanics of Large Language Model (LLM) training, which typically involves harvesting billions of data points from the internet. Regulators found that this process captured sensitive information, including health records, political affiliations, and data specifically related to minors, often without the knowledge of the individuals involved.

Canadian privacy laws generally require organizations to obtain meaningful consent for the collection and use of personal information. The findings suggest that OpenAI’s reliance on publicly available data did not exempt the company from these statutory obligations.

Regulatory Concerns and Legal Hurdles

The investigation highlights a growing tension between rapid AI innovation and established data protection frameworks. Privacy watchdogs emphasized that the scale of data ingestion in AI systems creates significant risks regarding the permanence of information and the inability of individuals to exercise their right to be forgotten.

“The collection and use of personal information for AI training cannot be treated as a free-for-all,” stated the joint task force. The regulators noted that OpenAI failed to adequately demonstrate that its practices aligned with the principles of necessity and proportionality required by Canadian law.

Industry Implications and Compliance

This ruling signals a shift toward stricter oversight of generative AI providers operating within Canada. For the broader tech industry, the decision underscores that “publicly accessible” data is not synonymous with “publicly usable” data under existing privacy regulations.

Companies building AI models must now navigate a complex landscape where traditional web-scraping practices are increasingly viewed as non-compliant. Failure to implement robust data governance and consent mechanisms could lead to significant legal exposure and operational restrictions in the Canadian market.

Looking Ahead

As the legal fallout continues, industry analysts are watching to see if OpenAI will implement specific technical safeguards, such as data scrubbing or opt-out mechanisms, to comply with the findings. The outcome may also influence future federal legislation intended to modernize Canada’s privacy laws for the AI era. Observers should monitor whether other international jurisdictions adopt similar investigative stances as they balance the benefits of AI development against the fundamental right to individual privacy.