OpenAI has voiced concerns that China's DeepSeek AI models, known for their remarkably low cost, may have been developed using data from OpenAI. This week, Donald Trump called DeepSeek a wake-up call for the U.S. tech industry following a significant drop in Nvidia's market value—nearly $600 billion.
DeepSeek's emergence triggered a sharp decline in the stock prices of major AI-focused companies. Nvidia, a leading provider of GPUs crucial for AI model training, suffered the most substantial loss in Wall Street history, with a 16.86% share drop. Microsoft, Meta Platforms, Alphabet, and Dell Technologies also experienced declines ranging from 2.1% to 8.7%.
DeepSeek boasts that its R1 model offers a significantly cheaper alternative to Western counterparts like ChatGPT. Built upon the open-source DeepSeek-V3, it reportedly requires far less computing power and was trained for an estimated $6 million—a claim disputed by some. Nevertheless, DeepSeek's impact has raised questions about the massive investments American tech companies are making in AI, unsettling investors. Its rapid rise to the top of the U.S. most downloaded free app chart further amplified the discussion surrounding its efficacy.
Bloomberg reported that OpenAI and Microsoft are investigating whether DeepSeek utilized OpenAI's API to integrate OpenAI's AI models into its own. OpenAI stated to Bloomberg that they are aware of attempts by Chinese and other companies to leverage leading U.S. AI companies' models. This process, known as distillation, involves extracting data from larger models to train smaller ones, a violation of OpenAI's terms of service. OpenAI emphasized their commitment to protecting their intellectual property and collaborating with the U.S. government to safeguard advanced models.
David Sacks, President Donald Trump's AI czar, suggested to Fox News that DeepSeek likely employed distillation techniques using OpenAI models, a move OpenAI is reportedly unhappy about. He anticipates leading AI companies will implement measures to prevent future distillation.

The situation has highlighted the irony of OpenAI's position, given accusations of its own data scraping practices in creating ChatGPT. Tech PR writer Ed Zitron pointed out this hypocrisy on Twitter.
OpenAI previously acknowledged in a submission to the UK's House of Lords that creating AI tools like ChatGPT without using copyrighted material is practically impossible, citing the vast scope of copyright protection. They argued that limiting training data to only public domain materials would severely restrict the capabilities of modern AI systems.
The use of copyrighted material in AI model training has become a major point of contention. The New York Times sued OpenAI and Microsoft for allegedly using its work without authorization, while OpenAI maintains that its actions constitute "fair use." This followed a lawsuit by 17 authors, including George R. R. Martin, alleging large-scale copyright infringement. Adding to the complexity, a 2018 U.S. Copyright Office ruling stated that AI-generated art cannot be copyrighted due to the lack of a human creative element.