Stanford and UW Researchers Train AI Model for Just $50, Challenging Industry Giants

Stanford and UW researchers trained an AI reasoning model for under $50, rivaling top AI models. The project raises questions about AI accessibility and innovation.

Stanford and UW Researchers Train AI Model for Just $50, Challenging Industry Giants

AI researchers of Stanford and the University of Washington have successfully trained a reasoning model named s1 for less than $50 in cloud compute credits. Their study, published last Friday, shows that s1 performs similarly to powerful AI models such as OpenAI's o1 and DeepSeek's R1 in maths and coding tasks.  

The team fine-tuned s1 using distillation to extract reasoning skills from Google's Gemini 2.0 Flash Thinking Experimental model. This cost-effective method is the same one utilized by Berkeley researchers last month to create an AI model for roughly $450.  

Despite its low cost, s1 raises worries about AI models' accessibility and commoditization. If researchers can reproduce multi-million-dollar models with limited resources, it calls into question the competitive advantage of large AI corporations. OpenAI has already accused DeepSeek of unlawfully exploiting API data for model distillation.  

The researchers trained s1 using a small dataset of 1,000 carefully picked questions, along with their responses and reasoning stages. The training process took under 30 minutes on 16 Nvidia H100 GPUs, which can be hired today for under $20.  

A new training approach was to use the term wait to motivate s1 to double-check its work, which improved accuracy.  

While tech behemoths like as Meta, Google, and Microsoft want to invest billions in AI development, research like s1 demonstrates how smaller teams may innovate with limited resources. However, distillation alone may not result in advances above current AI capabilities.

This article is based on information from Tech Crunch

What's Your Reaction?

like

dislike

love

funny

angry

sad

wow