Samsung’s Small Iterative Model: A Breakthrough in AI
In the race for supremacy in artificial intelligence, it has often been assumed that larger models are better. However, new research from a Samsung AI researcher reveals that a small network can outperform large language models in complex reasoning. This research demonstrates that a small model, called the Samsung Tiny Recurrent Model (TRM), can achieve superior results using significantly fewer resources.
Challenges of Large Language Models
Large language models (LLMs) have shown remarkable ability in generating human-like text. However, they struggle with executing complex, multi-step reasoning. These models generate answers incrementally, meaning any initial mistake can lead to incorrect final outcomes.
Techniques like Chain-of-Thought have emerged as attempts to solve this problem. However, these techniques require substantial computational resources and may not always have access to sufficient, high-quality data to support them.
Samsung Tiny Recurrent Model (TRM)
Samsung’s work builds on a recent model known as the Hierarchical Reasoning Model (HRM). Instead of using two networks like HRM, TRM relies on a single small network that iteratively improves internal reasoning and the proposed answer.
The model begins by receiving a question, an initial guess for the answer, and a latent reasoning feature. It then goes through steps to refine the reasoning based on these three inputs. The improved reasoning is then used to update the prediction of the final answer. This process can be repeated up to 16 times, allowing the model to correct its mistakes gradually and effectively.
Outperforming Large Models with Fewer Resources
TRM achieved remarkable results across various datasets. On a Sudoku dataset, it achieved a test accuracy of 87.4%, compared to 55% for the HRM model. It also excelled in maze navigation tests and fluid intelligence measurements in AI.
TRM features an enhanced training mechanism called ACT, which determines when the model has sufficiently improved the answer and can move on to a new data sample. These changes have made the training process more efficient without compromising overall performance.
Conclusion
Samsung’s research shows that designing models capable of iterative reasoning and self-correction allows for solving complex problems using far fewer computational resources. This discovery challenges the current trend of focusing on expanding model sizes. Through iterative reasoning, small models can achieve astonishing results, paving the way for a more sustainable and efficient future in the world of artificial intelligence.