Hierarchical Reasoning Models, built on Recurrent Networks, could revolutionize our cognitive abilities.

In a groundbreaking development, a new AI model named Hierarchical Recurrent Models (HRM) has emerged, outperforming some of the most sophisticated Language Models (LLMs) available today. This model brings renewed attention to the potential of recurrent architectures.

HRM operates on a unique workflow, consisting of two coupled recurrent modules: a high-level planner and a low-level doer. The high-level module is responsible for high-level abstract reasoning and planning, while the low-level module handles fast, detailed computations.

The low-level module takes several quick steps to reach a partial solution, and the result is sent up to the high-level module. The high-level module updates the plan based on the result from the low-level module and sends a new hidden state back down to the low-level module. This process repeats until the model converges on the final answer, with the low-level modules running for N times the high-level module, and each low-level module running for T steps.

One of the key advantages of HRM is its ability to avoid the exploding and vanishing gradients issues from backpropagation chains, which often plague deeper models as the number of iterations increases, eventually leading to a drop in performance. HRM achieves this by using a one-step gradient approximation during training.

Moreover, unlike other LLMs, HRM does not use pre-training. Instead, it adopts a strategy that uses both the input injection component and the best features of the recurrent network architecture. The input is transformed into machine-readable form by a trainable embedding layer, and the model maintains awareness of the problem statement by injecting the embedded input data into each iteration.

HRM also demonstrates effective reasoning through structured recurrence inspired by the human brain. However, it's important to note that current transformer models primarily rely on Chain of Thought (CoT) for reasoning tasks. Unlike HRM, CoT generates a lengthy thought trace, increasing the cost and slowing down the process. Moreover, CoT is prone to errors, and an error at any stage may propagate to the following stages.

HRM is inspired by biology, with a dual recurrent loop system reminiscent of the human brain's hierarchical mental processes. It uses reinforcement learning (Q-learning) for training, and the model does not send the high-level module's final hidden state directly into the output layer; instead, it goes through a halting head that determines whether the model should stop or continue for another N cycle.

The individuals involved in the development of hierarchical mental models include psychologists and cognitive scientists such as Kenneth Craik, who first proposed mental models, and Philip Johnson-Laird, who significantly expanded the theory. With HRM, these theories have found a practical application in AI, revolutionizing the field and opening up new possibilities for the future.

Hierarchical Reasoning Models, built on Recurrent Networks, could revolutionize our cognitive abilities.