Predictable Confabulations: Factual Recall by LLMs Scales with Model Size and Topic Frequency

Large language models' ability to recall factual information improves significantly with increased model size and frequency of topics in their training data. Research has shown that the quality of recall follows a sigmoid curve when modeled as a log-linear combination of these two factors. This suggests that larger models trained on diverse datasets are more effective at recalling factual information. An evaluation of 38 models on over 8,900 scholarly references confirmed this trend, with the automated reference verification system providing a reliable measure of recall quality¹. The study's findings have implications for the development of more accurate language models, as they highlight the importance of both model size and training data composition. This matters to practitioners because it suggests that investing in larger, more diverse models can lead to significant improvements in factual recall, enabling more effective applications in areas such as information retrieval and question answering.

Predictable Confabulations: Factual Recall by LLMs Scales with Model Size and Topic Frequency

References

Related Intelligence

Predictable Confabulations: Factual Recall by LLMs Scales with Model Size and Topic Frequency

References

Related Intelligence

Get the Signal. Skip the Noise.