A recent research paper by Google’s AI subsidiary, DeepMind, suggests that Large Language Models (LLMs) can be seen as strong data compressors. LLMs, which are AI systems trained on vast amounts of data to predict the next part of a word, can effectively compress information as well as, or even better than, widely used compression algorithms with slight modifications. The researchers repurposed LLMs to perform arithmetic coding, a type of lossless compression algorithm, and found that they achieved impressive compression rates on text, image, and audio data. The study highlights the potential of LLMs as data compressors and provides insights into their capabilities and evaluation.
While LLMs excel in text compression, the researchers discovered that these models also achieved remarkable compression rates on image and audio data, surpassing domain-specific compression algorithms. However, due to their size and speed differences, LLMs are not practical tools for data compression compared to existing models. Classical compressors like gzip are still more efficient in terms of compression vs. speed and size. Although bigger LLMs are generally thought to perform better, the researchers found that the performance of larger models diminishes on smaller datasets, indicating that a bigger model is not necessarily better for all tasks. Compression can serve as an indicator of how well the model learns the information of its dataset, providing a principled approach for reasoning about scale.
These insights into LLMs as data compressors could have significant implications for the evaluation and future development of these models. It addresses issues such as test set contamination in evaluating LLMs and suggests using compression approaches, such as Minimum Description Length, to consider model complexity. This framework can provide a quantifiable metric to evaluate the right size of the model and mitigate the problem of test set contamination. Overall, this research sheds light on the potential of LLMs as powerful data compressors and offers a fresh perspective on their capabilities and performance evaluation.