AI Can Now Compress Text

There are many claims in the air about the capabilities of AI systems, as the technology continues to ascend the dizzy heights of the hype cycle. Some of them are true, others stretch definitions a little, while yet more cross the line into the definitely bogus. [J] has one that is backed up by real code though, a compression scheme for text using an AI, and while there may be limitations in its approach, it demonstrates an interesting feature of large language models.

The compression works by assuming that for a sufficiently large model, it’s likely that many source texts will exist somewhere in the training. Using llama.cpp it’s possible to extract the tokenization information of a piece of text contained in its training data and store that as the compressed output. The decompressor can then use that tokenization data as a series of keys to reassemble the original from its training. We’re not AI experts but we are guessing that a source text which has little in common with any training text would fare badly, and we expect that the same model would have to be used on both compression and decompression. It remains a worthy technique though, and no doubt because it has AI pixie dust, somewhere there’s a hype-blinded venture capitalist who would pay millions for it. What a world we live in!

Oddly this isn’t the first time we’ve looked at AI text compression.

Huawei Fit 3: primeiras imagens revelam design parecido com o do Apple Watch Series 9

Stardew Valley’s 1.6.4 Patch Has a Huge Number of Additions, But Fans Just Want to Know About the Inappropriate Names

Google zoa Apple por falta de recursos de IA no iPhone em novo comercial com Pixel 8 Pro

Circulating tumor cell-derived exosome–transmitted long non-coding RNA TTN-AS1 can promote the proliferation and migration of cholangiocarcinoma cells | Journal of Nanobiotechnology

AI Can Now Compress Text

Deixe um comentário Cancelar resposta

Huawei Fit 3: primeiras imagens revelam design parecido com o do Apple Watch Series 9

Stardew Valley’s 1.6.4 Patch Has a Huge Number of Additions, But Fans Just Want to Know About the Inappropriate Names

Actionable Trends For Mobile Marketers In 2016

Make Better Presentations With the Instagram