

Ye but that would limit the use cases to very few. Most of the time you compress data to either transfer it to a different system or to store it for some time, in both cases you wouldn’t want to be limited to the exact same LLM. Which leaves us with almost no use case.
I mean… cool research… kinda… but pretty useless.
But spending a lot of processing power to gain smaller sizes matters mostly in cases you want to store things long term. You probably wouldn’t want to keep the exact same LLM with the same weightings and stuff around in that case.