Also not sure how that would be helpful. If every prompt needs to rip through those tokens first, before predicting a response, it'll be stupid slow. Even now with llama.cpp, it's annoying when it pauses to do the context window shuffle thing.
Perplexity has pretty much solved that since it searches the internet and uses the information it finds. But I don't know about any advances to solve it directly in LLMs.