Skip Navigation
8 comments
  • We could have AI models in a couple years that hold the entire internet in their context window.

    That's a really bold claim.

    • Also not sure how that would be helpful. If every prompt needs to rip through those tokens first, before predicting a response, it'll be stupid slow. Even now with llama.cpp, it's annoying when it pauses to do the context window shuffle thing.

8 comments