so there's 3 immediately-suggestive paths that come to mind from this
the first is that gibbering prompts itself already means you've hit a boundary in the design of its execution space (or fucking around in the very edges of training data where its precision gets low), and that could mean you are beyond what the programmers thought of/handled. whether or not you can get reliable further behaviours in that mode/space will be extremely contingent on a lot of factors (model type, execution type, runtime, ...), but given how extremely rapidly and harshly oai (and friends) reacted to simple behavioural breaks I get the impression that they're more concerned with such cases than they might be letting on
the second fairly obvious vector is where everyone is trying to shove LLMs into everything without good safety boundaries. oh that handy chatbot on your doctor/airline/insurance/.... site that's pitched as "it can use your identification details and look up $x"[0], that means that system has access to places where to look up private data. so if you could break a boundary via whatever method, who's to say it can't go further. it's not like telling the prompt "do $x and only $x" will work, as many examples have shown
third path, and sort-of the one that ties the bow on the second a bit, is that most of these dipshits probably don't have proper isolation controls, just because it's hard and effortful. building actual multitenancy with strong inter-tenant separation is a lot of work. that's something that's just not done in bayfucker world unless it is specifically needed. so the more these things get shoved into various products and this segmentation work is not done thoroughly, the more likely that sort of shit becomes
[0] - couple years back (pre-llm) I worked on exactly this problem with a client. it's fantastically annoying to design, not half because humans are such wonderfully unpredictable input sources