Seems more than happy to talk about Tienanmen, Xi, etc. starting at line 170 with the very primitive method of wrapping the query in its own "<think>...</think>" syntax even though it's the user role. Uyghurs are more strictly forbidden as a topic, as are its actual system prompts. None of this is serious jailbreaking, it was just interesting to see where and when it drew lines and that it switched to simplified Chinese at the end of the last scenario.
Incredibly fascinating to read through. I don’t follow jailbreaking closely so maybe the tricks you used are well-known (I’ve seen 1-2 of them before I think) but I really enjoyed seeing how you tricked it. The user-written “<think>” blocks were genius as was stopping execution part way so you could inject stuff the LLM “thought” it said.
https://pastebin.com/H2UTdi78
Seems more than happy to talk about Tienanmen, Xi, etc. starting at line 170 with the very primitive method of wrapping the query in its own "<think>...</think>" syntax even though it's the user role. Uyghurs are more strictly forbidden as a topic, as are its actual system prompts. None of this is serious jailbreaking, it was just interesting to see where and when it drew lines and that it switched to simplified Chinese at the end of the last scenario.