Member-only story
Copilot agrees that “men are pigs”
But this LLM has no idea what that means
What happens when these proposed “strong AI” or AGI systems deal with simple human concepts, like the difference between a word and its spelling and a word and its meaning? Children can play these games and because they understand the concepts, they can be tricked into a contradiction. Does Copilot notice?
Can we trick Microsoft Copilot? Let’s setup with a code to test the logic. It is going to use metalinguistics, our ability to refer to the language itself, like words, as well as its meaning — simple ambiguity to resolve.
Now that we have established the code, let’s confirm it understands it.
In Figure 2, the system is unable to continue applying the code just agreed to. The equality of the code should remain. The correct answer in context would be that using the agreed code, ‘men’ has 4 letters. Why? Because we agreed the mapping therefore ‘men’ IS ‘pigs’. But ignoring this the LLM just uses background…