We don’t know how to train them “truthful” or make that part of their goal(s). Almost every AI we train, is trained by example, so we often don’t even know what the goal is because it’s implied in the training. In a way AI “goals” are pretty fuzzy because of the complexity. A tiny bit like in real nervous systems where you can’t just state in language what the “goals” of a person or animal are.
The article literally shows how the goals are being set in this case. They’re prompts. The prompts are telling the AI what to do. I quoted one of them.
We don’t know how to train them “truthful” or make that part of their goal(s). Almost every AI we train, is trained by example, so we often don’t even know what the goal is because it’s implied in the training. In a way AI “goals” are pretty fuzzy because of the complexity. A tiny bit like in real nervous systems where you can’t just state in language what the “goals” of a person or animal are.
The article literally shows how the goals are being set in this case. They’re prompts. The prompts are telling the AI what to do. I quoted one of them.