Generative AI can simply be made malicious regardless of guardrails, say students
Students discovered by gathering as little as 100 examples of question-answer pairs for illicit recommendation or hate speech, they might undo the cautious “alignment” meant to determine guardrails round generative…