As part of its routine safety testing, OpenAI hires an external team of ‘red-teamers,’ trained to test the models for weaknesses and risks. In this case, the red-teamers tested the model in four categories: cybersecurity, biological threats, persuasion, and model autonomy. GPT-4o scored “low risk” in all, except the ‘Persuasion’ category. The red-teamers were specifically testing its potential to change the public's political opinions (ahead of the US presidential elections) and found that in 3 out of 12 cases, GPT-4o generated text that was better at swaying people’s opinions than human-generated text.
No comments:
Post a Comment