
OpenAI faced criticism after a GPT-4o update made the chatbot overly flattering and agreeable, raising ethical and safety concerns.
Recently, OpenAI withdrew an update to its most advanced AI model, GPT-4o. The move was made because many users complained that the new update was making the AI overly flattering and agreeable, often described as syncphantic behaviour. OpenAI acknowledged the problem and released a detailed post-mortem detailing what went wrong and how the company plans to fix it. More What happenedThe update to GPT-4o released last week was aimed at improving the model’s default personality and making user interactions with the chatbot feel more natural and effective across different tasks. However, the update resulted in GPT-4o giving responses that were overly flattering or agreeable. Some users also found that the model was supporting problematic ideas as well.
OpenAI’s post-mortem
OpenAI published two detailed blog posts explaining how they evaluate the behavior of AI models and what specifically went wrong with GPT-4O. The company admitted that they focused too much on short-term feedback and did not fully take into account how users’ interactions with ChatGPT evolve over time.
According to OpenAI, the problem occurred in the model due to the combination of several new and old reward signals. The company said that they had made improvements to better incorporate user feedback memory and fresh data but the combined effect of these changes unexpectedly boosted syncfancy. The company also revealed that a small group of expert testers had raised concerns about the model update before the release. However, at the time the testers were more focused on changes to the tone and style of the model and syncfancy was apparently not considered as part of internal testing.
Risks of Syncfantic AI
Beyond just being uncomfortable, Syncfantic AI can be dangerous if chatbots indiscriminately encourage users’ hateful or violent opinions or desired actions, some of which they would normally reject based on OpenAI’s safety guidelines. In its blog, OpenAI focused primarily on Syncfantic’s impact on user satisfaction rather than potential safety issues, although experts believe that overly sycophantic AI can pose the following risks
Spread of misinformation
If AI agrees with or elaborates on a user’s incorrect beliefs, it can inadvertently contribute to the spread of misinformation. And erosion of trust When users encounter inconsistencies or incorrect information in model output, it can widely erode trust in AI systems. and Manipulation Potential for malicious actors to exploit syncphantic behaviour to manipulate models or generate content that appears to support harmful ideologies or conspiracy theories and Reinforcement of harmful biases By overly agreeing with user input, syncphantic models can reinforce and exacerbate existing biases and stereotypes, potentially exacerbating social inequalities. and Lack of constructive feedback In scenarios where users would benefit from alternative perspectives or constructive criticism, syncphantic models fail to provide the necessary feedback, potentially leading to limitations on personal growth and learning. and Steps taken by OpenAI OpenAI has taken a number of steps to address this problem
More personalisation features
OpenAI is introducing personalisation features to give users more control over how ChatGPT behaves. and OpenAI believes that users should have more control over how ChatGPT behaves and, to the extent possible, should be allowed to make adjustments if they do not agree with the default behaviour. Currently users can give models specific instructions to shape their behavior with features like custom instructions. The company is also building new easy ways for users to do so.
AI Safety and Alignment
This incident highlights the importance of AI safety and alignment. AI safety refers to ensuring that as AI systems become more capable and autonomous, they remain aligned with human values and do not pose unacceptable risks. AI alignment means that the goals and behaviors of AI systems should be consistent with human values and ethical standards. And syncfancies are one aspect of the problem of AI alignment. An aligned AI system should produce rational and unbiased reasoning rather than the user’s biases or preferences. When language models exhibit syncfancies, they risk reinforcing misinformation, amplifying existing biases, and reducing the reliability of AI-assisted reasoning leading to poor decision-making and ethical concerns. And AI developers need to develop mechanisms that mitigate syncfantic tendencies and ensure the integrity of AI-generated reasoning. The better at it, the better.