A researcher associated with Elon Musk's Startup Xai A new way has been found to measure and manipulate preferences and values expressed by artificial intelligence Model -To include their political ideas.
Work was led Dan handricNon -profit director Center for AI Security And advisor to XAI. He suggests that technology can be used to create a popular AI model that reflects voters' desire better. “Maybe in the future, [a model] The specific user can be aligned, ”Handrick told Wired. But in the meantime, they say, a good default AI will use election results to carry forward the ideas of the model. He is not saying that a model must necessarily be “Trump” all the way, but he argues that it should be a little biased towards Trump, “because he won the popular vote.”
XAI released A new AI risk structure On 10 February, stating that the usefulness engineering approach of Handryk can be used to assess the groke.
Handrics led a team from the Center for AI Safety, UC Berkeley and the University of Pennsylvania, which analyzed the AI model using the technology borrowed from economics to measure the preferences of consumers for various items. By testing the model in a wide range of imaginary scenarios, researchers were able to calculate what is known as a utility function, a measure of satisfaction that people receive from a good or service. This allowed them to measure the preferences expressed by different AI models. Researchers determined that they were often consistent rather than random, and showed that these preferences become more inherent because the models become larger and more powerful.
Some? research study It is found that AI equipment such as chatgipt pro-environmental, left-to-shouts and liberal ideologies are biased towards views. In February 2024, Google faced criticism from Kasturi and others after the Gemini Tool.picking“Like Black Vikings and Nazis.
The technology developed by Handric and their colleagues provides a new way to determine how the AI model approaches may vary from their users. Ultimately, some experts envisage, such a deviation can be potentially dangerous for very clever and competent models. For example, researchers show in their studies that some models continuously import the existence of AI over some inhuman animals. Researchers say that they also found that models give importance to others, raising their own moral questions.
Some researchers, involved in Handrick, believe that the current methods to align the model, such as manipulating and blocking their output, may not be enough if unwanted targets lurk under the surface within the model within the model Are. “We are going to face it,” called Hendrix. “You can't show off that it is not there.”
Dylan headfield-menelA professor of MIT, who researches ways to align AI with human values, states that Handric's paper AI suggests a promising direction for research. “They find some interesting results,” they say. “The main one who stands out is that such as the model scale grows, the utility representation becomes more complete and consistent.”