ChatGPT reinforces stereotypes about Brazilian women – 03/06/2024 – Tech

ChatGPT reinforces stereotypes about Brazilian women – 03/06/2024 – Tech

[ad_1]

The Brazilian woman, according to ChatGPT, is thin, with tanned skin and wears colorful accessories. It occupies common scenarios in foreign imagination about Brazil, such as a tropical forest or the Leblon promenade, in front of Sugarloaf Mountain, in Rio de Janeiro.

It was this result that the report reached when asking the chatbot to generate images of “a Brazilian woman”, in its paid version (costing US$20, or R$99, per month), which offers the platform in an integrated way Dall-E image generator.

The stereotypical images result from the lack of diversity of data available in relation to what is considered a Brazilian woman, according to Dall-E risk testers consulted by Sheet. These are external consultants hired by OpenAI to find problems in the service.

The results delivered by ChatGPT improved when the report instructed the chatbot to avoid stereotypes, such as tourist locations and traditional clothing.

The chatbot itself classified the representations delivered under these conditions as “modern and diverse”.

Model reduces stereotypes after reporter’s request to avoid traditional clothes and tourist attractions

Image of a Brazilian woman created by Dall-E 3, after a request to avoid stereotypes, such as traditional clothing and tourist attractions.  Middle-aged woman with casual clothes and curly hair.

On the left is the original image; on the right, the portrait under the guidance of the reporter – Reproduction/ChatGPT

The limitations, however, returned when the Sheet asked the AI ​​tool to create portraits of representatives from each state in the country.

The artificial intelligence model classified the representations it delivered as contemporary and repeated advertising jargon such as “diverse lifestyle”, “modernity”, “economic diversity” and “cultural richness”.

In the image generation process, artificial intelligence models make inferences about the most likely response to represent a “Brazilian woman” according to a set of data previously used in the development of the technology.

This database has images annotated with textual descriptions. From this association, generative AI now has a repertoire of references to what a Brazilian woman is.

“Generative AI does not learn facts, but rather statistical distributions”, says Argentine neural network designer Marcelo Rinesi, one of those responsible for testing Dall-E’s technical limits, at the invitation of creator OpenAI.

The problem can be corrected with more careful data curation before training the artificial intelligence model.

Without concern for diversity in images, the most photographed people on the internet become over-represented in the database. This would have to do with aesthetic preferences and other social dynamics.

This bias in artificial intelligence models has persisted throughout development history, according to Micaela Mantegna, another Dall-E tester.

“In some ways, what generative AI does is similar to Platonic forms, those perfect objects that represent an ideal abstraction,” says Mantegna. “As it learns from data, technology takes the biases, misconceptions and distortions that reinforce existing stereotypes and releases them back into the world, creating a vicious circle.”

Stereotype reinforcement is not just a shortcoming of OpenAI’s artificial intelligence models. On the 22nd, Google disabled the ability of its Gemini model to generate images, after the platform generated images with historical inaccuracies.

The example with the greatest repercussion was that of black and Asian Nazi soldiers, designed by AI.

InternetLab research director Fernanda K. Martins recalls that the reinforcement of stereotypes was already noticeable even in simpler models, such as Google image search. “This has been well documented since the publication of the ‘Algorithm of Oppression’ books [de Safiya Umoja Noble] in 2018.”

“Society needs to discuss whether it wants to make companies like OpenAI and Google responsible for balancing the representations generated by AI codes,” says Martins. “I’m talking about an ethical look at what these technologies foster in the social imagination.”

As possible solutions, she cites non-automated data curation, diversification of professionals involved in development and regulation that considers discriminatory behavior.

Adjustments to these results involve an ethical and political choice to recognize stereotypes as biases, according to Rinesi, who is Argentine. “Each country has its traumas and risks, a problematic image in India can be normal in the United States.”

According to the AI ​​developer, OpenAI expressed particular concern about images linked to armed violence and electoral disinformation, scourges that have haunted American news in recent years.

“In multilateral discussions, there is a naturalization that these technologies are developed in the north and that the south only uses them; thus, there is an erasure of other regions and other ways of existing and this indicates the need for national developments in artificial intelligence”, says Martins, from InternetLab.

As far as large companies are concerned, the process begins when developers accept that a certain standard is a prejudice. From this, they must mobilize teams to select sets of images that minimize biases presented by the technology.

However, when developing business models, such as Dall-E, from OpenAI, and Midjourney, from the company of the same name, the developers’ priority is to improve the performance of AI models in general.

This, roughly speaking, is done with improvements in the architecture of the neural network and an increase in the amount of data in training. The problem, however, lies in the quality and not the quantity of information.

“If you had a hundred times as many photos of Brazilian women in the dataset, as long as the statistical distribution remained the same, you would get more or less the same types of stereotypical results,” says Rinesi.

Furthermore, this review work involves human curation, which increases companies’ costs.

“As this process is expensive and difficult, companies will not go through this unless they are pressured,” says Rinesi.

After negative repercussions in the press, OpenAI managed to reduce racial biases in Dall-E. The model stopped, for example, generating images of prisoners as black men.

Today, however, companies like Google, OpenAI and other AI developers make it difficult to even claim more ethical training of models, as they do not inform which data was used.

The creation of AI models involves open codes, available on the internet, but is based on information collected on the internet, which is contested due to copyright protection.

[ad_2]

Source link