Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I don’t personally agree with the term “overcorrecting” because they aren’t correcting anything. The output is already correct according to the input (humans behaving as they are). It is not biased. What they are doing is attempting to bias it, and it’s leading to false outputs as a result.

Having said that, these false outputs have a strange element of correctness to them in a weird roundabout uncanny valley way: we know the input has been tampered with, and is biased, because the output is obviously wrong. So the algorithm works as intended.

If people are discriminatory or racist or sexist, it is not correct to attempt to hide it. The worst possible human behaviours should be a part of a well-formed Turing test. A machine that can reason with an extremist is far more useful than one that an extremist can identify as such.



It really was just trading one bias (the existing world as it stands) for another bias (the preferred biases of SF tech lefties) so that was kind of funny in its own way. It would have been one thing if it just randomly assigned gender/race, but it had certain one-way biases (modifying men to women and/or white to non-white) but not the opposite direction... and then being oddly defiant in its responses when people asked for specific demographic outputs.

Obviously a lot of this was done by users for the gotcha screen grabs, but in a real world product users may realistically may want specific demographic outputs for example if you are using images for marketing and have specific targeting intent or to match the demographics of your area / business /etc. Stock image websites allow you to search including demographic terms for this reason.


If the current set of biases can be construed to lead to death, heck yeah I will take another set. The idea is that this other set of biases will at least have a chance of not landing us in hot water (or hot air as it might be right now).

Now note again, that the current set of biases got us in an existential risk and likely disaster. (Ask Exxon how unbiased they were.)

AI does not optimize for this thing at all. It cannot tell the logical results from, say, hiring a cutthroat egoist. It cannot detect one from a CV. Which could be a much bigger and more dangerous bias than discrimination against disabled. It might be likely optimizing for hiring conformists even if told to prefer diversity, as many companies are, and that would choke any creative industry ultimately. It might be optimizing for short term tactics over long term strategy. Etc.

The idea here is that certain set of biases go together, even in AI. It's like a culture, we could test for it. In this case, hiring or organizational culture.


You're committing a very common semantic sin (so common because many, many people don't even recognize it): substituting one meaning of "biased" for another.

Sociopolitically, "biased" in this context clearly refers to undue discrimination against people with disabilities or various other marginalized identities.

The meaning of "biased" you are using ("accurately maps input to output") is perfectly correct (to the best of my understanding) within the field of ML and LLMs.

The problem comes when someone comes to you saying, "ChatGPT is biased against résumés that appear disabled", clearly intending the former meaning, and you say, "It is not biased; the output is correct according to the input." Because you are using different domain-specific meanings of the same word, you are liable to each think the other is either wrong or using motivated reasoning when that's not the case.


no assertion about this situation, but be aware that confusion is often deliberate.

there is a group of people who see the regurgitation of existing systemic biases present in training data as a convenient way to legitimize and reinforce interests represented by that data.

"alignment" is only a problem if you don't like what's been sampled.


> there is a group of people who see the regurgitation of existing systemic biases present in training data as a convenient way to legitimize and reinforce interests represented by that data.

Do you have a link to someone stating that they see this as a good thing?


I'm aware that there are people like this.

I prefer to assume the best in people I'm actively talking to, both because I prefer to be kind, and because it cuts down on acrimonious discussions.


That "sin" can be a very useful bit of pedantry if people are talking about social/moral bias as a technical flaw in the model.


> I don’t personally agree with the term “overcorrecting” because they aren’t correcting anything.

When I think of "correctness" in programming, to me that means the output of the program conforms according to requirements. Presumably a lawful person who is looking for an AI assistant to sift through resumes would consider something that is biased against disabled people to be correct and conform to requirements.

Sure, if the requirements were "an AI assistant that behaves similarly to your average recruiter in all ways", then sure, a discriminatory AI would indeed be correct. But I'd hope we realize by now that people -- including recruiting staff -- are biased in a variety of ways, even when they actively try not to be.

Maybe "overcorrecting" is a weird way to put it. But I would characterize what you call "correct according to the inputs" as buggy and incorrect.

> If people are discriminatory or racist or sexist, it is not correct to attempt to hide it.

I agree, but that has nothing to do with determining that an AI assistant that's discriminatory is buggy and not fit for purpose.


I don't disagree with what you wrote here, however who gets decide what "correcting" knobs to turn (and how far)?

The easy obvious answer here is to "Do what's right". However if 21st century political discourse has taught us anything, this is all but impossible for one group to determine.


Agreed, problem as well is "do what's right" is a thing that changes a lot over time.

And while “the arc of the moral universe is long, but it bends toward justice.” .. it gyrates a lot overcorrecting in each direction as it goes.

Handing the control dials to a educationally/socially/politically/etc homogenous set of San Fran left wing 20 somethings is probably not the move to make. I might actually vote the same as them 99% of the time, while thinking their views are insane 50% of the time.


> while thinking their views are insane 50% of the time.

As a moderate conservative I feel the exact same.


I think in this case, correctness can refer to statistical accuracy based on the population being modeled

Remember that's all this is, statistics not a logical program. The model is based on population data


> If people are discriminatory or racist or sexist, it is not correct to attempt to hide it.

What is the purpose of the system? What is the purpose of the specific component that the model is part of?

If you're trying to, say, identify people likely to do a job well (after also passing a structured interview), what you want from the model will be rather different than if you're trying to build an artificial romantic partner.


| What is the purpose of the system

There are those who say that the purpose of a system is what it does.


> The output is already correct according to the input (humans behaving as they are). It is not biased.

This makes sense because humans aren’t biased, hence why there is no word for or example of it outside of when people make adjustments to a model in a way that I don’t like.


>> A machine that can reason with an extremist is far more useful than one that an extremist can identify as such.

And a machine that can plausibly sound like an extremist would be a great tool for propaganda. More worryingly, such tools could be used to create and encourage other extremists. Build a convincing and charismatic AI, who happens to be a racist, then turn it loose on twitter. In a year or two you will likely control an online army.


How does a computer decide what's "extreme", "propaganda", "racist"? These are terms taken for granted in common conversation, but when subject to scrutiny, it becomes obvious they lack objective non-circular definitions. Rather, they are terms predicated on after-the-fact rationalizations that a computer has no way of knowing or distinguishing without, ironically, purposefully inserted biases (and often poorly done at that). You can't build a "convincing" or "charismatic" AI because persuasion and charm are qualities that human beings (supposedly) comprehend and respond to, not machines. AI "Charisma" is just a model built on positive reinforcement.


> These are terms taken for granted in common conversation, but when subject to scrutiny, it becomes obvious they lack objective non-circular definitions

This is false. A simple dictionary check shows that the definitions are in fact not circular.


In general, dictionaries are useful in providing a history, and sometimes, an origin of a term's usage. However, they don't provide a comprehensive or absolute meaning. Unlike scientific laws, words aren't discovered, but rather manufactured. Subsequently they are, adopted by a larger public, delimited by experts, and at times recontextualized by an academic/philosophical discipline or something of that nature.

Even in the best case, when a term is clearly defined and well-mapped to its referent, popular usage creates a connotation that then supplants the earlier meaning. Dictionaries will sometimes retain older meanings/usages, and in doing so, build a roster of "dated", "rare", "antiquated", or "alternative" meanings/usages throughout a term's mimetic lifecycle.


Well if you're taking that tack then it's an argument about language in general rather than those specific terms.


It's an issue of correlating semantics with preconceived value-judgements (i.e. the is-ought problem). While this may affect language as a whole, there are (often abstract and controversial) terms/ideas that are more likely to acquire or have already acquired inconsistent presumptions and interpretations than others. The questionable need for weighting certain responses as well as the odd and uncanny results that follow should be proof enough that what is expected of a human being to "just get" by other members of "society" (an event I'm unconvinced happens as often as desired or claimed) is unfalsifiable or meaningless to a generative model.


I see these terms used in contexts that are beyond the dated dictionary definitions all the time.


Where are the people from the Indian subcontinent. The people who we know are a large plurality working at Google in the image set?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: