Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I've been known to start ranting when I'm searching on some topic that attracts SEO garbage (e.g., Python), and realize I'm in the nth different SEO garbage article search hit.

I wonder whether a winning solution to the coming massively-parallel "AI" megaspew will emerge, how it will work, and whether Google will be the ones to do it.

Or will the bulk of consumers overwhelmingly submit, and accept it as normal, much like we already do with huge regressions in the quality of interactions and thinking?



Here’s a solution:

Go ask ChatGPT your question about python.

It’s spew is probably better than the SEO spew.

I feel like species traitor taking the side of the AI, but I think this is probably a decent way of answering a lot of questions based on a factual and rational basis that have been answered multiple times. A lot of hoopla is made over the fact you can trick it with a false premise into giving a false answer (I’d note the false premise makes all answers true), but if it can simulate a Linux box reasonably well, it can answer your questions in python module imports better than sifting through page rank collections of SEO crap.

Edit: I also think this doesn’t undermine the value of stack overflow. Bots answering there probably does. But stack overflow has always struggled with simple redundant questions and trying to curate them away in favor of new and interesting questions, and promoting novel and well explained answers. ChatGPT probably won’t do that for you. But it can load shed the redundant stuff highly effectively. And, over time, it’ll get more nuanced as more nuanced questions and answers are incorporated. But the “I need human help” button is still there and thank goodness. Maybe the load shedding will enhance the experience on SO for everyone?


The solution will be to chain Google with chatGPT. A system that allows chatGPT to perform Google searches and use the retrieved documents as reference when formulating the response. This means we don't need to actually see Google or the website, but still have grounded responses with references.

Instead of Google it could also be a big dump of text, like 1TB of well curated web text, so with local search you get more privacy and speed. And instead of chatGPT we will use "chatStability", a LM running on the local machine, in privacy. This could be the end of ads and the creation of a new private creative space for people.

<rant>I think Google is in big trouble and that explains why they are second when it comes to LLMs - this line of development will make it easier to escape their ads and tracking. There's no profit in it for them.

Even though the Transformer was invented at Google 5 years ago, from all the authors only one still works there, most of them have left to implement their own visions, a bunch of startups. Why have they left Google if they wanted to innovate? Because they could not do it inside. Google got to protect the current revenue streams first.</>

https://analyticsindiamag.com/how-the-authors-behind-the-tra....


Going off your point, even if we aren't quite there yet, there is no reason to suppose that AI won't eventually be massively better at basically ALL things than humans - that's certain the direction we're headed, and very quickly too.

This is just the very beginning of the problem. It's time for people to start thinking about how to live in a world where they are "obsolete". All jobs, or, being conservative, at least the vast, vast majority will eventually be replaced by AI. AI will eventually be able to create art that is more appealing to humans than human-created art. I think there could eventually even be AI "people", who's company many people will prefer to other actual humans, as they will be designed/trained to be as perfect as possible for that task. I hope that most people will still see more value in genuine human interaction, as any person today probably would, but as tech evolves and we get used to it, let it into our lives more, let it control more of our lives and make more of our decisions, then it advances more and we get used to it more, etc. The idea of human connection and the separation between human life and AI "life" might not be as apparent to people as it is to us, and they may allow AI and technology to replace every facet of human existence


I actually strongly disagree here with the conclusion you make. I think these are going to be tools that enhance the human mind. It’ll enable the average person to be exceptional by using the AI as guidance. It’ll enable the brilliant to exploit the AIs in novel and clever ways that were not possible before. The humans place is in the use of a tool, the tool doesn’t replace the human.

The challenge will be instead is there enough useful work to keep everyone paid in a post scarcity world? The advent of the useful AI will probably necessitate some very very serious thought about how we allocate and distribute capital.

But I don’t believe artists will be replaced by an AI any more than they were by the camera. Or that streaming will kill music. Etc.


I'm not too attached to my conclusion, in fact I hope it's not true, and have also thought about what you said. But I am convinced that, eventually, humans will be capable of making autonomous AI advanced enough to "replace" them, and I am NOT convinced that humans will choose not to - especially given the fact that over enough time people will slowly allow AI to control more and more of their lives and the world. Then the next generation of people will grow up accustomed to this level of invasiveness and think it normal, and trade off more of their freedom to AI in exchange for more convenience and comfort.

As for the concern about whether there will be enough useful work to keep everyone paid. Eventually the answer is definitely no in my opinion, just a matter of how far out that is, but I don't think whether there will be enough work is a meaningful question in that world. If AI is good enough to replace human work then the work is still being done - in fact MORE work is probably being done, and therefore more goods/resources(post-scarcity as you said). It's a question of whether humans can distribute those resources fairly - this question and problem remains the same whether people need to work or not. We already have more than enough resources to keep everyone in the world fed and comfortable, assuming we don't destroy the planet this will only be more true when the AI work force comes into play.

It's easy to assume that in such a world there will be enough resources for everyone to live very comfortably while the higher class still has extremely excessive control over resources, and as such it's not likely that greed(the main obstacle to a fair outcome) would have to come into play, as the rich and power-hungry can have everything they want without having to allow poor people to starve to death.

Unfortunately I'm not convinced that assumption is valid either though, because of the fact that the psychological factor driving the greed of the ultra-rich isn't the desire for material possessions and wealth in itself, but to have "more" - more stuff, more capital, more power, than others. They don't just intrinsically love 100 million dollar yachts so much that they choose to buy 3 of them instead of, say, saving the lives of millions of starving children. What they love is the inflation of their ego and the feeling of power and superiority. If middle-class person suddenly had 100 million dollar yachts those people wouldn't be satisfied with their own anymore. Because of this, I think the ultra-rich and ultra-powerful, who of course will be the ones in power and who will have the ability to influence how resources are distributed in a post-scarcity world, will be motivated to keep the lower classes "lower", in order to maintain their ego and satisfy their power craving(though they will of course justify it in less crude ways).

To make my point more simply: If Jeff Bezos could snap his fingers right now and magically make every single person on earth exactly as rich as he is, give them all the same resources and material possessions, at no cost to himself, I am quite convinced he would not do it. This is a simpler version of the exact same decisions those in power will have to make over the next centuries - and in their case it will probably be easier to justify to themselves as it won't be so black and white as denying fortune to billions of people at no cost: that will be exactly what they're doing, but over the course of many small decisions that is each justifiable in it's own way, such that they never have to realize that that is exactly what they're doing


I almost decided to limit my searches to the search engines of the authoritative sites. Examples in the case of Python: the Python language, the Django framework, Stackoverflow (until it will surrender to AI posts then we see what happens.) I'll end up creating a home page with search boxes to those sites. Very '90s


Ironically, the 90s solution (human-curated link collections) to Internet-wide searches seems the easiest way out.

It's relatively easy for a human to tell the difference between AI copy (as pollutes the web) and actual content.

It's also strange that Google et al. haven't built human-in-the-loop flagging of spam content into their search UX ("Is this site low quality?"). They could get around flag spam via reputation, given that Google knows identity for most of its users (because they're logged in or via IP across their properties).

Side note: In cursory research on current state, I came across the Chinese phrase "human flesh search engine" [0], which I wonder if it sounds as cyberpunk in the original Chinese.

[0] https://en.m.wikipedia.org/wiki/Human_flesh_search_engine


> It's relatively easy for a human to tell the difference between AI copy (as pollutes the web) and actual content.

It is now. It definitely won't be when gpt gets popular.


It will be: GPT output is usually either garbage, repetitive, very wrong, subtly wrong, or inconsistent in writing style in a way humans aren't. (It sort of starts trying to say something different, in the "same style" as it started with, then switches style later.)

When humans really can't tell the difference, well… https://xkcd.com/810/ seems appropriate.


The GPT output I've been making using openAI's chatpot is indistinguishable from what the bottom 70% of commentators put out.

But an environment glutted with that content is also worthless. I read for the top 10% that has genuinely novel thoughts.


In most cases, judging if something was created by GPT takes quite a long while. It's not a search result you can easily dismiss.



Without revealing the origin, the latest models tested in private betas solve many of the points you mentioned.


This is why I'm not leaking all my discriminating heuristics. https://en.wikipedia.org/wiki/Goodhart%27s_law strikes again.


At some point you also run up against the virus vs immune system endgame: it's evolutionarily inefficient to optimally outwit an adversarial system.

Usually there exists a local maximum where you obtain most of the value with less-than-full effort.

Sadly, the centralization of search directly cuts against that. I.e. outwitting Google search = $$$$ vs outwitting 1-or-4 equally-used search engines = $. :(


> It's relatively easy for a human to tell the difference between AI copy (as pollutes the web) and actual content.

No:

https://en.wikipedia.org/wiki/Sokal_affair


The Sokal affair says more about the dynamics of academic publishing than it does about humans' ability to key to nuances.

And "as pollutes the web" -> we're not talking about state of the art.


Your post led me to an interesting little quick experiment.

So in 2020 I self-taught Python during Advent of Code. I remember remarking that official python.org results seldom made it to the top of Google searches. I was used to Java and Julia searches taking me right to the official sites.

Anyways, so an experiment just search for "[language] [feature]". I ended up with W3Schools as my top result for "Java ArrayList", but Oracle JDK8 docs for "Java Socket". "Python Socket" takes me to language reference but "Golang Socket" does not.

Maybe there's a trend to be discovered.


You could also put something like searx [1] in front to search all at once (or at least provide unified interface for all).

[1] https://searx.github.io/searx/


The solution is to create datasets that contain data and aesthetic judgment metadata from reputable sources together, and train models to perform aesthetic judgment. This will both filter out spam/low quality content, and provide a conditioning tool for generative models to improve their output. Even better, the arms race this will kick off will create a virtuous cycle that will push progress in AI.

The only problem with aesthetic judgment models is that (much like human aesthetic judges) it will sometimes give poor scores to things that are dissimilar to the data it was trained on, even though many people might find those things aesthetically appealing.


>from reputable sources

What does this even mean. Reputable sources tends to fail the 'chinese room' experiment because you can never tell if your source is reputable or just faking being reputable (or later becomes corrupted).


Well, if we were talking about food, I'd say Gordon Ramsay, David Chang, Anthony Bourdain or The New York Times could be considered reputable, while a Yelp reviewer with a handful of reviews could be considered disreputable.

Ultimately you can boil it down to: Trust sources when they make statements that are later observed by me to be true, or are trusted by other sources that I trust. The negative feedback loop boils down to: If a source made a statement that I later found to be untrue, or extended trust to an untrustworthy source, reduce that source's trust.


The issue here is you have pre-AI reputation sources, the problem is for post-AI content generation how are you supposed to find this content in a nearly infinite ocean of 'semi-garbage'. Anthony is not making new content, and one day the other people will expire. In the meantime a million 'semi-trustable' AI sources of content with varying reach depending on how they've been promoted will take over those markets.

There are any number of particular problems here that we already know do not mesh well with how humans think. You'd start your AI source 'true', build up a following, and then slowly them into the Q-anon pit of insanity. Most people will follow your truth and fight anything that questions it.


Search engines like Google are probably a dying technology. The next disruptive innovation in that space will be "answer engines" that use GPT like technology to dynamically generate specific answers to questions rather than linking users to third-party sites. This will allow for controlling the entire user experience and serving ads embedded in the generated content.



I don't think there will be ads. We'll be able to run a local chatBot just like we run SD.


SD is only 4 GB in size, you can run it on one 8-10 GB GPU, GPT-3 is around 350 GB so around 100x more, 99% of humanity can't run it locally currently even if they wanted.


There is FLAN-T5 and hopefully there will come open models that work on a regular computer.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: