Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

[flagged]


ChatGPT would be worthless without training material like SO.


It can’t have been trained on THAT much SO because it’s never told me my question is off topic or is a repeat of a different question. :P


Stack overflow is not for you and your question, it's for the person doing a google search with the same question months or years later.

A machine that produces a large set of annotated FAQs.


I’ve never posted a question to S.O. My infuriation is entirely gratuitous. So many times I’ve found a polite, well worded question asking exactly what I need answered, only to see it closed as off topic (and we’re talking “question about preg_match() on the PHP stackexchange” type question) or for some condescending asshole to mark it as duplicate, linking a mostly unrelated and far simpler question with no further indication why this might be at all the proper response.


Not really. LLMs are good at indexing and digesting documentation, up to and including actual source code, and answering questions about it.

And they never "Vote to close as duplicate" because somebody asked something vaguely similar 10 years ago about a completely different platform and didn't get a good answer even then.

Stack Overflow is the taxi industry to AI's Uber. We needed it at one point, but it really always sucked, and unsurprisingly some people took exception to that and built something better, or at least different.


> LLMs are good at indexing and digesting documentation, up to and including actual source code, and answering questions about it.

Requires citations not in evidence. Source code and documentation rarely co-exist, and even the best source code is not even close to well-described by documentation of the software it is a part of. I basically call BS.


OpenAI, through Microsoft + Github, has access to unfathomable amounts of source code training data and would be just fine without StackOverflow.


SO provided the connection between natural language (primarily English) and source code. Access to source code alone doesn't do that, commented code nothwithstanding.


I don't suspect that SO alone is anywhere nearly sufficient to train LLMs to predict solutions to coding problems and write code. There must be additional training going on with tagged sets. I've heard about people being employed by AI companies to solve programming problems just for the sake of generating training pairs.


No, compsci textbooks and language manuals do that. SO is not the primary, canonical educational resource you seem to think it is, and they'd be the first to agree.


By and large compsi text books are not sources of large amounts of working code in a specific language. Some programming-oriented ones may be; does Numerical Recipes in C count as a comp sci book?


True, I was assuming that people would think a bit more abstractly, or at least a bit more generously, but sometimes I forget where I am. By "compsci" I mean everything from graduate-level theoretical texts all the way down to "101 BASIC Programs for the TRS-80."

In the old days, magazine articles would also present practical code alongside plaintext explanations of how it worked. There's still no shortage of tutorial content, although not as much in paper form, and even less on Stack Overflow.


> No, compsci textbooks do that.

No, they don't.


Sigh. Yes, you're right, programming textbooks do that. Now, where's my cilice...


Source code and documentation rarely co-exist

They may not co-exist in real life, but in a million-dimension latent space you'd be surprised how many shortcuts you can find.

Requires citations not in evidence.

If you didn't bother to read the foundational papers on arxiv or other primary sources, it'd be a waste of time for me to hunt them down for you. Ask your friendly neighborhood LLM.


OMG YES, that site needed to die! I posted a few times on subjects I was an expert in, and hence they were difficult issues, and no one would ever answer them.

The few other times I posted they were questions about things I wasn't an expert in, hence why I was asking, and my god, it was like I was pulling them away from their busy schedules and costing them time at work. It's like you don't have to answer if you have something better to do.


You're absolutely correct!




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: