I agree that it should be open-source, but I think it can still be a YC company. Improving the user experience on the web is definitely a billion-dollar market.
It's a freaking browser extension. Not trying to insult anyone or be negative, but I genuinely don't understand why anyone would invest money into this.
the only valid reasons to participate in hacker news is to get your startup funded, to get hired by one of the yc startups, or to sell something. it doesn't really make sense to participate in this forum anymore, otherwise, especially if you are just giving people free product development advice.
Respectfully disagree. Why is it colloquially known as "Hacker News", and not say "Startup Forum"? My favorite articles & content on Hacker News are where I stay up to date on technology and what people are doing--which is very literally inline with the name "Hacker News".
This looks really slick, though it's a bummer that there isn't a quick way to try the hosted version. You mentioned the Vercel UX in the comments, and I think the single-click install on the hosted version is a significant part of it.
If you ever saw Claude Code/Codex use grep, you will find that it constructs complex queries that encompass a whole range of keywords which may not even be present in the original user query. So the 'semantic meaning' isn't actually lost.
And nobody is putting an entire enterprise's knowledge base inside the context window. How many enterprise tasks are there that need referencing more that a dozen docs? And even those that do, can be broken down into sub-tasks of manageable size.
Lastly, nobody here mentions how much of a pain it is to build, maintain and secure an enterprise vector database. People spend months cleaning the data, chunking and vectorizing it, only for newer versions of the same data making it redundant overnight. And good look recreating your entire permissioning and access control stack on top of the vector database you just created.
The RAG obituary is a bit provocative, and maybe that's intentional. But it's surprising how negative/dismissive the reactions in this thread are.
The article is not making a proper distinction of scale and is probably due to the small scale problem that they solved. What is small scale and <10K documents / files can be easily processed with grep, find etc. For something at larger scale >1M documents etc. you will need to use search engine technology. You can definitely do the same agent approach for the large scale problem - we essentially need search, look at the results and issue follow up queries to get documents of interest.
All that said, for the types of problem the OP is solving, it might just be better to create a project in Claude/ChatGPT and throw in the files there and get done with it. That approach has been working for over 2 years now and is nothing new.
2.0 Flash is significantly cheaper than 2.5 Flash, and is/was better than 2.5-Flash-Lite before this latest update. It's a great workhorse model for basic text parsing/summary/image understanding etc. Though looks like 2.5-Flash-Lite will make it redundant.
I meant it when I said these smaller models are great. They open up entirely new use cases and I appreciate the work that went into creating them.
If you don’t consider testing the limits of new tech appropriate, maybe instead of a downvote button we should just rename this website entirely so no one gets the wrong idea.
1.2B went to investors, the remaining 1.2B was actually an incentive/payout for the founders/employees that google took. The company basically has whatever money it had in the bank, plus a bit more from Google - but no investor liabilities.
Ok, Google can pay $1.2B to the CEO and key employees to get them to walk. The other $1.2B is for the Windsurf IP and it cannot go directly to the investors. It has to go through the company where it is first revenue and then an asset.
But Windsurf could distribute profit at this point before the Cognition deal. I guess this is where the preference rights got exercised. The tweet from employee #2 said his stock wasn't worth anything. Actually, he got preferenced out of the $1.2B in dividends.
Then came the $250M Cognition deal. He got preferenced out of the proceeds of the Cognition deal as well.
On multiple occasions, Claude Code claims it completed a task when it actually just wrote mock code. It will also answer questions with certainity (for e.g. where is this value being passed), but in reality it is making it up. So if you haven't been seeing hallucinations on Opus/Sonnet, you probably aren't looking deep enough.
This is because you haven't given it a tool to verify the task is done.
TDD works pretty well, have it write even the most basic test (or go full artisanal and write it yourself) first and then ask it to implement the code.
I have a standing order in my main CLAUDE.md to "always run `task build` before claiming a task is done". All my projects use Task[0] with pretty standard structure where build always runs lint + test before building the project.
With a semi-robust test suite I can be pretty sure nothing major broke if `task build` completes without errors.
What do you think it is 'mocking'? It is exactly the behavior that would make the tests work. And unless I give it access to production, it has no way to verify tasks like how values (in this case secrets/envs) are being passed.
Plus, this is all besides the point. Simon argued that the model hallucinates less, not a specific product.