More

benxh · 2026-01-14T22:36:50 1768430210

Crazy calling sovereign states "US Puppets".

quotz · 2026-01-15T11:32:14 1768476734

Maybe the word vassal states would be more appropriate.

benxh · 2026-01-08T20:20:21 1767903621

It is arguable that the new Minimax M2.1 and GLM4.7 are drastically above Sonnet 3.7 in capabilities.

torginus · 2026-01-08T21:45:34 1767908734

Could you share some impressions of using them? How do they feel like compared to OAI models or Claude?

benxh · 2026-01-08T23:36:43 1767915403

Minimax has been great for super high speed web/js/ts related work. It compares in my experience to Claude Sonnet, and at times gets stuff similar to Opus. Design wise it produces some of the most beautiful AI generated page I've seen.

GLM-4.7 like a mix of Sonnet 4.5 and GPT-5 (the first version not the later ones). It has deep deep knowledge, but it's often just not as good in execution.

They're very cheap to try out, so you should see how your mileage varies.

Ofcourse for the hardest possible tasks that GPT 5.2 only approaches, they're not up to scratch. And for the hard-ish tasks in C++ for example that Opus 4.5 tackles Minimax feels closer, but just doesn't "grok" the problem space good enough.

benxh · 2025-10-20T22:54:50 1761000890

cline is used by a lot of devs

wasabi991011 · 2025-10-20T23:05:30 1761001530

Yeah I was freaking out, but turns out it's not the usual Cline extension (which has extension is saoudrizwan.claude-dev).

benxh · 2025-06-10T21:11:57 1749589917

The longer "it" reasons, the more attention sinks are used to come to a "better" final output.

manmal · 2025-06-10T21:37:36 1749591456

I’ve looked up attention sinks and can’t figure out how you’re using the term here. It sounds interesting, would you care to elaborate?

benxh · 2025-02-04T16:54:28 1738688068

To prove you right, you can read up on the incredible giga-brained countrywide experiments by Kardelj in Socialist Yugoslavia [0]. The result being a country where no-one wanted to work, and everyone had a great standard of living (while the IMF didn't call in its loans). And then the entire country collapsed all at once under the accumulated mismanagement.

[0] - https://en.wikipedia.org/wiki/Workers%27_self-management#Yug...

bdndndndbve · 2025-02-04T17:19:34 1738689574

It's wild to me that despite tremendous resources and 100+ years of time capitalism still kills millions of people a year with starvation and preventable diseases. But every right winger has a pet wikipedia page about a failed communist state with no critical examination of why they failed beyond "communism bad".

To clarify my stance I'm an anarchist and that page has a lot of good examples of successful worker owned collectives.

satvikpendem · 2025-02-04T22:30:42 1738708242

There are good critically examined rebuttals if you actually look for them beyond Wikipedia which is not designed for that purpose. I read a book recently called Socialism: the failed idea that never dies, and while it has a clickbait title, the arguments are pretty cogent as to why people throughout history want to enact socialism based systems and why they eventually fail.

chii · 2025-02-05T04:16:14 1738728974

> capitalism still kills millions of people a year with starvation and preventable diseases

as if somehow, it's the responsibility of those capitalists to ensure those starving people have food and diseases cured - for nothing in return.

BriggyDwiggs42 · 2025-02-06T17:39:14 1738863554

Why else do we even bother making a society?

zemvpferreira · 2025-02-04T18:03:33 1738692213

To me it's equally as wild that you say such a thing when no system in human history did as much as capitalism to alleviate hunger and disease. In fact all other systems combined still can't touch the progress we've made to eradicate famine and disease while "under capitalism".

benxh · 2025-01-31T12:35:32 1738326932

My biggest gripe with Ollama is the badly named models, e.g. under deepseek-r1, it defaults to the distill models.

buyucu · 2025-01-31T12:37:10 1738327030

I agree they should rename them.

But defaulting to a 671b model is also evil.

rfoo · 2025-01-31T12:52:48 1738327968

No. If you can't run it and most people can never run the model on their laptop, it's fine, let people know the fact, instead of giving them illusion.

Mashimo · 2025-01-31T13:45:09 1738331109

Letting people download 400GB just to find that out is also .. not optimal.

But yes, I have been "yelled" at on reddit for telling people you need vram in the hundreds of GB.

diggan · 2025-01-31T16:24:25 1738340665

> Letting people download 400GB just to find that out is also .. not optimal.

Letting people download any amount of bytes just to find out they got something else isn't optimal. So what to do? Highlight the differences when you reference them so people understand.

Tweets like these: https://x.com/ollama/status/1881427522002506009

> DeepSeek's first-generation reasoning models are achieving performance comparable to OpenAI's o1 across math, code, and reasoning tasks! Give it a try! 7B distilled: ollama run deepseek-r1:7b

Are really misleading. Reading the first part, you think the second part is that model that gives "performance comparable to OpenAI's o1" but it's not, it's a distilled model with way worse performance. Yes, they do say it's the distilled model, but I hope I'm not alone in seeing how people less careful would confuse the two.

If they're doing this on purpose, I'd leave a very bad taste in my mouth. If they're doing this accidentally, it also gives me reason to pause and re-evaluate what they're doing.

singularity2001 · 2025-01-31T13:54:48 1738331688

at least the distilled models are officially provided by deepseek (?)

benxh · on Oct 2, 2024

I'm pretty sure that Neosync[0] does this to a pretty good degree, it is open source and YC funded too.

[0] https://www.neosync.dev/

benxh · on July 20, 2024

I am assuming this will be solved this year.

benxh · on Feb 23, 2024

If GPT4 is 220B/8 experts, that would be in-line with 3.5 Turbo being a 20B model, and GPT4 being a 55B activation out of a total 220B parameters.

It is ultimately all speculation, until Deepseek releases their own 145B MoE model, and then we can compare the activations/results

devit · on Feb 23, 2024

I think the conjecture is that each expert of GPT-4 has 220B parameters, for a total of 1.76T parameters.

benxh · on Jan 18, 2024

I personally was affected by this fire, although I've always kept 3 month backups of production data, encrypted, on-site, just in case of emergencies like this. Haven't touched their services for anything production related ever since