Hacker Newsnew | past | comments | ask | show | jobs | submit | leumon's commentslogin

Or could it be that it's using tool calls in reasoning (e.g. a google search)?


One other test you could add is generating a chessboard from a FEN. I was surprised to see NBP able to do that (however, it seems to only work with fewer pieces, after a certain amount it makes mistakes or even generates a completely wrong image) https://files.catbox.moe/uudsyt.png


Is there a reason why this text uses "-" as em-dashes "—"?


Since they are set open, I assume they are actually using them as if they were en-dashes and not em-dashes, which the more common style would be to set closed, but I’m guessing, in either case, the reason is “because you can type it on a normal keyboard without any special modification, Compose-key solution, or other processing, and the author doesn't care much about typography”.

EDIT: Though these the days it could also be an attempt at highly-visible “AI didn't write this” virtue signaling, too.


Yes; because - is on the keyboard and — isn't. (Don't tell me how to type —, I know how, but despite that it is the reason, which is what the parent comment asks about.)


Many people have for decades. Seems fine to me.


Is there a reason you phrased the question that way, instead of just asking whether it was written by AI?


It's just that I have the feeling that people avoid using the actual em-dash in fear of being accused that the text is ai generated. (Which isnt a valid indicator anyway) Maybe its just my perception that i notice this more since LLMs became popular.


my original word processor corrected “—-“ to an em-dash, which i would get rid of because it didnt render correctly somewhere in translation between plaintext- markdown- html (sort of how it butchered “- -“ just now on HN.)

but what youd see in your browser was “square blocks”

so i just ran output through some strings/awk /sed (server side) to clean up certain characters, that i now know specifying “ utf-8 “ encoding fixes altogether.

TLDR: the “problem” was “lets use wordpress as a CMS and composer, but spit it out in the same format as its predecessor software and keep generating static content that uses the design we already have”

em-dashes needed to be double dashes due to a longstanding oversight.

The Original Sin was Newsmaker, which had a proprietary format that didnt work in anything else and needed some perl magic to spit out plaintext.

I don’t work in that environment or even that industry anymore but took the hacky methodology my then-boss and I came up with together.

SO,

1) i still have a script that gets rid of them when publishing, even though its no longer necessary. and its been doing THAT longer than “LLMs” were mainstream.

and 2) now that people ask “did AI write this?” i still continue with a long standing habit of getting rid of them when manually composing something.

Funny story though after twenty years of just adding more and more post processing kludge. I finally screamed AAAAAAAAHAHHHH WHY DOES THIS PAGE STILL HAVE SQUARE BLOCKS ALL OVER IT at “Grok.”

All that kludge and post processing solved by adding utf-8 encoding in the <head>, which an “Ai” helpfully pointed out in about 0.0006s.

That was about two weeks ago. Not sure when I’ll finally just let my phone or computer insert one for me. Probably never. But thats it. I don’t hate the em-dash. I hate square blocks!

Absolutely nothing against AI. I had a good LONG recovery period where I could not sit there and read 40-100 page paper or a manual anymore, and i wasnt much better at composing my own thoughts. so I have a respect for its utility and I fully made use of that for a solid two years.

And it just fixed something that id overlooked because, well, im infrastructure. im not a good web designer.


Will we know AGI has been achieved when it stops using em-dashes?


Any AI smart enough not to use em-dashes will be smart enough to use them.


This problem wouldn't exist if openai wouldn't store chatlogs (which of course they want to do, so that they can train on that data to improve the models). But calling nyt the bad guy here is simply wrong because it's not strictly necessary to store that data at all, and if you do, there will always be a risk of others getting access to it.


This isn't memory until the weights update as you talk. (same applies to chatgpt)


Maybe this is just some niche use-case, but I tested it with a 268x98 png screenshot, and it made the image bigger and worse: https://files.catbox.moe/7so3z6.png


JPEG is for photos.

For a white screen with black text, PNG is also compressed and less lossy.

People should not be using PNG for images. If they are using PNG properly, converting to JPEG is a mistake.


Fair point.

Tiny, high-contrast UI screenshots are a worst-case for JPG—size can grow and edges get mushy.

PNG is the right choice here.


Or use the 'privacy redirect' extension which lets you specify your preferred nitter instance. It also works for other platforms.



Who's the suspect for who's doing the cyberattack? Russia? Or another ransomwear group?


You can schedule test calls and call in the scheduled time slot (US only)

Test calls confirm that your local 911 service can receive your 911 call and has the correct location information. Test calls can be scheduled by contacting your local 911 call center via its non-emergency phone number.

In your preferred search engine, search the key words “emergency communications center non-emergency number" and include the names of the city or town, state, and county or parish in your search. Test calls may need to be scheduled and are usually based on the workload experienced at the PSAP.

For more information, visit the National Association of State 911 Administrators site.

Please do not call 911 to obtain the non-emergency number.

https://www.911.gov/calling-911/frequently-asked-questions/#....


Seems like ai bots are indeed bypassing the challenge by computing it: https://social.anoxinon.de/@Codeberg/115033790447125787


That's not bypassing it, that's them finally engaging with the PoW challenge as intended, making crawling slower and more expensive, instead failing to crawl at all, which is more of a plus.

This however forces servers to increase the challenge difficulty, which increases the waiting time for the first-time access.


Obviously the developer of Anubis thinks it is bypassing: https://github.com/TecharoHQ/anubis/issues/978


Fair, then I obviously think Xe may have a kinda misguided understanding of their own product. I still stand by the concept I stated above.


latest update from Xe:

> After further investigation and communication. This is not a bug. The threat actor group in question installed headless chrome and simply computed the proof of work. I'm just going to submit a default rule that blocks huawei.


this kinda proves the entire project doesn't work if they have to resort to manual IP blocking lol


It doesn't work for headless chrome, sure. The thing is that often, for threats like this to work they need lots scale, and they need it cheaply because the actors are just throwing a wide net and hoping to catch it. Headless chrome doesn't scale cheaply so by forcing script kiddies to use it you're pricing them out of their own game. For now.


Doesn't have to be black or white. You can have a much easier challenge for regular visitors if you block the only (and giant) party that has implemented a solver so far. We can work on both fronts at once...


The point is that it isn't "implementing a solver", it's just using a browser and waiting a few seconds.


That counts as something that can solve it, yes. Apparently there's now exactly one party in the world that does that (among the annoying scrapers that this mechanism targets). So until there are more...


The point is that it will always be cheaper for bot farms to pass the challenge than for regular users.


Why does that matter? The challenge needs to stay expensive enough to slow down bots, but legitimate users won't be solving anywhere near the same amount of challenges and the alternative is the site getting crawled to death, so they can wait once in a while.


It might be a lot closer if they were using argon2 instead of sha. Sha is a kind of bad choice for this sort of thinh.


Too bad the challenge's result is only a waste of electricity. Maybe they should do like some of those alt-coins and search for prime numbers or something similar instead.


Most of those alt-coins are kind of fake/scams. Its really hard to make it work with actually useful problems.


Of course that doesn't directly help the site operator. Maybe it could actually do a bit of bitcoin mining for the site owner. Then that could pay for the cost of accessing the site.


this only holds through if the data to be accessed is less valuable than the computational cost. in this case, that is false and spending a few dollars to scrape data is more than worth.

reducing the problem to a cost issue is bound to be short sighted.


This is not about preventing crawling entirely, it's about finding a way to prevent crawlers from repeatedly everything way too frequently just because crawling is just very cheap. Of course it will always be worth it to crawl the Linux Kernel mailing list, but maybe with a high enough cost per crawl the crawlers will learn to be fine with only crawling it once per hour for example


my comment is not about preventing crawling, its stating that with how much revenue AI is bringing (real or not), the value of crawling repeatedly >>> the cost of running these flimsy coin mining algorithms.

At the very least captcha at least tries to make the human-ai distinction, but these algorithms are just purely on the side of making it "expensive". if its just a capital problem, then its not a problem for these big corpo who are the ones who are incentivized to do so in the first place!

even if human captcha solvers are involved, at the very least it provides the society with some jobs (useless as it may be), but these mining algorithms also do society no good, and wastes compute for nothing!


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: