Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

It once again completely fails on an extremely simple test: look at a screenshot of sheet music, and tell me what the notes are. Producing a MIDI file for it (unsurprisingly) was far beyond its capabilities.

https://chatgpt.com/share/68954c9e-2f70-8000-99b9-b4abd69d1a...

This is not anywhere remotely close to general intelligence.



Interpreting sheet music images is very complex, and I’m not surprised general-purpose LLMs totally fail at it. It’s orders of magnitude harder than text OCR, due to the two-dimensional-ness.

For much better results, use a custom trained model like the one at Soundslice: https://www.soundslice.com/sheet-music-scanner/




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: