Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Gave it a birth registry from a Portuguese locality from 1755 which my dad and I often decipher to figure out geneology and it did a terrible job.

Regular Gemini Thinking can actually get 70-80% of the documents correct except lots of mistakes on given names. Chatgpt maybe understands like 50-60%.

This Mistral model butchered the whole text, literally not a word was usable. To the point I think I'm doing something wrong.

The test document: https://files.fm/u/3hduyg65a5





Just gave it a shot with Grok 4.1 thinking - do you have the ground truth translation to compare? I've tried 4 different times, with slight tweaks adding information from your description, and it's given me a range of interpretations. It'd be nice to see if any of them got close - a couple were more like pulpy telenovela plots, lol.

The model might need tuning in order to be effective - this is normal for releases of image mode models, and after a couple days, there will be properly set up endpoints to test from, so it might be much better than you think. Or it could be really bad with turn of the 19th century portugese cursive.


Oh god, I'm sure I wouldn't come close to 50%; that's so hard to read

It's tough but my dad is quite good at it. He has books of common abbreviations and agglutinations from different centuries. After you get used to it it's faster and very fun.

We were mind blown how good Gemini was at it.


I am too. Gemini 3.0 fast on old scrawled diary entries in English from 100+ years ago got them 95% right. It also added historical context when I prefaced the images with the identity of the writer, such as summaries of an old military unit history in Europe post-WW1 it got from a very obscure U.S. Army archive.

Huge timesaver.


Forgivable, as that's a quite atypical document, I'd say.

Not atypical enough for Gemini is my point. Also its one of the most common hand written document types in existance since at the time almost nobody other than the local priest knew how to write and birth and marriage certificates were probably the only written documents in whole towns and villages. This is the same throughout Europe at least.

Quick tip: when you digitize a page, put a sheet of black paper behind it. That keeps the ink on the other side from bleeding through.

You can tell that to the national archives!



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: