Also difficult to distinguish a blog from a content farm if you are just crawling the web. Any content pattern you select for would likely be quickly adopted by SEOs.
I've found a direct correlation between the chance of a content farm and the number of ads on the blog. With 0 ads, the likelyhook of a content farm is 0%.