I remember working with a client in the last 5 years that demanded a Kubernetes cluster to run custom analytics on very fast incoming data streams (several GB per hour). By "custom analytics" I mean, python scripts that loaded a day's worth of data, computed something, wrote the results to disk, and quit.
During development of the scripts, the developers/data scientists wrote and tested everything outside of the cluster since they were simple scripts. They had no problem grinding through a day's worth of data in their testing. But going into prod, had to shove it into the cluster. So now we had to maintain the scripts AND the fscking cluster.
Why?
"What if the data volume increases or we need to run lots of analytics at once?"
"You'll still be dominated by I/O overhead and your expected rate of growth in data volume is <3% per year. You can just move to faster disks to more than keep up. Also there's some indexing techniques that would help..."
Nope, had to have the cluster. So we had the cluster. At the expense of 10x the hardware, another rack of equipment, a networking guy, and a dedicated cluster admin (with associated service contracts from a support vendor). It literally all ran fine on a single system with lots of RAM and SSDs -- which we proved by replicating all of the tasks the cluster was doing.
During development of the scripts, the developers/data scientists wrote and tested everything outside of the cluster since they were simple scripts. They had no problem grinding through a day's worth of data in their testing. But going into prod, had to shove it into the cluster. So now we had to maintain the scripts AND the fscking cluster.
Why?
"What if the data volume increases or we need to run lots of analytics at once?"
"You'll still be dominated by I/O overhead and your expected rate of growth in data volume is <3% per year. You can just move to faster disks to more than keep up. Also there's some indexing techniques that would help..."
Nope, had to have the cluster. So we had the cluster. At the expense of 10x the hardware, another rack of equipment, a networking guy, and a dedicated cluster admin (with associated service contracts from a support vendor). It literally all ran fine on a single system with lots of RAM and SSDs -- which we proved by replicating all of the tasks the cluster was doing.
Argh...