Seems very expensive compared to just running ES on instances yourself. Using i2.xlarges (as we do for our ES cluster) it's $1.194/hour for managed ES and just $0.853/hour for spot i2 instances. If you're will to reserve for a year, it's only $0.3555/hour.
Given how easy ES is to operate, I don't know why anybody would pay for a less flexible, more expensive hosted version.
> Given how easy ES is to operate, I don't know why anybody would pay for a less flexible, more expensive hosted version.
Easy. Amazon's service is fully managed. Getting ES running on EC2 is easy enough. Making it fault tolerant, automated backups to S3 and multi-az capable, takes more work. Even more to make sure it works properly.
Add the time and complexity required to test your recovery/tolerance/redundancy plans, and you pretty much lose out on the savings.
Given the choice of spending 2-3 days getting everything configured right, vs letting Amazon deal with it, I'd pick Amazon.
Using platitudes: the time I save not dorking with infrastructure is time that I can spend solving my customers problems.
+1. This is the reason I pay for Amazon RDS/Aurora, where the price difference versus the same EC2 instances is even greater. Sure I could set up a database... and replication to another database... and script scheduled snapshots... and build monitoring and fault tolerance against all the ways things can break... but by the time I've set all that up, I've wasted a week or more building servers instead of building a business.
I'm running W3Counter on it now. That's ~100GB of data and a couple hundred queries per second. It's about equal read/write load with one insert and one select for every page view I log. Nothing you can't do on a single server, but it's all I can talk about.
I was renting two ~$300/mo servers at Softlayer to run MySQL (Percona), with one as a backup + failover + to run my monthly browser/OS market share reports against without killing the website by overloading the main server. I regularly spiked high loads that killed latency anyway this spring. I was going to have to upgrade soon anyway, so I decided to try out Aurora during the free preview period, and migrated everything over there one weekend.
Turns out a single R3.Large instance (the smallest) was more than sufficient. It's sitting at under 25% CPU and 50% memory usage right now, which means there's plenty of room to grow too. I'd say performance is as good or better than native MySQL based on that. I'm paying less now than I did for the two servers, and get more for it, with free multi-AZ replication, nightly snapshots, and the ability to spin up a live replica in just a few minutes to run backups/large reports against, then shut down and only pay for the hour.
> Given the choice of spending 2-3 days getting everything configured right
It's never 2-3 days, especially when the next dude managing it needs to be on-boarded. He might not be familiar with how you do things.
This externalizes all that management overhead.
One less thing that you (and your co-workers, and the next dude after you leave) can break.
Also all the configuration settings are exposed via a neat UI rather than hidden in various files scattered out and about. Yes source control, Puppet, Ansible, Docker, etc makes this vastly simpler but then you need to hire and train people that are familiar with these things.
The last time I set up an Elasticsearch instance it got owned a year later due to a remote code execution vulnerability [1]. I found out because Amazon asked me why I was DOSing people.
For people who don't have time or resources to actively manage an ES cluster, security is another reason to go with Amazon's service.
Beyond that, it seems that they've spent the time to integrate ES with IAM, so having the ability to grant fine-grained access to ES using IAM policies will be a big plus for some people.
We're currently using Found to host our ES index and we had to come up with some pretty creative solutions to avoid embedding the auth credentials in our applications that need access to ES. With the new service, we'd simply setup the correct IAM policies for ES and assign the proper IAM roles to the instances running our application and be done.
you overestimate the average web dude's ability to run a service that actually works, and stays up.
you also underestimate the budget that lazy devops people have when their employer is used to (i.e. captive to) to a $100k amazon bill that really should be $25k somewhere else.
That's the per-instance price, so multiply it by the number of nodes you need. We use about a dozen, so the difference for us would be around $7000/month.
When it breaks, you can call/email someone to look at it. Factor in the cost of having someone on retainer or hiring a person that has experience with hosting ES.
Hahaha... that's not how AWS works (unless you pay them hundreds of thousands to millions of dollars a month). Amazon is terrifically unresponsive even for medium-to-large clients.
You just need to pay premium support, which is < 100$ a mo I believe. They reply within a day, but to satisfaction in our experience.
Might not suffice for all use case but thought I'd share it's not totally bad for small companies.
If you are not getting the kind of support with premium or enterprise support, you better talk to your TAM. We run into problems with Redshift all the time and they are quite responsive (well they do violate their SLA often and we are too nice to deal with that right now). But of course you can't get a hold of the developers until support can't do anything else.
If you have business support (10% of total a month / $100 minimum I think) or better that is how it works, except you have to wait hours sometimes and it's very slow to work with the Amazon reps. Also, always make requests urgent, otherwise you'll never get a rep.
Interesting, This would be Amazon's version of Found ( https://www.elastic.co/found ), which is Elastic's (makers of elasticsearch) aws based cloud solution. I wonder what this means for cloud search?
I should note that I work for Elastic on the Logstash team and used to work on Found.
We will definitely be checking this out. Our experience with another vendor and ES has not been great. From having the whole cluster come down when we're doing batch inserts for initial loads to having 10s timeouts on random queries during low activity windows with the vendor saying "dunno, everything looks good here" (when we measured that this 10s were spent waiting for ES).
If you're looking for a search backend without having to worry about managing infrastructure, you should check out Azure Search. I'm a dev on the product. If you're interested in trying it out let me know. You can ping me at my username @microsoft.com if you have questions or need pointers.
It says it is not very good. There are a lot of problems with Cloudsearch. By contrast, Elasticsearch is great and not too difficult to run yourself in AWS, or indeed elsewhere.
Aside from product differences, Cloudsearch was phenomenally expensive, particularly if you had a larger dataset but low read/write volume, as Cloudsearch did not use EBS (at least not in a way you could control), unlike RDS for example.
RDS seems to be doing just fine with a large markup over EC2. Why would the managed version of a service be lower priced than the same infrastructure being managed?
Amazon has more possiblities to run a really huge cluster and balance the different load peaks of the search nodes. The whole business model of AWS is based on this... but you're right as well, price isn't everything and comfort worth money. And yes, I love RDS as well, especially Aurora... it's a great deal!
Opinions on the price being high seem to miss the point of AWS.
AWS is very expensive relative to other clouds, with exceptions so some very nice services (S3, SQS, SES, Route53).
"Core" (Very Important™) services such as EC2/RDS are (appropriately) expensive due to the tooling and high availability options available. I'd expect the tradeoff for their hosted ES to be price, since I'm buying availability and less headache, not just the search service.
I know for sure that I don't know enough to monitor JVM applications. I'd love to pay for their ES service if/when it made sense for a business.
It seems weird to handle the number of instances on top of AWS service, I mean it's AWS they should provide a solution where you don't have to think about that...
It looks like ES installed by Puppet on top of EC2.
I thought there was already an AWS Elasticsearch product? Am I remembering something different or is this coming out of beta or something to that effect?
There is CloudSearch however it doesn't support schemaless stuff.
Looks like this is AWS' answer to that.
There's also the issue of CloudSearch taking in document batches rather than individual documents, which means you have to do a fair bit of more leg work to get your stuff in there. ElasticSearch can take in Logstash inputs which is handy and quite performant.
Plus a lot of data analysts are used to Kibana so there is some appeal there.
Given how easy ES is to operate, I don't know why anybody would pay for a less flexible, more expensive hosted version.