Awesome to hear more about MySQL/Vitess connection pooling.
Folks typically only consider memory usage for database connections, but we've also had to consider the p99 latency for establishing a connection. For SpiceDB[0] one place we've struggled for our MySQL backend (originally contributed by GitHub who are big Vitess users) is preemptively establishing connections in the pool so that it's always full. PGX[1] has been fantastic for Postgres and CockroachDB, but I haven't found something with enough control for MySQL.
PS: Lots of love to to all my friends at Planetscale! SpiceDB is also a big user of vtprotobuf[2] -- a great contribution to the Go gRPC ecosystem.
Impressive! But I guess the trade-off of having all that power is the potentially terrifying cost. As you detail in the post, AWS Lambda comes with a default throttle (1000 concurrent) which can be adjusted. Is any throttle/limit like that supported, or in the road-map? Only I've been thinking I may want a service to fail beyond a certain point, as that amount of load would indicate an attack, not genuine usage.
Absolutely agreed. As others have already pointed out, there is no underlying implication of preference to this kind of application architecture. Since we do run an actual DBaaS, one of our main internal goals in running these experiments was to specifically test our Global Routing Infrastructure, and construct a scenario that allowed us to help size specifically those components for capacity planning.
As long-time DBA's ourselves, we do as much as we can to educate and empower users to architect their applications wisely... but we still need to be prepared for the worst. As it turns out, Lambda was an easy way to accomplish that. :)
So, disclaimer, I am not an Amazon Billing wizard, but given that we ran the Lambdas from an isolated sub account, I can be particularly certain that I was able to filter this down accurately.
We hit the one million connections total probably between 10-20 times over the course of a couple of days, and probably spent at least another 20-30 runs working our way up to it, testing various things along the way. Keep in mind, these were all very short-lived test runs, lasting maybe up to 8 minutes at the most.
Our total bill for Lambda in the month of October came out to just over 50USD.
On the off chance someone associated with this is reading: I’m curious about the networking stack here. Specifically TCP. Is it being used? The reason I ask is because one limit I’ve run into in the past with large scale workloads like this is exhausting the ephemeral port supply to allow connections from new clients.
Did you run into this? If not I’m curious why not. And if so, how did you manage it?
Article author here, interesting question! We didn't run into that issue, explicitly.
Our setup was effectively as follows:
- AWS Lambda functions being spawned in us-east-1, from a separate AWS sub account.
- Connections were all made to the public address provisioned for MySQL protocol access to PlanetScale, using port 3306. The infrastructure did also reside in us-east-1.
- Between the Vitess components themselves, and once inside our own network boundaries, we use gRPC to communicate.
Since the goal we set was to hit one million, and realizing we were staying just barely within the limits of the Lambda default quotas, we didn't aggressively try to push beyond that. Some members of our infrastructure team did notice what appeared to be some kind of rate limiting when running the tests multiple times consecutively. Many tests before and after succeeded with no such issues, so we attributed it to a temporary load balancer quirk, but it might be worth going back to confirm if this is the behavior we saw.
Two hypotheses — one of which you can falsify easily. Perhaps Vitess is doing port concentration? Ie dispatching requests made by multiple clients over fewer db connections? This is quite typical to do.
The other is that you may have simply had a fast enough query that Little’s Law worked out for you.
> making MySQL live outside its means (i.e. overcommitting memory) opens the door to dangerous crashes and potential data corruption, so this is not recommended.
data corruption? how?
I'm no mySQL fan but is this FUD or referring to a real issue?
Data corruption is always possible due to software, firmware, hardware bugs and failures... but that’s not specific to OOMs.
You could have non-crash safe settings like sync_binlog != 1 or innodb_flush_log_at_trx_commit != 1, but with the default settings of MySQL 8.0 it’s entirely crash safe in every way (binlog events etc). I think that part of the post needs to be updated. It was a bit of an off-handed comment but it's not as clear and accurate as it should be.
I’m going to replace `potential data corruption` with `potential downtime` as it could initiate a failover and the process will have to restart and go through crash recovery which can take some time.
I wouldn't say exceptionally. InnoDB is reasonably robust on a modern FS with sane mount options. Yet I'll never forget the MySQL 8 table rename bug that crashes the server and corrupts/truncates the table. Appeared in a GA release and took several patch releases before it was fixed.
Sure but data corruption? Data integrity on powerloss or other crash is a core feature of pretty much any RDMS, and OOM is actually probably one of the safer ways a database can crash (any disk caches still get flushed so no unpredictability around what is actually persistented)
Article author here. I fully endorse Aaron's correction and appreciate the call-out.
For context: I initially wrote this paragraph to include more flavor and history around crash recovery challenges with relational databases, implying that while your data might be safe, it is still 100% preferable, even today, to avoid crashes by accurately sizing AND limiting the database to live within its means. Crash recovery can still take a certain amount of time, and when people are weighing whether to bring their app up faster or maintain their data integrity, taking a shortcut in a high pressure situation is sadly not unheard of.
Alas, in my editing, I opted to spend less time in the weeds there, and without the proper context, the use of the term "data corruption" lost all meaning, and no longer belonged in that sentence. Totally fair correction.
This is more dependent on parallelism of queries rather than parallelism of connections. Having a ton of relatively idle connections won't increase the query latency.
If you have a lot of connections doing similar things, just batch requests to get the data in bulk.
Scaling your database up should only be attempted once you can no longer improve efficiency of your application. It is always better to first put effort into improving efficiency than scaling it up.
For example, one trick that allowed me to improve throughput of one application using MongoDB as a backend by factor of 50 was capturing queries from multiple requests happening at the same time and sending them as one request (statement) into the database, then when you get the result you fan them out to the respective business logic that needs them. The application was written with Reactor which makes this much easier than a normal thread based request processing.
For example, if you have 500 people logging at the same time and fetching their user details, batch those requests for example every 100ms up to 100 users and fetch 100 records with a single query.
You will notice that executing a simple fetch by id query even for hundreds of ids will only cost couple times more than fetching a single record.
The application in question was able to fetch 2-3 GB of small documents per second during normal traffic (not an idealised performance test) with just couple dozen connections.
Can you please edit swipes like "you are doing it wrong" out of your HN comments? It acidifies the thread and evokes worse from others. The site guidelines are pretty clear about this: https://news.ycombinator.com/newsguidelines.html.
Edited out but the issue is it really is technically wrong. From a person that is actually building real world large scale applications this is super naive. There exists no networking hardware or disk IO that can adequately support this many active connections. What you really want to do is to do the same work with much less connections which will save you a ton of overhead and also memory that can be used to actually cache useful data.
My goal is to share this little bit of insight so that hopefully fewer people make this fatal mistake. Just because you can technically do something does not mean this is a good idea. If you need to run a million connections to the database instance you really need to rethink this entire problem.
I’m not sure I’d say anything is a “fatal mistake” if it works, and this appears to? Feels like you’re coming at this from “it’s suboptimal”, calling it a fatal mistake is a bit much?
I would look at it this way:
Firstly, it’s just a demonstration of how far a technology can scale, not an endorsement of technical approach for the customers.
Secondly, on serverless, PlanetScale is just solving things their customers want? They don’t have a business if they don’t do that.
Finally, even if you are “right” or at least making a valid point, isn’t the “rethink this entire problem” take principled more than practical? There’s always 47 things to do and time only exists to do a few.
How could you possibly say this is doing it wrong? The only way you could batch requests in the way you describe is if you have 1 (or very small number) compute nodes. You would need all those requests to hit same node so you could try and batch. With serverless compute infrastructure (which is what this blog is demonstrating by using lambda) you can have 1 isolated process per request and therefore need a database that can actually handle this kind of load.
Here is your problem. You are trying to build a huge application using inadequate technical building blocks.
Lambdas are super inefficient in many different ways. It is a good tool but as with every tool you do need to know how to use it. If you try to build heavy compute app in Python and then complain at your electricity bill -- that really is on you.
If your database is overloaded with hundreds of thousands of connections from your lambdas, it means it is end of the road for your lambdas. Do not put effort into scaling your database up, put effort into reducing the number of your connections and efficiency of your application.
I think you can start to hit connection limit walls with RDS at several hundred connections, depending on your instance size. Running an even moderately busy app you could hit those pretty quickly. I would hate to have to change my entire infrastructure at such an early stage because the DB was hitting connection limits!
Would you ever need a million open connections? Probably not! But you'll likely want more than 500 at some point. And if your entire stack is serverless already, it'd be nice if the DB could handle that relatively low number of connections too.
I look at the database connections the following way: how many connections can a database really serve effectively? For a connection to be actively served the database really needs to have a cpu core working on it or waiting for IO from the storage. And I am completely omitting the fact that databases really need a sizeable amount of memory to be able to do things efficiently.
Even if you have a server with hundreds of cores your database probably can't be actively working on more than a small multiple of the number of the cores.
I am not saying you can't. I totally believe you do.
Modern hardware is totally able to execute hundreds of thousands of transactions per second on a single core. If your query is simple and you can organise getting the data from storage at the necessary speeds you should totally be able to do this many requests, possibly even tens of millions.
But handling one million queries per second is completely different from having database server making progress on one million queries in parallel. What happens is, the database server is only making progress on a small number of them (typically in tens up to hundreds on a very beefy hardware) and everything else is just queued up.
There are much, much better ways to queue up millions of things than opening a million connections to get each one done individually.
Lambda was a means to an end for us here, and we're not specifically endorsing its use in _this_ way. Our goal was explicitly to test our ability to handle many parallel connections, and to observe what that looked like from different angles.
We're a DBaaS company, and we do need to be prepared for anything users may throw at us. Our Global Routing infrastructure has seen some major upgrades/changes recently to help support new features like PlanetScale Connect and our serverless drivers.
From our point of view, this was a sizing exercise with the interesting side benefit that many people do happen to use Serverless Functions similarly.
How much is a moderately busy app? I have a sketch of a twitter app in Scala with zio-http as the framework, doing the batching strategy twawaaay describes, and it can handle 46k POSTs per second on my i5-6600 with a SATA3 SSD. That's using 16 connections to postgres, which is probably more connections than is reasonable for my 4 core CPU.
At 46k RPS, it only takes 5.5 ms to assemble a batch of 256, so latency is basically unaffected by doing this. Just set a limit of 5-10 ms to assemble the batch (or lower if you have a more powerful computer that can handle more throughput).
But then you can't put "led implementation of a highly-scalable $80m/yr cloud-native solution" on your resume. And bigger numbers mean bigger salary.
Now, if you wait for someone else to do this then you swoop in and cut that spending by 90%... now you're talking. Then the next person can reverse what you did and we all keep making money digging holes then filling them back up.
Yeah, it is better to manage 50 people and millions in hardware than do the same work with just 2 engineers and one server.
You then get those completely unnecessary feats of technology when really, a single server is usually good to run hundreds of thousands of transactions per second.
It infuriates me that incompetence is promoted and the only way to get rewarded is to be good at internal politics.
Just wanted to come back and defend you (a little :-)
I read "if you have a million database connections then you are doing it wrong" less as a ad hominem attack and more as "if one is doing X one should try a different approach"
I did like the "architectural strategy" of I can call it that of batching the calls. It's "tricks" like that, expressed at this level that are somehow missing from the common software dev parlance. They are not in l33tcode tests, they don't fit into neat boxes but they are vital "common knowledge"
I wish I had a better term for these sort of optimisations.
Anyway. Thanks for the comment. Don't take the blowback personally - frankly I was surprised even if it was a small storm in a teacup.
Folks typically only consider memory usage for database connections, but we've also had to consider the p99 latency for establishing a connection. For SpiceDB[0] one place we've struggled for our MySQL backend (originally contributed by GitHub who are big Vitess users) is preemptively establishing connections in the pool so that it's always full. PGX[1] has been fantastic for Postgres and CockroachDB, but I haven't found something with enough control for MySQL.
PS: Lots of love to to all my friends at Planetscale! SpiceDB is also a big user of vtprotobuf[2] -- a great contribution to the Go gRPC ecosystem.
[0]: https://github.com/authzed/spicedb
[1]: https://github.com/jackc/pgx
[2]: https://github.com/planetscale/vtprotobuf