I’m familiar with this project - the creator is a friend. I’ll try to get him on here to answer questions.
He’s a seasoned database architect. With SQLsync he’s made a way for frontend developers to query and update a remote database as if it was completely located right in the browser. Because it basically is. The power of WASM makes it possible to ship a whole SQLite database to the browser. The magic is in how it syncs from multiple clients with a clever but simple reactive algorithm.
It’s a radical attack on the whole problem. Much of our work as developers is about syncing data. When you start looking at React and REST APIs as a kind of sync procedure, this approach can open a lot of new possibilities. You don’t have to write a weird bespoke database of trees of objects fetched and cached from the API any more. You can just update and query it locally, with all the power of a relational database.
I find that moving the full query system into the front end is where most front end devs really want to be. They want a full power query system for the data instead of continuous rounds of re-inventing the transport layer, REST, GraphQL, *RPC, etc.
It's hard to adopt such a system in most traditional web shops with their specialized backend and frontend teams. You're pulling out the database, backend, transport, and auth layers and replacing them with this single block system. Most system architects grew up in the backend so they are generally pretty ignorant of this issue. As it touches both sides extensively you're probably not fitting this into an existing system, which leaves only green field new development. Finally your backend is not an AWS or Asure service, neither is it lambda friendly. All of this means that most architect types I talk to will never touch it.
This style of system mostly already exists with old tech, CouchDB+PouchDB. Which works pretty well for some things. The downsides are that the query system isn't really ideal and the auth and data scoping system is pretty foreign to most people. The easiest model to work with is when the data is totally owned by a single user, and then you use the out-of-the-box database-per-user model. High data segmentation with CRDTs removes a lot of conflict issues.
It has scaling issues though, CouchDB has really high CPU requirements when you're connecting 10k to 100k users. The tech is long in the tooth though it is maintained. On the system design side it gets really complicated when you start sharing data between users, which makes it rather unsuitable as you're just moving the complexity rather than solving it.
This approach seems to hit the same target though will likely have similar scaling issues.
Look forward to see the evolution of the system. Looks like a first step into the world.
connecting to remote DB and impersonating a user account for bug hunting is too easy right now, let’s create the need for a way to proxy to local client computers with greater access to their private information.
Great until you have to support n versions on m platforms and half your customers are enterprisey and stay on a 6-year-old version from before your last LTS version on a now-unsupported platform because they built a core part of their business processes on a a misfeature.
I've worked on various forms of "legacy code" for most of my career. As long as the economics line up and the customer is willing to pay for the required support, then it's a fine business avenue.
If economics don't line up then you have to pull the plug and set them adrift, which is much easier and more secure with a fat client that runs without a server than say, a complex cloud system.
> Remember when targeting Macromedia Flash was going to solve the web compatibility and interactivity conundrum?
This sounds like the set up for a "No? Me neither." punchline. Certainly one of the features of Flash is that it gave you fairly good consistency across computers, but honestly my perception of Flash wasn't that it was going to solve some grand web problem, but more "oooh look, shiny rollover buttons!" and "ooh look, super obnoxious advertisement that forces people to pay attention to it!"
I work with lean, business speculative software mostly. Which means not cross-platform native development is simply not economical to do. I generally need to be able to hit Windows, iOS, Android, and MacOS square on with one code base.
A "native" electron or capacitor distribution system is a fine extension of a local-first web client. And an advantage of building fat clients generally is they lend themselves to such distribution models much easier than say, htmx or hotwire.
Native fat client have had their benefits and lots of people still prefer them, but always had the drawback of manual data management and installs. Being able to leverage any device you own with a cloud synced local-first client really gives you the best of both worlds.
But not all software fits readily within this model.
First, my list failed to include web because of the context. Web is, by far, the largest and most important platform. Even if I'm building only native installers for a project, I need to be able to build web projects with the same tools to make this work.
Java also fails "one code base" requirement as desktop and mobile are very different. The poor iOS support is going to cause it to fail the "square on" requirement as well.
I wish we would go in the opposite direction, let's make a web standard for app stores and app package files, and let web apps handle native file extensions.
Then we can write apps once and run them everywhere, desktop and mobile.
It's not quite shipping the DB to the client, but I like the Supabase/PostgREST approach for replacing everything between the client and SQL server with a single very thin proxy that maps tables and functions to REST calls. Even the auth mechanism is fundamentally just Postgres RLS with signed tokens that you can query against in your RLS policies.
How is this approach meant to handle data visibility and access control? Often a large part of a backend is materializing raw data into a form that the active user is allowed to view.
So if the user owns all their own data, their "data view" is their data set. A To-Do system, a personal finance app, any kind of note-taking or personal record keeping fits this model.
You create a database per user and the auth and sync are all self contained within that database. This system is multi-master, which means that any change on a client or on the server will be replicated to every other. There is no "authority" which trumps the others. The server is simply a central hub that requires the right authentication to allow the sync process to happen.
When you want to create a set of data that crosses user boundaries, it gets complicated. It's possible to do, but you're not on the easy train anymore.
Creating a system that's both easy to use, and scopes the right data view out of the system wide tables and rows we usually think of databases, is not the CouchDB nor SQLSync model.
My problem is that to-do and note taking apps are more or less toy problems. A radical solution to the challenge of building these kinds of apps just doesn't seem that interesting.
Correct me if I'm wrong: we can avoid the idea of a master for this use case because we suppose that only a single client (also server, I guess) will write at a time?
One user can have multiple clients. This is frequently the case, many to most users have both a PC and a phone. Also when one allows reasonable sharing of the account with family, 5+ connected clients is common.
Reminds me of Meteorjs. It would let you sync a subset of your data to the client and then the client could query it any which way it wanted. They called this “Minimongo”.
I've used Meteor. I thought it was a good system. It didn't have offline capability, at least not back when I used it. It really needed to be connected to work. But conceptually, yes, it had a very similar system.
Yeah, faced the scaling challenges. What was recommended was using the "methods" which was basically RPC to fetch the data, rather than using the PubSub data syncing. Meteor maintained a full copy of what data each client had in memory, so this made it use a lot of RAM, and it did some kind of diffing for these on each update to only send the required updates over the wire, so this made it have high CPU utilisation. I had to move the chat portion of my app over to SocketIO which scaled wonderfully, but the DX was not quite as nice.
Rails and Phoenix are doing something similar (kind of) these days, and those seem to scale better, you just move all the data heavy lifting to the server and do almost nothing on the client.
That sounds awfully like Couchbase, which allows you to query/update databases that will sync to remote and the back to peers. And you can control the process (auth/business logic) with sever side JavaScript plugin with ease.
Creator of Couchbase Mobile here — I’m doing a new web-based thing[1] with a similar reactive API. I’m hoping that my encrypted block replication makes it more of a “data anywhere” solution than a “local first” database. But the paradigm of powerful databases in the browser is definitely one I’m glad to see becoming popular.
Genuinely curious why not just cache the relevant bits in LocalStorage / SessionStorage? I seem to remember Chrome trying to add a literal SQL database to the browser, but it never panned out, localStorage became king. I don't mean to downplay the usefulness, just I usually opt for what the browser gives me. I'm huge on WASM and what it will do for the browser as it matures more (or grows in features).
First to address the main point: why not cache the relevant bits in some kind of local storage. SQLSync plans on doing this, specifically using OPFS for performance (but will have fallbacks to localstorage if needed).
Second to address the question of why not use built in kv stores or browser side databases. One answer is another question: how do you solve sync?
One approach is using a data model that encodes conflict handling directly, like CRDTs. This approach is easier to put into general kv stores, as syncing requires simply exchanging messages in any order. I find this solution is well suited to unstructured collaboration like text editing, but makes it harder to coordinate centralised changes to the data. Centralised changes are nice when you start introducing authentication, compaction, and upgrades.
Another approach is doing something similar to how Git Rebase works. The idea is to let the application state and server state diverge, and then provide an efficient means for the app to periodically reset to the latest server state and replay any unacked mutations. This approach requires the ability to re-run mutations efficiently as well as efficiently track multiple diverging versions of the database state. It's certainly possible to build this model on top of local storage.
For SQLSync, I found that by controlling the entirety of SQLite and the underlying storage layer I was able to create a solution that works across platforms and offers a fairly consistent performance profile. The same solution runs in native apps, browser sessions (main thread or workers), and on serverless platforms. One of my goals is to follow the lead of SQLite and keep my solution fairly agnostic to the platform (while providing the requisite hooks for things like durable storage).
FWIW, Web SQL was always fine, but could never be standardized, because no one was ever going to redo all the work sqlite has done (when every browser already uses sqlite).
Tbf, the WebSQL standard was not well-written from how I've heard that story told. It was bug-for-bug exactly standardized to a particular version of SQLite, which is not a good way to write a standard.
The issue was that a specific library would be pinned at a specific version for the rest of the history of the web. As good as SQLite is, I hope to hell we're not still stuck using it to handle mining operations in the oort cloud in 200 years.
Because if this works it's amazing. Realtime sync with offline support out of the box, while not having to develop state management on client and api, but in one place. Those are very hard problems, done with less development. Will definitely give it a shot.
An old company I worked for used project management software with a check-in/out mechanism for making changes. When you "check out" a project it downloads a copy that you change locally, then "check in" uploads it back to the server. A project is "locked" while in the "checked out" state. We all felt it was an archaic mechanism in a word of live updating apps.
After 10 years of building SPA "web apps", that data synchronization mechanism feels ahead of its time.
What many people either can't or don't want to acknowledge is that ultimately whether or not you support live updates in parallel by multiple users, instead of locking so only one update can proceed at a time, is not a technical decision, it's a business decision: do the business rules that are appropriate for your application enable you to deal with concurrent live updates or not?
Ultimately that comes down to whether you can implement a process to ensure consistent resolution of any incompatibilities between multiple concurrent updates. Sometimes that can be done, and sometimes it can't, and which is the case for your application depends on your business rules, not on any technical capability.
If your business rules don't allow you to implement a resolution mechanism, you need locking so that only one update can happen at a time, whether you have the technical capability to support concurrent updates or not.
This is one of those phrases that should turn into a saying, and be passed around for hundreds of years.
Every hard problem I have today in my career involves getting business people to define their business problem properly in order to solve it with technology. Even the hardest code I've ever written was easy compared to some projects, simply due to the business issues lurking around the project. Last week I finished a script to download a CSV and save it to a SQL table (literally) that took 3 weeks because business folks couldn't get their act together on what they wanted. I finished another project in a few days which is currently the core of a previous employers energy efficiency controls product which was easy because the person defining it did it very well, and I had no questions, just work to perform.
Nah. I've seen plenty of systems where the business rules would handle concurrent updates fine, but since they're using a traditional Web/ORM/RDBMS setup they build a last-write-wins system without thinking about it. It's one of those rare problems where the technical part is actually harder than the business part.
Database systems have been able to deal with concurrent updates for quite some time now, so I don't think doing this is technically difficult with the current state of the art. Individual dev teams might not be well versed in the current state of the art, but the correct business response to that is not to restrict your business rules but to get developers who are well versed in the current state of the art.
> Database systems have been able to deal with concurrent updates for quite some time now, so I don't think doing this is technically difficult with the current state of the art.
Traditional ACID systems can't really handle them nicely - your only choice with an update is to commit it or discard it - so you have to do a lot of handwritten logic on top, and even if the database itself handles that well, the layers above it generally don't. Event-sourcing style systems work well but they're still not really mainstream yet.
It solves so many problems and makes it so easy to implement if you go this way.
But just like mentioned it is hard to convince people that it is what they actually want.
People fall into some grand illusion that everything should be always available but in reality then one person is doing changes at a time and if somehow 2 or more people have to work on something - more often than not they should be talking or communicating with each other anyway to synchronize.
Even with GIT and fully distributed development you cannot solve conflicts automagically. You still have to communicate with others and understand context to pick correct changes.
Unison has a neat approach to this problem: References are hashes of the abstract syntax tree, the only way to write a "collision" is to write an identical function--which isn't actually a collision at all.
Good point. I do the same in my own system and use Hashes of the source code, so there are no collisions. Slowly this technique will become mainstream I predict.
This totally. This is one of the reasons that classical RDBMS paradigms and software like MySQL still survive despite however people want to talk it down in favor of "Nosql" or non-relational databases like mongodb citing how fast it is or how cool it is in comparison.
For some things, you need the time tested solutions.
Sounds like RCS [1]. I remember, back when a company I worked for switched from RCS to CVS, one of my coworkers was annoyed that CVS didn't support locking checkouts.
Now I feel old, I remember "Anything but sourcesafe" [0], which was a followup to "Visual Sourcesafe Version Control tunsafe at any speed", and having my trust evapourate when I found out Microsoft didn't dogfood their own version control system.
To get back on topic, the key thing an explicit database gives you is a purpose built-language (and data-integrity enforcement etc. if you do it properly), that everyone knows. (Or used to? SQL is getting more hidden by abstraction layers/eco-systems these days). I'm old, so I reach for my older, well understood tools over new and exciting. Get off my lawn. It may be over-architecting, but I'm also not working in maximising 'performance in milli/micro-seconds is vital' high load environments, or releasing updated software every other day.
The other issue is tool/eco-system fragmentation.
But when you're young and have the energy and mental capacity to abstract out the wahoo for effeciency/performance, you do, because you can, because its better at the time. In our day everyone was writing code to write to code which were effectively the pre-cursors to ORM's. It's just part of being young and committed to your craft, and wanting to get better at it - this is a good thing!
It's only as you get older you start to appreciate the "Less is More" around same time that job ads appear with "Must have 3 years of SQL-Sync experience" (no offence intended here). There are both costs and benefits but which and how much of each you only find out years later.
Are you sure? My experience of using TFVC was that it would warn you if someone else had opened the file for editing but would not actually lock it. Multiple people could edit the same file concurrently with standard automerging/conflict resolution afterwards.
Server workspaces vs local workspaces, maybe? With server, your local copy was marked read-only. Don’t recall if you could change that flag to edit anyway. We moved to local workspaces as
Quickly as we could - that was a more typical offline edit, resolve conflicts at commit model. Don’t remember all the details, been 5+ years since I did anything with TFS.
Yes, “tf edit” would mark on the server that you were opening the file for editing, and cleared the read-only bit, but it didn’t lock the file for others or prevent concurrent edits in any way.
Back in the early days of TFS I was briefly at a company that went all in on MS tools. TFS was used and to avoid the lock each developer had a clone made and after checking their clone in the “TFS Guy” in the office would merge it. He also had to merge things when later checking had conflicting changes.
Now, the best part of this shit show was they had ~30 different customers and each of these customers had a clone of the main thing that would be customized. So the “TFS Guy” had to determine if to keep in the customer clone only or to propagate to the main and then to all the other clones!
Needless to say the “TFS Guy” made a lot of money.
I'm a fan of this approach. SQLSync effectively is doing this continuously - however it would be possible to coordinate it explicitly, thus enabling that kind of check in/out approach. As for single-owner lock strategies, I think you could also simulate that with SQLSync - although you may not need to depending on the app. If the goal is to "work offline" and then merge when you're ready, SQLSync provides this pattern out of the box. If the goal is only one client can make any changes, then some kind of central lock pattern will need to be used (which you could potentially coordinate via SQLSync).
Looks very similar to JEDI [0], an early Delphi VCS system that worked that way. It gave us the tranquility to know that no conflict would appear, as only one developer could work with a locked/checked out file at a time. There was no merge those days. In contrast, files that were frequently changed in every task would always cause a blocking between developers.
There is an interaction here between the "what gets measured gets managed" principle and the sunk cost fallacy.
The problem with databases is actually complexity. Any individual feature is more or less safe, but around the time reliability, caching and indexes get matched together there is a complexity explosion and it doesn't (normally, anyhow) make sense to implement a domain-specific DB (call is a DSD?).
But, around the time a company has invested in implementing those 3 features and discovered that it has sunk a lot of resources into the DSD, is politically averse to recommending it be stripped out and there is a high real cost to taking out the tech debt in one go.
Really the problem here is SQL's syntax. If using a basic relational database was a pleasant experience that involved some familiar C-like syntax instead of broken English people would be more tempted to go with a DB instead of rolling their own. The NoSQL databases were a good step in that direction, but then they by and large overfocused on big data instead of everyday usefulness. Things like Redis took hold which is nice.
Making it easy to run SQL is a reasonable approach, but the problem is that the good databases - I like postgres - are SQL native and it is hard to get efficiency without speaking the DB's language. We really need a PostgresPostSQL database that is a perfect postgres clone but primary parser supports a language with good syntax.
What is exactly hard about sql? Every dev imho should know it. And sql syntax is good and proven too be long lasting. Maybe investing some time in actually learning it instead of bashing it will help you further.
- No non-nullable types (at the expression level). No way to express e.g. normal boolean logic
- No real data structures (at the expression level), unless you count rows, which are not first-class values. Even collections aren't first-class
- Very awkward control flow constructs. E.g. look at how you write recursive queries. Even if/else is weird. It's a classical Turing Tarpit: everything is possible but nothing is easy
- Absurdly complex grammar. Something like 200+ keywords. Bizarre and inconsistent quoting rules
- Weird rules about where expressions can be used, and where you have to put variables. Often if you want to extract out a common subexpression and reuse it you have to completely restructure your statement.
- Missing basic functionality, e.g. no string concatenation operator (except in some nonstandard dialects)
- No libraries. No standard way of reusing code. No unit testing framework, which means in practice most projects have no testing.
It's a terrible language by any reasonable standard. Its closest cousin is COBOL, and that's honestly less awful to write.
If you run into issues with non nullable types and missing of real data structures then i guess you dont get the purpose of SQL and its tech. Its a query language to retreive data. thats the only purpose. With under the hood a crafty machine that enables atomic principles and depending on the sql tech other different functionalities like scalability and more..
No Libraries? There are a zillion battle tested libs out there. Some are used by companies like Paypal, SpaceX, Tiktok, youtube, Gmail and many more.
And why you would unit test datasets in a database? You normally use mocks to test against your data layer. Most if not all software stacks have mock libs available.
> If you run into issues with non nullable types and missing of real data structures then i guess you dont get the purpose of SQL and its tech. Its a query language to retreive data. thats the only purpose.
That's no excuse for making it bad.
> No Libraries? There are a zillion battle tested libs out there.
Not in SQL.
> And why you would unit test datasets in a database?
For the same reason as anywhere else.
> Most if not all software stacks have mock libs available.
Exactly. They're a basic piece of functionality you'd expect most languages to provide, but not SQL, because it's awful.
`||` is the SQL standard. IMO its biggest deficiency is a single NULL nullifies the whole thing. But at least that's consistent with how NULL behaves in many other cases.
SQL often gets criticized, and I think for good reasons, but why didn't we come up with something better?
We have dozens of programming languages in use for general programming, it is a field that is constantly evolving. Even JS, which is hard to change because that's what browsers run and you don't control your client's browser is seeing some evolution, using transpilers, and now WebAssembly.
But for databases, there is only one, and that SQL. There are some alternatives, but no one comes close to SQL in terms of usage. So maybe SQL is not that bad after all.
Maybe the reason is that the relational model is really good, and attempts to deviate from this will only work in a niche. That a declarative style is also really good, and again, you won't have much success if you deviate from this. And if you end up doing SQL with a different syntax, for most people, it will be too small of an improvement to justify changing their way.
> Really the problem here is SQL's syntax. If using a basic relational database was a pleasant experience that involved some familiar C-like syntax instead of broken English people would be more tempted to go with a DB instead of rolling their own.
I wish Postgres had a stable, more low-level API than SQL. Perhaps something that resembles the query plan I get from EXPLAIN. An application written against this API could implement an SQL DB: (1) parse SQL (2) implement a query planner that spits out a query plan written against this API.
You'd need databases that were open to splitting out and layering like that, and I don't think any major SQL databases are. (I mean, most SQL databases are still written in C-family languages; splitting out the PostgreSQL parser into its own library took 20 years and still hasn't really been done properly, they do some weird sync thing from the mainline postgresql codebase rather than being the actual trunk for parser development and having postgresql proper use it as a dependency). Maybe you could do something with H2?
Author here. Finally got through most of the questions, will keep checking periodically for ones I've missed (has someone built a better way to track HN discussions?).
I'm very happy to see the discussion thus far. This first post focused on the parts of frontend engineering that motivated me to create SQLSync rather than specifically how SQLSync works. I'll be addressing that in my next post. Thanks for all the great questions! Keep them coming :)
Hey, finally got around to reading the article. I have been grappling with the convoluted, fucked up mess that is offline first over the last 6 months. I really truly thought that this was a solved problem which the likes of facebook, instagram, etc have solved a long time ago and that there was some open source version copying whatever underlying mechanism they created.
As for the prior art, I was surprised you didn't mention watermelonDB. It sounds like what you are building but specifically for react native. I tried to use it but there were lots of really annoying api choices regarding how the querying works in the front end and their choices on how their ORM works. But the sync methodology they use seems to be sound, and as they say, they've thought long and hard about. https://watermelondb.dev/docs/Sync/Intro
The project as it is seems pretty low level. Are you planning on adding and ORM layer, or leave that up to someone else?
The last question that I always have is, how would this project create a facebook app clone? IE the user will almost always have subsets of tables that are much much smaller than the full tables the the backend has.
IE, a user should have various posts from the feed, and various users, photos, comments and liked tables downloaded for the next 30 minutes of offline activity lets say. In this system, it seems like we are syncing entire databases. But syncing small slivers of the entire system seems to be what a lot of your typical REST based apps do. Yet it is a problem I haven't been able to solve cleanly or with a unified plan of attack
Don't give the user a mental model that reality can break ... badly, or invisibly
I fear sync'ing databases instead of client server models is one of those - either your sync mechanism will just melt, or there are deep assumptions not met
Inwoukd feel safer building a set of CRDT primitives to work with if I feel the need for fast UI and stick with forms submit for everything else -
I agree! One of my goals is to make the mental model of SQLSync easy to grok for the developers using it. I'm biased, but I find the rebase model much easier to understand than CRDTs.
I feel like you might be miss the point of the parent comment. Synchronizing between databases is a notoriously difficult problem. It's really hard to do while avoiding race conditions. To be fair, I don't understand the "git rebase" technique your project uses, but I'm doubtful it solves these problems.
The underlying issue is that users of SQLsync are going to assume that consistency works, but in suble and unsuspecting ways it won't.
As far as I can tell, the only solutions that handle distributed consistency are those that use CRDTs.
I've thought a lot about this, but sure there are still some rough edges to figure out. See this comment for a bit more of a deep dive into how SQLSync works and why it might be possible to make it easier for devs to write correct code that converges. https://news.ycombinator.com/item?id=38502987
Also, I'm considering adding a CRDT column type to SQLite to enable embedded collaborative use cases while using the relational model for the rest of your data's structure.
Trying to synchronize state between client & server is a cursed problem.
You can sidestep it altogether if you make mild UX sacrifices and revert to something more akin to the PHP/SSR model. SPA is nice, but multipart form posts still work. Just the tiniest amount of javascript can smooth out most of the remaining rough edges.
Our latest web products utilize the following client-side state:
3rd party IdP claims for auth
1st party session id in query args
The current document
For the first item, I genuinely don't even know where this is stored. It's Microsoft's problem, not ours. All other state lives on the server. We treat the client more or less like a dumb terminal that punches <form>s all day. We don't even use first party cookies or local storage to get the job done. This dramatically improved our development experience for iOS/Safari targets.
So, I would ask this: What is the actual experience you are trying to offer, and why does it justify decoupling of client & server state?
It's a very common trend for consumer-facing GUI's to have optimistic rendering, and if you're doing that then you're juggling client/server state. I still see spinning loaders here and then but they're generally for initial content load; e.g., does Gmail make you wait when you archive an email?
Not the GP, but I would include optimistic rendering on the list of common patterns that really are a bad idea.
Optimistic rendering means your frontend and backend are tightly coupled, error recovery and synchronization is much more complex, and you are locked into (likely heavy) frontend rendering patterns that add even more complexity and coupling.
We've spent well over a decade trying to avoid the fact that frontend actions require backend logic to complete. Its a losing battle that's just made worse by trying to paper over it.
Edit to clarify: this isn't a direct comment on the OP tool. I haven't used this tool directly but having a hunch it does solve some important use cases for common patterns.
the real UX goal is to minimize data loss (ie. provide the ability to resume whatever the user was doing, from a point that's as recent as possible), and for this it becomes necessary to provide "auto-save"
at that point there's already a diverging state problem, even if it's saved to localStorage (but obviously it's much better to save it on the server)
it's absolutely ridiculous that we spent the last 10+ years simply reinventing the same wheel (all low-level tools, Angular, React, Vue, rxjs, hooks, signals, whatever, etc), without providing the useful composable primitives for the hard problems.
and now there's yet one more thing on the frontend, SQLite. okay, there's a sync thing too. it's not like we didn't try it with that adorable DB (RethinkDB).
Can't say I've been asked to build optimistic updates for that reason, but the irony there is pretty rich if that's the UX goal you've seen it used for.
In my experience optimistic rendering actually creates more data loss risk than it solves. Caching layers are involved, state is duplicated, and routing comes into question as users could leave the page before a failure state is returned.
No, I mean in general, that's the hard problem. The main goal is that the user wants to do something, but there's an instrumental goal that the system has to be resilient to typical real world problems, otherwise the users will have a bad time.
For example the user wants to interact with the service, order something, check for new messages or send one.
On the happy path things are trivial, easy, synchronous, even quite resilient. There's not much complexity to manage on "no new message" or "add to cart, yes, buy with saved card, ship to saved address, k thx bye", if the browser or server crashes, or there's a network error, the interaction is short, easy to retry, problems are not hidden (quasi instant feedback)
But as the total interaction time increases, as the state the user would need to reproduce grows (adding new card, shipping/billing address, adding a lot of stuff into the card, writing a long message) the need for partial state sync arises.
gmail makes you wait when you are deleting spam (ie. selecting a ~100 messages and clicking delete permanently, or how it's called, and it's surprisingly slow)
Indeed it's a very hot space! So exciting to see all the different approaches.
ElectricSQL and PowerSync are both tackling the very hard problem of partial replication. The idea is to build a general solution which allows a traditional centralized db to bidirectionally sync only what's needed on the client side - while still supporting optimistic mutations (and all the consistency/conflict stuff that goes along with that).
The downside is implementation complexity. Both require the ability to keep track of precisely the set of data on each client in order to push out changes to only that subset of the overall database. In addition, specifying which subsets of the database state to pull down requires a new DSL and is a new thing to learn (and optimize). That said, I'm stoked they are taking on this extremely hard problem so when SQLSync is ready for partial replication someone will have already figured out the best practices.
SQLSync, on the other hand, only supports full db sync. So every client will see a consistent view of the entire database. You might immediately wonder if this is a good idea - and for some apps, it's not. But consider a personal finance app. The main goal is cross device sync, cloud backup, offline capable, etc. In this case having the entire db stored on every device is probably what you want. Another example is a document oriented data model, such as Airtable. Each Airtable could be a distinct database, thus leaving it up to the client to manage which tables they care about.
(added in edit:) By focusing on full db sync, the sync engine is much simpler than solutions that support partial replication. One benefit of this is that the backend is very lightweight. Currently the demo (https://sqlsync-todo.pages.dev) runs entirely within Cloudflare Durable Objects using very little storage and CPU time.
SQLSync has a ton of work to do to make these use cases possible (still very much a prototype), but my initial tests have been extremely promising. Hope this helps!
(edit: clarified language regarding centralized dbs and full db sync. Also added paragraph regarding full db sync)
Phillip from PowerSync here, always good to see more people working on problems in this space.
A few things to clarify:
>one multi-tenant centralized db to bidirectionally sync
PowerSync supports syncing from multiple databases.
>The downside is complexity.
I'd say this is true if you're building a partial replication system yourself. PowerSync gives you a ready-built system that's been proven at scale and therefore lets you avoid most of that complexity.
>SQLSync, on the other hand, is full db sync.
It's just as easy to sync the full db with PowerSync as do partial sync.
>But consider a personal finance app. The main goal is cross device sync, cloud backup, offline capable, etc. In this case having the entire db stored on every device is probably what you want.
A bit confused by this. If I'm a developer of a PFM, I don't want anything but a single user's financial data synced to their device. This sounds like partial replication to me.
Precisely. In the SQLSync model - every user would have a private database just for their data. For example, this is how the todo list demo works: https://sqlsync-todo.pages.dev
(Note: currently SQLSync's server tier doesn't support auth, just random 128bit ids. Auth will come as it matures - but don't use this for anything super secure at the moment).
Many times the thought “what if we just shipped the database to the client” has crossed my mind in large multi tenant apps where individual datasets were relatively small. I’ve never gone far with it as it seems sufficiently outside the norm to be a cursed architectural pattern. Would be nice to find out I was wrong.
I once did this with a calorie counting app. Even with hundreds of thousands of foods in the database, the app took much less space than most media apps or games.
I'm also interested to find out if it's cursed :) So far it's been a lot better than I expected. Apps like https://sqlsync-todo.pages.dev are trivialised with this pattern.
Tons of work to do in order to really prove it out though. But I'm pretty excited to keep pushing and see where this goes.
The correct answer is to move the UI back to the server, where the database already is, and just send html to the client side html renderer (or "web browser"). This whole post is just "front end and gone off the rails, over the cliff, and into the ocean."
This isn't a problem of only websites. Should mobile and desktop ecosystems start making a big move for thin-client like the browser? Should a simple app like Apple Reminders or Google Tasks have the GUI pause if there are delays or connection issues?
> Should a simple app like Apple Reminders or Google Tasks have the GUI pause if there are delays or connection issues?
Yes, they should, because I _need_ good feedback for connection issues.
I have not used those two specific apps, but for other "online first" apps it's such a common problem. Open an app, type a note, switch back to a different app (or turn off your phone, or close laptop lid).
Later on, you want to access the note and it's not there. Why? the authors decided to be "smart" and "offline first" and made sync in background, with no feedback, or with heavily delayed feedback... and I was walking out of wifi range when i was typing the note, so connection was spotty.
The OP's demo TODO app has exactly the same problem - no indication when the data is already uploaded. So please, if your real goal is collaboration, let user know when you cannot perform it, don't sweep it under the rug hoping the things would get uploaded eventually.
This is such solid feedback - and a frustration I have with offline first apps. In particular Obsidian sync has this problem which is made worse because it doesn't seem to want to sync in the background. (other than that I love Obsidian - just would be nice to get some more feedback)
I think my preference is that it works instantly, but have some sort of indicator that it is syncing working. Users should be able to see when it is safe to assume state is consistent with what is on the server.
And thus we get back to the need for Exchange Active sync icon from the Pocket PC era! I have finally lived long enough to see a full tech cycle go from new hotness to duh everyone does it this way to a good if boring option to that's so lame and worse, legacy, to what's that?
Agree, it's dealing with distributed systems all over and handling a very limited sandbox on the client side.
I think my sibling comments are completely underestimating the complexity and the market value here. Apple Reminders or Google Tasks are customer facing applications for single individuals, nobody but a single client is working at a time on their own data. Businesses on the other hand want their data at a central location and personnel close to the tasks using client applications. They don't want any reliability issues in the client-server-communication blocking the work being done. Heck, businesses want to be able to throw more people at the same tasks and we call it collaborative editing.
Read-only access for coarse-grained pages (as opposed to building a fine-grained ad-hoc DB) seems something reasonable (and easy) to cache for any kind of frontend.
That would allow offline viewing to a variety of apps, regardless of approach.
Last time I checked Google Docs doesn't primarily allow editing offline files, which hints how hard it is to support substantial features beyond mere reading.
If I'm not mistaken, Google Docs has allowed offline editing for a very long time? You might have to enable an extension that comes default in Chrome. For offline-capable rich text collab there's also Notion or Apple Notes.
That could just hint at complex legacy code that painted Docs into a corner and prevented them from easily supporting that feature without a full rewrite. No doubt the problem is challenging but just because Google didn't do it for docs does not mean it's necessarily some Herculean feat.
The authorization story in letting the client upload a new version of the application database (after a drop in connectivity) sounds like a total nightmare.
I just don't think there are "embarrassingly client-side, but also needs a server for some reason" web apps that would benefit from this in the real world. Even Google's or Apple's version of the simple Todo app has a lot of (useful) integrations, which means having some level of trust with the client.
Recently gave htmx a spin. It is absolutely bananas how much complexity it removes and how much more productive you become as a result. The fact that you can use whatever stack you want is also such a blessing. I tried it with ocaml + web components and it’s a 10/10 super productive experience. Only need one build tool that compiles faster than I can blink, no wiring needed between frontend and backend to map json, it is just insanely productive.
I took a close look at htmx, and my impression is that the problems it addresses can be resolved with even fewer than its 4000 lines of JS [1], and without having to familiarize myself with its many opinionated approaches. The crux of the matter is to generate HTML on the server, as much as possible. I know how to achieve that without htmx.
The marketing materials for htmx are also a bit off-putting to me, and the authors seem to be enjoying fueling the flames of old and useless holy wars about what REST and hypermedia "actually" are, and how we all need to get back to basics, and remember to thank Ted Nelson before every meal, etc. Their online book spends something like 30 pages explaining what hypermedia is... [2]. I prefer not to sign up for a religion when I choose a JS framework (honestly, that's a bit hard these days :-/).
i don't think htmx is very opinionated: it generalizes hypermedia controls in HTML and that's about it. No strong opinions on how you use it beyond that, can work reasonably well w/ any back-end that produces HTML. The goal is to expand the set of UI use cases that can be cleanly expressed in HTML w/o resorting to (user) scripting.
Yeah, htmx is 3800 LoC, and you could do an 80/20 version of it for a lot less, but there are a lot of details to get right: history, collecting inputs, etc. plus, since it's a general purpose library, it has an extensive event system, extension mechanism, etc. There isn't a ton of dead weight to drop, although 2.0 should be smaller as we drop some bad ideas in 1.x.
I don't care too much about the holy wars around REST, I found them off-putting back in the day before I really understood the concept, but I am passionate about the uniform interface and communicating to developers why that was interesting and different. I do go a bit hot at times, especially at twitter, but, on the other hand, if I didn't, would you have ever heard of htmx? I try to balance that all out w/ reasonable essays and the book, which, I think, is worthwhile for most web developers to read (it's free online.)
My tone was a bit strong—my bad! I appreciate your thoughtful response. I must add I realize running an OSS project and writing a book is a massive effort. Big respect for not just sharing ideas but actually implementing them.
From where I sit, I welcome the opinionated approach. I don't need my presentation layer to be industry defining. Htmx allows me to quickly make pages in a way that has never been simple for me using other frameworks. I am primarily a backend engineer though. Maybe there is some glaring weakness that I am missing, but I have yet to find something simpler for my needs.
I don’t know what SPA’s you have the pleasure of using, but most SPA’s I’m subjected to are an exercise in molasses like interactions and loading spinners.
Yeah, they are building SPAs incorrectly by relying on something like `react-query` which is waterfall rendering land. People don't truly understand the sacrifice they are making by using `react-query`. It isn't designed for large scale SPAs, it's designed for small sites that just need to fetch a little data.
Why? Gmail is one of the most used webapps in the world, surely it shows its age, but it's doing its job pretty well, and way better than all the other commercial webmail clients I've tried.
I think it disappears not when you move away from SPA's but when you move away from some high interactivity features products that are amazing to use often have.
This especially becomes true for products that are expected to work in low internet zones.
It disappears if your customers have reliable networks, and they are either close enough to the datacenter that the database is in, or you have sufficiently smart database replication. So, often, the problem comes back, but you're synchronizing between datacenters.
Running server-side does seem to be one of the problems SQLSync wants to handle? I wonder how well it does at that compared to other ways of doing it?
Only for some class of websites with limited interactivity.
On iOS/Android, before all the WebKit apps took over, many apps would use a local SQLite database to support offline edits and implement a syncing protocol to the server. It's a lot of work, but the end product can be quite handy. (This is how you'd expect your email client to work, for example.)
"Limited interactivity" is misleading. It allows much broader interactivity than you think, and arguably most sites and apps would work just fine in this context.
If you want new data, you just fetch it again from the server, and the server returns inherently-fresh data, reasonably fast, along with the HTML fragments necessary for a re-render (over ajax or websockets)
i thought the whole premise of the article was that you don't want to do that, you want to cache some stuff, and instead of writing the cache stuff (a db) yourself, use a real db in your frontend.
if you wanted to just fetch data from your server, it's not a problem anyway, right? a spa can also just fetch fresh data from a server. the whole point of the frontend cache was optimising ux/latency, e.g. for apps with global users but not globally deployed servers
I'm not a HTMX believer, so excuse my potential ignorance, but as far as I know the whole principle of HTMX is to keep things so simple that things cannot go out of sync. (Mostly because HTMX is basically a set of locally cute JS snippets helping with managing the very local interaction/state of a "user clicks button -> JS disables button, sends request to backend, waits for response, puts response somewhere in the DOM, and re-enables the button (or removes it, etc)"
everything else is simply HTML, <a href.., routing is on the backend, etc.
> 1. always refetch data. always in sync, but needs a server request, so it's slow
Except you're not fetching data in htmx, you're fetching hypertext. This is an important distinction. With a JS app you fetch data, integrate it into some local data structure that is then rendered into some form that generates HTML that the browser then renders. With htmx, all of those steps happen in the (faster) server, and the client only receives and renders the HTML. The client-side steps are faster.
Furthermore, apps end up structured differently because you're not querying fine-grained data.
If apps are structured differently as I said, then that does change the pattern and possibly even the total number of network requests. I've seen a few anecdotal reports of reductions in complexity and latency. I would like to see a more systematic study of this to eliminate the possibility that this may be due to second iterations typically being better engineered than the first when the problem was not fully understood.
I'm currently writing a very similar article about "full-stack databases" which highlights the same pattern where many apps end recreating the logic of our backend and database in the frontend client code. The solution we're promoting is to choose a database that can run on both the server and in the client and then sync between them.
The reason we aren't using Sqlite for our product is because Sql is frankly not the right tool for querying data for an application. It doesn't easily map to the data-structures you want in your client code and nearly all SQL databases have no way to subscribe to changes to a query without polling the query repeatedly.
So if you like the idea of having a complete database on your client but also want deep integration with Typescript/Javascript check out what we're building at https://github.com/aspen-cloud/triplit
SQLite does actually provide the mechanisms required to listen for changes via update hooks. It's unfortunate that many SQLite bindings don't expose that. I'm using it in a very simple way - automatically rerun the query if data in the underlying table changes. It's perhaps not as efficient as incrementally updating the results, but with how fast queries in SQLite usually are, I find that doesn't really matter.
> It doesn't easily map to the data-structures you want in your client code
I disagree. Normalizing data is critical for FE reactive applications, keeping data up-to-date basically requires it; all CRUD operations are much easier to handle.
Realm Sync, Mongo + Kotlin MP will cover basically all platforms (server, web, mobile, desktop)... at a cost. Actually interested by alternatives.
Will this be part of your article?
Yeah you can subscribe to overall changes to the data on a row by row basis but can't subscribe to an actual query. Many apps and libraries imitate reactive queries by just refetching all queries from Postgres when any data changes or just repeatedly polling the query every 10 seconds or so but this puts a lot of strain on the database. You can just subscribe to the replication stream but then you're left trying to reconstruct your queries in your application code which is extremely error prone and painful
Good question! You can use a ORM with SQLSync. Currently SQLSync doesn't provide an ORM layer for two reasons:
1. there are many that exist, it's better to integrate
2. it's a prototype so I started with the lowest common denominator which is raw SQL.
SQL.js is an inspiring project in the space and led to official upstream support for compiling SQLite to Wasm. Without these projects SQLSync would have been much more difficult (akin to the original difficulty of SQL.js in the first place). That said, SQLSync is also unique from SQL.js in that it includes a synchronization layer that coordinates with the SQLSync server to provide real time collaboration between users.
Precisely! The same database (SQLite) is running on both the client and the server. SQLSync provides a custom storage layer to SQLite that keeps everything in sync. As changes are made locally (optimistically) they are synced into the cloud where they are eventually applied to the primary db replica. The clients subscribe to changes from the primary db and then rebase themselves to stay in sync.
I really need to write up a detailed overview of how this works! Thanks for the feedback!
My front end db would look a lot different than the back end. A lot of mutations involve submitting work and waiting for distributed jobs to roll up into some kind of partial answer. This worked, That part didn't etc. Long running transactions, workflow that spans months until the final sign off.
I do need better in the moment state in the client though. Was looking at react query with perhaps websockets for cache invalidation. It's nice to see this sqlsync idea too though to consider.
This is solid feedback. One integration I'm stoked to build/find is some kind of durable job queue that can sync directly with SQLSync. Would be so nice to just commit a job request into the db and know it will eventually run and update the state. If anyone wants to experiment with this let me know! :)
I'm trying to achieve something similar with SignalDB: https://signaldb.js.org/
It uses signals for reactivity and is framework agnostic with a mongodb-like query syntax
Relational databases are mathematically elegant and useful in certain ways for arbitrary data retrieval and manipulation, but I don't see why they would be the best fit browser-side for any but the absolute largest and most complicated front-end systems.
Is the typical in-browser code really complex enough that one would want to introduce the object-relational mismatch and all of its associated issues?
Most in-browser code wants to operate on a limited set of very well-defined tree data structures, which is a bad fit for relational databases because relational databases are designed to represent a generic graph that can be virtualized into a tree arbitrarily and as necessary.
I am getting the impression that this might work for small data sets. Does it work for large data sets? My webapp has to work with 100s of GBs of data in MS SQL Server.
Admittedly, I haven't yet read the linked article. But, I plan to.
Currently, SQLSync copies the entire SQLite database to every client. So in order to support large datasets you would need to split up the data over many databases. Some applications can do this, some can't. It depends on the specific business and usage requirements.
Perhaps in the future, SQLSync will support partial replication which would enable this use case. As always there are trade-offs to consider with both approaches.
Can someone explain me how it's syncing the state between two different devices without any activity in the Network tab in DevTools, not even WS traffic?
I get that you can sync state between browser tabs, but I'm trying on two different devices (iPhone and Desktop).
And as far as I can tell, the Wasm layer can't perform network requests directly.
UPDATE: In the console tab I can see 'coordinatorUrl: 'wss://sqlsync.orbitinghail.workers.dev', but I was expecting to see this Websockets connection in the Network tab, and it isn't.
Good catch! SQLSync runs in a shared worker to enable cross-tab reactivity and centralise both local storage and the replication system. You can inspect the worker at the magic url: chrome://inspect/#workers
I've found that TanStack Query [0] solved all of my problems around data fetching for the front end. It supports many different frameworks and supports plenty of advanced features.
SQLSync also looks very interesting. I'll try it out in my next project!
I misinterpreted the title, I clicked because I thought it was literally about how we should stop building and designing more actual databases, which is something that I agree with and is perhaps a hotter take than the one in the article.
There are hundreds of subtly different database out there, but there are only a dozen or two that really ought to exist. So stop making new databases for every little thing folks. They are
I also built a similar serverless project but with a focus on no-code.
I built a support chat app with it (which we use ourselves) using only declarative HTML components. Supports both public chat and private chat with authentication.
It's fully working now but I'm now focusing on walkthroughs/guides.
This looks interesting and i might give it a try, but after watching the talk i'm still a bit unclear why you choose for wasm-in-wasm for the reducer.
I suspect you would be better off by creating a rust reducer trait, and lifting that wasm-in-wasm complexity into a new crate implementing the reducer trait through wasm-in-wasm for the people who want that.
Totally fair question. Nothing is set in stone - but this approach made it very easy for me to run precisely the same code on both a generic backend (CloudFlare Durable Objects) and in the frontend stack.
As for wasm-in-wasm specifically. It would be nice to experiment with the component model to load reducers alongside SQLSync - but the UX isn't quite there yet.
> It isn't clear how they manage the subscription to data updates/inserts/deletions - it mentions supporting triggers, but that feels icky to me.
Architecture post coming soon. In the meantime, I want to clarify that SQLSync does not use triggers for sync. Instead, I hijack SQLites page storage and added page replication to it. Writes are consolidated through an API I call the "reducer" which allows SQLSync to keep track of which logical writes correspond to which sets of page changes. The actual sync is pretty dumb: we run the reducer on both the client and the server. The client replicates down server pages, and then periodically throws out local changes, resets to the server state, and then replays any mutations that haven't yet been acked on the server.
You say things like "X is pretty dumb" and then go on saying stuff I don't understand. Pretty annoying if you ask me.
And that's despite me having worked with Cassandra, Kafka, Postgres and a variety of programming languages, DevOps tools, having worked with Vuejs and React.
The reducer takes objects representing mutations and applies the appropriate changes to the state.
It's called a reducer because it operates like a reduce/fold operation, in that if you take a given state and an ordered list of mutations, you'll end up with a new state out the other side, much like 'let newState = list.reduce(reducer, initialState)' would in JS.
The reducer model is how Elm's state management works, and Redux's, and is what allow replaying of subsequent changes against a previous state to get to a current state (which enables, amongst other things, some the features of those systems' respective dev tools).
Ember.js has had more innovations contributed to modern framework ideas than any other framework, really.
EmberData, Ember Routing, Ember Multi Applications (can't remember what its called, but its a precursor to microfrontends) all in one CLI tooling etc.
I could never understand what holds Ember back from being more used. I think it used to be performance but I think they addressed that with Glimmer many years ago.
Plus people like to tinker and wire up things together themselves, like router, rendering, state management, testing, etc. That's more exciting than `ember new project`.
I was like that 10 years ago too, but now when I learned the ropes I just want to focus on shipping and providing value, rather than wasting my life wiring up 17 javascript packages of the month together.
> writing all the data mutation logic within a reducer that can be easily shared on both the client and the server
This part… makes me feel odd. Is it actually realistic? Reducers for everything, keeping WASM and backend code in sync. Sounds like a lot of work and cognitive overhead.
The idea is that the reducer is your backend code. Think about it like writing your normal API layer. You handle requests and make changes to the database. Since it's Wasm, SQLSync automatically runs it on both the client and the server. Thus you've written your backend once but ran it everywhere.
As for the question of "what if my backend should access stuff that the client doesn't have access to" - the answer is that the reducer doesn't need to do the same thing in both environments. It's easy to call out from the Reducer into an Auth API which only runs on the server. SQLSync will guarantee that everyone reaches a consistent state even if the same mutation results in different outcomes when run locally vs remotely.
Anyone remember when "frontend applications" were actual applications and not web pages? I'm willing to bet we have reached that point where new devs literally do not remember that time.
There comes a time [..] where we [..] need to cache data from an API. It might start off benign – storing a previous page of data for that instant back button experience, implementing a bit of undo logic, or merging some state from different API requests.
The browser is annoying to control, so you're trying to make javascript jump through hoops to do what you want...
SQLSync is [..] designed to synchronize web application state between users, devices, and the edge.
So you want your stateless application to be stateful, to sync that state with some other state, so you can cache data easier, so you can make the browser pretend it's not a browser.
Can we not just admit the browser is a shitty client app? I get that like 50% of the software developers in the world have made the browser their literal livelihood and career, but jesus christ, we need to end this madness. Either make a full blown VM and just write normal client applications that run in it, or stop trying to make client applications out of javascript.
It's insane how much of the world's engineering effort and technology investment money has gone into putting lipstick on a pig. If we took all the money invested in browser ecosystems and put it towards spaceflight, we'd be living on Mars right now.
History doesn't repeat itself, but it does rhyme. One upon a dark age, we had mainframes and dumb terminals. Then came the first age of the PC - let's pull everything to the client. Then came servers and "thin clients". With faster processors and cheaper storage came the second age of the PC, with only the permanent data storage left on the server. As the Internet grew, centralization came back: web services and the cloud, with clients just serving dumb web pages.
And now we see the beginning of a migration back to client-side computation and storage.
Somehow, though, this latest iteration doesn't make a lot of sense. It's hard enough maintaining data consistency on a web service that may be used by hundreds or thousands of people. Imagine when this data is cached in microdatabases in unreliable browsers.
On top of that, the browser makes an absolutely horrible programming environment for client-side apps. For the programming part, Javascript is an poor language, so you wind up using heavy-duty frameworks like React to make it tolerable. For the UI representation, that's just not what HTML/CSS were ever meant for. So you get frameworks there as well. Young developers think this is just the way it is. No actually, it's more like the worst of all possible worlds. Using something like JavaFX or (I know, I know) even Visual Basic, you can produce a functional, robust UI with a tiny fraction of the effort.
years and years ago on a C++ forum someone made an observation that was eerily similar to yours. I still remember it to this day as it stuck in my head.
They made an observation that our industry goes in cyclical centralize/de-centralize cycles and that we we were (at the time) entering into a centralization cycle.
Now here I am reading a comment that we're going back into a de-centralization cycle and I wouldn't be surprised if you're the same poster.
probably 15-20 years ago (maybe more?) I made a prediction that I still think will come true.
The OS will become the "browser" and applications will run directly on the OS and will access local resources through standardized interfaces. WebAssembly and things like WebGL are already moving us in that direction. Honestly HTML5 was the first time I recognized standard updates as moving us towards that reality with things like localStorage, etc.
I honestly think if someone more imaginative had the helm at MS when the cloud started getting big they would have eaten google's lunch by leveraging their desktop dominance into the web. Instead they dd inane things like display websites on the desktop (win98 IIRC).
Could be - I've been making this observation for a long time. The cycles keep going. On the other hand, probably lots of other people have commented on it as well...
You may be right about the browser becoming the OS. Chromebooks were already a step in that direction. But JS/HTML/CSS really is a horrible combination for application programming. If the browser does become the OS, can we please get decent technology to work with?
HTML and CSS aren't too horrible as foundational tech. CSS needs a little work, but both allow for a lot of good abstractions. They may be some of the best platform we've ever had.
They are bad at just being directly programmable targets.
> make the browser a full blown VM and just write normal programs that run in it
This is actually happening, albeit slowly, with recent efforts around WASM etc. If you want a fun hypothetical of where this all goes, check out the talk "The Birth & Death of JavaScript". Link here: https://www.destroyallsoftware.com/talks/the-birth-and-death...
You’re not wrong, but the web won because it had a superior delivery system: URLs. Everything weird about the web era of development has been about contorting everything to be URL-oriented.
But consider how WASM is now turning the browser into an app delivery client. Not a “html and json bodged into an app”, but a real honest to god app.
This project happens to be browser based because that’s convenient place to put a WASM app, and it has a decent presentation layer. But it doesn’t have to be!
Not just delivery, but also security. Browsers offer a level of isolation and safety that you generally don't get with native desktop apps. Things like iOS do bridge the gap a bit more though
> Browsers offer a level of isolation and safety that you generally don't get with native desktop apps.
They didn't originally: Java <applets> and ActiveX <objects> originally weren't sandboxed and had free run of the visitor's computer.
All major OSes today now have built-in support for process/app sandboxing. I suppose if the "rich client" frontend model (WPF, etc) was more popular then I expect desktop OS application isolation to have been introduced much sooner.
Security development happens where the market demands it, and rarely does it happen where it's actually needed.
They don't, though. Browsers are almost trivial to exploit.
There have been seven (7) 0day exploits in Chrome this year (that we know of). Know how many CVEs there were for Chrome in total in 2023? Two-hundred and forty (240). That's impressive. And this is the browser people brag about as being secure.
> Anyone remember when "frontend applications" were actual applications and not web pages?
I do. And also I remember building those apps. It was not as simple as building webapps today.
Yes, there are downsides to this model (a lot of them) BUT you can whip up a simple app with a beautiful UI in a couple of hours today. It was not like that 25 years ago.
> It's insane how much of the world's engineering effort and technology investment money has gone into putting lipstick on a pig.
I'm a JavaScript developer and yes, this is deeply disturbing. Even more so after I built a framework [1] that just copycats what PHP/Rails does (treat the server like a server, and the client/browser like a client—no SPA spaghetti). It works, it's fast, and makes development 100x easier.
I applied the same heuristic to deployments, eschewing all of the containers dogma and lo and behold, you can achieve great performance (and iterative speed) with old school, minimal tech (bare metal/VPS and just install deps and code directly on the machine).
I completely agree with you, I'm going to copy part of another comment I made
-----
probably 15-20 years ago (maybe more?) I made a prediction that I still think will come true.
The OS will become the "browser" and applications will run directly on the OS and will access local resources through standardized interfaces. WebAssembly and things like WebGL are already moving us in that direction. Honestly HTML5 was the first time I recognized standard updates as moving us towards that reality with things like localStorage, etc.
I honestly think if someone more imaginative had the helm at MS when the cloud started getting big they would have eaten google's lunch by leveraging their desktop dominance into the web. Instead they did inane things like display websites on the desktop (win98 IIRC).
What I find annoying is the still-existing problem that sometimes apps don't load properly and you have to refresh the browser. You don't get this with desktop apps. There are some caching capabilities in browsers but they are not being used by anyone to cache app code and resources. If I'm using an app for the first time it should properly load all code and resources or else report an error.
I totally agree! I would love a VM designed for development simplicity & performance that is built with a standard protocol & interface for accessing arbitrary VM applications over the net.
I won't forget that someone at Google didn't have the courage to enable Dart in Chrome as a successor to Javascript. And someone killed SQLLite as a in-browser db.
i used couchdb (on server, with touchdb on android and ios, pouchdb on web, ..) for this kind of thing. Clients were directly connected to cursors over the localdb. How and when that localdb was exchanging data with server-or-others, was not any more Client's problem :)
The thing I've used for this kind of problem is fusejs which is a lightweight search index. You can load it with a list of JSON documents and do structured or fuzzy string searches. I find it pretty well-suited to the kind of frontend experiences I need a lot of data for.
This is cool! Thanks for sharing. Sounds like Fuse would be a great solution for a read-only index. But what if you want to collaborate on the data with other people?
Sorry for perhaps reducing what SQLSync does to any over simplistic description, but is it effectively applying the CRDT pattern on top of the client (SQLSync/sqlite) and servers (Some OLTP) individual operation logs?
Good question. I see you edited your post but i'll be clear - SQLSync doesn't use CRDTs in it's current sync architecture.
Basically each client maintains their own log of mutations. These mutations are applied optimistically to the local instance of SQLite, and in parallel we replicate the log to the server.
On the server side, it reads from all of the individual client logs in a deterministic order and applies the mutations to it's own SQLite database. Under the hood SQLSync hijacks all of the writes to storage and organises them into a format that's easy to replicate.
Finally, the the storage log from the server is replicated back down to each of the clients. This leaves the clients in a weird position as they have two versions of the database which may have diverged (probably). So to finish this up, the clients throw away their local changes and reset to the server state (i.e. git reset --hard). Then since there may be some mutations that were executed on the client but not yet run on the server the client simply re-runs those mutations (i.e. git rebase).
In this way the system continuously keeps itself in sync with the server and other clients.
Conflicts are handled by logic in the reducer which is able to inspect the state of the database to figure out what to do. Yes, this does require writing the reducer code carefully - however in testing I've found this not to be too bad because:
1. A lot of SQL operations automatically converge pretty nicely (and you probably already have to think about them converging in your existing REST api's or w/e backend api you are writing). Think about patterns like `insert...on conflict do...` for example.
2. Since the reducer is isolated and comes with a type description of all possible mutations, it's very easy to unit test the reducer with different orderings. Basically think of it as deterministic simulation testing for your API. Something that's pretty hard to do with normal backend architectures without a lot of mocking and architecture.
That's the dream! Currently SQLSync wraps SQLite in a fairly heavy weight way as it needs to both intercept the storage tier (to replicate page changes) as well as the mutation layer (to provide the reducer api). I'm interested in making it lighter weight and perhaps a regular SQLite extension you could install into any instance of SQLite.
As for the web integration, SQLSync works with any framework but currently only ships with a React library. Most of it's logic is framework agnostic though.
SQLSync also provides a query subscription layer that is table-level reactive. What this means it that the client API can subscribe to a query which will automatically re-run when any table dependencies change. I'm exploring more granular reactivity, however for many datasets re-running on table change is sufficient when coupled with OLTP query patterns and small-medium sized data.
So frontend and backend don't really mean a whole lot. You build apps to do some work with a specific deployment in mind and use the tools available to make it performant?
Having embedded SQLlite in a programming tool on the browser myself I found this very interesting. I am always happy when these ideas become more mainstream!
I don't know what program the author used, but I'd be surprised if he did them by hand. Excalidraw (https://excalidraw.com/) will give a very similar feel.
Coming from the backend world, I've only done small frontends that don't get this complicated. Assuming there are legit reasons for a FE to have complex state, a relational database is probably one of the first things you need. It's almost an automatic decision on a backend to have one, same should apply here. Using SQLite instead of Redux sounds reasonable.
> Using SQLite instead of Redux sounds reasonable.
The trick is reactivity. `redux` handles surgically updating the view based state changes. It's not enough to have sqlite, you also need reactivity. Further, apply strict types on top of sqlite/SQL is another big challenge.
That's a great way to accomplish local storage, but requires a bit of gymnastics to build sync. By controlling the database entirely, SQLSync can provide very ergonomic sync to the developer and performance for the user.
So it's not that one is better than the other. Just the capabilities, performance, and test-ability differs.
Validation can be handled in the reducer layer. Since the reducer logic re-runs on the server, it can do anything including reaching out to server-side resources like authorization or other dbs to ensure the client isn't doing anything fishy. It's also possible to use the full capabilities of SQLite to execute automatic in-db validation via triggers, constraints, checks, etc.
Access control is more difficult. Currently SQLSync is full db sync, so a user either has read or read+write access to an entire db. Validation can stop certain users from writing to portions of the db that they don't have permissions to - but the only way to restrict reads is to split the data across multiple dbs. For some apps this is not ok - and thus I'm researching different ways to achieve partial replication.
Currently got it working across Safari/FF/Chrome + iOS Safari + a couple android devices I was able to test. Requires fairly recent versions everywhere though. You can test out your own device using this demo: https://sqlsync-todo.pages.dev/
Please file an issue if you find it fails somewhere :)
Give someone state and they'll have a bug one day, but teach them how to represent state in two separate locations that have to be kept in sync and they'll have bugs for a lifetime -ryg
Collollary: if you don't represent state in more than one place, you'll eventually run into loss of availability (accessibility), integrity, of existence of data
Isn’t this idea something like couch-db? Then there is pouch-db which is browser implementation using local storage.
So nothing new but it is not bad not to be first. Maybe it is bad not knowing prior work and writing up your idea like it is something no one ever thought earlier about ;)
He’s a seasoned database architect. With SQLsync he’s made a way for frontend developers to query and update a remote database as if it was completely located right in the browser. Because it basically is. The power of WASM makes it possible to ship a whole SQLite database to the browser. The magic is in how it syncs from multiple clients with a clever but simple reactive algorithm.
It’s a radical attack on the whole problem. Much of our work as developers is about syncing data. When you start looking at React and REST APIs as a kind of sync procedure, this approach can open a lot of new possibilities. You don’t have to write a weird bespoke database of trees of objects fetched and cached from the API any more. You can just update and query it locally, with all the power of a relational database.