> Graft clients commit locally and then asynchronously attempt to commit remotely. Because Graft enforces Strict Serializability globally, when two clients concurrently commit based on the same snapshot, one commit will succeed and the other will fail.
OK, but, the API provides only a single commit operation:
> commit(VolumeId, ClientId, Snapshot LSN, page_count, segments) Commit changes to a Volume if it is safe to do so. The provided Snapshot LSN is the snapshot the commit was based on. Returns the newly committed Snapshot on success.
So if a client commits something, and it succeeds, presumably locally, then how should that client discover that the "async" propagation of that commit has failed, and therefore everything it's done on top of that successful local commit needs to be rolled-back?
This model is kind of conflating multiple, very different, notions of "commit" with each other. Usually "commit" means the committed transaction/state/whatever is guaranteed to be valid. But here it seems like a "local commit" can be invalidated at some arbitrary point in the future, and is something totally different than an "async-validated commit"?
Hey friends!
Author of Graft here. Just want to say, huge thanks for all the great comments, stars, and support. Feels really nice to finally be building in public again.
I'm going to force myself to sign off for the evening. Will be around first thing to answer any other questions that come up! I just arrived to Washington, DC to attend Antithesis BugBash[1] and if I don't get ahead of the jet lag I'm going to regret it.
If anyone happens to be around Washington this week (perhaps at the conference) and wants to meet up, please let me know! You can email me at hello
[at] orbitinghail [dotdev].
So, if I understand correctly, the consistency model is essentially git. I.e. you have a local copy, makes changes to it, and then when its time to "push" you can get a conflict where you can "rebase" or "merge".
The problem here is that there is no way to cleanly detect a conflict. The documentation talks about pages which have changed, but a page changing isnt a good indicator of conflict. A conflict can happen due to a read conflict. E.g.
Update Customer Id: "UPDATE Customers SET id='bar' WHERE id='foo'; UPDATE Orders SET customerId='bar' WHERE customerId='foo'"
Add Customer Purchase: "SELECT id FROM Customers WHERE email="blah"; INSERT INTO Orders(customerId, ...) VALUES("foo", ...);"
If the update task gets committed first and the pages for the Orders table are full (i.e. inserting causes a new page to allocated) these two operations dont have any page conflicts, but the result is incorrect.\
In order to fix this, you would need to track the pages read during the transaction in which the write occurred, but that could easily end up being the whole table if the update column isnt part of an index (and thus requiring a table scan).
> After a client pulls a graft, it knows exactly what’s changed. It can use that information to determine precisely which pages are still valid and which pages need to be fetched
I've always found these general solutions kind of confusing. All uses of distributed data are inherently flawed, and have specific limitations, so you have to pick a solution that closest matches your specific use case.
I like the idea behind graft, but it immediately runs into some complications. Like the fact that, as an edge device with unreliable networking, its client may not have availability to fetch the pages it needs when it needs them. If anything, what the client really needs is to fetch all the data whenever it can, so that when it does need to access it, it isn't waiting.
But if you have a lot of data, that could take forever! That's why the application needs to be more intelligent about what data it's requesting, and do things like create priorities and classes for its data so it can fetch what it needs the most first.
The fact that this is built for SQLite kind of reinforces the point. Distributed relational databases are usually a bad idea, because usually people want them to be ACID-like, but you can't enforce that in a distributed way. That's why things like CRDTs exist; you treat every "copy" as an independent thing, breaking the false narrative that you can really distribute the same thing in multiple places reliably. (And that shows you why relational databases just aren't good for replicating; to replicate an entire relational database with CRDTs, you need one CRDT per cell, which multiplies your data)
This is a really interesting project, and a great read. I learned a lot. I'm falling down the rabbit hole pretty hard reading about the "Leap" algorithm (https://www.usenix.org/system/files/atc20-maruf.pdf) it uses to predict remote memory prefetches.
It's easy to focus on libgraft's SQLite integration (comparing to turso, etc), but I appreciate that the author approached this as a more general and lower-level distributed storage problem. If it proves robust in practice, I could see this being used for a lot more than just sqlite.
At the same time, I think "low level general solutions" are often unhinged when they're not guided by concrete experience. The author's experience with sqlsync, and applying graft to sqlite on day one, feels like it gives them standing to take a stab at a general solution. I like the approach they came up with, particularly shifting responsibility for reconciliation to the application/client layer. Because reconciliation lives heavily in tradeoff space, it feels right to require the application to think closely about how they want to do it.
I didn't go into the implementation details so I won't comment on that, but I will say that this is a really important problem to solve. I've long wanted/needed an agnostic sync layer that apps could use to sync my changes across clients and servers, and it would be great if app developers didn't have to put any work into it, they'd just point their database to a URL and the database would take care of all the syncing.
This means that the app developer could support private syncing without any effort, and I wouldn't have to pay a subscription just for sync, I could just run a sync server on my NAS and all apps would know to sync their data with it. Of course, app developers don't want this because subscriptions are a reliable income stream, but hopefully users would push for it.
Looks impressive! Using the VFS is such a fun "hack" :)
We developed our own sync engine for an offline-first IDE for notes/tasks [1] we're building, where the data structure is a tree (or graph actually) to support outlining operations. Conflict resolution is always the challenge, and especially with trees multiple offline players can optimistically commit local changes which would result in an invalid tree state once globally merged.
The way we handle this is by rolling back tree inserts when a client comes online and receives other tree insert deltas. From what I understand from the description of SQLSync, the best way to handle this would be to pull in a latest snapshot and then replay. Pulling in a full snapshot sounds rather expensive though. We have some sort of heuristic where we can do this if the number of incoming deltas would be very large, but for most offline syncing we never need it. Just curious how SQLSync defines these snapshots? Sticking with the note-taking example, in our case we can't really have a snapshot of a single "note" because of graph features like transclusions. Does SQLSync have some clever way to avoid having to send all state in order to "reset and replay"?
This approach has a problem when the mobile client is on the end of a very slow connection, yet the diff that needs to be synced is gigabytes, perhaps because it's an initial sync or the database got bloated for whatever reason.
A hybrid approach is to detect slow syncing (for example when sync hasn't completed after 5 seconds), and instead send queries directly to the server because there is a good chance the task the user wants to complete doesn't depend on the bloated records.
Maybe it's just me, since people here in the comments apparently understand what this is, but even after skimming the comments, I don't. Some simplified API example would be useful either in the "marketing post" or (actually, and) in the github readme.
I mean, it's obviously about syncing stuff (despite the title), ok. It "simplifies the development", "shares data smoothly" and all the other nice things that everything else does (or claims to do). And I can use it to implement everything where replication of data might be useful (so, everything). Cool, but... sorry, what does it, exactly?
The biggest problem with syncing is, obviously, conflict resolution. Graft "doesn’t care about what’s inside those pages", so, obviously, it cannot solve conflicts. So if I'm using it in a note-taking app, as suggested, every unsynced change to a plain text file will result in a conflict. So, I suppose, it isn't what it's for at all, it's just a mechanism to handle replication between 2 SQLite files, when there are no conflicts between statements (so, what MySQL or Postgres do out of the box). Right? So, it will replace the standard SQLite driver in my app code to route all requests via some Graft-DB that will send my statements to external Graft instance as well as to my SQLite storage? Or what?
I see this now and then, but it makes me wonder, why would I pick in this case Apache over MIT? Or is this software actually Apache licensed, but the developer is giving you greenlight to use it under the terms of the MIT? But at that point I don't get why not just license it all under MIT to begin with...
Very interesting! I've been hacking on a somewhat related idea. I'm prototyping a sync system based on pglite and the concept of replicating "intentions" rather than data. By that I mean replicating actions -- a tag and a set of arguments to a business logic function along with a hybrid logical clock and a set of forward & reverse patches describing data modified by the action.
As long as actions are immutable and any non-deterministic inputs are captured in the arguments they can be (re)executed in total clock order from a known common state in the client database to arrive at a consistent state regardless of when clients sync. The benefit of this I realized is that it works perfectly with authentication/authorization using postgres row level security. It's also efficient, letting clients sync the minimal amount of information and handle conflicts while still having full server authority over what clients can write.
There's a lot more detail involved in actually making it work. Triggers to capture row level patches and reverse patches in a transaction while executing an action. Client local rollback mechanism to resolve conflicts by rolling back local db state and replaying actions in total causal order. State patch actions that reconcile the differences between expected and actual outcomes of replaying actions (for example due to private data and conditionals). And so on.
The big benefits of this technique is that it isn't just merging data, it's actually executing business logic to move state forward. That means it captures user intentions where a system based purely on merging data cannot. Traditional crdt that merges data will end up at a consistent state but can provide zero guarantees about the semantic validity of that state to the end user. By replaying business logic functions I'm seeking to guarantee that the state is not only consistent but maximally preserves the intentions of the user when reconciling interleaved writes.
This is still a WIP and I don't have anything useful to share yet but I think the core of the idea is sound. Exciting to see so much innovation in the space of data sync! It's a tough problem and no solution (yet) handles the use cases of many different types of apps.
This is a very interesting approach. Using pages as the basic sync unit seems to simplify a lot. It also makes the sync of arbitrary bytes possible. But it does seem that if your sync is this coarse-grained that there would be lots of conflicting writes in applications with a lot of concurrent users (even if they are updating semantically unrelated data). Seems like OT or CRDT would be better in such a use-case. I'd be interested to see some real-world benchmarks to see how contention scales with the number of users.
I looked at using turso embedded replicas for a realtime collaboration project and one downside was that each sync operation was fairly expensive. The minimum payload size is 4KB IIRC because it needs to sync the sqlite frame. Then they charge based on the number of sync operations so it wasn't a good fit for this particular use case.
I'm curious if the graft solution helps with this. The idea of just being able to ship a sqlite db to a mobile client that you can also mutate from a server is really powerful. I ended up basically building my own syncing engine to sync changes between clients and servers.
> [rqlite and dqlite] are focused on increasing SQLite’s durability and availability through consensus and traditional replication. They are designed to scale across a set of stateful nodes that maintain connectivity to one another.
Little nitpick there, consensus anti-scales. You add more nodes and it gets slower. The rest of the section on rqlite and dqlite makes sense though, just not about "scale".
In the post or the comments here nobody mentions end-to-end encryption. Obsidian Sync offers multi-user sync withe2e encryption. An open source general tools solving the same problem would be great.
Man this looks super awesome. I will be extremely excited to ditch CouchDB for this (even tho I'm an Erlang fan). I'll certainly be keeping an eye on the project and I'll pitch in where I can!
How are permissions supposed to work? Suppose a page has data that I need to see and also has data I can’t see. Does this mean I need to demoralize my entire data model?
How does this compare with Turso? I know it's mentioned in the article (mainly better support for partial replication and arbitrary schemas), but is there also a deeper architectural departure between the two projects?
Interesting approach. I've built my own thing and then rebuilt it last year for our use case. I'm using kotlin-js instead of regular js/typescript. It makes some of this stuff a bit easier to deal with.
Our system is map based; so we are dealing with a lot of map content that is updating often (e.g. location tracking).
v0 of our system was a failed attempt at using mongo realm before I joined. One of my first projects as the CTO of this company was shaking my head at that attempt and unceremoniously deleting it. It was moving GB of data around for no good reason (we only had a few hundred records at that point), was super flaky/buggy at that point, and I never liked mongo to begin with and this was just a mess that was never going to work. We actually triggered a few crash bugs in mongo cloud that caused data loss at some point. Probably because we were doing it wrong (somehow) but it made it clear to me that this was just wrong at many levels. The key problem of realm was that it was a product aimed at junior mobile developers with zero clue about databases. Not a great basis to start engineering a scalable, multi user system that needs to store a world full of data (literally, because geospatial).
We transitioned to a system that used a elasticsearch based system to query for objects to show on a map. Doing that all the time gets expensive so we quickly started thinking about caching objects locally. v1 one of that system served us for about two years and was based on a wasm build of sql lite together with some experimental sqldelight (a kotlin multiplatform framework). This worked surprisingly well given the alpha state of the ecosystem and libraries. But there are some unrelated gotchas when you want to package things up as a PWA, which requires being a bit strict on security model in the browser and conflicting requirements for OPFS (one of the options for local storage). Particularly Safari/IOS is a bit picky on this front. We got it working but it wasn't nice.
At some point I decided to try indexeddb and just get rid of a lot of complexity. IndexedDB is an absolutely horrible Javascript API piece of sh*. But with some kotlin coroutine wrappers, I got it to do what I wanted and unlike OPFS it just works pretty much in all browsers. Also it has similarly relaxed storage quota so you should be able to cram tens/hundreds of MB of data in there without issues (any more might work but is probably not a great idea for sync performance reasons). It's querying is much more limited. But it works for our mostly simple access pattern of getting and storing stuff by id only and maybe doing some things with timestamps, keyword columns, etc.
This is part of another project that I'm working on that will be OSS (MIT license) at some point that I parked half a year ago. I built that first and then decided to lift the implementation and use it on my map product (closed source). Has some limitations. Transactional callback hell is a thing that I need to tackle at some point. Mostly you use it like a glorified Map<String, T> where T is anything that you can convert to/from json via kotlinx serialization.
We're currently working on adding geospatial filtering so we can prioritize the areas the user is using and delete area they are not using. We have millions of things world wide (too much to fetch) but typical usage focuses on a handful of local areas. So we don't need to fetch everything all the time and can get away with only fetching a few tens/hundreds of things. But we do want caching, personalization, and real time updates from others to be reflected. And possibly offline support later. So, the requirements around this are complicated.
We're actually deprioritizing local storage because after putting our API on a diet we don't actually fetch that much data without caching. A few KB on map reposition, typically; the map tiles are larger.
Offline is something that generates a lot of interest from customers but that's mostly because mobile networks suck in Germany.
My ideal version of this is simple: just define the queries you want (no matter how complex) and the you'll get exactly the data you need to fulfill those queries, no more, no less. And the cherry on top would be to have your queries update automatically with changes both locally and remote in close to real-time.
That's basically what we're doing with Triplit (https://triplit.dev), be it, not with SQL--which is a plus for most developers.
Stop syncing everything
(sqlsync.dev)637 points by neilk 31 March 2025 | 122 comments
Comments
https://github.com/orbitinghail/graft/blob/main/docs/design....
> Graft clients commit locally and then asynchronously attempt to commit remotely. Because Graft enforces Strict Serializability globally, when two clients concurrently commit based on the same snapshot, one commit will succeed and the other will fail.
OK, but, the API provides only a single commit operation:
> commit(VolumeId, ClientId, Snapshot LSN, page_count, segments) Commit changes to a Volume if it is safe to do so. The provided Snapshot LSN is the snapshot the commit was based on. Returns the newly committed Snapshot on success.
So if a client commits something, and it succeeds, presumably locally, then how should that client discover that the "async" propagation of that commit has failed, and therefore everything it's done on top of that successful local commit needs to be rolled-back?
This model is kind of conflating multiple, very different, notions of "commit" with each other. Usually "commit" means the committed transaction/state/whatever is guaranteed to be valid. But here it seems like a "local commit" can be invalidated at some arbitrary point in the future, and is something totally different than an "async-validated commit"?
I'm going to force myself to sign off for the evening. Will be around first thing to answer any other questions that come up! I just arrived to Washington, DC to attend Antithesis BugBash[1] and if I don't get ahead of the jet lag I'm going to regret it.
If anyone happens to be around Washington this week (perhaps at the conference) and wants to meet up, please let me know! You can email me at hello [at] orbitinghail [dotdev].
[1]: https://bugbash.antithesis.com/
The problem here is that there is no way to cleanly detect a conflict. The documentation talks about pages which have changed, but a page changing isnt a good indicator of conflict. A conflict can happen due to a read conflict. E.g.
Update Customer Id: "UPDATE Customers SET id='bar' WHERE id='foo'; UPDATE Orders SET customerId='bar' WHERE customerId='foo'"
Add Customer Purchase: "SELECT id FROM Customers WHERE email="blah"; INSERT INTO Orders(customerId, ...) VALUES("foo", ...);"
If the update task gets committed first and the pages for the Orders table are full (i.e. inserting causes a new page to allocated) these two operations dont have any page conflicts, but the result is incorrect.\
In order to fix this, you would need to track the pages read during the transaction in which the write occurred, but that could easily end up being the whole table if the update column isnt part of an index (and thus requiring a table scan).
Curious how this compares to Cloud-Backed SQLite’s manifest: https://sqlite.org/cloudsqlite/doc/trunk/www/index.wiki
It’s similar to your design (sending changed pages), but doesn’t need any compute on the server, which I think is a huge win.
I like the idea behind graft, but it immediately runs into some complications. Like the fact that, as an edge device with unreliable networking, its client may not have availability to fetch the pages it needs when it needs them. If anything, what the client really needs is to fetch all the data whenever it can, so that when it does need to access it, it isn't waiting.
But if you have a lot of data, that could take forever! That's why the application needs to be more intelligent about what data it's requesting, and do things like create priorities and classes for its data so it can fetch what it needs the most first.
The fact that this is built for SQLite kind of reinforces the point. Distributed relational databases are usually a bad idea, because usually people want them to be ACID-like, but you can't enforce that in a distributed way. That's why things like CRDTs exist; you treat every "copy" as an independent thing, breaking the false narrative that you can really distribute the same thing in multiple places reliably. (And that shows you why relational databases just aren't good for replicating; to replicate an entire relational database with CRDTs, you need one CRDT per cell, which multiplies your data)
https://tinybase.org/
https://www.evolu.dev/
https://replicache.dev/
https://fireproof.storage/
https://vlcn.io/
https://www.instantdb.com/
https://loro.dev/
https://electric-sql.com/
https://docs.y-sweet.dev/
https://syncedstore.org/docs/
https://collabs.readthedocs.io/en/latest/
https://remotestorage.io/
https://rxdb.info/offline-first.html
https://github.com/siriusastrebe/jsynchronous
https://www.powersync.com/
https://pouchdb.com/
https://jazz.tools/
https://www.triplit.dev/
https://automerge.org/
https://www.dxos.org/
I hope one day to try them all :-) Or read a summary from someone who does.
It's easy to focus on libgraft's SQLite integration (comparing to turso, etc), but I appreciate that the author approached this as a more general and lower-level distributed storage problem. If it proves robust in practice, I could see this being used for a lot more than just sqlite.
At the same time, I think "low level general solutions" are often unhinged when they're not guided by concrete experience. The author's experience with sqlsync, and applying graft to sqlite on day one, feels like it gives them standing to take a stab at a general solution. I like the approach they came up with, particularly shifting responsibility for reconciliation to the application/client layer. Because reconciliation lives heavily in tradeoff space, it feels right to require the application to think closely about how they want to do it.
A lot of the questions here are requesting comparison's to existing SQLite replication systems, the article actually has a great section on this topic at the bottom: https://sqlsync.dev/posts/stop-syncing-everything/#compariso...
This means that the app developer could support private syncing without any effort, and I wouldn't have to pay a subscription just for sync, I could just run a sync server on my NAS and all apps would know to sync their data with it. Of course, app developers don't want this because subscriptions are a reliable income stream, but hopefully users would push for it.
We developed our own sync engine for an offline-first IDE for notes/tasks [1] we're building, where the data structure is a tree (or graph actually) to support outlining operations. Conflict resolution is always the challenge, and especially with trees multiple offline players can optimistically commit local changes which would result in an invalid tree state once globally merged.
The way we handle this is by rolling back tree inserts when a client comes online and receives other tree insert deltas. From what I understand from the description of SQLSync, the best way to handle this would be to pull in a latest snapshot and then replay. Pulling in a full snapshot sounds rather expensive though. We have some sort of heuristic where we can do this if the number of incoming deltas would be very large, but for most offline syncing we never need it. Just curious how SQLSync defines these snapshots? Sticking with the note-taking example, in our case we can't really have a snapshot of a single "note" because of graph features like transclusions. Does SQLSync have some clever way to avoid having to send all state in order to "reset and replay"?
[1] https://thymer.com
I'm thinking to give it a try in one of my React Native apps that face very uncertain connectivity.
The talk contains a lot more details on how the transactional and replication layers of Graft work.
A hybrid approach is to detect slow syncing (for example when sync hasn't completed after 5 seconds), and instead send queries directly to the server because there is a good chance the task the user wants to complete doesn't depend on the bloated records.
I mean, it's obviously about syncing stuff (despite the title), ok. It "simplifies the development", "shares data smoothly" and all the other nice things that everything else does (or claims to do). And I can use it to implement everything where replication of data might be useful (so, everything). Cool, but... sorry, what does it, exactly?
The biggest problem with syncing is, obviously, conflict resolution. Graft "doesn’t care about what’s inside those pages", so, obviously, it cannot solve conflicts. So if I'm using it in a note-taking app, as suggested, every unsynced change to a plain text file will result in a conflict. So, I suppose, it isn't what it's for at all, it's just a mechanism to handle replication between 2 SQLite files, when there are no conflicts between statements (so, what MySQL or Postgres do out of the box). Right? So, it will replace the standard SQLite driver in my app code to route all requests via some Graft-DB that will send my statements to external Graft instance as well as to my SQLite storage? Or what?
> Licensed under either of
> Apache License, Version 2.0 (LICENSE-APACHE or https://www.apache.org/licenses/LICENSE-2.0)
> MIT license (LICENSE-MIT or https://opensource.org/licenses/MIT)
> at your option.
I see this now and then, but it makes me wonder, why would I pick in this case Apache over MIT? Or is this software actually Apache licensed, but the developer is giving you greenlight to use it under the terms of the MIT? But at that point I don't get why not just license it all under MIT to begin with...
As long as actions are immutable and any non-deterministic inputs are captured in the arguments they can be (re)executed in total clock order from a known common state in the client database to arrive at a consistent state regardless of when clients sync. The benefit of this I realized is that it works perfectly with authentication/authorization using postgres row level security. It's also efficient, letting clients sync the minimal amount of information and handle conflicts while still having full server authority over what clients can write.
There's a lot more detail involved in actually making it work. Triggers to capture row level patches and reverse patches in a transaction while executing an action. Client local rollback mechanism to resolve conflicts by rolling back local db state and replaying actions in total causal order. State patch actions that reconcile the differences between expected and actual outcomes of replaying actions (for example due to private data and conditionals). And so on.
The big benefits of this technique is that it isn't just merging data, it's actually executing business logic to move state forward. That means it captures user intentions where a system based purely on merging data cannot. Traditional crdt that merges data will end up at a consistent state but can provide zero guarantees about the semantic validity of that state to the end user. By replaying business logic functions I'm seeking to guarantee that the state is not only consistent but maximally preserves the intentions of the user when reconciling interleaved writes.
This is still a WIP and I don't have anything useful to share yet but I think the core of the idea is sound. Exciting to see so much innovation in the space of data sync! It's a tough problem and no solution (yet) handles the use cases of many different types of apps.
Beta ETA?
I'm curious if the graft solution helps with this. The idea of just being able to ship a sqlite db to a mobile client that you can also mutate from a server is really powerful. I ended up basically building my own syncing engine to sync changes between clients and servers.
Little nitpick there, consensus anti-scales. You add more nodes and it gets slower. The rest of the section on rqlite and dqlite makes sense though, just not about "scale".
Looks really good, great work!
Our system is map based; so we are dealing with a lot of map content that is updating often (e.g. location tracking).
v0 of our system was a failed attempt at using mongo realm before I joined. One of my first projects as the CTO of this company was shaking my head at that attempt and unceremoniously deleting it. It was moving GB of data around for no good reason (we only had a few hundred records at that point), was super flaky/buggy at that point, and I never liked mongo to begin with and this was just a mess that was never going to work. We actually triggered a few crash bugs in mongo cloud that caused data loss at some point. Probably because we were doing it wrong (somehow) but it made it clear to me that this was just wrong at many levels. The key problem of realm was that it was a product aimed at junior mobile developers with zero clue about databases. Not a great basis to start engineering a scalable, multi user system that needs to store a world full of data (literally, because geospatial).
We transitioned to a system that used a elasticsearch based system to query for objects to show on a map. Doing that all the time gets expensive so we quickly started thinking about caching objects locally. v1 one of that system served us for about two years and was based on a wasm build of sql lite together with some experimental sqldelight (a kotlin multiplatform framework). This worked surprisingly well given the alpha state of the ecosystem and libraries. But there are some unrelated gotchas when you want to package things up as a PWA, which requires being a bit strict on security model in the browser and conflicting requirements for OPFS (one of the options for local storage). Particularly Safari/IOS is a bit picky on this front. We got it working but it wasn't nice.
At some point I decided to try indexeddb and just get rid of a lot of complexity. IndexedDB is an absolutely horrible Javascript API piece of sh*. But with some kotlin coroutine wrappers, I got it to do what I wanted and unlike OPFS it just works pretty much in all browsers. Also it has similarly relaxed storage quota so you should be able to cram tens/hundreds of MB of data in there without issues (any more might work but is probably not a great idea for sync performance reasons). It's querying is much more limited. But it works for our mostly simple access pattern of getting and storing stuff by id only and maybe doing some things with timestamps, keyword columns, etc.
If somebody is interested, I put a gist here with the kotlin file that does all the interaction with indexed db: https://gist.github.com/jillesvangurp/c6923ac3c6f17fa36dd023...
This is part of another project that I'm working on that will be OSS (MIT license) at some point that I parked half a year ago. I built that first and then decided to lift the implementation and use it on my map product (closed source). Has some limitations. Transactional callback hell is a thing that I need to tackle at some point. Mostly you use it like a glorified Map<String, T> where T is anything that you can convert to/from json via kotlinx serialization.
We're currently working on adding geospatial filtering so we can prioritize the areas the user is using and delete area they are not using. We have millions of things world wide (too much to fetch) but typical usage focuses on a handful of local areas. So we don't need to fetch everything all the time and can get away with only fetching a few tens/hundreds of things. But we do want caching, personalization, and real time updates from others to be reflected. And possibly offline support later. So, the requirements around this are complicated.
We're actually deprioritizing local storage because after putting our API on a diet we don't actually fetch that much data without caching. A few KB on map reposition, typically; the map tiles are larger.
Offline is something that generates a lot of interest from customers but that's mostly because mobile networks suck in Germany.
That's basically what we're doing with Triplit (https://triplit.dev), be it, not with SQL--which is a plus for most developers.