Hello!

So, a lot has happened since the last post, but there’s still not too much that’s working. Here’s the update:

So, I looked into file sharing solutions (primarily IPFS and Holepunch), but in the end I decided to do my own solution instead. There are three main reasons:

  1. It looks to me like they do a bunch of work that we don’t need to do. They establish stateful connections, they authenticate signatures (which we then won’t use because we’re doing content-addressing), they attempt to collaborate to maintain the map of the nodes instead of relying on a central server’s authoritative map. So we don’t need some of their work, and we’re going to be partially duplicating some of their work in different form, up at a higher layer on top of them, and the whole solution may wind up taking on this awkward nature because of it.
  2. Correspondingly, it seems unlikely to me that their latency can be brought down as low as we could get it by rolling a custom transport layer. Web pages make a bunch of requests and some of them are quite small and numerous, and the performance of the web page is critically dependent on the latency of some of them. In a perfect world, the web app would be adapted so it’s requesting stuff it needs in big chunks and expecting a certain amount of latency before the whole chunk comes back, but real-world web apps are rarely perfect in that regard.
  3. Honestly a certain amount of it is pure hubris on my part.

So obviously, a reimplemented-from-scratch transport layer adds quite a bit of time and risk to the whole project, but I honestly think that if it can get done in a robust form, it’ll make the final product quite a lot better.

So the reason why is this: My imagined reality for what happens when one of the proxies needs a piece of data on behalf of the client, is as follows:

  • Each node has a fairly up-to-date full map of the network, as a DHT, and it looks up what are the closest N nodes to the data it’s looking for (probably N=5 or 10 or so)
  • It sends a spray of UDP packets to those nodes, requesting the data
  • At t=50-100ms, it starts getting back the first responses from the closer nodes, with a list of nodes where it can find its data. It sends another spray of requests, for different non-overlapping chunks of the data it’s looking for, to N of the nodes that are holding the data it’s looking for
  • At t=100-200ms, it starts getting back the first responses, with data included, from the faster ones of the nodes it requested data from. As data comes in, it can adjust its strategy for requesting what it needs based on which nodes are performing well at getting data to it.

Basically, the upshot is that because it’s talking in parallel with a bunch of different nodes, within a very short window of time it should be able to saturate its downstream pipe with the content it needs. If the ultimate result is that apps served by the grits network can work faster than even content served by a fairly powerful central instance, then that’ll be a big step in favor of its adoption.

Of course the devil is in the details. This type of problem is famous for being fairly difficult in the real world, but I feel like all the problems are solvable. I put up on github the current state of some code which attempts to achieve that 100-200ms latency I was talking about up above; it’s not complete even in barely-working-prototype stage yet, but I wanted to post up the current progress just so there wasn’t too long a silence. My guess is that within 1-2 weeks, it should start to be ready for testing and some amount of careful experimentation in an actual networked setup.

Comments? Questions? Feedback? I have a Lemmy instance set up; probably before overly long I’ll want to do up a little content-serving network and start doing actual-network testing on that instance, but the caching software has to get done first obviously. If you’re interested in working on any of the complementary pieces, or testing it out on grits.dev once it’s ready, or have comments or criticisms or anything of what I’ve done so far I’m 100% open to it.

Cheers!