Hacker Newsnew | past | comments | ask | show | jobs | submit | CarlosBaquero's commentslogin

Rateless Bloom Filters: Set Reconciliation for Divergent Replicas with Variable-Sized Elements


Exactly, that is the one million dollar question.


I do agree that in the end the most productive approach will be a mix. Humans with AI support. My fear is on the possibility that learning to program always with AI support, will limit the quality of learning. Only time will tell.


What happened to peer-to-peer as a technological concept? Actually, we still use a lot of that technology.


Years ago, most computing devices were desktops. They often had a routable IP address, unlimited power, and would happily sit passing packets all day. This made things like a DHT practical, so you could find your other peers. This made things like the early days of skype where except for auth, chat and file sharing was p2p. After being online for long enough and having a routable IP, you could become a supernode to help less fortunate nodes talk to each other.

These days a much larger fraction of computing devices are on battery, on expensive networks like cellular, and can't really tolerate being part of a DHT. Increasing use of NAT/Masquerading makes a harder (and a support nightmare) to accept incoming packets from new peers.

One solution to this is to add a "superpeer" to a router distribution like OpenWRT, or sell the "plug/wallwart" to help. That way a cheap (under $100) computer could build reputation with it's peers, accept incoming packets form new peers, provide some storage, and keep up with DHT maintenance. Then low power and/or expensive network peers could just check their "home" superpeer and get what they need quickly with minimal bandwidth and power.


NAT was the problem since near beginning of P2P tho

> One solution to this is to add a "superpeer" to a router distribution like OpenWRT, or sell the "plug/wallwart" to help. That way a cheap (under $100) computer could build reputation with it's peers, accept incoming packets form new peers, provide some storage, and keep up with DHT maintenance.

...and do what exactly ? Don't have CPU power to do much, dont have storage to serve anything.

Also the same problems of "how do I exactly connect thru NAT" home router in same way, some of them might have IPv6 directly, but most are still behind some carrier grade NAT just like the phones are.

But I do like idea of evolving router a bit. Stuff like home automation should ideally just talk to MQTT queue on a router and then user is free to either install automation using it somewhere on the network, connect directly from phone, install container on the router running HomeAssistant or something, or pay some cloud service to ingest the MQTT stream and give them nice UI for it.


> and do what exactly ? Don't have CPU power to do much, don't have storage to serve anything.

My new $140 router had 8 cores (4xA76 + 4xA55), 8GB ram, 32GB eMMC storage, and 2x2.5gb+gbe ports. Even has a SD slot for more storage. My thinking is more along the lines of what can't it do. The low hanging fruit would be to replace maps.google.com (with p2p shared openstreemmaps or similar), drive.google.com/dropbox.com, chat/blog/twitter/instagram/snapchat/facebook and similar low hanging fruit. If you need more storage a 256GB sd card is $25 to $40. I believe the default storage for most google accounts is 17GB.

With a healthy P2P ecosystem you could leverage your peers, things like FileCoin could let you supplement your storage from any provider, and not depend on any single provider.

Running SHA256 on files, even reed-solomon, keeping track of your DHT peers, running IPFS or similar, even mastodon (once implemented in Go or Rust) shouldn't make newish hardware work hard.

Being in the router avoids the NAT issue, and if this kind of thing gets any traction. Anything outside the router will need working IPv6 (like Comcast in the USA), an accommodation from the router with port forwarding, or one of the various NAT traversal protocols like ICE, TURN, or STUN.


I agree that a pool of equal peers is tough these days. I think the Fediverse has a pretty good approach, where most end users are on mobile but you can spin up a server whenever.

There’s still a big complexity/skill/cost jump from “I toot from my iPhone” to “I run a mastodon instance for my company” though. Some of that can be addressed by managed hosting. It’s probably preferable to have a “super peer” though. In my mind, a superpeer runs the same software as a peer, but does more work because it can. It should be easier to maintain than a full server. I’m taking about the difference between:

A) manage a mastodon node, with its own redis, PostgreSQL, web server, object storage, etc

And

B) run BitTorrent in the background on your gaming PC to seed the latest cut of the niche documentary you’re working on

There’s a lot of interesting self-hosting projects happening, but they tend to focus on helping you run kubernetes or a similar container orchestrator. That’s still way more complex than an executable.

I think we need things to get a bit more opinionated again…


Agreed. Supernodes should be near zero admin, and be able to run where advantageous, not just where an expert is available. OwnCloud/Nextcloud running every more services and at least looking at fediverse integration. Various NASs allow local applications for photos, plex, etc. All targeted at being nearly turnkey and friendly to the average consumer.

Ideally things become easy enough that people can depend less on FAANG and do more with services that are distributed. Hopefully the software can get to the point where your server helps when it can, but if it's down other servers would help out. I just bought a $120 ($140 with case) router, 8 cores, 8GB ram, 32GB storage, GigE + 2x2.5gbe. I've love to dedicate it for messaging, filesharing, mastodon, photo storage/viewing/sharing, etc and even have it exchange services with others so that message/photos/whatever can be shared even if it goes down. Hopefully it gets there, pretty amazing resources are available cheap these days. I'd happily trade 2/3rd of my resources for other peers ... if they did the same for me.


Smartphones took over as people's primary "computers" of choice. And mobile devices, generally, don't even get an IPv4 address with ports as most are behind carrier NAT. So most people cannot participate on the internet anymore and require third parties to hold their metaphorical hand when doing network operations.

For people still using actual computers with real internet connections and ports p2p is still as big, and as useful, as ever. It's just that the relative percentage of online users with actual internet connections has shrunk. The absolute number of people with real computers and connections has not shrunk.


Being behind a NAT poses constraints for p2p technologies (you need some well-known servers to do the hole-punching and act as a relay, but that's not too different from the well-know IPs that are needed for bootstraping a regular p2p system anyway, except of course, not every NAT are friendly to hole punching, and that's a problem as well…) but that also has a significant security and privacy advantage: since you aren't openly connected to the internet, you don't casually leak your computer's IP to the random strangers you're interacting with (at least when we're talking about a NAT you share with other people, not just your ISP box's NAT) and the amount of harm they can actually do to you is significantly lower.

In the end I think the internet would actually be a significantly better place security-wise for p2p if IPs weren't directly routable by default, and NAT with all its limitations gives you mostly that.


NAT punching definitely tells other peers your NAT's IP address (and often your local address too, but that's less important).

Unless you're behind CGNAT, your NAT IP can often be used to find your neighborhood with public information. With private information (a legal challenge for example) you can find the exact subscriber/house.


> NAT punching definitely tells other peers your NAT's IP address

Yes, and that's all you share, so when the NAT is shared with other people (like other students on a campus for instance, or other customers of your phone mobile phone carrier) the amount of info that can be collected is much lower than if you have a public IP address for your computer.

> Unless you're behind CGNAT

Did you read what I wrote above, when I said: “at least when we're talking about a NAT you share with other people, not just your ISP box's NAT”.

> (and often your local address too, but that's less important).

Here you're mixing up the hole-punching part with the signaling protocol (ICE, which have had this issue in the past, before browsers switched to mDNS[1] instead of private IP addresses in ICE candidates).

[1]: https://groups.google.com/g/discuss-webrtc/c/6stQXi72BEU?pli...


You need a signaling protocol to do hole punching.


The two are working together to establish a p2p connection behind a NAT but that doesn't make them equivalent. It's like saying “UDP sometimes leaks your local IP address”, that's factually inaccurate.


Here's an off-topic but somewhat related question that I've been meaning to ask somewhere.

How do "plug and play" consumer devices that receive an incoming call / connection work behind the typical home NAT router? I have an OOMA VOIP phone service which is plugged into my home router with no ports forwarded. It has no trouble receiving an incoming call.

Does it simply open an outgoing connection and hold it open indefinitely?


STUN or an intermediary


Yes, that's pretty much the only way it could work.


For 'Linux ISOs' some of the decline was bandwidth getting cheaper for mirror hosts, meaning that in place of P2P seeders there are now well-connected mirrors that are not 'bandwidth handicapped' in comparison to the past


NAT happened. Everyone is behind a NAT these days. No way to directly network with any computer on the internet anymore.


It became... boring technology.


haha, so many tb's now for such small money


Thanks for the pointer. I took a look at the code and it’s different. Looks like Aether maintains several bloom filters, each link to a given time range. In contrast the paper builds a different kind of bloom filter where hash functions can overlap.


Interesting, thanks for taking a look. That is correct. Aether also does some partial merging in the cases where the time range required does not exactly match the filter sequence.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: