There is a 3 part hash going on. There is an Origin ID hash, a URL hash and then an MD5 on the actual payload. When a new asset is registered on the mesh the Edgemesh backplane downloads the asset direct to confirm the MD5. If it doesn't match it won't allow the asset to register. On a replication the destination node receives the asset and calc's the MD5 again. If the MD5 doesn't match - it signals Edgemesh who then takes that node (source) out of the mesh. E.g. if you modify an asset and attempt to replicate it - the receiving party will invalidate the object and signal back to Edgemesh. Replication directions are from the Edgemesh backplane. PM me if you'd like to go into this in more detail.
> In 1996, Dobbertin announced a collision of the compression function of MD5 (Dobbertin, 1996). While this was not an attack on the full MD5 hash function, it was close enough for cryptographers to recommend switching to a replacement, such as SHA-1 or RIPEMD-160.
:) You're dead right and it's why we use it inside two other top level hashes (e.g. you'd need to collide inside the OriginID space as well). It's certainly possible though (for extremely large sites) and we're experimenting with an xxHash64 implementation for a later release.
>if you modify an asset and attempt to replicate it - the receiving party will invalidate the object and signal back to Edgemesh
If I understand you explanation correctly, the receiving party will invalidate the object if the MD5 of the object doesn't match the advertised MD5? That would leave you open to people serving other objects with the same MD5 hash as the original.
You can but our backplane won't know about you local modifications. When you're client informs the backplane (on a sync) it will see that those IDs and hashes we're registered and it will instruct you client to delete them.
E.g. modifications that happen in your local instances are checked against our backplane. If an asset hasn't been registered (and verified independently via our backplane) it won't be available for replication
I'm working on a platform (Peerweb) similar to the product being discussed, and I think I've put more thought into the security and autonomous self-policing aspects of P2P CDNs. I don't waste my time with MD5, and I deeply considered the PKI that I designed.
Also, my platform can offload all assets including the page itself and enables sites to get free failover during content server downtime. Due to my DNS-seeded PKI, your users stay secure and content continues to be correctly authenticated in your P2P CDN cache even when your site would normally be down.
Ah I see, I forgot that in the SSL attack the attacker had to choose both certificate prefixes as opposed to just one. Thanks!
It does seem to me though that if I could coerce/direct the site into accepting one image that I created, I could manage to replicate a second, different file throughout the network. Obviously assuming I computed both images ahead of time and both image formats were unperturbed by the nonsense appended to file by the attack.
When you register a new asset, the Edgemesh backend downloads it from origin itself to validate the hash you've calculated. And on replication the destination recalculates it on the payload (to make sure the asset replicated correctly).
Right. So let's say we have file A, which is an innocuous image file, and file A', which is a malicious image file, where MD5(A) == MD5(A'). Based on the MD5 prefix collision attack, I should be able to construct two such files A and A'.
I get an edgemesh site to accept file A (perhaps the site allows me to upload a user avatar, upload an image on a forum, etc). I then behave as a node in the mesh, and receive file A. When I get a request to replicate file A to someone else, I send them file A', they check the MD5 hash, and the hash matches. Not seeing how that doesn't work?
It is admittedly a narrow attack, but I think it works.
Am I missing something, or would this let any node (supernode/browser) in the system potentially replace arbitrary content with their own content? [1]
Hopefully JS isn't being served by this mechanism (attack vector pretty obvious there), but even images are still a concern [2] [3].
[1] https://en.wikipedia.org/wiki/Collision_attack#Chosen-prefix...
[2] https://threatpost.com/apple-patches-ios-flaw-exploitable-by...
[3] https://imagetragick.com/