There is no functional difference between an array of unsigned bytes (binary) and a array of signed bytes (char data). The only difference is that when you send binary, 0 is now a valid value instead of null terminating a string. Therefore you must prepend the size because you can no longer parse till you find a NULL byte. It is always safer to know the size you must allocate ahead of time instead of dynamically growing a buffer until the text stream is terminated.
There are plenty of examples of binary formats where you do not know buffer sizes until you've received all the data, and where assumptions with parsing the data can cause a buffer overflow.
decompression and PNG libraries for example have caused massive security impact across the industry because of reuse in different products. Font handling, compressed bitmap, and windows cursor parsing also have been sources of issues.
Mozilla didn't just invest in Rust because parsing HTML and JSON are hard. Its all hard.
“It is always safer to know the size you must allocate ahead of time instead of dynamically growing a buffer until the text stream is terminated.”
And then you go on to give examples of said scenarios of how this is true while saying I’m wrong? Anytime you have an unknown payload you have to make a determination of how long you’re going to wait, how much you’re going to accept, buffer, etc before it’s become a drain on the system
Yes, in fact, this is most common in video streaming formats. These types of streams are more commonly downloaded as opposed to uploaded where the server has to be careful not to exhaust too many resources parsing variable-length messages.
There’s no reason strings cant be sent like binary if you do a size header first. The problem is trying to send binary like a string where your data might have 0s that could be interpreted by the receiver as string terminations. Typically base64 is used to address this issue