> UDP is just about as reliable or unreliable as IP.
Ok.
> It's a shim on top of IP to let unprivileged users send IP datagrams that can be multiplexed back to the right user. (Hence the name: "user" datagram protocol.)
It's mostly a way to send data in such a manner that you don't need to do the full 'call-setup-data-transmission-and-terminate' that would be required for other virtual circuit based protocols such as TCP when you don't need all that luxury. So for protocols that carry small amounts of data in a manner where retrying is not a problem and where loss of a packet is not an immediate disaster. It's also more suitable for real-time applications because of this than TCP (especially true for the first packet). Because of the fact that there is no virtual circuit a single listener can handle data from multiple senders.
The 'USER' does not refer to unprivileged users but simply to users as opposed to system packets (such as for instance ICMP and other datagram like packets that are not usually sent out directly by applications). So it's not a privilege matter but a matter of user-space vs system modules elsewhere in the stack.
> Lots of people talk about a "TCP or UDP" design decision, but that's usually not the relevant question for an application developer.
It absolutely is.
> Most UDP applications wouldn't just blat their data directly into UDP datagrams,
They usually do exactly that.
> just like almost nobody blats their data directly into IP datagrams.
You're comparing apples with oranges, IP is one layer and TCP and UDP are on another. So you'd have to compare UDP with TCP and then you're back to that design decision again.
> Typical applications generally want some sort of transport protocol on top of the datagrams (whether UDP-in-IP or just plain IP), that cares about reliability and often about fairness.
Fairness is something that is usually not under control of the endpoints of a conversation but something that routers in between influence. They can decide to let a packet through or drop it (this goes for TCP as well as UDP), if a line is congested your UDP packets will usually (rules can be set to configure this) be dropped before your TCP packets will be in spite of the fact that TCP will re-try any lost packets. UDP packets can also be duplicated and routed in such a way that they arrive out-of-order.
> That could be TCP-over-IP, but some (like MixApp, which wanted TCP but not the kernel's default implementation) use TCP-over-UDP-over-IP, BitTorrent uses LEDBAT-over-UDP, Mosh uses SSP-over-UDP, Chrome uses QUIC-over-UDP and falls back to SPDY-over-TCP and HTTP-over-TCP, etc.
Running alternative protocols packaged inside other protocols is a time honored practice. See also: TCP over carrier pigeons and tunneling HTTP over DNS traffic (effectively using UDP). This is not in any way special, it's just a means to an end.
> The idea that using UDP somewhere in the stack means "designing and implementing a reliable transport protocol from scratch" is silly. It doesn't mean that any more than using IP does.
It actually comes down to exactly that. If you use UDP as your base and your application requires reliable transmission of data then you're going to have to deal with loss/duplication/sequencing at some other point in your application or put another (pre existing) protocol on top of it in order to mitigate these.
If your application can tolerate those errors (or if they are not considered errors) then a naive implementation will do.
> The question of SOCK_STREAM vs. SOCK_DGRAM is more typically, "Does the application want to use the one transport protocol that the kernel implements, namely a particular flavor of TCP, or does it want to use some other transport protocol implemented in a userspace library?"
TCP is the default for anything requiring virtual circuits if you have demands that are not well describe by that model and/or need real time, low overhead and you're willing to do the work required to deal with UDPs inherent issues (if those are a problem) then you're totally free to do so.
But the question is usually not 'do I need TCP', it usually is 'how do I avoid re-implementing TCP if I need its features'.
It's a tough choice because at a minimum it means that you're going to have to write software for both endpoints.
This is one of the reasons why we see HTTP over TCP in so many places where it wasn't originally intended: it is more or less guaranteed to be well tested and there are tons of tools available to use this protocol combination, especially browsers, fetchers and servers in all kinds of flavors. For UDP that situation is much less rosy and using UDP always translates into having to do a bunch of plumbing yourself.
The thing that bites a lot of naive protocol designers who use UDP is that it doesn't guarantee order either. So you can get delivery out of order, and that gets more likely the more your packet crosses IP subnets. In part because some folks do traffic shaping, in part because UDP traffic is considered "less important" by a lot of ISPs, and in part because new switches, like the latest compilers, have enough cpu in the control plane to play games on packets flying about.
> You're comparing apples with oranges, IP is one layer and TCP and UDP are on another.
That is what the books say, but I don't think it is right. When you consider that raw IPv4 doesn't work in practice because of NAT, UDP is a defacto minimum internet layer in practice.
NAT is a kludge. The existence of NAT does not remove the fact that both TCP and UDP are layered on top of it and there is plenty of stuff happening that is layered directly on IP besides UDP, for instance ICMP.
> It's mostly a way to send data in such a manner that you don't need to do the full 'call-setup-data-transmission-and-terminate' that would be required for other virtual circuit based protocols such as TCP when you don't need all that luxury. So for protocols that carry small amounts of data in a manner where retrying is not a problem and where loss of a packet is not an immediate disaster. It's also more suitable for real-time applications because of this than TCP (especially true for the first packet). Because of the fact that there is no virtual circuit a single listener can handle data from multiple senders.
It is possible that someone wants a virtual circuit but can do better than TCP for their application. I think the parent's explanation was more apt - it's a small layer on top of IP for you to implement your own protocol logic.
Ok.
> It's a shim on top of IP to let unprivileged users send IP datagrams that can be multiplexed back to the right user. (Hence the name: "user" datagram protocol.)
It's mostly a way to send data in such a manner that you don't need to do the full 'call-setup-data-transmission-and-terminate' that would be required for other virtual circuit based protocols such as TCP when you don't need all that luxury. So for protocols that carry small amounts of data in a manner where retrying is not a problem and where loss of a packet is not an immediate disaster. It's also more suitable for real-time applications because of this than TCP (especially true for the first packet). Because of the fact that there is no virtual circuit a single listener can handle data from multiple senders.
The 'USER' does not refer to unprivileged users but simply to users as opposed to system packets (such as for instance ICMP and other datagram like packets that are not usually sent out directly by applications). So it's not a privilege matter but a matter of user-space vs system modules elsewhere in the stack.
> Lots of people talk about a "TCP or UDP" design decision, but that's usually not the relevant question for an application developer. It absolutely is.
> Most UDP applications wouldn't just blat their data directly into UDP datagrams,
They usually do exactly that.
> just like almost nobody blats their data directly into IP datagrams.
You're comparing apples with oranges, IP is one layer and TCP and UDP are on another. So you'd have to compare UDP with TCP and then you're back to that design decision again.
> Typical applications generally want some sort of transport protocol on top of the datagrams (whether UDP-in-IP or just plain IP), that cares about reliability and often about fairness.
Fairness is something that is usually not under control of the endpoints of a conversation but something that routers in between influence. They can decide to let a packet through or drop it (this goes for TCP as well as UDP), if a line is congested your UDP packets will usually (rules can be set to configure this) be dropped before your TCP packets will be in spite of the fact that TCP will re-try any lost packets. UDP packets can also be duplicated and routed in such a way that they arrive out-of-order.
> That could be TCP-over-IP, but some (like MixApp, which wanted TCP but not the kernel's default implementation) use TCP-over-UDP-over-IP, BitTorrent uses LEDBAT-over-UDP, Mosh uses SSP-over-UDP, Chrome uses QUIC-over-UDP and falls back to SPDY-over-TCP and HTTP-over-TCP, etc.
Running alternative protocols packaged inside other protocols is a time honored practice. See also: TCP over carrier pigeons and tunneling HTTP over DNS traffic (effectively using UDP). This is not in any way special, it's just a means to an end.
> The idea that using UDP somewhere in the stack means "designing and implementing a reliable transport protocol from scratch" is silly. It doesn't mean that any more than using IP does.
It actually comes down to exactly that. If you use UDP as your base and your application requires reliable transmission of data then you're going to have to deal with loss/duplication/sequencing at some other point in your application or put another (pre existing) protocol on top of it in order to mitigate these.
If your application can tolerate those errors (or if they are not considered errors) then a naive implementation will do.
> The question of SOCK_STREAM vs. SOCK_DGRAM is more typically, "Does the application want to use the one transport protocol that the kernel implements, namely a particular flavor of TCP, or does it want to use some other transport protocol implemented in a userspace library?"
TCP is the default for anything requiring virtual circuits if you have demands that are not well describe by that model and/or need real time, low overhead and you're willing to do the work required to deal with UDPs inherent issues (if those are a problem) then you're totally free to do so.
But the question is usually not 'do I need TCP', it usually is 'how do I avoid re-implementing TCP if I need its features'.
It's a tough choice because at a minimum it means that you're going to have to write software for both endpoints.
This is one of the reasons why we see HTTP over TCP in so many places where it wasn't originally intended: it is more or less guaranteed to be well tested and there are tons of tools available to use this protocol combination, especially browsers, fetchers and servers in all kinds of flavors. For UDP that situation is much less rosy and using UDP always translates into having to do a bunch of plumbing yourself.