Hi Ruben,
I think you're falling into a common trap re. TCP. On a lightly-loaded network, if you send a block of data on one host it typically arrives at the other end of the connection as one thing. In other words, calls to send() and recv() are one-to-one. In that situation adding NODELAY will (seem to) solve problems like the ones that you were seeing. However, it will all fall to pieces when you're running under load or there's congestion or some other kind of problem, as it's perfectly legitimate for packets to be combined and/or fragmented which breaks this one-to-one relationship on which the correctness of your program rests.
You _must_ treat data received over TCP as a continuous stream of bytes and not a sequence of discrete packets, and do things such as accounting for the case where your 4-byte length indicator is split across two packets so does not all arrive at once. If you don't, it will bite you at the very worst time, and will do so nondeterministically. This kind of thing is very hard to reproduce in a test environment.
There is nothing special about the DCC protocol that makes it immune from this effect.
Best wishes,
David