bufWrite: Save extra syscall when data fills handle buffer completely.

The bug is that the check if (size - w > count) should be
if (size - w >= count) instead (>= instead of >),
because we can do the write all fine if it fits exactly.
This allows us to do 1 instead of 2 write syscalls the case it fits.

An example of when this matters is when an application writes output
in chunks that are a fraction of the handle buffer size.

For example, assume the buffer size is 8 KB, and the application writes
four 2 KB chunks.
Until now, this would result in 3 copies to the handle buffer, but
the 4th one would not be allowed in by size - w > count
(because size - w == count is the case), so we'd end up with a write
syscall of only 6 KB data instead of 8 KB, thus creating more syscalls

Implementing this fix (switching to size - w >= count), we also have
to flush the buffer if we fill it completely.

If we made only the changes described so far, that would have the
unintended side effect that writes of the size equal to the handle
buffer size (count == size) suddenly also go to the handle buffer
first: The data would first be copied to the handle buffer, and then
immediately get flushed to the underlying FD. We don't want that extra
memcpy, because it'd be unnecessary: The point of handle buffering is
to coalesce smaller writes, and there are no smaller writes in this
case. For example, if you specify 8 KB buffers (which menas you want
your data to be written out in 8 KB blocks), and you get data that's
already 8 KB in size, you can write that out as an 8 KB straight away,
zero-copy fashion. For this reason, adding to the handle buffer now got
an additional condition count < size. That way, writes equal to the
buffer size go straight to the FD, as they did before this commit.

