The SFTP read function now does transfers the same way the SFTP write
function was made to recently: it creates a list of many outgoing
FXP_READ packets that each asks for a small data chunk. The code then
tries to keep sending read request while collecting the acks for the
previous requests and returns the received data.
The new SFTP write code caused a regression as the seek function no
longer worked as it didn't set the write position properly.
It should be noted that seeking is STRONGLY PROHIBITED during upload, as
the upload magic uses two different offset positions and the multiple
outstanding packets etc make them sensitive to change in the midst of
operations.
This functionality was just verified with the new example code
sftp_append. This bug was filed as bug #202:
Bug: http://trac.libssh2.org/ticket/202
The SFTP handle struct now buffers number of acked bytes that haven't
yet been returned. The way this is used is as following:
1. sftp_write() gets called with a buffer of let say size 32000. We
split 32000 into 8 smaller packets and send them off one by one. One of
them gets acked before the function returns so 4000 is returned.
2. sftp_write() gets called again a short while after the previous one,
now with a much smaller size passed in to the function. Lets say 8000.
In the mean-time, all of the remaining packets from the previous call
have been acked (7*4000 = 28000). This function then returns 8000 as all
data passed in are already sent and it can't return any more than what
it got passed in. But we have 28000 bytes acked. We now store the
remaining 20000 in the handle->u.file.acked struct field to add up in
the next call.
3. sftp_write() gets called again, and now there's a backlogged 20000
bytes to return as fine and that will get skipped from the beginning
of the buffer that is passed in.
This function now only returns EAGAIN if a lower layer actually returned
EAGAIN to it. If nothing was acked and no EAGAIN was received, it will
now instead return 0.
SFTP packets come as [32 bit length][payload] and the code didn't
previously handle that the initial 32 bit field was read only partially
when it was read.
There were some chances that they would cause -1 to get returned by
public functions and as we're hunting down all such occurances and since
the underlying functions do return valuable information the code now
passes back proper return codes better.
The sftp_write function shouldn't assume that the buffer pointer will be
the same in subsequent calls, even if it assumes that the data already
passed in before haven't changed.
The sftp structs are now moved to sftp.h (which I forgot to add before)
sftp_write was rewritten to split up outgoing data into multiple packets
and deal with the acks in a more asynchronous manner. This is meant to
help overcome latency and round-trip problems with the SFTP protocol.
Neither _libssh2_channel_write nor sftp_write now have the 32500 size
limit anymore and instead the channel writing function now has its own
logic to send data in multiple calls until everything is sent.
The new function takes two data areas, combines them and sends them as a
single SSH packet. This allows several functions to allocate and copy
less data.
I also found and fixed a mixed up use of the compression function
arguments that I introduced in my rewrite in a recent commit.
sftp_write() now limits how much data it gets at a time even more
than before. Since this function creates a complete outgoing
packet based on what gets passed to it, it is crucial that it
doesn't create too large packets.
With this method, there's also no longer any problem to use very
large buffers in your application and feed that to libssh2. I've
done numerous tests now with uploading data over SFTP using 100K
buffers and I've had no problems with that.
If an application accidentally provides a NULL handle pointer to
the channel or sftp public functions, they now return an error
instead of segfaulting.
Alexander Lamaison filed bug #172
(http://trac.libssh2.org/ticket/172), and pointed out that SFTP
init would do bad if the session isn't yet authenticated at the
time of the call, so we now check for this situation and returns
an error if detected. Calling sftp_init() at this point is bad
usage to start with.
As the long-term goal is to get rid of the extensive set of
macros from the API we can just as well start small by not adding
new macros when we add new functions. Therefore we let the
function be libssh2_sftp_statvfs() plainly without using an _ex
suffix.
I also made it use size_t instead of unsigned int for the string
length as that too is a long-term goal for the API.
The clang-analyzer report made it look into this function and
I've went through it to remove a potential use of an
uninitialized variable and I also added some validation of input
data received from the server.
In general, lots of more code in this file need to validate the
input before assuming it is correct: there are servers out there
that have bugs or just have another idea of how to do the SFTP
protocol.
This was triggered by a clang-analyzer complaint that turned out
to be valid, and it made me dig deeper and fix some generic non-
blocking problems I disovered in the code.
While cleaning this up, I moved session-specific stuff over to a
new session.h header from the libssh2_priv.h header.
clang-analyzer pointed this out as a "Pass-by-value argument in
function call is undefined" but while I can't see exactly how
this can ever happen in reality I think a little check for safety
isn't such a bad thing here.
I'll introduce a new internal function set named
_libssh2_store_u32
_libssh2_store_u64
_libssh2_store_str
That can be used all through the library to build binary outgoing
packets. Using these instead of the current approach removes
hundreds of lines from the library while at the same time greatly
enhances readability. I've not yet fully converted everything to
use these functions.
I've converted LOTS of 'unsigned long' to 'size_t' where
data/string lengths are dealt with internally. This is The Right
Thing and it will help us make the transition to our
size_t-polished API later on as well.
I'm removing the PACKET_* error codes. They were originally
introduced as a set of separate error codes from the transport
layer, but having its own set of errors turned out to be very
awkward and they were then converted into a set of #defines that
simply maps them to the global libssh2 error codes instead. Now,
I'l take the next logical step and simply replace the PACKET_*
defines with the actual LIBSSH2_ERROR_* defines. It will increase
readability and decrease confusion.
I also separated packet stuff into its own packet.h header file.
We reserve ^libssh2_ for public symbols and we use _libssh2 as
prefix for internal ones. I fixed the intendation of all these
edits with emacs afterwards, which then changed it slightly more
than just _libssh2_error() expressions but I didn't see any
obvious problems.
This function no longer has any special purpose code for the
single entry case, as it was pointless.
The previous code would overflow the buffers with an off-by-one
in case the file name or longentry data fields received from the
server were exactly as long as the buffer provided to
libssh2_sftp_readdir_ex.
We now make sure that libssh2_sftp_readdir_ex() ALWAYS zero
terminate the buffers it fills in.
The function no longer calls the libssh2_* function again, but
properly uses the internal sftp_* instead.
When _libssh2_channel_write() is asked to send off 9 bytes, the
code needs to deal with the situation where less than 9 bytes
were sent off and prepare to send the remaining piece at a later
time.
libssh2_error() no longer allocates a string and only accepts a const
error string. I also made a lot of functions use the construct of
return libssh2_error(...) instead of having one call to
libssh2_error() and then a separate return call. In several of those
cases I then also changed the former -1 return code to a more
detailed one - something that I think will not change behaviors
anywhere but it's worth keeping an eye open for any such.
internal code was changed to use that instead of wrongly using
libssh2_channel_read_ex(). Some files now need to include
channel.h to get this proto.
channel_read() calls libssh2_error() properly on transport_read()
failures
channel_read() was adjusted to not "invent" EAGAIN return code in
case the transport_read() didn't return it
channel_close() now returns 0 or error code, as
documented. Previously it would return number of bytes read in
the last read, which was confusing (and useless).
I made this change just to easier grep for "return .*EAGAIN" cases
as they should be very rare or done wrongly. Already worked to find
a flaw, marked with "TODO FIXME THIS IS WRONG" in channel.c. I also
fixed a few cases to become more general returns now when we have
more unified return codes internally.
On many places in the code there have been laziness return -1
statements lying around that should be fixed to return sensible
error codes. Here's a take at fixing a few offenders.
Each SFTP file handle is now handled by the "mother-struct"
using the generic linked list functions. The goal is to move
all custom linked list code to use this set of functions.
I also moved the list declarations to the misc.h where they
belong and made misc.h no longer include libssh2_priv.h itself
since now libssh2_priv.h needs misc.h...
In misc.c I added a #if 0'ed _libssh2_list_insert() function
because I ended up writing one, and I believe we may need it here
too once we move over more stuff to use the _libssh2_list* family.