Join GitHub today
GitHub is home to over 50 million developers working together to host and review code, manage projects, and build software together.
Sign upFix curle partial file on downloading index #221
Conversation
This commit adds the parameter userdata to the function acquire_file() to specify the private data given to cURL that is not necessarly the CaRemote object.
This commit fixes a download issue. Now, the casync-http process writes
the whole index in one shot upon the download is completed, instead of
streaming it to the casync process.
The transfer may end with CURLE_PARTIAL_FILE when the download of the
index does not complete before casync starts seeding.
That issue is reproducible when the seed operation is very long. A large
device block of 9.5GB is good enough to reproduce the issue.
In this situation, the cURL transfer starts in the casync-http process
and the write_index() function streams the data received to the casync
process through stdout.
In the meanwhile, the casync process starts the long seeding operation
that causes the casync-http process to hold on the transfer as the
poller blocks it in the function process_remote().
Many minutes later, when the seeding operation ends, the poll in
casync-http returns and the function robust_curl_easy_perform() returns
CURLE_PARTIAL_FILE (which stands for "Transferred a partial file" in a
human-readable format, according to curl_easy_strerror()).
The commit does not call the poller anymore in the middle of the
transfer of the index, as for chunks. It appends the data to the realloc
buffer instead. Then, it sends the whole data upon the transfer is
complete.
The commit reuses the function write_chunk() to download both index and
chunk. It renames that function to write_buffer() that is more generic.
The buffer is then written to casync using the former function
write_index() that is freed from cURL stuff.
This commit fixes the issue below:
$ casync extract http://localhost/rootfs-9.5GB.caibx /dev/sda6 -v
Failed to acquire http://localhost/rootfs/rootfs-9.5G.caibx
Failed to run synchronizer: Broken pipe

Formed in 2009, the Archive Team (not to be confused with the archive.org Archive-It Team) is a rogue archivist collective dedicated to saving copies of rapidly dying or deleted websites for the sake of history and digital heritage. The group is 100% composed of volunteers and interested parties, and has expanded into a large amount of related projects for saving online and digital history.

Hi,
This is a proposal to fix the issue mentioned in the second commit. This solution as a drawnback
There is an alternative solution. It consists in fixing the issue in casync process by updating the state machine to download the file after seeding.
In my solution, the size of the index is limited to
CA_PROTOCOL_SIZE_MAX(1610241024). Is it enough for an index file?@poettering, @keszybz, what do you think?