Introduce new native backup provider (KNIB)#12758
Introduce new native backup provider (KNIB)#12758JoaoJandre wants to merge 3 commits intoapache:mainfrom
Conversation
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #12758 +/- ##
============================================
- Coverage 17.92% 17.84% -0.09%
- Complexity 16175 16226 +51
============================================
Files 5949 6001 +52
Lines 534058 537981 +3923
Branches 65301 65650 +349
============================================
+ Hits 95742 95983 +241
- Misses 427560 431217 +3657
- Partials 10756 10781 +25
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
@JoaoJandre just heads up - my colleagues have been working on an incremental backup feature for NAS B&R (using nbd/qemu bitmap tracking & checkpoints). We're also working on a new Veeam-KVM integration for CloudStack whose PR may may be out soon. My colleagues can further help review and advise on this. /cc @weizhouapache @abh1sar @shwstppr @sureshanaparti @DaanHoogland @harikrishna-patnala Just my 2cents on the design & your comments - NAS is more than just NFS, but any (mountable) shared storage such as CephFS, cifs/samba etc. Enterprise users usually don't want to mix using secondary storage with backup repositories, which is why NAS B&R introduced a backup-provider agnostic concept of backup repositories which can be explored by other backup providers. |
At the time of writing that part, I believe it was only NFS that was supported. I'll update the relevant part.
The secondary storage selector feature (introduced in 2023 by #7659) allows you to specialize secondary storages. This PR extended the feature so that you may also create selectors for backups. |
|
Hi Joao, This looks promising. Incremental backups, quick restore and file restore features have been missing from CloudStack KVM. I am having trouble understanding some of the design choices though:
|
Hello, @abh1sar
I don't see why we should force the coupling of backup offerings with backup repositories, what is the benefit?
The secondary storage also has both features. Although the capacity is not reported to the users currently.
The secondary storage selectors feature (introduced in 2023 through #7659) allows you to specialize secondary storages. Quoting from the PR description: Furthermore, my colleagues are working on a feature to allow using alternative secondary storage solutions, such as CephFS, iSCSI and S3, while preserving compatibility with features destined to NFS storages. This feature may be extended in the future to allow essentially any type of secondary storage. Thus, the flexibility for secondary storages will soon grow.
Using any other type of backup-level compression will be worse then using qemu-img compression. This is because when restoring the backup, we must have access to the whole backing chain. If we use other types of compression, we will have to decompress the whole chain before restoring. Using qemu-img, the backing files are still valid and do not need to be decompressed, we actually never have to decompress ever. This is the great benefit of using qemu-img. In any case, here is a brief comparison of using qemu-img with the zstd library and 8 threads and using the
Compression using qemu-img was a lot faster, with a bit smaller compression ratio. Furthermore, we have to consider that the qemu-img compressed image can be used as-is, while the other images must be decompressed, further adding to the processing time of backing up/restoring a backup.
The compression feature is optional, if you are using storage-level compression, you probably will not use backup-level compression. However, many environments do not have storage-level compression, thus having the possibility of backup-level compression is still very interesting.
The compression does not add any interaction with the SSVM.
I did not want to add dozens of parameters to the import backup offering API which are only really going to be used for one provider. This way, the original design of the API is preserved. Furthermore, you may note that the APIs are intentionally not called
There are two main issues with using bitmaps:
At the end of the day, this PR adds a new backup provider option for users. They will be free to choose the provider that best fits their needs. This is one of the reasons why it was done as a new backup provider; KNIB and other backup providers do not have to cancel each-other out. |
Description
This PR adds a new native incremental backup provider for KVM. The design document which goes into details of the implementation can be found on https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=406622120.
The validation process which is detailed in the design document will be added to this PR soon.
The file extraction process will be added in a later PR.
This PR adds a few new APIs:
The
createNativeBackupOfferingAPI has the following parameters:namecompressfalsevalidatefalseallowQuickRestorefalseallowExtractFilefalsebackupchainsizecompressionlibraryzstdThe
deleteNativeBackupOfferingAPI has the following parameter:idA native backup offering can only be removed if it is not currently imported.
The
listNativeBackupOfferingsAPI has the following parameters:idcompressvalidateallowQuickRestoreallowExtractFileshowRemovedfalseThe
listBackupCompressionJobshas the following parametersidbackupidhostidexecutingparameter is implicitzoneidtypeStartingorFinalizingexecutingscheduledBy default, lists all offerings that have not been removed.
It also adds parameters to the following APIs:
isolatedparameter was added to thecreateBackupandcreateBackupScheduleAPIsquickRestoreparameter was added to therestoreBackup,restoreVolumeFromBackupAndAttachToVMandcreateVMFromBackupAPIshostIdparameter was added to therestoreBackupandrestoreVolumeFromBackupAndAttachToVMAPIs, which can only be used by root admins and only when quick restore is true.New settings were also added:
backup.chain.sizeknib.timeoutbackup.compression.task.enabledtruebackup.compression.max.concurrent.compressions.per.host5backup.compression.max.job.retries2backup.compression.retry.interval60backup.compression.timeout28800backup.compression.minimum.free.storage1backup.compression.coroutines1backup.compression.rate.limit0Types of changes
Feature/Enhancement Scale or Bug Severity
Feature/Enhancement Scale
Bug Severity
Screenshots (if appropriate):
How Has This Been Tested?
Tests related to disk-only VM snapshots
Basic tests with backup
Using
backup.chain.size=3Interactions with other functionalities
I created a new VM with a root disk and a data disk for the tests below.
Configuration Tests
Compression Tests
Tests performed with an offer that provides compressed backups support
Tests with
restoreVolumeFromBackupAndAttachToVMTests with
restoreBackup