KVM and Back Ups



  • @stacksofplates
    Lots of testing to do later tonight.



  • @scottalanmiller said in KVM and Back Ups:

    @fuznutz04 said in KVM and Back Ups:

    So in this case, I’d have a PBX, a Wordpress site and eventually some windows server workloads. All of them are individually backed up via scripts at the OS level.

    That's all that you want. Just the OS level backups.

    No, that is all you want. The rest of us want VM level backups.



  • @JaredBusch said in KVM and Back Ups:

    @scottalanmiller said in KVM and Back Ups:

    @fuznutz04 said in KVM and Back Ups:

    So in this case, I’d have a PBX, a Wordpress site and eventually some windows server workloads. All of them are individually backed up via scripts at the OS level.

    That's all that you want. Just the OS level backups.

    No, that is all you want. The rest of us want VM level backups.

    I want what is reliable and good for restoring the environment, not what's pushed by marketing companies.

    I'm focused on the goal: working backups. Or even better: Disaster protection.

    I'm not being distracted by the means. When anyone in IT talks about wanting a hypervisor level backup, that's a "means" distraction caused by forgetting to stay goal focused.



  • I think it's wise to consider what kind of failure you are trying to protect yourself from and how you are going to recover.

    If I want to restore a VM that doesn't work as it should or the host crashed, I'd want a VM backup, taken in a known good state, because that is the fastest way to get something working again, perhaps on another host. To me this is an infrastructure backup. Our infrastructure is broken and we need to recover from that. Which also means we need a backup of the VM host of course, and everything else that could fail, including documentation and procedures how to get it back up.

    If a user deleted some files and want them back, I'd want a file level backup. That to me is a backup of business data, not infrastructure, and I'd want that backed up on a schedule that fits the data.

    And if I have to restore a database or a table within the database, I'd want a consistent database backup, not the database files and absolutely not the VM level backup. This to me is a different kind of business data.



  • @fuznutz04 said in KVM and Back Ups:

    @black3dynamite said in KVM and Back Ups:

    Proxmox backups are always a full backup.
    https://pve.proxmox.com/wiki/Backup_and_Restore

    Do you use or have any experience using proxmox? Does/can it just run as a VM on the host?

    I haven’t use it since version 4. And then off and on I set up a lab just to see how’s it progressing. I’ve installed it as an VM but that’s about it.



  • @stacksofplates

    So this one is running right now. So far, looks like it is working fine. Will test restores after.

    # Set the language to English so virsh does it's output
    # in English as well
    # LANG=en_US
    
    # Define the script name, this is used with systemd-cat to
    # identify this script in the journald output
    SCRIPTNAME=kvm-backup
    
    # List domains
    DOMAINS=$(virsh list | tail -n +3 | awk '{print $2}')
    
    # Loop over the domains found above and do the
    # actual backup
    
    for DOMAIN in $DOMAINS; do
    
    	echo "Starting backup for $DOMAIN on $(date +'%d-%m-%Y %H:%M:%S')" | systemd-cat -t $SCRIPTNAME
    
    	# Generate the backup folder URI - this is something you should
    	# change/check
    	BACKUPFOLDER=/mnt/backups/$DOMAIN/$(date +%d-%m-%Y)
    	mkdir -p $BACKUPFOLDER
    
    	# Get the target disk
    	TARGETS=$(virsh domblklist $DOMAIN --details | grep disk | awk '{print $3}')
    
    	# Get the image page
    	IMAGES=$(virsh domblklist $DOMAIN --details | grep disk | awk '{print $4}')
    
    	# Create the snapshot/disk specification
    	DISKSPEC=""
    
    	for TARGET in $TARGETS; do
    		DISKSPEC="$DISKSPEC --diskspec $TARGET,snapshot=external"
    	done
    
    	virsh snapshot-create-as --domain $DOMAIN --name "backup-$DOMAIN" --no-metadata --atomic --disk-only $DISKSPEC 1>/dev/null 2>&1
    
    	if [ $? -ne 0 ]; then
    		echo "Failed to create snapshot for $DOMAIN" | systemd-cat -t $SCRIPTNAME
    		exit 1
    	fi
    
    	# Copy disk image
    	for IMAGE in $IMAGES; do
    		NAME=$(basename $IMAGE)
                    # cp $IMAGE $BACKUPFOLDER/$NAME
                    # pv $IMAGE > $BACKUPFOLDER/$NAME
    		rsync -ah --progress $IMAGE $BACKUPFOLDER/$NAME
    	done
    
    	# Merge changes back
    	BACKUPIMAGES=$(virsh domblklist $DOMAIN --details | grep disk | awk '{print $4}')
    
    	for TARGET in $TARGETS; do
    		virsh blockcommit $DOMAIN $TARGET --active --pivot 1>/dev/null 2>&1
    
    		if [ $? -ne 0 ]; then
    			echo "Could not merge changes for disk of $TARGET of $DOMAIN. VM may be in invalid state." | systemd-cat -t $SCRIPTNAME
    			exit 1
    		fi
    	done
    
    	# Cleanup left over backups
    	for BACKUP in $BACKUPIMAGES; do
    		rm -f $BACKUP
    	done
    
    	# Dump the configuration information.
    	virsh dumpxml $DOMAIN > $BACKUPFOLDER/$DOMAIN.xml 1>/dev/null 2>&1
    
    	echo "Finished backup of $DOMAIN at $(date +'%d-%m-%Y %H:%M:%S')" | systemd-cat -t $SCRIPTNAME
    done
    
    exit 
    


  • @Pete-S said in KVM and Back Ups:

    If a user deleted some files and want them back, I'd want a file level backup. That to me is a backup of business data, not infrastructure, and I'd want that backed up on a schedule that fits the data.

    Any kind of backup might allow for that. Doesn't require it to be file level. Veeam, for example, will take a system image, but restore just a file.



  • @Pete-S said in KVM and Back Ups:

    If I want to restore a VM that doesn't work as it should or the host crashed, I'd want a VM backup, taken in a known good state, because that is the fastest way to get something working again, perhaps on another host. To me this is an infrastructure backup. Our infrastructure is broken and we need to recover from that. Which also means we need a backup of the VM host of course, and everything else that could fail, including documentation and procedures how to get it back up.

    That's one approach, but you can also often do a fresh build roughly as fast, if your system is designed well. You don't need an image of the whole thing. Also, image backups are risky and require you to normally have something else as the "real" backup. So you often take two backups (or more) instead of one, and if you restore from it, you risk that your restore is bad and you have to do it again using another method. Rather than one method that gives you reliable backups AND rapid recovery.



  • @Pete-S said in KVM and Back Ups:

    And if I have to restore a database or a table within the database, I'd want a consistent database backup, not the database files and absolutely not the VM level backup. This to me is a different kind of business data.

    Kind of all the same thing, just the chances of files being corrupted is different. It's the "risk level". If you take what I call devops style backups, you get everything covered in a single method. If you do anything else, you have to have multiple backups to address each recovery case.



  • @fuznutz04 said in KVM and Back Ups:

    @stacksofplates

    So this one is running right now. So far, looks like it is working fine. Will test restores after.

    # Set the language to English so virsh does it's output
    # in English as well
    # LANG=en_US
    
    # Define the script name, this is used with systemd-cat to
    # identify this script in the journald output
    SCRIPTNAME=kvm-backup
    
    # List domains
    DOMAINS=$(virsh list | tail -n +3 | awk '{print $2}')
    
    # Loop over the domains found above and do the
    # actual backup
    
    for DOMAIN in $DOMAINS; do
    
    	echo "Starting backup for $DOMAIN on $(date +'%d-%m-%Y %H:%M:%S')" | systemd-cat -t $SCRIPTNAME
    
    	# Generate the backup folder URI - this is something you should
    	# change/check
    	BACKUPFOLDER=/mnt/backups/$DOMAIN/$(date +%d-%m-%Y)
    	mkdir -p $BACKUPFOLDER
    
    	# Get the target disk
    	TARGETS=$(virsh domblklist $DOMAIN --details | grep disk | awk '{print $3}')
    
    	# Get the image page
    	IMAGES=$(virsh domblklist $DOMAIN --details | grep disk | awk '{print $4}')
    
    	# Create the snapshot/disk specification
    	DISKSPEC=""
    
    	for TARGET in $TARGETS; do
    		DISKSPEC="$DISKSPEC --diskspec $TARGET,snapshot=external"
    	done
    
    	virsh snapshot-create-as --domain $DOMAIN --name "backup-$DOMAIN" --no-metadata --atomic --disk-only $DISKSPEC 1>/dev/null 2>&1
    
    	if [ $? -ne 0 ]; then
    		echo "Failed to create snapshot for $DOMAIN" | systemd-cat -t $SCRIPTNAME
    		exit 1
    	fi
    
    	# Copy disk image
    	for IMAGE in $IMAGES; do
    		NAME=$(basename $IMAGE)
                    # cp $IMAGE $BACKUPFOLDER/$NAME
                    # pv $IMAGE > $BACKUPFOLDER/$NAME
    		rsync -ah --progress $IMAGE $BACKUPFOLDER/$NAME
    	done
    
    	# Merge changes back
    	BACKUPIMAGES=$(virsh domblklist $DOMAIN --details | grep disk | awk '{print $4}')
    
    	for TARGET in $TARGETS; do
    		virsh blockcommit $DOMAIN $TARGET --active --pivot 1>/dev/null 2>&1
    
    		if [ $? -ne 0 ]; then
    			echo "Could not merge changes for disk of $TARGET of $DOMAIN. VM may be in invalid state." | systemd-cat -t $SCRIPTNAME
    			exit 1
    		fi
    	done
    
    	# Cleanup left over backups
    	for BACKUP in $BACKUPIMAGES; do
    		rm -f $BACKUP
    	done
    
    	# Dump the configuration information.
    	virsh dumpxml $DOMAIN > $BACKUPFOLDER/$DOMAIN.xml 1>/dev/null 2>&1
    
    	echo "Finished backup of $DOMAIN at $(date +'%d-%m-%Y %H:%M:%S')" | systemd-cat -t $SCRIPTNAME
    done
    
    exit 
    

    Remember, testing restores from "tests" is rarely similar to restoring from catastrophic failure. In a test, almost any method appears to restore reliably, even those we know are not reliable.



  • @dafyre said in KVM and Back Ups:

    In my experience with it, it has often corrupted randomly and to the point that it's own snapshots are no help, nor are VMware Snapshots.

    How could it correct VMware snapshots?



  • @dafyre said in KVM and Back Ups:

    @scottalanmiller said in KVM and Back Ups:

    @dafyre said in KVM and Back Ups:

    In my experience with it, it has often corrupted randomly and to the point that it's own snapshots are no help, nor are VMware Snapshots.

    How could it correct VMware snapshots?

    I guess it's more that BtrFS doesn't detect the corruption early enough and our VMware snapshot are nothing but snapshots of corrupt data... That's about the only way I can explain it.

    General risk with hypervisor level backups. This is a huge reason for either local file based or what I call devops backups. They are at a higher level, so there is way more opportunity for this.

    But if the system was okay when you took the VMware snap, it should have been okay when you restored it. Regardless of corruption.



  • @scottalanmiller said in KVM and Back Ups:

    @dafyre said in KVM and Back Ups:

    @scottalanmiller said in KVM and Back Ups:

    @dafyre said in KVM and Back Ups:

    In my experience with it, it has often corrupted randomly and to the point that it's own snapshots are no help, nor are VMware Snapshots.

    How could it correct VMware snapshots?

    I guess it's more that BtrFS doesn't detect the corruption early enough and our VMware snapshot are nothing but snapshots of corrupt data... That's about the only way I can explain it.

    General risk with hypervisor level backups. This is a huge reason for either local file based or what I call devops backups. They are at a higher level, so there is way more opportunity for this.

    But if the system was okay when you took the VMware snap, it should have been okay when you restored it. Regardless of corruption.

    Yeah, exactly.... and this is why Snapshots are not a backup!



  • @dafyre said in KVM and Back Ups:

    Yeah, exactly.... and this is why Snapshots are not a backup!

    But a snapshot can be a backup if you export it, and it would have none of the issues like was described during a restore. Because, the VM presumably is having no issues at the time the snapshot was taken.

    Thus restoring that snapshot (backup) would result in a point in time restoration.



  • @dafyre said in KVM and Back Ups:

    @scottalanmiller said in KVM and Back Ups:

    @dafyre said in KVM and Back Ups:

    @scottalanmiller said in KVM and Back Ups:

    @dafyre said in KVM and Back Ups:

    In my experience with it, it has often corrupted randomly and to the point that it's own snapshots are no help, nor are VMware Snapshots.

    How could it correct VMware snapshots?

    I guess it's more that BtrFS doesn't detect the corruption early enough and our VMware snapshot are nothing but snapshots of corrupt data... That's about the only way I can explain it.

    General risk with hypervisor level backups. This is a huge reason for either local file based or what I call devops backups. They are at a higher level, so there is way more opportunity for this.

    But if the system was okay when you took the VMware snap, it should have been okay when you restored it. Regardless of corruption.

    Yeah, exactly.... and this is why Snapshots are not a backup!

    Snapshots absolutely are the backup mechanism.