Why doesn’t Movebot delete files?

Movebot
Couchdrop
Published in
6 min readNov 21, 2022
file migration tool

We’ve all done it. Deleted a file by accident. Saved something after making a stupid mistake. Overwritten our main save file with a save state a frame before being sniped. Depending on how critical the file was, you can be left with a lot more trouble than having to redo a few hours of gameplay. You can put your company at risk or even face fines or worse.

At Couchdrop we aim to make data moving simple. One of the things we’re often asked is if our Movebot software can sync deleted files-if you start a data migration and a file is removed at the source, will it show up at the destination?

Our answer is yes. It will be there.

At first glance, this might seem like a massive oversight. Deleted files poof back into existence. They appear at the destination like that guy who keeps crashing BBQ parties and eating all the filets.

But there’s an important reason we don’t sync deletions. It’s a major security concern and would invalidate one of our most important tenets: we don’t store or alter your data.

The problems with a two-way sync system

Having the best data security and privacy are two things we won’t compromise. When you use any of our systems, you have complete control of your data. You decide when, how, and where it’s moved.

Simply a data moving tool

Movebot is simply the software that allows an agent to get the job done. Think of it as a truck driver for a moving company. You pack up the truck with all of your belongings the way that you want to. Then, when you’re happy with everything, you lock up the back and pocket the key (Okay, so it’s a weird moving company). Then you let the driver know you’re ready, and they start the journey and make sure your possessions arrive safely.

The driver never sees what you packed, let alone has the chance to take what they want. You’re the only one with the key, and the truck is simply a vehicle to get your things across.

Contrast that to if the driver has the key to your belongings and can change things to their whim. Most likely, your items will still arrive safely. But, there’s the chance the driver can open up the back and decide that the box labeled Grandma’s Ashes can be dumped at the side of the road. If the software has access to alter files, there’s the possibility that important content gets lost.

Or, in a worst case scenario, you notice that your cloud system seems to have the files so you (or an ambitious Junior) delete all the ones in your old account while the migration is at 99.99% complete. Seeing that your most recent action was to delete the files, the software might decide that’s what you intended to do. And poof. No more data on either system.

How inconvenient is it not to have a sync system?

The Movebot team decided that having no access to the data was more important than having files deleted during a migration reappear at the destination. If you delete the files before the transfer, they won’t come back at the destination; it’s only the ones deleted during the transfer that show up. Some people think it’s an inconvenience, but this way you have full control of your data and a better level of security.

Plus, in the extremely rare likelihood that there’s a glitch in the system, the software can’t affect your data since it doesn’t store or have access to it. You won’t get an “unknown error” and have your files deleted because there is no way for Movebot to delete them without access to change them. Instead, the transfer will simply be unsuccessful.

Practically, this limitation doesn’t cause much inconvenience in most cases because the data transfers quickly. Unless the migration is dozens of Terabytes or more, it can usually finish up in a few days. But what about huge enterprises with enormous amounts of data?

Working with large organizations and huge amounts of data

For most organizations, the data migration will be quick and painless, and no one will really notice that a file or two that was deleted still exists on the new platform.

But what if you’re a large organization with a massive amount of data? What if you have over 10,000 users and a few Petabytes worth of data to shift? Even with Movebot’s impressive speeds, a transfer of this size is going to take awhile to finish. Fortunately, by following a few best practices, users don’t have to take an extended vacation because it can be business as usual-for the most part.

Start with a pre-migration scan

Sure, a pre-migration scan adds time to the job, but as Scar says “Be Prepared”. Running a scan before the migration helps you be prepared for sensational news instead of wearing a vacant expression wondering why some files didn’t come through as expected.

The scan searches for potential issues and flags them so you can decide how to proceed. It also shows you how much data you actually have-not double counting shared files for instance-and gives an accurate picture of what to expect. And it’s free, so why not?

Break up the jobs in smaller segments

If you have a PB of data to transfer, you should break it up into smaller jobs instead of trying to move the entire amount at once. Having more jobs takes more planning and oversight, but you reduce the risk of having to restart a major transfer in the rare case something goes wrong.

Chances are, larger organizations moving enormous swaths of data aren’t doing it on a whim. Continuing this careful, measured approach through the transfer can make it practically background noise that most users won’t notice. Plan out which files to transfer at which times and a lot of the headache is already gone. And there shouldn’t be a huge amount of deleted files if one huge job is broken up into smaller jobs, like moving a hoarder across town with several small trailer loads instead of renting a massive semi no one can back up properly.

Use delta migrations

When you start a migration with Movebot, it takes the data at that moment in time and makes a copy of it in the new platform. Think of it like a photo. Movebot takes a snapshot and then replicates it so you have it at the new location.

The problem? Your organization has made changes since then. It’s not realistic to have employees stop using cloud storage for a few weeks while the data transfers. Fortunately, delta migrations make quick work of this issue.

What’s a delta migration? It’s a migration of only the data that changed since the last transfer.

First, you copy all of the data over i.e. take the snapshot. Only that snapshot won’t pass a game of Spot the Difference because people have altered the original since the data started moving. To solve this, a delta migration then reads both sides, compares the two instances, then finds any files that changed in the source and adds them to the destination. Then you can confidently say they’re the same picture-almost.

Remember that Movebot doesn’t delete any files; the deleted files are already at the destination and Movebot doesn’t touch the data, so you’d need to delete them again at the destination.

However, any updates made to the files will pull over, so if you made 15 versions of the same document while the migration happened, a delta migration will pull over V15 so you don’t have to start from 1 again.

What if syncing files is essential for my team?

Syncing files is an important feature and there are plenty of solutions for that. For automating frequent file transfers and backing up on multiple systems, a solution like might be what you’re looking for.

Movebot is designed to move data, not to sync two systems or create backups. You’re breaking up with your cloud service and not coming back… Unless your ex brings some really good gifts. And if you do decide to go back, Movebot will be there, unjudging, to help you move your data when you get back together-and won’t delete your keepsakes box.

Originally published at https://www.movebot.io.

--

--