Windows – How is CMD’s convert command able to convert FAT to NTFS without data loss

cmd.exefilesystem-conversionfilesystemsntfswindows

Something I've been wondering a while. On Windows, the convert command is able to convert a FAT16 or FAT32 disk to NTFS non-destructively (i.e. without the loss of any data):

convert X: /fs:ntfs

What are the technical details of how exactly Windows does this? What's it doing to convert a disk's filesystem without wiping data on it, and would applying the same principle work for other file systems?

I understand that NTFS is a proprietary filesystem, so the details may be only known by Microsoft themselves, but I was wondering whether they had published any documentation on this, or whether anyone else had picked Windows/CMD apart enough to find out.

The following question talks about exFAT to FAT32, but it's an entirely different question that concerns entirely different filesystems and has entirely different answers.

Again, I'm specifically wanting to know what convert is doing in order to convert a disk's filesystem without wiping data on it, and whether applying the same principle would work for other file systems.

Best Answer

First I want to distinguish between two very different things:

  • How the implementation of the proprietary convert.exe on Windows takes care of this in-place filesystem conversion between FAT32 and NTFS.
  • How, in general, one might approach the problem of converting between two filesystems, in-place, without having 2x the desired disk space available, or interoperable metadata between the two filesystems, or other such niceties that would make the problem trivial

For the first one, we're mostly at the mercy of Microsoft to release that information, because anyone who gets access to the source of Windows has to sign an NDA. It would only get released if approved by Microsoft's legal team. Sure, maybe someone reading this question got an illegal copy of the leaked Windows source and figured this out from the code, but that's a legal gray area.

So I won't attempt to provide an answer to the first question because I don't have an answer.

However, I will answer the second question.

There have been in the history of filesystems, many cases where we've wanted to upgrade from one filesystem to another without wiping the OS, reinstalling, or using a second disk. To name a few:

  • In 2017, everyone upgrading an Apple iOS device to iOS 10.3 received an in-place filesystem upgrade from HFS+ to APFS on their iDevice's built-in NAND.
  • For years, btrfs has supported upgrading a filesystem, in-place, from ext4 to btrfs using the btrfs-convert tool.

The open source fstransform program purports to convert between many different filesystems (with some caveats and limitations) -- and it includes many/most common Linux filesystems, as well as, impressively, NTFS! It doesn't yet support FAT32, though, despite supporting a great many other filesystems.

A reading of its C++ code would provide you the most detailed technical knowledge you require about the general algorithms involved in converting between disparate filesystems, even when compatibility or interoperability were neither planned nor designed by the original filesystem authors (by neither the source nor destination filesystem!).

The general process, in broad strokes, would go something like this:

  • Walk the file/directory tree of the current filesystem; and, in a new ordinary file within the existing filesystem, construct the FS-specific file table (list of files, directories, symbolic links, permissions, etc.) based on the current filesystem's list of files, but in the new filesystems' format.
  • Reformat the data structures and in-place metadata of the old filesystem in terms of the new filesystem's data structures, and update the pointers (to logical disk blocks and offsets, as applicable) within the new file table to point to the re-calculated file/directory/permission blocks (using general concepts such as "inodes" or "streams", depending on which filesystem you're converting from/to this will vary).
  • At the very end of the process, destructively overwrite the original filesystem's metadata (which identifies the filesystem as the old filesystem type using magic numbers, etc.) with the new filesystem's metadata, and create the appropriate maps that point to the "superblock", "MFT", or whatever filesystem-specific data structures are required for the filesystem to initialize itself.
  • Update the disk-global partition table (e.g. MSDOS format or GPT format) updating the magic number that hints at the filesystem type contained within the partition, if necessary (note: certain filesystems share the same magic number, because AFAIK it's only a 16-bit number, so there are only 65,535 possibilities. And some filesystem drivers are smart enough to ignore the magic number and "probe" the actual data structures of the filesystem to determine whether that partition contains an instance of a given filesystem.)

It's worth noting that, at the very least, the last two steps are not atomic; meaning, the usual atomicity guarantees of a journaled filesystem (like NTFS, reiserfs, XFS, zfs, etc.) are not available. If the system crashes, powers off, or even if the userspace program doing the conversion crashes or hangs during this process, the filesystem will require a data recovery expert to get your data back or restore sanity to the filesystem (either old or new). During these "destructive" operations, the underlying storage medium is making destructive overwrites of critical data in a way that is not backed up by a filesystem journal, due to the process inherently bypassing the journal of the old filesystem (in order to change from one FS to another, you can't tell the old filesystem to "safely kill itself" by overwriting its core metadata with something else it doesn't know about).

By contrast, asking a filesystem that does data journaling to do an ordinary write is, in fact, atomic: either the whole write is completed, or it's not done at all (the incomplete partial write to the journal area can be rolled back if the system crashes mid-way through the write, which is what fsck or chkdsk programs do on boot-up after your system BSODs or kernel panics.)

Doing an in-place FS conversion is pretty risky -- it's about as risky as a BIOS flash (moreso on mobile devices, where you can permanently brick your device by having an unbootable OS) because the safety of many operations is not guaranteed, and it tends to take a long time to do, so there's a high chance of the user thinking the OS is hung and power cycling it during the conversion, or a battery-powered device running out of battery.

For more insight into how this can be done safer with the knowing cooperation of two filesystems that are designed to convert between one to the other (which was, as I understand it, the case with the HFS+ to APFS conversion on iOS), this fascinating talk takes a forensic approach to figuring out what the heck is going on with APFS. It doesn't directly tackle the conversion problem, but several details about the conversion could be inferred from the information provided.

This is all to say, a definitive answer may never be found to your exact original question, but I think providing a lot of knowledge about the general process of in-place FS conversion should give you enough clues as to de-mystify what is likely to be the process for convert.exe.

BTW, I originally thought "Oh, great, ReactOS will have already implemented this tool and we can just view the source code!" -- Nope. They haven't implemented convert.exe in the open on ReactOS. If it's used at all by any user on their system, they must be executing the proprietary MS Windows binary. Otherwise I guess they simply don't provide this utility in ReactOS.

Related Question