RECOVERING
a Failed Storage Unit
by James Delhauer
In 2005, Hurricane Katrina made landfall upon the Southeastern Seaboard of the United States, cutting a swath of devastation in its path and inflicting incalculable damage to those who were left in its wake. Though we continue to mourn the loss of life and livelihood that this natural disaster caused, the residents of New Orleans and other affected territories have spent more than a decade rebuilding. In that time, hundreds of personal hard drives have been sent to data recovery centers across the nation in the faint hope that the data contained within could be salvaged. Devices that were battered by the storm and then submerged in murky waters for days, weeks, or even months would be deemed a lost cause to almost anybody. But they weren’t. In what can only be described as the miracle of technology, survivors began to see their personal data recovered and returned to them intact. This proved quite definitively that digital data is more robust than most would have suspected.
Digital storage devices are bigger and faster than ever before but the risk of failure and data loss is just as daunting as it was when the first hard disk drive was invented in 1954. Today, digital storage mediums exist along every link in the chain of the film and television business.
Petabytes of information are created, acquired, distributed, and archived every year. We rely on digital storage almost as much as we rely on craft services but there is a very real danger of drive failure, data corruption, and loss of work. An estimated 0.3 percent of flash storage devices sold each year will suffer some sort of fault or accidental damage resulting in data loss. For mechanical devices that still utilize moving parts, that number increases to approximately 1.7 percent. While these numbers may sound comfortingly low, the sheer number of storage devices used within the industry would suggest that drive failure comes up more often than one might think. While it is always advised that media be backed up to multiple storage units as soon as possible, mistakes can happen or failure can occur before that is possible. So, what should a Local 695 sound mixer or video engineer do if they find themselves holding a faulty memory card or storage drive?
The first and most important step is to stop using the faulty unit immediately. Attempted use could exacerbate problems and make data recovery more difficult. When not attempting troubleshooting, the device should remain powered off and unplugged. The next step is to attempt to deduce the sort of problem that has caused a drive to fail. Broadly speaking, issues can be divided into the categories of logical failure, mechanical failure, and complex failure. Each of these groups presents its own symptoms and has its own set of troubleshooting steps. So correctly assessing the type of problem is essential.
Logical failure is the most prevalent and is the result of digital damage to the device’s partition—the file system that a computer uses in order to communicate with a storage medium. When a drive ceases to function as a result of logical failure, it remains physically sound and viable but cannot be read or written to by the computer’s operating system. More often than not, this means that all of the files that a user has on the device are safe and sound but simply cannot be accessed until the partition is repaired. Reasons for logical failure include malware, bad or degraded software sectors, overworking the drive, improper ejection during data transfer, or the deletion of necessary system files. Prior to a complete partition crash, users may notice sluggish behavior from their device, a high number of read/write errors, and frequent unprompted mounting and un-mounting of the drive. If the problem drive is acting as a computer’s primary boot drive, regular lockups and computer crashes are another warning sign. When logical failure occurs, connected storage devices will usually still power on and light up but will not mount and will appear to be absent from the computer’s Finder (macOS) or Explorer (Windows). If the user opens the macOS Disk Utility or Windows Disk Management system, the problematic unit will still appear in the list of connected devices.
Before attempting any direct troubleshooting steps, users should check a device’s manual or product support page and make sure that any necessary firmware or drivers have been installed on their computer. Failing that, macOS users can open the Disk Utility application and use it to attempt partition repairs. Find the storage device that will not mount and look to see if any partitions are listed. If available, select it and click “First Aid.” The software will assess the unit’s file system and attempt to make repairs. If it is successful, the computer will automatically mount the repaired storage device, allowing the user to access their files. Similarly, Windows users can make use of the Windows Partition Recovery Wizard. This program will scan the storage device for any corrupted or lost partitions and, if found, will attempt repairs. If partition recovery is successful, it is highly advised that all data on the drive or card be copied to another storage medium immediately so as to avoid the risk of data loss again in the future.
In more complex cases where more substantial damage to the partition has occurred, repair may not be possible. Simply creating a new partition will not recover the files contained within the original and could, in fact, overwrite valuable data that has become inaccessible. At this point, if recovery is essential, it becomes necessary to bypass the partition altogether. There are several pieces of software available that can perform this task. The two that I have personally used to the best results are Stellar Data Recovery ($79.99 USD) and EaseUS Data Recovery Wizard ($89.95 USD). Both applications can scan storage devices sector by sector and locate files within the damaged partition. Once located, said files can be recovered and transferred to a second external storage device. Due to the fact that the software works around standard operating system to partition communication systems, scans and recovery periods can be quite time-consuming. Larger capacity drives containing multiple terabytes of information can require scans of more than twenty-four hours. On a more positive note, both companies allow users to try before they buy. A free trial is available for both, which will allow users to scan and preview their recoverable files before spending money—eliminating the concern of spending without any guarantee that data will be found. This method can also be used to recover files that were accidentally deleted by a user—a mistake that occurs far more frequently than actual device faults.
Mechanical drive failure occurs when there is a physical issue with a storage device. It can occur due to manufacturer error, physical degradation, or damage. When plugged into a computer, devices suffering from mechanical failure may not be discoverable at all. Though less prevalent than logical failure, mechanical failure is far more difficult to troubleshoot and best practice is to take steps to prevent it altogether. There are two subsets of mechanical failure: electrical failure and bad sector failure.
Electrical failure occurs when the drive does not receive the necessary power to run properly. Oftentimes, the device will not power on at all, though it may generate heat if it remains plugged in. If this is the case, remove all power cables immediately as heat buildup can result in further damage and, in extreme cases, fire. Impact damage, such as a fall or drop can disrupt electrical flow, resulting in electrical failure. It can also happen as the result of a power surge, which can burn out the circuitry of the device in a manner that prevents electricity from reaching the whole of the unit. To avoid this, it is best to always run devices in conjunction with a surge-protected uninterruptable power supply, such as APC’s Backup Battery ($169.99 USD). When using memory cards or external hard drives, damage to the connector cables, card readers, or drive enclosures can present as electrical failure. For memory cards, it is always advisable to try using a second card reader before assuming electrical failure. In the case of external drives, users with the correct tools can open a drive’s enclosure, extract the unit inside, and attempt to mount it using another enclosure or mounting system, such as the iDsonix Hard Drive Docking Station ($20.99 USD).
The second subset of mechanical failure, bad sector failure, is a worst-case scenario. It is what occurs when the portion of the drive where information is written cannot be accessed at all by the unit. It is most common in spinning disk drives, where the actuator arm inside of the drive is used to retrieve information from rapidly spinning platters. If the arm is knocked out of alignment, it may be unable to access part of the platter in order to retrieve its contents and send it to the computer. Or, if dust settles on the platter, it can act as a barrier between the platter and the arm, also interrupting communication. In extreme cases, the actuator arm may make direct contact with a platter, scratching it and permanently damaging the data in the same manner as a scratched DVD or Blu-ray. In this case, recovery of damaged sectors may be impossible. If this occurs, users may hear a distinct clicking or scraping sound coming from the drive when it is powered on. This is a screaming red flag and the device should be powered off immediately as each clicking or scraping noise is the sound of data being permanently destroyed. In the case of solid-state media, bad sectors can occur when memory cells age and fail as a result of constant use, similar to the lithium-ion batteries found in cellphones.
Unlike logical drive failure, where a variety of consumer options exist to resolve the issue and recover media, mechanical hardware failure is almost always beyond the means of a user to fix on their own. Advanced technicians utilize sterile clean room environments to perform surgery on damaged drives. Functional components are removed from the damaged devices and transplanted into new units. Dirty or corroded mechanical platters need to be chemically treated in order to clean them. The entire process is incredibly delicate, as dust or fingerprints on a physical disk is more than enough to ruin the entire transplant procedure. As a result, this process can be expensive and estimates can vary from a couple of hundred dollars up to several thousand. Fortunately, unless the actuator arm has actually scratched a drive’s platter, spinning disk drives currently have an estimated ninety nine percent successful recovery rate, with a success being defined as a recovery of at least ninety seven percent of a user’s data.
The last category, complex failure, is simply a combination of any of the above errors. A drive can fall from a table, yanking it out of a computer during transfer in a manner that damages the partition and creates logical failure before it crashes to the floor, knocking its actuator arm out of alignment and causing bad sector mechanical failure. At this point, the unit would require multiple troubleshooting steps in order to recover the information within. Unfortunately for most users, the outcome is the same as if the device had simply suffered mechanical damage and the unit will almost certainly need to be sent to data recovery experts for repair.
In the event of a drive failure on set, Local 695 members should never attempt troubleshooting or repair procedures without first discussing the matter with their head of department or a producer and informing them of the potential cost and the risks involved. In the event of logical failure, it is possible to salvage a production’s data and save both time and money—always a good thing when negotiating your next rate. However, if mechanical or complex failure is the suspected culprit, it is probably best to turn the faulty drive over to someone with decision-making power and recommend that they consult advanced recovery specialists.