The Photo Archive Solutions

This post is mostly geared at other photographers, but it’s available for anyone who wants an insight in some of the backend solutions here.

As all of our clients know, we store digital negatives indefinitely. We can run off prints of images from 2005 (our switchover to digital) and hope to be able to say that for many years go come.

As cameras have gotten more advanced it’s gone from approximately 1GB per shooting hour to 4-8GB per hour currently. This means that I abandoned DVDs and DVD-DLs some time ago as a back up solution except in specific circumstances. All storage is currently done using magnetic hard drives. (I don’t feel that flash-based SSDs are, as of early 2010, ready for general use).

The current workflow is as follows:

  1. CF cards out of the camera are stored during a portrait session, or backed up on site to an Epson Photo Viewer with a 100GB hard drive.
  2. After the shoot CF cards are read into the photo processing workstation (currently a 15″ Macbook Pro for portability).
  3. All CR2 (Canon RAW) files are backed up to a network share and stored indefinitely.
  4. CR2 files are converted to Adobe DNG (Digital NeGative). DNG is an open format suitable for long-term archiving and suggested by the library of congress.
  5. DNG files are imported into an Apple Aperture library where shoot metadata is added, shoot metadata is added and basic editing is done.
  6. The workstation is backed up hourly using Time Machine and Aperture libraries are regularly backed up to the network share.
  7. Photo delivery to clients is done by uploading shoot picks to Photoshelter for the client’s web gallery. I upload full-resolution jpeg images so that even in a worst case scenario printable images are stored on by Photoshelter on both the east and west coasts.
  8. After delivery the digital negatives are exported out of the Aperture library as referenced masters on the network share. I keep the thumbnail (1024px) images so that I can look images up as needed and I can work on the images any time I am on the network.
  9. The network store is backed up to the off-site storage using rsync on a regular basis. For very large updates the specific new files are transported by hand on other media. (Hard drives, DVDs, etc).

Why is my backup workflow this complex? I worked in computers for years while photography was a hobby. In that time I saw a lot of hard drives and other media fail. It is not a question of IF it will fail, but when. While doing photography and other tasks I’ve had hard drives fail all around the world and I feel that the current system lets me keep as much data intact as possible. I also believe that the structure I have set up makes for a simple and relatively inexpensive upgrade path where my data follows along.

Why am I posting this now? Well… hit the part after the break to find out the trials and tribulations of last week’s series, yes, series, of hard drive failures and the specific pieces of hardware I am using right now.

Storage failures, be they failures of hard drives, optical media, computers themselves, or the enclosures we entrust our hard drives to happen.

Last week, and I haven’t gotten enough sleep lately to remember exactly which day I had a failure of the NewerTech Guardian raid mirror enclosure that had the pair of 1.5TB drives that store the referenced master files from Aperture. The remote storage is a pair of 750GB drives and was intact. (A pair of mirrored 1.5TB drives gives a total of 1.5TB of storage, not 3TB, so can be backed up to 2x 750GB drives). This gives me 3 actual stores of the data. I removed the 1.5TB drives and the advantage of mirrored drives is both hold an identical copy of my data. I used one to continue working and the other as a backup. Unfortunately I found that the drive I had chosen to work with was also experiencing problems.

While all of this was happening SMART reported an error with my laptop’s hard drive. Modern hard drives deal with bad sectors at the firmware level. They keep a number (varying between manufacturers and product lines) of sectors as spare so that when a bad sector is detected that address is swapped to a spare sector. SMART reported that my drive had not only just used up all of the spare sectors but had tried to allocate 3 additional spare sectors and errored out since it was out of spares. A new Hitachi 7200RPM drive was immediately ordered.

I had already planned to switch out the mirrored arrays in favor of a Drobo from Data Robotics with the upcoming move, so I went ahead to order that and found that Drobo had new products! After a few days reading reviews weighing pros and cons, I ordered a Drobo FS with 2TB Western Digital Green drives from my favorite outlet Other World Computing.

My new laptop drive arrived and would not accept an install of Snow Leopard. I had wanted to do a clean install and then just use the migration tool to transfer over my user data and applications as I had a fair number of errors in system.log relating to incompatible software, mostly that I no longer used, from the Leopard->Snow Leopard upgrade. This was more than slightly odd, but long story short the drive would spin up initially, but would fail to spin up after it had been powered down. The drive would work fine in a USB enclosure, though, spinning up and down without trouble. This sounds like an insufficient power issue and my laptop is a working computer that has been through the wringer so I felt it was a computer problem rather than drive problem. Josh at OWC and I poured over datasheets. The new drive used 1.1 amp to spin up and the old drive used 0.9amp. We settled on another Hitachi drive that had 0.9A spin-up and also a fairly good mix of other power usage and performance stats and OWC got it on a late FedEx shipment out and it worked perfectly. The previous drive has joined the small fleet of USB drives that velcro onto the back of my laptop for specific storage and backup needs.

The Drobo FS has had it’s own problems. Mostly this post was started to address those because I couldn’t find _ANY_ information online about these problems. The first off was simple, the Drobo FS arrived with a broken top drive bay. The door was bent and I just removed it. No big deal, but annoying. The second was that the second drive bay would not recognize the first two drives I tried in it (before giving up). Part of me wants to say, because I’ve become a paranoid conspiracy nut about my hard drives these days, that since I didn’t find the spring that I assume the top bay door should have had that it shorted out. I have no idea, honestly, so I popped off an e-mail to Data Robotics about it. I had purchased the enhanced DroboCare warranty (no, Drobo isn’t copying Apple at all with their naming and packaging *cough*) that morning so figured it would be no big deal to get another one shipped out. The rep I was e-mailing back and forth with never did seem to get, despite me including it on every e-mail, that I had paid for DroboCare and wanted a new system. At her insistence I did try another drive in the slot and yes, it worked. But no, I can’t run my business on a system that works best 1 out of 3. That just isn’t happening. It was sent off to an RMA specialist who called me the next day. Clayton was fantastic, added my drobocare info to the ticket and told me that it was only listed for RMA due to the drive door. When we discussed that an entire bay wasn’t working things went much faster and a new drobo was shipped out the next morning (it’ll be here Monday). So, Drobo may still be hit or miss. I really like the platform, both the BeyondRAID system that offers me dual disk redundancy and quick changes and recovery from failures and that it’s a relatively open platform for development that’s as simple as cross-compiling linux software for the ARM chip in the enclosure. But, I’m not sure that the support I received means I can suggest it as a business product yet.

The other problem has turned out to be much more complex. After copying terrabytes of data to the Drobo FS using the rsync drobo app I found that all of the samba shares were unavailable. Apparently all files on the Drobo FS need to be owned by root. My rsync scripts maintained numeric permissions because that’s how my backups have always worked. After chown’ing all of the files, restarting the Drobo FS and restarting my computers everything seemed to work.

Except that a number of the images were now being listed as Unsupported Image Type or something to that affect. Several thousand. Having roughly 10% of my images suddenly gone is not a pleasant experience. I attempted to consolidate them back into the library and got an a variety of errors, Error while consolidating (Can’t write file (no space)) was most common. I tried relocating them off of the Drobo FS and received Error while relocating (Can’t write file (no space)). I gave up on the device for a while and tried to point the references to the drive I had been using. This fixed some, but not all, of the image problems. I still was unable to work with close to a thousand images.

I pulled the backup of the library and relinked it to the original set of master images (doing this takes 12-24 hours each time, and this is probably the 3rd time). During this time I continued plugging error codes into google and seeking information on the problem. I found only 2 threads dealing with it on Apple’s support site, neither were solved in a satisfactory way. One was unsolved and the other was a small library (5000 images) rebuilt by hand. I have neither the time nor the patience to rebuild a library encompassing my entire professional career. But it put me on the right track, both people had the same problems I was having and both people were using Samba/SMB/CIFS to connect to their master images. The only difference between my actual deployment and my testing is that I had tested using AFP, the same file protocol that I had used before the Drobo when the master storage was external arrays connected to two OSX Leopard computers. The Drobo Dashboard, though, provides the convenience of an automatic mounting system but only uses Samba.

While I was rebuilding the library again over night, I did testing using another computer here and no matter what I tried using Samba and the Drobo FS I encountered the “no space” errors. This morning I had a thought that I should have had a long time ago. I can only say that lack of sleep while working on recovering the failed drives clouded my mind and also continuing to work on the exciting new developments that are now somewhat behind schedule. I _will_ report on that soon. I mounted a test volume using AFP, loaded a small project onto it from my library onto it and easily moved images back and forth. I’ve now picked up a little piece of software called Bonjour Mounter and for $15 I get the functionality of automounting any kind of network volume I want.

I have photos that need to be processed still, so I won’t be able to test it with the full library until I am able to leave it relinking over night. I will simply update this post if it doesn’t work. Mostly now here is a post that people can find on google somewhere it they have the same problem. I hope it is helpful. I also hope that anyone interested in the backend of the digital photography business also found this interesting.

Edit: Well, what worked in my small scale test to fix my Aperture errors above has turned out overnight to not work on the entire library. Back to the drawing board.

Edit Edit: There were two problems. Or maybe lesser and greater versions of the same problem. The previous fixed nearly every image, but there were a handful that gave the same errors over AFP. These remaining images appear to just have had corrupt master files. I first relinked these to the mirrored masters and so far everything is going well on transfer to the Drobo. Should be done by tomorrow Morning.

Tags: , , ,

Comments are closed.