Cache or Check?

On September 9th, 2009, Apple unveiled the (disappointing) updates to their iPod line, and released to the world iPhoneOS 3.1. The first, while notable, is nothing interesting in comparison to what they’ve done in the latest release of iPhoneOS.

With the latest release of their desktop operating system, OS X, Apple made great improvements to the system’s speed and application load times (supposedly, I’ve heard mixed reviews of Snow Leopard.) iPhoneOS 3.1 brings in these new enhancements, further streamlining the software on their mobile devices.

The single most impressive, noticeable change I’d like to discuss today is library caching.

Boots of Speed

Does it make sense to load 200+ individual files from various locations on disk? Arguably. This is a pretty standard situation. But wouldn’t it be better if you could, say, intelligently combine those 200+ files (as they do not change often) into one larger file, stored in a single contiguous region on-disk?

This is, in effect, what Apple has done with the dynamic libraries on the iPhone. libSystem, UIKit, Foundation, and hundreds more libraries and frameworks are combined into one large (~92MB for 3.1 on the armv6) flat archive. This archive is loaded into memory (or, more likely, mapped as-needed) at 0x30000000 (all libraries in iPhoneOS seemed to start beyond 0x30000000 anyway; regardless, this location itself matters very little.)

/System/Library/Caches/com.apple.dyld/dyld_shared_cache_armv6 contains every shared dynamic library in iPhoneOS, and dyld no longer needs to hunt down individual libraries and map them into memory when they’re required. The device starts faster and applications launch faster. Huge success… or is it?

Precipice of Doom

This does, of course, cost a lot of disk space. 92MB is a lot in a root partition of ~500MB – so, how did Apple save the rest of the space? Why, they simply deleted the originals! That’s right, there are no (well, there might be five) shared libraries in the filesystem. They’re all horded into a single cache file!

At first glance, this doesn’t seem like a very bad thing. iPhone users rarely peek into the internals of the device, and iPhone Developers have their SDK to fall back on. The SDK contains every library in original format, compiled for ARM.

But, what if you compile on the device? iphone-gcc in Cydia allows you to do so! Or, at least, it used to. There are no longer any shared objects to link against; you can create all the intermediate object files you want, you’ll just never be able to link them into a proper executable.

What if you compile using the open toolchain? Similar situation. The open toolchain is typically built using a copy of the iPhone filesystem to supply needed libraries and datafiles. Of course, you could just copy the libraries from the SDK. What if there’s something you can’t find there? What if there’s something that’s only found in this huge cache file?

Note: I was previously unaware that the SDK actually included the private frameworks as well as the public ones, which does greatly diminish the use of this tool. Still, there ARE binaries that would be in the filesystem that might be of interest to hackers that are not available via the SDK. Lacking the private frameworks would have been bad, as developers using them would have NO way to link their binaries, and would have to do everything regarding them at runtime. Fortunately, that’s not the case.

Cache Withdrawal

I’ve spent the better part of two days writing a utility that will recreate a filesystem’s worth of libraries given a cache file. There are a few drawbacks, but the created binaries have been proven linkable and, for the most part, class-dump-able.

The cache file is of a pretty simple format, with a few “gotcha”‘s. It begins with a header describing what it’s for (cache version, processor) and how many libraries it contains, dyld’s offset in memory, an offset to a code signature, etc., and then moves on to a list of library addresses. Each library’s information is stored in 4 64-bit integers – the load address (which is also the offset in the cache file plus 0x30000000), two unknowns, and the offset of this entry’s filename.

Each library can, for the most part, be extracted as-is: Seek to its offset, figure out its size, and dump that into a separate file. The “gotcha” of this part is, of course, that the cache file is a little more complex than “a bunch of libraries stuck end-to-end”. The application that creates the cache seems to separate the code, data, and linker information (already in separate segments, __TEXT, __DATA and __LINKEDIT) intelligently. It also COMBINES __LINKEDIT for every library in the cache.

The cache, thus, looks something like this:

Header, Library Info, Lib1, Lib2, Lib3, …, Data1, Data2, Data3, …, __LINKEDIT, Code Signature

This is very intelligent and very efficient – until you want to extract the libraries. __LINKEDIT, as stated before, has been combined for every library. So, to extract library 1, you’d need Lib1 + Data1 + __LINKEDIT. The same goes for all other libraries in the file. The only problem with this is that __LINKEDIT alone is ~20MiB in size. Each library requires it.

When all is said and done, the set of 200+ extracted libraries comes out to 4.6 GB. Every library contains a whole bunch of duplicated data which is absolutely necessary for its proper function. (This could be rectified if I intelligently recombined the symbol tables, but that’s a whole lot of Mach-O sorcery I don’t feel like getting into right now.)

The Tools of the Trade

Update: the code is here. The explanation is, however, missing. 😛
There is an updated fork (by KennyTM~) here.

Comments

5 responses to “Cache or Check?”

  1. KennyTM~ Avatar

    Actually… most of how LC_DYLD_INFO_ONLY (0x80000022) works is coded in http://www.opensource.apple.com/source/ld64/ld64-96.5/src/other/dyldinfo.cpp .

  2. coco Avatar

    hey dhowett we have met on the iphone chat…. however i cannot find that chat anymore 🙁

    i need some help. if you could just mail me ?

    i tried looking for u lol that’s how desperate i am lol

    anyways hope you get this message bro

    Coco from Thailand!

  3. brandon Avatar
    brandon

    i need help with the absinthe program, the jailbreak button will not let me click it, please help! thank you

  4. […] originally written by D. Howett, was created to unpack files from the dyld_shared_cache_armvX file. Unfortunately, this tool does […]

  5. […] There’s a bit of a better (though older and less informed, more in-depth) write-up here: http://blog.howett.net/2009/09/cache-or-check/. […]