Mac OS X Universal Binary API redux

Here are some brief and relatively unstructured notes I made while reading Apple's Universal Binary guidelines with a view to making a checklist for my own stuff. I hope they're useful. Any errors and omissions are my own and I welcome corrections and clarifications.

Bottom line

  • If you're not already on Mach-O, migrate ASAP.
  • If you're not already on XCode, migrate ASAP.
  • Only the new MacOSX10.4u.sdk is portable to x86.
    • Update: you can combine PPC binaries built with earlier versions of gcc and/or earlier Mac OS X SDKs with x86 binaries, so supporting Panther and earlier is not a problem. Chris Espinosa has an example on his public iDisk folder "cdespinosa" called "SDKExample".
    • Eric Albert has pointed out the standard lipo developer tool, for inspection and arbitrary manipulation of fat binaries.
    • Starting with Xcode 2.2, you can manage this much more easily, using per-architecture variants of settings such as GCC_VERSION_ppc, SDK_ROOT_ppc and MACOSX_DEPLOYMENT_TARGET_ppc.
  • CodeWarrior is dead, Jim. I mourn it as much as anyone, but facts are facts. :-(

Byte-swapping etc

All the byte-swapping issues etc that are familiar to anyone who's done Mac<->Win porting must be handled :

  • For structures managed by the OS, e.g. menus, the OS will do the byte-swapping. You're responsible for byte-swapping your own on-disk and network-transmitted data, including custom Apple Event data and anything you pack into e.g. TCP packets. Read on...
  • System-defined Apple Event data types are swapped by the OS, but you must swap app-defined types. You can register a 'flipper' callback with the Apple Event Manager to reduce code impact.
  • Similarly you can register a flipper callback with the Resource Manager to flip your app-defined resources and clipboard formats. It can be the same callback as the Apple Event flipper. Gotcha: if the resource is marked as preloaded, the flipper won't be called. On first reading, the registration API appears to take care of the OSType<->UTI shenanigans.
  • The second edition includes a much-improved approach to swapping PowerPlant 'PPob' resources, the key concept being to centralise all the swapping in the LStream class. Prettty much all you have to do now is audit for custom classes that call LStream::ReadData and either use more fine-grained LStream methods or do the swapping yourself.
  • AliasRecords are always big-endian, and you must use accessors for userType and aliasSize.
  • Many deprecated Toolbox functions have byte-swapping issues. Warning for such functions is on by default in XCode 2.1.
    • Update: I've been asked for more detail on this. Apple's document is a bit vague, saying only "such as those that use PICT + PS data".
  • FOND, NFNT, sfnt etc are always big-endian. Use Apple Type Services or swap them yourself.

Other size and alignment issues

Again these will be familiar to anyone who's done Mac<->Win porting:

  • C bit-fields are both architecture-dependent and compiler-dependent. No surprise.
  • Don't mix BitTst() etc with the C bit operators such as |, & and ~.
  • Be careful when using BitTst() on Gestalt() values.
  • bool is 1 byte on x86 and 4 bytes on PPC, at least under XCode. No surprise: the standard has always said sizeof(bool) is implementation-dependent, albeit within between certain bounds.
  • Floating point equality comparison is architecture-dependent (but let's face it, you'd be a fool to rely on it anyway).
  • "On x86, a long double is still 16 bytes, but only 80 bits are significant." I'd need to reflect before commenting on the implications of this.
  • Converting a double to an integer type that isn't large enough to represent the integer part of the double is architecture-dependent. You might get INT_MAX, you might get INT_MIN. You might get my great-aunt's maiden name in ASCII. Don't rely on such conversions.
  • Nothing is said about wchar_t. Probably best avoided. :-(

Low-level disk formats

  • Disk partitioning is different on x86.
  • Certain low-level HFS+ structures are always big-endian.

Objective-C

  • On x86, Obj-C messages to nil objects return garbage for return types of float and double (actually, whatever was in st(0)). Apple say that on PPC you get 0.0 but this is incorrect: you get whatever was in the FPR1 register. A correction has been submitted.
  • There is a paragraph about an ABI difference with the Obj-C runtime function objc_msgSend_stret that went over my head.

AltiVec

  • You can either use the Accelerate framework (10.3 or later) instead of AltiVec or port to the Intel SIMD APIs. I skipped the rest of this section as it was over my head, but I saw that alignment issues need special attention.

QuickDraw, QuickTime and OpenGL

  • GWorlds are big-endian by default but you can create little-endian formats. However, QuickDraw on PPC doesn't support little-endian GWorlds.
  • Certain OpenGL and Quartz pixel types need special attention re endian issues.
  • For QuickDraw Picture structs, use QDGetPictureBounds() rather than accessing .picFrame. Don't cast the result of DeltaPoint() to a Point struct, use the LoWord and HiWord macros.
  • QuickTime 'thng' resources need special treatment.

Rosetta

  • Rosetta is transparent to the user, but the Finder's Get Info will show which platforms an app is compiled for.
  • Rosetta runs PPC apps on x86, but not the other way around. :-(
  • Rosetta cannot run Classic apps, AltiVec code, System pref panels, G4/G5-specific code, kernel extensions, bundled Java apps and Java apps with JNI libs that can't be translated.
    • Update: the shipping version of Rosetta supports AltiVec.
  • Rosetta does not support mixtures of different architectures re apps and plug-ins, private frameworks, etc. It's all-or-nothing per process.
  • Rosetta is just-in-time but uses a large translation buffer. Apple claim that reused code will be translated only once.
  • There are endian issues when a translated app shares custom file formats, pasteboards, etc with native apps.
  • It is possible to force a universal binary to open under Rosetta.
  • There are quite severe limitations re debugging a translated app.
  • Those of us who lived through the 68K/PPC transition will immediately see that Rosetta is nowhere near as full a solution as the Mixed Mode Manager was. :-(

64-bit

The document says nothing about 64-bit addressing and 64-bit file APIs. Obviously 64-bit addressing is not supported in x86 (cough, AMD, cough). Presumably the 64-bit file APIs will "just work".

Sundry gotchas

  • Integer divide by zero is fatal on x86.
  • The x86 has fewer registers than the PPC so locals are more likely to be on the stack than is the case with PPC. Hence, bad code that writes beyond the end of locals, via gung-ho casts etc, is more likely to smash the stack on x86.
  • If you use the MachineLocation struct for time zones etc, you'll have to use .u.dls.Delta rather than .u.dlsDelta and you must set .u.dls.Delta AFTER you set .u.gmtDelta. Ugh.
  • Intel hyperthreading and dual-core will be supported by the threading APIs. So don't hard-code your number of threads or limit it to the number of CPUs.
  • OSEnqueueAtomic and OSDequeueAtomic are not available on x86 (Apple's document erroneously cites OSEnqueAtomic).
  • If you generate code at runtime, be aware that the stack must be 16-byte aligned when calling OS libs or frameworks. Presumably this also applies if you write assembler.

That's all for now. If you found this useful, let me know.

Translations

It Traduzione italiana, grazie a Ludovico Rossi.


© 2004-2006 Sailmaker Software Limited. All rights reserved.
Last updated: Thursday, February 16, 2006.