Routines as Values

Volume Number: 13 (1997)
Issue Number: 10
Column Tag: Programming Techniques

Routines As Values

by Lawrence D'Oliveiro

The High-Level Programmer's Guide to Generating Code at Run-Time

Introduction

This article is about passing around routines (procedures and functions) as values. That is, you take the address of a routine, pass it to another routine as an argument, maybe even store it in a data structure for later use. Ultimately the whole point with a routine is that you make a call to it; but many interesting things can happen on the way there.

There are many useful programming techniques to do with routines as values. In my view, these techniques deserve to be more widely known than they are, because they can contribute so much to the principles of reusable code and information hiding that good programmers value. This article will give you an introduction to some of these techniques.

To make sense of this article, it helps to have some understanding of 68K assembly language and how it interfaces to the Mac OS. The definitive reference to 68K machine language is the 68000 Family Programmer's Reference, available from Motorola's Web site as an Acrobat PDF file at http://www.mot.com/hpesd/aesop/68k/68kprm.pdf. Links to related documents can be found at http://www.mot.com/SPS/HPESD/docs/pubs/index.html. For a description of 68K calling conventions on the Mac OS, see the Trap Manager chapter of Inside Macintosh: Operating System Utilities, available on-line at ftp://ftpdev.info.apple.com//Developer_Services/Technical_Documentation/Inside_Macintosh/Operating_System_Utilities/. [Another useful and brief article introducing the basics of 68K ASM is "The Secrets of the Machine" by Malcolm Teas in MacTech 9.5. It is available from www.mactech.com/Articles/Vol.09/09.05/Read-Assembly/index.html -- ed].

Language Notes

Most of the code examples in this article are written in Modula-2. Specifically, they were written for the (now-defunct) Metrowerks Modula-2 compiler that runs under Apple's MPW (Macintosh Programmer's Workshop). I use this combination because it's been the best language and development environment for the kinds of programs I write. I recently got hold of the p1 Modula-2 compiler from Germany, but have yet to start using that in earnest.

At the end of the article, I will discuss the applicability of these techniques to other languages: everybody's favorite, C, and everybody else's favorite, Pascal.

In the meantime, I'll explain a few things for those who might be unfamiliar with Modula-2, but are familiar with Pascal. And if you're not familiar with Pascal, then you're not a real programmer...

Modula-2 has a lot in common with Pascal, and most of the following code should be comprehensible to Pascal programmers. It has the same signed integer types as Pascal (INTEGER for 16-bit integers and LONGINT for 32-bit ones), but it also has unsigned integer types: CARDINAL for 16-bit ones and LONGCARD for 32-bit ones. Pointer types are defined using the syntax "POINTER TO t" instead of Pascal's "^t". And there is a general untyped pointer type, ADDRESS, which is compatible with any pointer type. The Pascal compilers on the Mac don't seem to have any equivalent to this; it is like the "void *" type in C. The built-in ADR function returns the address of something; it is equivalent to the "@" operator in Pascal or "&" operator in C. Like the Pascal version (but unlike the C one), it always returns an untyped pointer value, that is, its return type is ADDRESS.

If you want to do unsafe type conversions, there are two ways. One is the traditional record-variant trick from Pascal (or unions in C). The other is the use of the CAST function. You will see both in use later. Unlike Pascal or C, you can't just use a type name as a function to do a cast; this makes unsafe conversions a little easier to find in the code!

The big difference from Pascal, for the purposes of this article, is that Modula-2 allows you to declare routine types. That is, you can write pieces of code like this:

  TYPE
    MyProcType =
      PROCEDURE
       (
        INTEGER
       );

  VAR
    ProcVar : MyProcType;
    K : INTEGER;

  PROCEDURE MyProc
   (
    Arg : INTEGER
   );

  BEGIN
    ...
  END MyProc;

BEGIN (* main program *)
  MyProc(K); (* invokes MyProc *)
  ProcVar := MyProc;
  ProcVar(K); (* also invokes MyProc *)
  ...

In Pascal, you can take the address of the routine:

  Var
    ProcVar : Ptr;

  Procedure MyProc
   (
    Arg : Integer
   );

 Begin
  ...
 End {MyProc} ;

Begin
{ main program } 
  MyProc(K);
  ProcVar := @MyProc;
  ...

But there is no way to define a type like MyProcType in the Modula-2 version, and the language provides no standard mechanism for calling the routine once you get its address in ProcVar. Of course, you can get around this, in a type-unsafe way, with appropriate use of inline machine code.

C programmers will already be familiar with all this, since C has similar sorts of facilities to Modula-2 in this regard, albeit with slightly less elegant syntax:

  typedef void (*MyProcType)(short);

  MyProcType ProcVar;
  short K;

void MyProc
 (
  short Arg
 )
 {
  ...
 } /*MyProc*/

void main
 (
  void
 )
 {
  MyProc(K); /* invokes MyProc */
  ProcVar = &MyProc;
  (*ProcVar)(K); /* also invokes MyProc */
  ...

I did a quick search for Modula-2-related information on the Web, and was amazed at how much I found. Here are a few URLs to get you started: http://www.infm.ulst.ac.uk/~george/mp/pract/modula.html, http://www.ee.newcastle.edu.au/users/staff/peter/m2/Modula2.html, http://www.ise.fhg.de/personal_home_pages/hubertus/texte/tabcont.html.

Anyway, enough with the preliminaries, on with the show...

Action Routines

There is frequently a need to traverse a complex data structure, looking for elements that match a particular criterion, and doing some processing on them. For example, consider the System 7 process list: you might want to scan it to see if a particular program is running. Or you might want to total up information about all processes, or display them in a list for the user's information. In all these applications, there is a certain recurring pattern to the code:

Declare a variable named, say, CurrentProcess, of type ProcessSerialNumber. Initialize it to kNoProcess.
Call GetNextProcess, passing it CurrentProcess. If you get an error, exit the loop.
Perform whatever processing is appropriate on the process with the serial number in CurrentProcess. Then repeat from the previous step.

The code in the first two steps would be the same in every case. Ideally, you should be able to write it just once, and reuse this code everywhere you need to. How can you do this?

The answer is to write a generic process-traversal routine, which calls an action routine that you specify for each process. In Modula-2, you could define the type of the action routine as follows:

TYPE
  ProcessActions =
    PROCEDURE
     (
      VAR (*ThisProcess :*) ProcessSerialNumber,
      VAR (*ProcessInfo :*) ProcessInfoRec,
      VAR (*KeepGoing :*) BOOLEAN,
      (*ActionArg :*) ADDRESS
     );

The first two arguments pass all the information about the process that the Process Manager provides. The KeepGoing argument may be set to false by the action routine to stop the traversal. This is useful if, for example, you're searching for a particular process, and you don't need to search further once you've found it. The last argument allows the passing of additional data whose meaning is up to the caller; we'll see one interesting use for this later.

And here's the generic process-traversal routine:

PROCEDURE TraverseProcesses
 (
  Action : ProcessActions;
  ActionArg : ADDRESS
 );

  VAR
    ThisProcess : ProcessSerialNumber;
    ProcessInfo : ProcessInfoRec;
    ProcessName : Str255;
    AppSpec : FSSpec;
    KeepGoing : BOOLEAN;
    Err : OSErr;

BEGIN
  ThisProcess.highLongOfPSN := 0;
  ThisProcess.lowLongOfPSN := kNoProcess;
  ProcessInfo.processInfoLength := SIZE(ProcessInfoRec);
  ProcessInfo.processName := ADR(ProcessName);
  ProcessInfo.processAppSpec := ADR(AppSpec);
  KeepGoing := TRUE; (* initially *)
  LOOP
    Err := GetNextProcess(ThisProcess);
    IF Err <> noErr THEN
      EXIT
    END (*IF*);
    Err := GetProcessInformation(ThisProcess, ProcessInfo);
    IF Err = noErr THEN
      Action
       (
        ThisProcess,
        ProcessInfo,
        KeepGoing,
        ActionArg
       );
      IF NOT KeepGoing THEN
        EXIT
      END (*IF*)
    END (*IF*)
  END (*LOOP*)
END TraverseProcesses;

A sorting example

Another use for action routines is to isolate assumptions about data structures. Consider a common operation: sorting. You've probably got a collection of favorite sorting routines that you like to use. What sorts of assumptions do they make about the data being sorted? Does it have to be in an array? Can it be in a disk file? Do the sort keys have to reside together with the rest of the data? Why make any assumptions at all?

Many sorting routines can function by making only two kinds of accesses to the data: they need to be able to compare the keys of two records, and they need to be able to swap the positions of two records to get them into the right order. Let's define interfaces for action routines to perform these operations:

TYPE
  ComesBeforeProc =
    PROCEDURE
     (
      (*First :*) LONGCARD,
      (*Second :*) LONGCARD,
      (*Arg :*) ADDRESS
     ) : BOOLEAN;
  SwapProc =
    PROCEDURE
     (
      (*First :*) LONGCARD,
      (*Second :*) LONGCARD,
      (*Arg :*) ADDRESS
     );

The one assumption I will make is that records are identifiable by index numbers starting from 1. A ComesBeforeProc returns true if the record with index First is supposed to come before the one with index Second (that is, they're already in the right order); it returns false otherwise. A SwapProc is called to swap two out-of-order elements: after it executes, the record which had index First should now have index Second, and vice versa.

As with the process-traversal example, each action routine takes an additional argument which may be used to pass additional caller-specified information.

Here's a sorting routine that uses the above action routines to implement a simple Shellsort:

PROCEDURE Sort
 (
  NrElements : LONGCARD;
  ComesBefore : ComesBeforeProc;
  ComesBeforeArg : ADDRESS;
  Swap : SwapProc;
  SwapArg : ADDRESS
 );

  VAR
    i, j, k, l, m : LONGCARD;

BEGIN
  i := 1;
  WHILE i < NrElements DO
    i := i + i
  END (*WHILE*);
  m := (i - 1) DIV 2;
  IF (m = 0) AND (NrElements > 1) THEN
   (* do at least one pass when sorting 2 elements *)
    m := 1
  END (*IF*);
  WHILE m > 0 DO
    k := NrElements - m;
    FOR j := 1 TO k DO
      i := j;
      LOOP
        l := i + m;
        IF ComesBefore(i, l, ComesBeforeArg)
        THEN
          EXIT
        END (*IF*);
        Swap(i, l, SwapArg);
        IF i <= m THEN
          EXIT
        END (*IF*);
        i := i - m
      END (*LOOP*)
    END (*FOR*);
    m := m DIV 2
  END (*WHILE*)
END Sort;

Of course, you could replace the body of this routine with just about any sorting algorithm you like, and keep the same interface. Here's a very simple, almost trivial example use of this routine to sort an array of integers. First we declare a data array, and comparison and swap routines for the sort:

TYPE
  ElementType = INTEGER;
VAR
  MyData : POINTER TO
    POINTER TO ARRAY [1 .. 10000] OF ElementType;
  MyNrElements : CARDINAL;

PROCEDURE MyComesBefore
 (
  First, Second : LONGCARD;
  Unused : ADDRESS
 ) : BOOLEAN;

BEGIN
  RETURN
      MyData^^[First]
    <=
      MyData^^[Second]
END MyComesBefore;

PROCEDURE MySwap
 (
  First, Second : LONGCARD;
  Unused : ADDRESS
 );

  VAR
    Temp : ElementType;

BEGIN
  Temp := MyData^^[First];
  MyData^^[First] := MyData^^[Second];
  MyData^^[Second] := Temp
END MySwap;

And here's the call that does the actual sort:

Sort
 (
  (*NrElements :=*) MyNrElements,
  (*ComesBefore :=*) MyComesBefore,
  (*ComesBeforeArg :=*) NIL,
  (*Swap :=*) MySwap,
  (*SwapArg :=*) NIL
 )

This example makes no use of the extra arguments that can be passed to the comparison and swap routines. Later we will see an interesting use for these extra arguments.

Closures

In languages like POP-11 and LISP, which are primarily used for Artificial Intelligence work, you can have a lot of fun with functions. You can even create new functions at run-time!

But why should AI researchers have all the fun? Can't you create functions at run-time in a common language like Pascal, Modula-2 or C? Certainly you can!

The bad news is: the techniques for doing this rely on particular features of the compiler and machine. It takes some work to get them right the first time. And certain compiler optimizations can signficant problems.

The good news is: once you get it right, it's a lot easier to do again. And these become very useful techniques, that you find yourself using over and over!

Technically, a Curried function (it's named after a person, not a dish) is created when you take a function with many arguments, and "bind" some of them to particular value; the result is a function of fewer arguments. A closure, on the other hand, is created when you bind a function to its parent environment, to create a new function which automatically sets up the correct environment for the original function when it's called.

To illustrate closures and environments, consider a common problem on the Mac: accessing A5 globals from an interrupt routine.

In most Mac languages, when you declare a bunch of global variables, the compiler allocates them in a block of memory accessed relative to register A5. A5 is also used to access the application's segment table, as well as QuickDraw's global storage. Thus, A5 is a very important register, and your application code is liable to behave very strangely if it's called with an incorrect value in A5, and it tries to access any global data.

Instead of putting special code to deal with this into all of your interrupt routines, the principle of reusable code dictates that you should write this code just once, and use it over and over. That is, you should write a general library routine which, given an interrupt routine, creates a "wrapper" routine for it that passes the correct A5 value. Here's an example of what I mean:

TYPE
  BoundInterruptProc = ARRAY [1 .. 9] OF CARDINAL;

A BoundInterruptProc is an array which will hold the the code for the wrapper routine. The wrapper code itself is filled in by BindInterruptProc, which is defined as follows:

PROCEDURE CurrentA5() : LONGCARD;
(* returns the current value of the global base register.*)

  CODE
    02E8DH;          (* move.l a5, (sp) *)

PROCEDURE BindInterruptProc
 (
  TheProc : ADDRESS;
  VAR Result : BoundInterruptProc
 );
(* constructs the wrapper around the specified procedure.*)

  VAR
    TwoWords :
    (* used to access a longword as two words *)
      RECORD
      CASE : BOOLEAN OF
      | FALSE:
        LongValue : LONGCARD
      | TRUE:
        HighWord, LowWord : CARDINAL
      END (*CASE*)
      END (*RECORD*);

BEGIN
  Result[1] := 02F0DH;          (* move.l a5, -(sp) *)
                              (* save previous A5 *)
  Result[2] := 02A7CH;          (* move.l #n, a5 *)
  TwoWords.LongValue := CurrentA5(); 
                              (* n for previous instr *)
  Result[3] := TwoWords.HighWord;
  Result[4] := TwoWords.LowWord;
  Result[5] := 04EB9H;          (* jsr p *)
  TwoWords.LongValue := TheProc;  (* p for previous instr *)
  Result[6] := TwoWords.HighWord;
  Result[7] := TwoWords.LowWord;
  Result[8] := 02A5FH;          (* move.l (sp)+, a5 *)
                              (* restore previous A5 *)
  Result[9] := 04E75H;          (* rts *)
  FlushCaches
END BindInterruptProc;

So instead of passing the address of your actual interrupt routine to the system, you pass the address of the appropriately-filled-in BoundInterruptProc structure, for example:

VAR
  Params : HParamBlockRec;
  BoundCompletion : BoundInterruptProc;
...

BindInterruptProc(ADR(MyCompletion), BoundCompletion);
Params.ioCompletion := ADR(BoundCompletion);
...
Err := PBReadAsync(ADR(Params));

and in your interrupt routine you have full access to all your A5 globals!

Note that FlushCaches call at the end of the definition of BindInterruptProc. This is necessary because of a piece of brain damage in Motorola's 68030 and 68040 processors. The problem is that, on these processors, memory references are divided into two kinds, data and code, and there is a separate CPU cache for each, with no attempt to maintain consistency between them. Thus, if a given memory location ends up in both caches (as could happen when you generate code at run-time, as BindInterruptProc does), it is possible for it to be modified from the data cache, without the instruction cache picking this up! Then, when you later try to execute the instruction at that memory location, you get the wrong value, and crash...

Here's how FlushCaches is defined. This definition is safe to use on all members of the 68000 CPU family, with system version 6.0 and later. It even works on Power Macs, where it flushes the cache of compiled code on models with the DR ("dynamic recompilation") emulator, and does nothing on older Power Macs.

PROCEDURE HWPrivPresent() : BOOLEAN;
(* has this system got the HWPriv trap. *)

  CONST
    UnimplementedTrap = 009FH;
    HWPrivTrap = 0A198H;

BEGIN
  RETURN
      GetOSTrapAddress(HWPrivTrap)
    <>
      GetOSTrapAddress(UnimplementedTrap)
END HWPrivPresent;

PROCEDURE FlushInstructionCache;
(* actually flushes both instruction and data caches. *)

  CODE
    07001H, (* moveq.l #1, d0 *)
    0A198H; (* _HWPriv *)

PROCEDURE FlushCaches;
 (* flushes instruction and data caches, if present. *)

BEGIN
  IF HWPrivPresent() THEN
    FlushInstructionCache
  END (*IF*)
END FlushCaches;

Binding Actual Arguments

Often, in a routine of yours that is called by the system, you don't need access to all your A5 globals, but you do need access to a particular block of data. Depending on the situation, the system may provide all kinds of different ways of passing the address of this block as an additional argument to your routine -- or it might not provide any way at all.

For example, the action routine you pass to the Control Manager's TrackControl routine gets the control handle as an argument, so you could use the refCon field in the control record to pass information to it. That is, unless the part of the control you're tracking is an indicator (such as a scroll bar thumb), in which case you don't get any arguments at all. Of course, if your code is part of the currently-running application, you could probably get away with using A5 globals in this case.

There are other, equally complicated situations. Time Manager tasks didn't use to get any arguments, until System 6.0.3, when they started getting the address of the TMInfo record in register A1. Memory Manager GrowZoneProcs don't get any arguments, and you can't guarantee that some part of the system (or some weird INIT) won't try to allocate some memory in your heap while the A5 value is set to something nonsensical.

Now, what do you think: would you prefer to resort to a range of different fiddly techniques to deal with each of these situations, or would you rather stick with one general, uniform technique that can handle them all?

I know which I prefer. I would Curry my routine by binding it to the address of my data, using the following Bind routine:

TYPE
 (* structure for holding Curried routine *)
  BoundProc = ARRAY [1 .. 8] OF CARDINAL;

PROCEDURE Bind
 (
  TheProc : ADDRESS;
  ToArg : ADDRESS;
  VAR Result : BoundProc
 );
 (*    creates a new procedure which calls TheProc
      with ToArg as an extra argument. *)

  VAR
    TwoWords :
    (* used to access a longword as two words *)
      RECORD
      CASE : BOOLEAN OF
      | FALSE:
        LongValue : LONGCARD
      | TRUE:
        HighWord, LowWord : CARDINAL
      END (*CASE*)
      END (*RECORD*);

BEGIN
  Result[1] := 02F17H;          (* move.l (sp), -(sp) *) )
  Result[2] := 02F7CH;          (* move.l #m, n(sp) *) 
  TwoWords.LongValue := ToArg;    (* m for above move *)
  Result[3] := TwoWords.HighWord;
  Result[4] := TwoWords.LowWord;
  Result[5] := 4;                (* n for above move *)
  Result[6] := 04EF9H;          (* jmp n *)
  TwoWords.LongValue := TheProc;  (* n for jmp *)
  Result[7] := TwoWords.HighWord;
  Result[8] := TwoWords.LowWord;
  FlushCaches
END Bind;

Note that you can use Bind on any procedure or function of just about any number of arguments, and (if a function) returning a result of any type. The only restrictions are that it must follow Pascal calling conventions (as Metrowerks Modula-2 does), the argument you are binding must be the last one in the argument list, and it must be 32 bits in size (such as an address).

Converting Calling Conventions

Another important use for run-time closures is where the system invokes some code, that you supply, in a nonstandard way. One case that has bitten many Macintosh programmers over the years is the infamous TextEdit clikLoop callback.

If you specify a clikLoop routine in your TextEdit record, TEClick will call it repeatedly as long as the mouse remains down. One use for this callback is to implement autoscrolling: TextEdit's default clikLoop action, like the rest of TextEdit, doesn't know anything about scrollbars, so you have to provide your own code to keep these in sync.

The clikLoop routine is invoked in a slightly odd way: it must preserve register D2 (normally on the Mac, any routine you call is allowed to trash A0, A1, D0, D1 and D2), and it must return a boolean result in the bottom byte of D0. If this value is ever false, then TEClick returns immediately, as though the mouse has been released.

The trap lies in an inocuous-seeming library routine, that Apple provides, called TESetClickLoop. It looks very tempting to use, as it allows you to pass a clikLoop callback that conforms to straight Pascal calling conventions. Of course, it uses its own glue code internally to call your routine and convert appropriately between calling conventions.

The problem lies in how this glue code finds your routine: the address of your routine is stored in a global variable! What this means is, if you have two separate TextEdit records in existence at once, and you try to specify a separate clikLoop routine for each one, then the last one you specify gets called for both, since there is only room to remember a single one!

The correct solution is to create the glue code on the fly. Here's a routine that does this:

TYPE
  ClikLoopProc =
   (* has one more arg than TextEdit.ClikLoopProc *)
    PROCEDURE
     (
      (*Arg :*) ADDRESS
     ) : BOOLEAN;
  BoundClikLoopProc = ARRAY [1 .. 12] OF CARDINAL;

PROCEDURE BindClikLoopProc
 (
  TheProc : ClikLoopProc;
  ToArg : ADDRESS;
  VAR Result : BoundClikLoopProc
 );
 (* binds TheProc to the specified argument to create
  a closure that can be invoked by TextEdit as a
  ClikLoop procedure. *)

  VAR
    TwoWords :
    (* used to access a longword as two words *)
      RECORD
      CASE : BOOLEAN OF
      | FALSE:
        LongValue : LONGCARD
      | TRUE:
        HighWord, LowWord : CARDINAL
      END (*CASE*)
      END (*RECORD*);

BEGIN
  Result[1] := 02F02H;            (* move.l d2, -(sp)*)
                                (* save vulnerable register *)
  Result[2] := 04227H;            (* clr.b -(sp) *)
  Result[3] := 02F3CH;            (* move.l #n, -(sp) *)
                                (* insert additional arg *)
  TwoWords.LongValue := ToArg;
  Result[4] := TwoWords.HighWord  ;  (* n for above *)
  Result[5] := TwoWords.LowWord;    (* ditto *)
  Result[6] := 04EB9H;            (* jsr l *)
  TwoWords.LongValue := CAST(LONGCARD, TheProc);
  Result[7] := TwoWords.HighWord;    (* high word of l *)
  Result[8] := TwoWords.LowWord;    (* low word of l *)
  Result[9] := 07000H;            (* moveq.l #0, d0 *)
  Result[10] := 0101FH;            (* move.b (sp)+, d0 *)
  Result[11] := 0241FH;            (* move.l (sp)+, d2 *)
                                (* restore saved register *)
  Result[12] := 4E75H;            (* rts *)
  FlushCaches
END BindClikLoopProc;

Note how, while I'm at it, I cleverly change the type of the ClikLoopProc that you must pass, to include an extra argument. This allows you to pass extra information to your ClikLoopProc when it is called, for whatever purpose you want.

Up-Level References

In Pascal and Modula-2, you can declare one routine local to another. In this case, the environment of the inner routine includes everything declared locally to the outer routine. In particular, it is possible for the inner routine to access variables declared local to the outer routine.

How is this done? I know of two different techniques that compilers may use. One involves the use of a global display array of pointers to stack frames, indexed by the source nesting level of the routine. The other way is to pass a static link when calling a nested routine -- this is just a pointer to the enclosing routine's stack frame.

On the Mac, one of these techniques is the right way, and the other is the wrong way. The display array is the wrong way, because it requires global data that isn't always available. The static link method is the right way. Thankfully, the Mac compilers I'm aware of all do it the right way.

To illustrate how a static link works, have a look at the following piece of source:

PROCEDURE Outer;

  VAR
    OuterVar : CARDINAL;

  PROCEDURE Inner
   (
    NewValue : CARDINAL
   );

  BEGIN
    OuterVar := NewValue
  END Inner;

BEGIN (*Outer*)
  OuterVar := 1;
  Inner(2)
END Outer;

Here's the code that the Metrowerks compiler generates for for procedure Outer:

LINK    A6,#$FFFE
MOVE.W   #$0001,-$0002(A6) ; OuterVar := 1
MOVE.W   #$0002,-(A7) ; arg for call to Inner
PEA    (A6)
JSR    Inner
UNLK    A6
RTS

Note that the variable OuterVar is at offset -2 within Outer's stack frame. Also note that when it calls Inner, it doesn't just pass the cardinal value 2 for the argument NewValue; it pushes an additional longword, which is the value of its A6 (stack frame) register. To see the use of this, here's the code for Inner:

LINK    A6,#$0000
MOVEA.L  $0008(A6),A4 ; Outer's A6 to access OuterVar
MOVE.W   $000C(A6),-$0002(A4) ; OuterVar := NewValue
UNLK    A6
MOVEA.L  (A7)+,A0
ADDQ.L   #$6,A7
JMP    (A0)

Notice how Inner gets to OuterVar by using the stack frame pointer that was passed to it by Outer. If Outer was itself declared local to another routine, it is similarly possible for Inner to access the local variables in that stack frame by following the frame pointer value passed to Outer by its parent.

To see the relevance of this, here's an example use of the TraverseProcesses routine I gave up above. It's a routine called FindProcessBySignature, that scans the process list until it finds a process with a given application signature. It returns true and the process serial number if it finds a match, false otherwise.

PROCEDURE FindProcessBySignature
 (
  Signature : OSType;
  VAR MatchingProcess : ProcessSerialNumber
 ) : BOOLEAN;

  VAR
    Found : BOOLEAN;

  PROCEDURE MatchSignature
   (
    VAR ThisProcess : ProcessSerialNumber;
    VAR ProcessInfo : ProcessInfoRec;
    VAR KeepGoing : BOOLEAN
   );

    VAR
      Err : OSErr;

  BEGIN
    IF ProcessInfo.processSignature = Signature THEN
      MatchingProcess := ThisProcess;
      Found := TRUE;
      KeepGoing := FALSE
    END (*IF*)
  END MatchSignature;

BEGIN (*FindProcessBySignature*)
  Found := FALSE; (* initial assumption *)
  TraverseProcesses
   (
    (*Action :=*) CAST(ProcessActions,
    ADR(MatchSignature)),
    (*ActionArg :=*) CurrentA6()
   );
  RETURN
    Found
END FindProcessBySignature;

First of all, compare the declaration of the inner MatchSignature routine with the original definition of the ProcessActions type. Something is missing: What has happened to the ActionArg?

The answer, of course, is that this is now an implicit additional argument, being the outer routine's stack frame pointer! When FindProcessBySignature passes the address of MatchSignature to TraverseProcesses as the action routine, it tells the latter to call it with that additional argument, which is the result of calling the function CurrentA6. This is defined as follows:

PROCEDURE CurrentA6() : LONGCARD;

  CODE
    02E8EH; (* move.l a6, (sp) *)

It returns the value that is currently in the A6,which is the stack frame register -- exactly what the inner routine needs when it's called.

The main ugly thing about this technique is that Modula-2 doesn't allow you to pass a nested routine as a routine value; this is only allowed for globally-declared routines (as in the sorting example earlier). Luckily, Metrowerks allows you to use the ADR (address-of) function on a nested routine, and you can do a CAST to force the result of this to the appropriate type. (Unfortunately, the p1 compiler doesn't seem to support this feature.)

Why not just declare MatchSignature globally outside FindProcessBySignature, and avoid this trickiness with stack frame pointers? The reason is because, instead of passing the stack frame pointer as the additional argument, you'll now have to pass pointers to the local variables Found and MatchingProcess. Or if you try using globals instead, your code is no longer reentrant! This may not seem a big deal at first sight, but it's very important in library code, which will be written once and reused in a wide variety of situations you never envisaged. Alternatively, you could combine both the local variables into a single structure, and pass the address of that. That will work, but you lose some of the simplicity of the nesting version.

Besides, the MatchSignature routine doesn't need to be visible globally. One of the safety principles you learn when you're writing large programs is: if something doesn't need to be visible globally, don't make it so. In a large piece of software, the complexity comes not from the number of parts, but from the number of potential interactions between them. To minimize unwanted interactions, things should only be accessible in places where they need to be. This is called information hiding.

There is one subtle pitfall with static links you need to watch out for. Consider the following code fragment:

PROCEDURE Parent;

  VAR
    TargetProcess : ProcessSerialNumber;

  PROCEDURE Sister
   (
    VAR ThisProcess : ProcessSerialNumber;
    VAR ProcessInfo : ProcessInfoRec;
    VAR KeepGoing : BOOLEAN
   );

  BEGIN
    ...
    IF ThisIsTheOneIWant THEN
      TargetProcess := ThisProcess;
      KeepGoing := FALSE
    END (*IF*)
  END Sister;

  PROCEDURE Brother;

  BEGIN
    TraverseProcesses
     (
      (*Action :=*) CAST(ProcessActions, ADR(Sister)),
      (*ActionArg :=*) CurrentA6() (***WRONG***)
     )
  END Brother;

BEGIN (*Parent*)
  ...
  Brother
  ...
END Parent;

What is wrong with this? The problem is, that the use of CurrentA6() in Brother means that it is passing Sister a pointer to its own stack frame, rather than the one belonging to Parent! This means that, where Sister expects to access variables local to Parent, it accesses ones local to Brother instead. This is obviously wrong, since the language rules don't allow variables local to Brother to be directly visible to Sister.

The following is one way to modify this to work correctly:

PROCEDURE Parent;

  VAR
    ParentA6 : LONGCARD;
    TargetProcess : ProcessSerialNumber;

  PROCEDURE Sister
   (
    VAR ThisProcess : ProcessSerialNumber;
    VAR ProcessInfo : ProcessInfoRec;
    VAR KeepGoing : BOOLEAN
   );

  BEGIN
    ...
    IF ThisIsTheOneIWant THEN
      TargetProcess := ThisProcess;
      KeepGoing := FALSE
    END (*IF*)
  END Sister;

  PROCEDURE Brother;

  BEGIN
    TraverseProcesses
     (
      (*Action :=*) CAST(ProcessActions, ADR(Sister)),
      (*ActionArg :=*) ParentA6
     )
  END Brother;

BEGIN (*Parent*)
  ParentA6 := CurrentA6();
  ...
  Brother
  ...
END Parent;

Here the CurrentA6() call has been moved to Parent, and its value saved for use by Brother when calling Sister.

The rule is simple: when calling a nested routine, you must always pass the stack frame pointer for the routine textually immediately enclosing the one you want to call. If you stick to this rule, all the more complicated cases -- references to local variables up more than one level, recursion, inner routines calling outer ones, and so on -- are guaranteed to work correctly!

Application To Other Languages

For some odd reason, most Mac programmers don't use Modula-2. Most use C, and most of the rest use Pascal. And you're wondering: how applicable are these techniques to those languages?

You can use them to some extent. Modula-2 is modeled fairly closely on Pascal (they were both designed by the same person, after all), and I'm pretty sure all the Mac Pascals use A6 as a stack frame pointer in the same way as Modula-2. There are two main drawbacks to Pascal:

There are no routine types. Once you take the address of a procedure or function, you're completely on your own.
I understand THINK Pascal doesn't even allow you to take the address of a routine that's local to another routine. That means you can't use any of the A6 stuff.

Note that it is possible in Pascal to pass a routine as an argument to another, in a type-safe fashion. This feature is powerful enough to deal with all the examples using static links I have given above. However, it doesn't allow you to save routine addresses in data structures, which can come in useful in some cases I haven't covered in this article.

The major drawback with C is that it doesn't allow you to declare one routine inside another. This means you have to go through the extra complexity I mentioned above in passing additional arguments to action routines. At least C has proper procedure and function types.

Conclusion

Assembly-language hackers are occasionally fond of a programming technique known as self-modifying code, where an instruction sequence pokes some bits in an instruction elsewhere in the routine before executing it.

I want to make a clear distinction between self-modifying code, and the techniques for generating code at run-time that I have been describing above. Self-modifying code was a technique concocted to squeeze a few more instruction cycles out of time-critical code. It has been quite rightly criticized as leading to programs that are hard to understand, hard to reuse and hard to maintain.

The techniques I have been describing, on the other hand, are more structured than this. The code which is being generated at run-time follows a simple, easy-to-understand form, and its use is really quite straightforward. My aim has not been to optimize time-critical code, but precisely to create code which is easy to understand, easy to reuse and easy to maintain!

Lawrence D'Oliveiro has been doing both high-level and low-level programming on a variety of different platforms over the last eighteen years. When he's not playing with his techno-toys, or worrying about which cool technology Apple is going to abandon next, he likes cycling and walking around the nearby Waikato River.

Software Updates via MacUpdate

Latest Forum Discussions

Make the passage of time your plaything...

While some of us are still waiting for a chance to get our hands on Ash Prime - yes, don’t remind me I could currently buy him this month I’m barely hanging on - Digital Extremes has announced its next anticipated Prime Form for Warframe. Starting... | Read more »

If you can find it and fit through the d...

The holy trinity of amazing company names have come together, to release their equally amazing and adorable mobile game, Hamster Inn. Published by HyperBeard Games, and co-developed by Mum Not Proud and Little Sasquatch Studios, it's time to... | Read more »

Amikin Survival opens for pre-orders on...

Join me on the wonderful trip down the inspiration rabbit hole; much as Palworld seemingly “borrowed” many aspects from the hit Pokemon franchise, it is time for the heavily armed animal survival to also spawn some illegitimate children as Helio... | Read more »

PUBG Mobile teams up with global phenome...

Since launching in 2019, SpyxFamily has exploded to damn near catastrophic popularity, so it was only a matter of time before a mobile game snapped up a collaboration. Enter PUBG Mobile. Until May 12th, players will be able to collect a host of... | Read more »

Embark into the frozen tundra of certain...

Chucklefish, developers of hit action-adventure sandbox game Starbound and owner of one of the cutest logos in gaming, has released their roguelike deck-builder Wildfrost. Created alongside developers Gaziter and Deadpan Games, Wildfrost will... | Read more »

MoreFun Studios has announced Season 4,...

Tension has escalated in the ever-volatile world of Arena Breakout, as your old pal Randall Fisher and bosses Fred and Perrero continue to lob insults and explosives at each other, bringing us to a new phase of warfare. Season 4, Into The Fog of... | Read more »

Top Mobile Game Discounts

Every day, we pick out a curated list of the best mobile discounts on the App Store and post them here. This list won't be comprehensive, but it every game on it is recommended. Feel free to check out the coverage we did on them in the links below... | Read more »

Marvel Future Fight celebrates nine year...

Announced alongside an advertising image I can only assume was aimed squarely at myself with the prominent Deadpool and Odin featured on it, Netmarble has revealed their celebrations for the 9th anniversary of Marvel Future Fight. The Countdown... | Read more »

HoYoFair 2024 prepares to showcase over...

To say Genshin Impact took the world by storm when it was released would be an understatement. However, I think the most surprising part of the launch was just how much further it went than gaming. There have been concerts, art shows, massive... | Read more »

Explore some of BBCs' most iconic s...

Despite your personal opinion on the BBC at a managerial level, it is undeniable that it has overseen some fantastic British shows in the past, and now thanks to a partnership with Roblox, players will be able to interact with some of these... | Read more »

Price Scanner via MacPrices.net

You can save $300-$480 on a 14-inch M3 Pro/Ma...

Apple has 14″ M3 Pro and M3 Max MacBook Pros in stock today and available, Certified Refurbished, starting at $1699 and ranging up to $480 off MSRP. Each model features a new outer case, shipping is... Read more

24-inch M1 iMacs available at Apple starting...

Apple has clearance M1 iMacs available in their Certified Refurbished store starting at $1049 and ranging up to $300 off original MSRP. Each iMac is in like-new condition and comes with Apple’s... Read more

Walmart continues to offer $699 13-inch M1 Ma...

Walmart continues to offer new Apple 13″ M1 MacBook Airs (8GB RAM, 256GB SSD) online for $699, $300 off original MSRP, in Space Gray, Silver, and Gold colors. These are new MacBook for sale by... Read more

B&H has 13-inch M2 MacBook Airs with 16GB...

B&H Photo has 13″ MacBook Airs with M2 CPUs, 16GB of memory, and 256GB of storage in stock and on sale for $1099, $100 off Apple’s MSRP for this configuration. Free 1-2 day delivery is available... Read more

14-inch M3 MacBook Pro with 16GB of RAM avail...

Apple has the 14″ M3 MacBook Pro with 16GB of RAM and 1TB of storage, Certified Refurbished, available for $300 off MSRP. Each MacBook Pro features a new outer case, shipping is free, and an Apple 1-... Read more

Apple M2 Mac minis on sale for up to $150 off...

Amazon has Apple’s M2-powered Mac minis in stock and on sale for $100-$150 off MSRP, each including free delivery: – Mac mini M2/256GB SSD: $499, save $100 – Mac mini M2/512GB SSD: $699, save $100 –... Read more

Amazon is offering a $200 discount on 14-inch...

Amazon has 14-inch M3 MacBook Pros in stock and on sale for $200 off MSRP. Shipping is free. Note that Amazon’s stock tends to come and go: – 14″ M3 MacBook Pro (8GB RAM/512GB SSD): $1399.99, $200... Read more

Sunday Sale: 13-inch M3 MacBook Air for $999,...

Several Apple retailers have the new 13″ MacBook Air with an M3 CPU in stock and on sale today for only $999 in Midnight. These are the lowest prices currently available for new 13″ M3 MacBook Airs... Read more

Multiple Apple retailers are offering 13-inch...

Several Apple retailers have 13″ MacBook Airs with M2 CPUs in stock and on sale this weekend starting at only $849 in Space Gray, Silver, Starlight, and Midnight colors. These are the lowest prices... Read more

Roundup of Verizon’s April Apple iPhone Promo...

Verizon is offering a number of iPhone deals for the month of April. Switch, and open a new of service, and you can qualify for a free iPhone 15 or heavy monthly discounts on other models: – 128GB... Read more

Jobs Board

IN6728 Optometrist- *Apple* Valley, CA- Tar...

Date: Apr 9, 2024 Brand: Target Optical Location: Apple Valley, CA, US, 92308 **Requisition ID:** 824398 At Target Optical, we help people see and look great - and Read more

Medical Assistant - Orthopedics *Apple* Hil...

Medical Assistant - Orthopedics Apple Hill York Location: WellSpan Medical Group, York, PA Schedule: Full Time Sign-On Bonus Eligible Remote/Hybrid Regular Apply Now Read more

*Apple* Systems Administrator - JAMF - Activ...

…**Public Trust/Other Required:** None **Job Family:** Systems Administration **Skills:** Apple Platforms,Computer Servers,Jamf Pro **Experience:** 3 + years of Read more

Liquor Stock Clerk - S. *Apple* St. - Idaho...

Liquor Stock Clerk - S. Apple St. Boise Posting Begin Date: 2023/10/10 Posting End Date: 2024/10/14 Category: Retail Sub Category: Customer Service Work Type: Part Read more

Top Secret *Apple* System Admin - Insight G...

Job Description Day to Day: * Configure and maintain the client's Apple Device Management (ADM) solution. The current solution is JAMF supporting 250-500 end points, Read more

SPREAD THE WORD:
Slashdot
Digg
Del.icio.us
Reddit
Newsvine