AppleScript's Variable Types and You
Volume Number: 22 (2006)
Issue Number: 3
Column Tag: Programming
AppleScript's Variable Types and You
What you don't know about AppleScript's variable types can hurt you (or slow down your scripts!)
by Ryan Wilcox
Introduction
Those who know AppleScript well are very familiar with obscure AppleScript tricks, magical incantations to make their scripts run faster. These incantations are usually due to an understanding of AppleScript's data sharing mechanisms, which are thoroughly explained in this article.
Learning how different types of variables work is an important step for any AppleScripter. Sometimes AppleScript behaves in ways that are unintuitive and often frustrating to those who don't understand what's going on. To those that don't understand these different types, some of AppleScript's behavior, looks like bugs in the language, but this is rarely the case.
This article discusses three classifications of AppleScript's variable types, examines methods of appending an item to a list, and then examines the different kinds of copy operations available to a scripter. This article is not meant to be an introduction to the language, but rather read by someone who has a working knowledge of AppleScript.
Variable Types
Pure, or "Vanilla" AppleScript (AppleScript that does not depend on any third party extensions aka: OSAXen, nor applications) has three different classifications of variable types: scalar types, container types, and other types. The first classification, scalars, is a type that can only hold one item. An example being a variable that contains the number 27. Numbers, reals, as well as dates, strings, and texts are also scalars.
The second classification is container types. A container type is a type that can hold multiple values. For example, a scalar could contain the number 27, while a list could contain {27, 3, 5, 6}, and a record could contain {item1: 27, item2: 3, item3: 5, item4: 6}. Lists, records and script objects are container types.
The last classification is other types. Other types include references, as well as constants and unit types.
The information presented in this article (and this section in particular) is all backed up by experimentation, most of which was condensed into a test script. This script contains nearly four-dozen handlers that test the assertions made in this article. Most of the snippets of code you'll find are from the test script as well. The full script is found on the Web at: http://www.wilcoxd.com/articles/asreferences/ Each snippet of code in this article is backed by a handler in the testing framework, which will return true if the conditions have been met, or false if the test has failed. Handlers can easily be located by using the Table Of Contents at the top of the test script. Doing the timed tests included in the test script requires the Jon's Commands OSAX, found at http://www.seanet.com/~jonpugh/ The script can be run without the timed tests by setting the doTimings property to false.
Scalar and Container Types Compared
There is a difference in how AppleScript deals with scalars and how it handles container types. Container types, internally, use something the AppleScript Language Guide calls "Data Sharing" (page 206). The understanding of this "data sharing" concept first requires an understanding of how variables work in AppleScript.
An AppleScript record, deconstructed, might be the easiest way to see the behavior of AppleScript's variable types, especially when accompanied by some illustrations.
set myRec to {varOne: 1, varTwo: 2, varThree: 3}
Figure 1 represents what myRec looks like:
Figure 1: myRec with three scalar items inside it
Now we create a new item in myRec, illustrated in Figure 2:
set myRec to myRec & {varFour: 4}
and represent it by Figure 2:
Figure 2: myRec with four scalar items inside it
Simple enough. Figure 3 shows, the situation changes when we set varOne, varTwo, and varThree to be container types instead:
set myRec to { varOne: {1}, varTwo: {2}, varThree: {3} }
myRec represented graphically looks like Figure 3:
Figure 3: Diagram of our variables as container types
As seen in the diagram, container types refer to their contents by pointing to where the data is "over there". In essence, container types in AppleScript work a lot like aliases in the Finder. An alias points to another file or folder, allowing you to access that item from a different logical location in your file structure. Likewise, container types let you refer to the same value "from" multiple variables. Aliases also take up less hard disk space than making a duplicate of the item itself, an advantage shared by AppleScript's container types: the same values only exist in one place in memory, instead of having multiple copies of the same data.
This means we can change two things with list items, just as we can with aliases. We can change the contents of the variable (the data that is shared), or we can change the variable itself (where it "points" to). First, Figure 4 shows myRec when we add another element to it:
set myRec to myRec & {varFour: varThree of myRec}
Figure 4: varFour shares the contents of varThree
Notice how varFour shares varThree's data. As you might expect, any change to {3} will be mirrored in both varThree and varFour, as the data is shared between these two variables, as seen in the following line of AppleScript (also shown in Figure 5).
set item 1 of varThree of myRec to 4
Figure 5: Changing the contents of varThree
Changing where a variable points is also possible, shown in Figure 6:
set varThree of myRec to {2.5}
Figure 6: Changing varThree
The above line changes only where varThree points to, as opposed to the contents (aka: the shared value) of what it was sharing with varFour. It would be as if you created two aliases in the Finder (alias A and alias B), both of them pointing to the same document, and then you selected a new original for alias B. Alias A continues to point to the original item; only alias B points somewhere else.
AppleScript shares the contents of container items, instead of sharing the item itself. This can be proven with a handler from the test script, testContainerItemsShareScalars().
to testContainerItemsShareScalars()
set myItem to {1}
set myList to myItem
set myList to myList & myItem
set item 1 of myItem to 45
return (myList is equal to {1,1} and myItem is equal to {45})
end testContainerItemsShareScalars
If item 1 and item 2 of myList shared item 1 of myItem itself, they would reflect the change in line 5. Instead, item 1 and item 2 share the contents of item 1 of myItem, just like our example in Figure 6. Items in myList remain unchanged: they still share the same value, even if myItem does not anymore. The test script has two tests that further explore this functionality: testContainerItemChangeReflectedInShared(), and testContainerItemChangeReflectedInShared(). testContainerItemChangeReflectedInShared() proves that even if item 1 of myItem is a container, instead of a scalar, the behavior is the same as testContainerItemsShareScalars(). testContainerItemChangeReflectedInShared() proves that if item 1 of myItem is a container, that changing the contents of myItem is reflected in myList.
C/C++ or Java programmers might recognize that container items act a lot like pointers or references in their languages. They would be correct, to a point. This article pointedly avoids the term "reference" to refer to container types, to make it more accessible to scripters not familiar with those languages, and to avoid confusion with AppleScript's actual reference type, discussed in the next section. Care should be taken, especially when reading AppleScript mailing lists, as container items are often called "references", despite not being associated with AppleScript's actual reference type. AppleScript's container types are different from the reference type because they are automatic, and don't have any of the restrictions of actual references. The test script tries to avoid using the term "reference" as a substitute for "container types", but sometimes uses it for brevity.
The Reference Type
In vanilla AppleScript, the reference type exists to allow the scripter to point to scalar data. Try running the following handler:
on run
set x to true
set y to a reference to x
set contents of y to false
log "x = " & x as string
log "y = " & y as string
end run
(*Event Log Output:
x = false
y = false
*)
Here y is a reference that shares data with x. Remember that the difference between scalar types and container types is that container types point to data "over there". The reference type points to data "over there" as well, and reflects changes to the shared data just like container items do.
Below is a snippet of code, based on the changeAReferenceType() handler from the test script:
set refProp to "world"
set y to a reference to refProp
set z to a reference to refProp
set y to "hello"
set contents of z to "foo"
--refProp is equal to "foo"
--and (y is equal to "hello"
--and (contents of z is equal to "foo"))
First it's important to note that refProp is actually a property. We'll get into why later in this section, but for now just remember that fact if you're trying to reproduce this test at home.
Lines 1-3 of the handler create our three variables, pictured in Figure 7:
Figure 7: Initial state of refProp, y and z
Line 4 changes the value of y to "hello". Remember that container types have two different types of values: what they point to, and the contents of the variable. In Figure 7, y points to refProp, and points at the value "world". Reference types work the exact same way. Figure 8 shows the state of y after line 4.
Figure 8: set y to "hello"
Here we have only changed what y points to, so z has the same value as it had before.
Now, in line 5, we change the value z points to, using AppleScript's contents of syntax. Contents of allows us to change the value being pointed to by the reference type.
Figure 9: set contents of z to "foo"
Here we are changing the value z points to, the value of refProp.
Reference types seem like they would be useful, but they have some pretty limiting drawbacks, according to AppleScript: The Definitive Guide (Section 11.5), and our tests here. Try running the following script:
to main()
set x to true
set y to a reference to x
set the contents of y to false
--AS runtime error: "can't make 'x' into type reference
end main
main()
While y points to x, we get an error when we try to get the contents of the reference. AppleScript: The Definitive Guide has a work-around: make x a property. This is what changeAReferenceType() does with refProp.
property x : true
on main()
set y to a reference to x
set the contents of y to false
log "x = " & x
log "y = " & y
end main
main()
(*Event Log Output:
x = false
y = false
*)
Further experimentation leads to an interesting fact: you can create a useful reference type using a reference to if and only if you are in the (implicit or explicit) run handler, as the following script proves.
on run
set x to true
set y to a reference to x
set contents of y to false
log "x = " & x as string
log "y = " & y as string
end run
(*Event Log Output:
x = false
y = false
*)
For those of you not aware, when running a script AppleScript first looks for a handler explicitly named run()to execute, and if it doesn't find it, then executes any commands not in a handler, assuming that (the implicit run handler).
Referencing a list in the run handler:
on run()
set x to {true, " foo"}
set y to (a reference to x)
set contents of y to {false, " bar"}
log "x = " & x as string
log "y = " & y as string
end run
(*Event Log Output:
x = false bar
y = false bar
*)
The reference type seems like a useful way to save memory (and time) when you want one variable to mirror a scalar variable's value. In reality, thanks to the restrictions of the reference type, it is probably better and easier to use a container type instead.
Beyond our explorations of the reference type, the different mechanisms of copying can confuse scripters and could lead to slower than expected performance in scripts. The next section of this article covers this important topic.
Conclusions on Variable Types
Knowing the difference between scalar and container types is an important step in understanding AppleScript 's behavior in certain circumstances (like using the set command). AppleScript's reference type, while not as useful or powerful as other types, is important to know about and may prove useful in your script writing.
The next section shows several test handlers that further illustrate the difference between scalar and container types. Now, test your knowledge by trying to guess the output of the event log of the following handlers, based on the information in the previous section.
Test Your Knowledge!
This section is composed of 5 functions from this article's test script. See if you can guess what the output of the event log will be in each case.
The Test
on xAnYScalars()
set x to true
set y to x
set y to false
log "x = " & x
log "y = " & y
end xAnYScalars
on xAndYList()
set x to {true}
set y to x
set item 1 of y to false
log "x = " & x as string
log "y = " & y as string
end xAndYList
on xAndYListCopy()
set x to true
copy x to y
set y to false
log "x = " & x as string
log "y = " & y as string
end xAndYListCopy
on copyWithLists()
set x to {true}
copy x to y
set item 1 of y to false
log "x = " & x as string
log "y = " & y as string
end copyWithLists
The Answers
xAnYScalars() Event Log
(*Event Log Output:
x = true
y = false
*)
xAnYList() Event Log
(*Event Log Output:
x = false
y = false
*)
xAndYListCopy() Event Log
(*Event Log Output:
x = true
y = false
*)
copyWithLists() Event Log
(*Event Log Output:
x = true
y = false
*)
The Answers, Explained
These tests show the basics of how AppleScript's variables react differently to different situations. First, in xAnYScalars(), we have a test that sets two scalars. The line set y to x copies the value from x and puts it into y. Since y is a copy, we change its contents and not have it reflected in x.
xAndYList() is different in that it deals with container types. Using set with a container type only points to the original data. Thus, in xAndYList(), y shares x's data, claiming it as its own, and changing the value of any element inside of y will be reflected in x.
You might expect the same results from xAndYListCopy()as those from xAndYList(). No, for this function uses the copy keyword, which results in a duplication of all the data, down to every last scalar value, from x into y. This duplication (also called "deep copy") means that there are two different copies of {true}, and changing one copy does not affect the other. This would be like duplicating a file in the Finder, and making a modification to one, copies contents, one file contains the new data, while the other file remains the same as it was before. Copy operations are covered later in this article.
The next section will take the knowledge of scalars and container types, and apply it to the simple operation of adding an item to the end of a list.
Appending To A List Under the Microscope
Appending an item to a list happens a fair deal in AppleScripts, and can be a source of slowness if not done correctly. In the following sections we'll investigate several different ways to append an item to a list, discussing both speed and differences in the operations performed. Space considerations prevent detailed analysis with all permutations, but we'll cover the more interesting ones in this article. Other append constructs can be found in the test script.
By downloading the test script you can see the timings of the tests yourself, on your machine, and also further examine the handlers this author used to gather the data presented in this series of articles
The benchmark machine for these tests was a (admittedly ancient) 400Mhz Powerbook G4 running OS X 10.4.3 with 768 MB RAM. Timings are given in ticks, which is 1/60th of a second.
copy myList & myListItem to myList
copy myList & myListItem to myList copies both myList and myListItem - changes made to either variable don't get reflected in the new list. twiddleCopyListItemList() shows us trying to change the contents of item 1 of myList, which is a duplicate of (instead of sharing data with) myItem.
to twiddleCopyListItemList()
set a to 1
set myList to {{a}, {a}, {a}, {a}, {a}}
copy myList to deepCopy
set shallowCopy to items of myList
set item 1 of item 4 of deepCopy to 42 -- not reflected
set item 1 of item 5 of myList to 23 -- reflected
return (myList is equal to {{1}, {1}, {1}, {1}, {23}}) and
(deepCopy is equal to {{1}, {1}, {1}, {42}, {1}})
and (shallowCopy is equal to {{1}, {1}, {1}, {1}, {23}})
end twiddleCopyListItemList
This function will be revisited later in the article, but note how the change to item 4 of deepCopy does not change the value of shallowCopy or myList. We used copy to construct deepCopy, making a duplicate of the data. Also notice how changing value 5 of myList is reflected by shallowCopy.
Now, the timing of this construction: timeCopyListAndListToList() took a minimum of 0 ticks, max of 3, and averaged 1.11 ticks on the benchmark machine.
set myList to myList & myListItem
The next way to append an item to a list is to use set myList to myList &... There are things to be aware of while appending container items to a container. The first is that, appending two lists together in this method will result in a "flat" list. Take the following handler, from the test script:
to setListToListAndContainerListItemAppend()
set a to {1, 3}
set b to {2}
set b to a & b
return b is equal to {1, 3, 2}
end setListToListAndContainerListItemAppend
b grew to accommodate all of a, creating a new item for every item of a. b is now 3 items long (count of a + count of b = 3.)
One must be careful when initially creating myList, to avoid inadvertently changing values you didn't expect, as shown in the following handler:
to testEmptyListsChangesContents()
set emptyList to {}
set myItem to {1}
set myList to emptyList & myItem
set item 1 of myList to 123
set myList to myList & myItem
return (myItem is equal to {123} and myList is equal to {123, 123})
end testEmptyListsChangesContents
One might have expected myList to be {1, 123}, instead of {123, 123}. On line 3 of this handler, myList behaves as if we had used the "set myList to myItem" syntax, where myList simply reflects myItem, instead of having item 1 of myList share myItem. Figure 10 shows this distinction.
Figure 10: Seen vs expected behavior in testEmptyListsChangesContents()
timeSetListToListAndList(), in the test script, gave us a min value of 0 ticks, a max of 2, and an average of .8 ticks. This is very close to the timing for set myList to myList & myListItem, timed with timeSetListToListAndScalar() (found in the test script).
set end of myList to myListItem
AppleScript's syntax has many nice features to work on lists. first of, end of, rest of, and some item of, to name a few. testFirst OfReassignRefItem() in the testing script shows, for example, that first of will replace the first item of the list (instead of creating a new first item, such that the previous first item is now the second item, and so on)
The end of syntax appends an item onto a list. For an example, examine the handler we use to time this construct:
to testSimpleEndOfWorks()
set a to {2}
set b to {1}
set end of b to a
return b is equal to {1, {2}}
end testSimpleEndOfWorks
Notice that item 2 of b is {2}, and not simply 2, as in a set myList to myList & anotherList construction, it is nestled, instead of flat.
A word to the wise with this construction: do not append a variable to itself using this technique. At best it will result in a stack overflow when you go to display it, at worse it could cause an infinite loop. Even if the list is sharing another variable and you try adding that variable onto the list. simpleAddSelfToEndOfSelfErr() of the test script shows trying to append an item onto itself, and the handler addSelfToEndLogOverflowErr() shows the more exotic case.
Another difference with the end of syntax vs. the set... to... syntax is that changes are shared. The handlers setEndOfSharedData() and addToOneNotShared() in the test script prove this, but they can be distilled with two simple examples:
set a to {1}
set b to a
set b to a & 45
a
a {1}
to
set a to {1}
set b to a
set end of b to 45
a
a {1, 45}
On line 3 of the first construct, it is like b is created all over again, which would explain the discrepancy in speed between this and set end of. It also explains why, in the second snippet, a contains {1, 45} while in the first example, a remains {1}. In other words: the second example acts on b itself, and since b shares data with a, logging a at the end of the second script will show it identical to b, because b is just a reflection of a.
Being able to refer to the end of the list has some timing advantages. The benchmark machine took a minimum of 0 ticks to complete timeSetEndOfListToList(), 1 tick max, and an average of .09 ticks. Compare this to set myList to myList & myListItem, and you'll see it was almost 10 times faster!
Conclusions on Appending To Lists
By far the fastest method of appending an item to a list is to use the set end of... to construction. The timeSetEnd OfList100ItemsList() and timeSetEndOfList100 ItemsScalar() handlers, presented in the test script, append 100 lists and scalars with 100 items, respectively, to a list, constructing a 1000 item list. On the benchmark machine, these tests were identical (min of 0, max of 1, average of 0.13). The test script shows timed examples of copy myList & myItem to myList constructing this kind of list, one constructing the 100 item list using the set end of... syntax, and one using copy myList & myItem to... (timeCopyScalarAndListToList 100Items() and timeCopyList AndListToList 100Items() respectively). These two functions take an average of 2.5 ticks per run on the benchmark machine, proving that careful use of AppleScript can speed up your scripts, even with something as simple as appending items to lists.
The next section discusses the different methods to copy variables in AppleScript. Along with appending to a list, copying variables draws on much of our previous knowledge about AppleScript's variable types.
The Right Copy For The Right Situation
AppleScript has two different types of copy: a shallow copy and a deep copy. Early in this article, we mentioned AppleScript's copy construction. A deep copy, which is triggered in AppleScript using the copy keyword, will perform a deep copy: duplicating all of the data in a variable, even if you have scalars nested inside of lists inside of records inside another list. A deep copy of such an object would be very expensive because of all the data involved. AppleScript would duplicate each and every item in this complex structure.
An alternative would be a shallow copy. Like a deep copy, a shallow copy duplicates all of the items in a list. Unlike a deep copy, any container items in the original list will create a new item in the destination. This item will share the same data as the original item. Our scalar deeply nested inside a list would be very fast with a shallow copy, because AppleScript would share the original data and put that in our new list. To expand on our earlier example of aliases, shallow copying a list simply creates another alias "pointing to" the same file as the first.
Let's examine these two kinds of copy individually:
Shallow Copy
A shallow copy creates items that share data with their respective items in the original list. This means that when you shallow copy, data is not duplicated, but the items of the new list share their data with the items of the original list.
When you create a shallow copy, items added or removed from either list are not reflected in the other list, but changing the contents of an item will be reflected.
You can shallow copy a list in AppleScript by using the items of syntax.
set shallowCopy to items of myList
It is important to reiterate here: the items of shallowCopy point to the data shared by myList directly, they are not depending on myList for anything beyond the initial setup. We can change an item on myList to point somewhere else, yet shallowCopy remains the same.
to shallowCopyChangeOne()
set a to {1}
set b to {2}
set myList to {a, b}
set shallowCopy to items of myList
set item 1 of myList to 0
return (myList is equal to {0, {2}}) and (a is equal to {1})
and (shallowCopy is equal to {{1}, {2}})
end shallowCopyChangeOne
Figure 11 shows what shallowCopy looks like when we create it on line 4 of shallowCopyChangeOne().
Figure 11: shallowCopy as created.
Notice how both items of the lists share the same data, but are two separate sets of items. You can really see this in line 5, where we set item 1 of myList to 0.
Figure 12: set item 1 of myList to 0
Line 5 is setting what item 1 of myList points to. Since myList and shallowCopy are separate entities, and we did not change the data shared between them (a), our change does not affect anything outside of myList.
Unlike shallowCopyChangeOne(), where we changed what item 1 of myList shares, shallowCopy ChangeSharedValue() changes the contents of an item shared between the two lists. Watch and see:
to shallowCopyChangeSharedValue()
set a to {1}
set b to {2}
set myList to {a, b}
set shallowCopy to items of myList
copy myList to deepCopy
set item 1 of item 1 of myList to 0
return (myList is equal to {{0}, {2}}) and (shallowCopy is equal to {{0}, {2}})
and (deepCopy is equal to {{1}, {2}})
end shallowCopyChangeSharedValue
Line 6 changes the value of (item 1) of a, which is shared by both myList and shallowCopy. Since shallowCopy and myList both share the contents of a, the change is shown in both lists.
Shallow copy only has an advantage when dealing with container items. Just like set, scalar items are duplicated, so there is no speed or memory gain for using a shallow copy on a list of scalar values.
Deep Copy
A deep copy is when AppleScript duplicates all of the data inside a container type. This means that both lists duplicate each other's data, and modifying even the contents of one will not affect the other, because there is no sharing going on at any level. You can deep copy a list by using the copy to syntax:
copy myList to deepCopy
Since deepCopy is a copy of the values in myList, we can change one list without affecting the other in any way.
to deepCopyChangeOne()
set a to {1}
set b to {2}
set myList to {a, b}
copy myList to deepCopy
set item 1 of a to 42
return (myList is equal to {{42}, {2}}) and (a is equal to {42})
and (deepCopy is equal to {{1}, {2}})
end deepCopyChangeOne
Line 4 of this handler duplicates myList, bit for bit, creating deepCopy. Figure 13 illustrates what myList and deepCopy look like on line 4. Line 5 of this handler changes item 1 of a. A's data is shared by myList, but not by deepCopy, thus the changes are not reflected.
Figure 13: myList vs deepCopy
Our test framework's twiddleCopy ListItemList() shows that one can change a scalar value in one list and not have it reflected in the deep copied list. twiddleCopyListItemList() is a more complex example of the same topics shown by deepCopyChangeOne()
to twiddleCopyListItemList()
set a to 1
set myList to {{a}, {a}, {a}, {a}, {a}}
copy myList to deepCopy
set shallowCopy to items of myList
set item 1 of item 4 of deepCopy to 42
set item 1 of item 5 of myList to 23
return (myList is equal to {{1}, {1}, {1}, {1}, {23}})
and (deepCopy is equal to {{1}, {1}, {1}, {42}, {1}})
and (shallowCopy is equal to {{1}, {1}, {1}, {1}, {23}})
end twiddleCopyListItemList
See how changing the contents of item 4 of deepCopy does not modify myList (because deepCopy has duplicated all data), but changing item 5 of myList is reflected by both myList and shallowCopy, because that value was shared by both variables.
Copying and scalars
As discussed earlier in the article, care should be taken to remember that, that set performs a deep copy on scalar items.
Conclusions on Types of Copy
So when do we want a shallow copy, and when would we prefer a deep copy? In situations requiring speed, or recursion, a shallow copy is probably your best bet. A deep copy is useful when your script is going to be changing values of a list, but you need to leave the original values intact. Duplicating all this data has a price, and scripters must be aware what exactly copy is doing, why it takes so long, and the alternative.
Conclusion
This article has covered a lot of ground. From learning the three different classifications of AppleScript variable types (scalar types, container types, and examining the reference type in detail), examining the different ways of appending items to a list, to picking the correct copy type for the correct situation. Even without interacting with applications, AppleScript is a language with a lot of power, and (like real world human languages), different ways to express concepts. By understanding these concepts, the best possible solution can be chosen, and your scripts can run better and faster. A faster script means everybody wins: the scripter, the end users, and the overall system. Happy Scripting and best of luck on your further AppleScript projects!
References
AppleScript Language Guide, Apple Computer: http://developer.apple.com/documentation/AppleScript/Conceptual/AppleScriptLangGuide/
AppleScript: The Definitive Guide, Matt Neuburg, ISBN: 0-596-00557-1:
http://www.oreilly.com/catalog/applescpttdg/
The author would like to take the time to thank (in alphabetical order): Jared Barden, Hamish Sanderson, and Matthew Strange for their review and technical insights during the creation of this article.
Ryan Wilcox is the founder of Wilcox Development Solutions (www.wilcoxd.com) specializing in carbonization, cross-platform application development and e-commerce solutions. While leaning mostly towards Python folk, he does sometimes get away to use AppleScript. He can be reached at rwilcox@wilcoxd.com