Format of the V2 Image File. Introduction: The format of a Smalltalk-80 image file is neither specified in the Blue Book nor published elsewhere. I used another source which I received from Dan Ingalls, who kindly came to help with the Hobbes Emulator, which he ported to Squeak from a Visual Works implementation written by Vassily Bykov. I thank both of them for supporting this study of Smalltalk-80, one of the two or three genuine gems of computing science. The Hobbes distribution includes a V2 image file, whose transcript view says that the snapshot was taken on 31 May 1983 at 10:37:52 am and a sources file whose banner reads: 'From Smalltalk-80, version 2, of April 1, 1983 on 31 May 1983 at 9:10:35 am' This sources file can be found in the internet by googling for ST-80 classes that are not provided by newer Smalltalks. I started with 'Opaque Smalltalk-80' and then narrowed in the search until I've found what I wanted with: 'Opaque BitBlt Smalltalk-80 Form methodsFor'. Thanks to the anonymous who put it there! I couldn't resist elaborating on a programming error of Vassili's. But it serves just too well to teach three lessons, one about counting, one about reuse, and one about rocking the boat by neglecting the standard as defined by Smalltalk-80. Since Hobbes reads the image file and then uses Squeak's native object memory, it does not need all the values needed by the Blue Book object memory implementaion. These values might be stored in the image file as well, but I have no idea where and why. These unanswerd questions are marked by '???'. The Green Book mentions a memo distributed with the image to assist programming the VM. Could it be that this memo answers those questions? It could be! In the meantime I received a some pages of that memo from Mario Wolczko (marioAtwolczkoDotcom) via Luc Verbist (lucvAtetzouDotnet). Thanks again! You can view them at http://www.ba-stuttgart.de/~helbig/st80/memo/ Last but not least, Klaus D. Witzel (klausDotWitzelAtcobss) gave an very obvious answer regarding the oop of the active context, one of the root objects. (End of Introduction) Layout of the image file: The image file starts with a 512 byte header. See Version2Snapshot>>headerSize. The first two big endian 32-bit double words in the header specify the number of 16-bit words in the object space and the object table. The rest of the header is zero. See Snapshot>>headerDoubleWord, Version2Snapshot>>objectSpaceWordSize and Version2Snapshot>>objectTableWordSize and use Squeak's debugger. The header is followed by the object space. The object space is followed by the object table at the next 512 byte boundary. The entries of the object table are as described in the Blue Book. The object space is a sequence of 64K 16-bit word sized heap segments, containing the chunks as described by the Blue Book. The method Version20TEntry>>objectAddress reveals that an index into the object space is computed as the concatenation of the 4-bit segment bits with the 16-bit location bits of the entry. This only works, if each but the last segment is padded to a size of 64K words, the first segment is number 0 and the segments are tiled without gaps. (End of Layout of the image file) Error: To compute the position of the object table in the file stream of the image file, as the next 512 byte boundary after the object space, the routine Version2Snapshot>>positionAtObjectTableStart tries: objectTableStart <- (objectSpaceEnd // 512 + 1) * 512. If objectSpaceEnd is a multiple of 512 objectTableStart will be off by 512, thus skipping the first 512 bytes of the object table. The rectified expression is: objectTableStart <- (objectSpaceEnd + 511) // 512 * 512. With the v2 image you get objectSpaceEnd = 518272 = 16r7E880 which is not a multiple of 512 = 16r200. Thus the error won't be harmful with the v2 image. Remark: This error is typical of programmers from 1Planet. Whereas 0Planet programmers are known to be immune from it. To find out your home planet, concatenate the first natural number and 'Planet'. What does the home planet have to do with this error, you ask? Watch how a native of 0Planet will derive an expression which is guaranteed to yield the correct position of the object table. Always. Assume s is the combined size in bytes of the header and the object space. Then the half open interval [0..s) is the range of indexes to all bytes of the header and the object space. The number of indexes would be s, which is the difference of the upper bound and the lower bound of the half open interval. The number nb of 512-byte blocks needed to store s bytes is known to be nb = s + 511 // 512. A 0Planet native would view nb as the number of blocks before the start of the object table. And nb*512 likewise as the number of bytes. Then [0..nb*512) is the range of indexes needed for the header and the object space. And [nb*512 .. fs) is the range of indexes to the bytes of the object table. So in C, the mother tongue on 0Planet, you'd set the position of the file to nb*512 by the seek() system call. But, you might say, Smalltalk is the native language from 1Planet, that is the first index would be one and not not zero. So nb*512+1 is the index of the first byte of the object table. Right! But to reconcile the inhabitants of 0Planeter with those of 1Planet, the protocol of the PositionableStream mandates that the position is incremented by one *before* it is used as an index in the underlying collection. And therefore, the positions in PositionableStreams are to be treated the same as the positions in Unix files. As EWD [EWD0] pointed out, Euklid [EUK0] and his contemporaries lived on 2Planet. This entailed a clumpsy proof of the Euklidean Algorithm [EUK1] which yields the greatest common denominator of two numbers. Euklid needed to handle a special case when the common denominator is one. But neither the algorithm nor its proof contains any errors. Which is not always true with the output of us humble programmers. We therefore better move to 0Planet.(End of Remark) Remark: This error could have been avoided if the programmers knew their libraries, in this case the protocol of ExternalStream. They'd code the method like this: positionAtObjectTableStart: aStream aStream padTo: 512 assuming that aStream is positioned at the end of the object space, which is a valid assumption when the message is received. (End of Remark) Remark: Neither ExternalStream nor a padTo: method is provided by Squeak 3.6. So the above remark does not apply when programming Squeak, which is bad for the remark and worse for Squeak. (End of Remark) (End of Error) Snapshot Related Globals: Sizes of the three components of the V2 image file as distributed with the Hobbes Emulator: Component base 16 base 10 ------------------------------ Header 200 512 Object Space 7E680 517760 Gap to n. block 180 384 Object Table 12EA0 77472 Sum 918AA 596128 V2.0 image: 92000 598016 This leaves a difference of 760 byte for unknown reason. (End of Snapshot Related Globals) (End of Answer) Question: All 16-bit entities, like words, integer and object pointers are composed of two bytes, one with a lower address and one with a higher address. Which of them is the more significant byte of the entity? Answer: The more significant byte has the lower address. The more significant byte of a word is indexed with 0. This is big endian and reflects the Alto hardware architecture. The most significant bit in a word is indexed with 0. This is unusual and reflects something unknown to me. (End of Answer) Question: How does the ObjectMemory know the object pointer to instances of Integer class and the CompiledMethod class. Answer: The oop of the compile method class is 34 (at least in the V2 image, see Version20Entry>>compiledMethodClassOop. The oop of the Integer class is 12 (see memo/page07.jpg) (End of Answer) Question: Where is the head of the free entry list stored? Answer: The free entries are marked accordingly. But the list has to be rebuild during load time by the VM. (memo/page03.jpg) (End of Answer) Question: Where are the heads of the free chunk lists stored? Answer: There is no free chunk in the images object space. All memory is occupied. (End of Answer) Question: The marking collector needs to know the root objects. These are the system dictionary and the current process. Where are their oops saved in the image file? Answer: They are not saved in the image file. The system directory is the value of an association whose object pointer is 18. (memo/page07.jpg). The VM-routine activeProcess returns the current process (BlueBook p 644). (End of Answer) References: [EWD0]: Dijkstra: "0 Why Numbering Should Start at Zero" in "Formal Developments of Programs and Proofs", 1990, pp 209, 210. See also http://people.squeakfoundation.org/article/57.html and http://www.cs.utexas.edu/users/EWD/transcriptions/EWD08xx/EWD831.html [EUK0]: Euklid: "The Elements", Book 7, Definition 1 and 2. See also http://aleph0.clarku.edu/~djoyce/java/elements/bookVII/defVII1.html [EUK1]: Euklid: "The Elements", Book 7, Propositions 1 to 3. http://aleph0.clarku.edu/~djoyce/java/elements/bookVII/propVII1.html (End of References) (End of Format of the V2 Image file) Author: Wolfgang Helbig (helbigAtLehreDotBA-StuttgartDotDE) Changes: 15.06.2006 Questions answered by the memo from Mario. 16.06.2006 Last question answered by Klaus Witzel References added. Replaced "context of active process" by "current process."