163. BASIC program format ~~~~~~~~~~~~~~~~~~~~ This is how a BASIC program is actually stored in memory. The program starts at the current value of PAGE, which denotes the lower edge of available memory. If you PRINT~PAGE you should get something like E00 or 1900, depending on whether or not you have a DFS etc fitted. The value ('byte') stored in this 1st location, is &D (Hex) or 13 (Dec), which is the code for a Carriage Return (CR). Whenever a program is present, then PRINT~?(PAGE) should indeed give "D". (If you have a utility ROM like Disc Doctor, then you have a much easier way of examining memory.) The next byte is the Most Significant Byte (MSB) of the 1st linenumber, and the one after is the LSB. You can check this as follows: Suppose the 1st linenumber is 100. The MSB is given by PRINT~100DIV256 , and the LSB is given by PRINT~100MOD256 . This should give you 00 and 64 respectively, (note the convention of expressing Hex bytes as 2 digits). Thus PRINT~?(PAGE+1),~?(PAGE+2) should indeed give you 00 and 64. Note that the MSB is always 00 for a linenumber of 255 or less; more about that later. The next byte, ie at (PAGE+3), is the total length of the 1st line of the program, including the two linenumber bytes, including the length byte itself, and including the Carriage Return (0D) byte at the end of the line. Note that BASIC keywords such as REM and PRINT are stored as just 1 byte each, and that some also include the opening bracket, eg MID$( . You can look up these single-byte 'tokens' in your User Guide, indexed under "Tokens" or "Basic Tokens". For example, the keyword NEXT is tokenised as ED. As a simple example, the basic line 110NEXT would be coded as 00 6E 05 ED 0D. Note that there would be another 0D immediately before the 00 in the unlikely event of this being the 1st line in the program, but the length byte would remain as 5. The following lines are coded in exactly the same way, and the end of the program is indicated by an FF immediately following the 0D at the end of the final program line. (Do not confuse this FF with the keyword token for the BASIC word END, which happens to be E0.) The very next location after this, which is the 1st 'free' one after the program, is denoted by TOP . Thus if you PRINT~?(TOP-1) you always get FF. As a practical example, here is a simple 2-line program fully tokenised, with a 'translation' underneath. You will notice that the code for a Space is 20 in Hex, and this is very commonly found in programs! 10REM HI PAGE 20PRINT"LO" TOP-1 v v 0D 00 0A 08 F4 20 48 49 0D 00 14 09 F1 22 4C 4F 22 0D FF ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ R Line Line R S H I R Line Line P " L O " R End e 10 Len E p e 20 Len R e t 8 M a t 9 I t u c u N u r e r T r n n n Cynics among you will point out that there doesn't seem to be any memory saving as a result of tokenising. However, you must remember that each line takes up at least 4 bytes before you've started, ie the linenumber bytes, the length byte, and the 0D at the end. Thus, for very short lines such as I have used in the example, there is little or no saving in memory space, particularly as I have only used 2 keywords. The more you cram onto one line, and hence the fewer linenumbers you use in the program, the better the economy. Typically, a tokenised program contains about 70% of the characters counted in a listing. There is just one little complication, and that is when a GOTO or GOSUB or THEN (implied GOTO) destination linenumber is encountered. You might expect to find the appropriate token, E5 E4 or 8C, followed by 2 bytes for the destination. In fact, the linenumber appears as 4 bytes; the 1st is always 8D as an indicator, followed by 3 bytes which are encoded in a really weird way. One final little quirk. When you press or type NEW, the 2nd byte of the program at PAGE+1 is changed to FF. This tells the computer to ignore the program, which is still intact, and pretend there is nothing there. When you type OLD, the byte is changed back to 00, and the program 'reappears'. Now, it so happens that this byte is also the MSB of the 1st linenumber, and since it is reset to 00 rather than to whatever it was before, odd things happen if the 1st linenumber is greater than 255. Try typing in a two-line program with 1000 as the 1st line, and 1010 as the 2nd. Now type NEW followed by OLD, and then LIST, and see what happens to line 1000! The linenumber has been changed to 1000MOD256, which is 232. In other words, the linenumber bytes have been changed from 03 E8 to 00 E8. Although this is quite harmless, it is more than a little disconcerting to the uninitiated!