Disk Format Layout WHAT'S ON A DISK PROGRAMMERS'S WORKSHOP CONFERENCE MARCH 31/APRIL 7, 1988 by SYSOP JL If there is anything more confusing then the array of terms used to talk about personal computers it would have to be those used to talk about what's on a disk. While the casual user can get by with little information about his computer the same cannot be said for his disks and the files he puts on them. This conference will cover what's on a disk in two parts. Internally programming the disk drive is beyond the scope of this article (and my experience or library). Disk Media : The disk media is a plastic base material impregnated with a nickel alloy magnetic film. The information is stored similar to the way music is recorded on tape. A high frequency bias signal is modulated by the data and magnetically impressed onto the surface of the disk. Two methods are commonly used to record the information. Group Code Recording (GCR) is used on Commodore floppy disks. Modified Frequency Modulation (MFM) is used by other floppy disk drives. Disk formats : The term 'format' refers to the way the information is recorded on the surface of the disk and the logical structure of the information stored on the disk. An unformatted disk is completely blank like an unused tape. When the disk is formatted it has the logical structure of the disk established by recording dummy information on the disk. This operation permanently erases any information currently stored on the disk and creates the track, sector and directory structure of the disk for the disk operating system to use. The GCR recording on Commodore disks also has a corresponding standard for the logical structure, or format, of the disk. There are also MFM formats, but there is no single standard for them. Two somewhat defacto standards exist for MFM formats however. The CP/M format and MSDOS/PCDOS formats. There are several variations of MSDOS format and there are dozens of variations of CP/M format. The Commodore disk drives except for the 1571 only read GCR formatted disks. The 1571 recognizes the format of a variety of CP/M and MSDOS MFM disks and reads them. With appropriate programming you can also write MFM disks with the 1571. The Commodore GCR format generally allows higher storage capacity on the disk because the number of sectors per track varies, using more sectors per track at the outside of the disk. MFM formats are limited in capacity by the shortest track (cylinder is the MFM term) and the highest recording density that the disk is capable of. One standard MFM format (74 cylinders, 8 sectors per track, 512 bytes per sector) allows for 300k bytes on a double sided disk (including directory info) while the GCR format allows for 350k bytes including the directory info. Tracks : For any position of the disk head there is a track that it will follow around the disk as it spins. The 1541 type of disk drive uses head positioning to access up to 40 separate tracks on a disk. This is the limit to which double density disks are certified. The DOS programming allows for record keeping of 35 tracks per disk side. These tracks are numbered 1 through 35 with track 1 being at the outside edge of the disk and higher numbered tracks spaced inward toward the center of the disk. While it is possible to position the heads to track 36 through 40 on a disk there is insufficient space in the block availability map to allow DOS to keep track of them. These tracks are sometimes used by commercial programs for data or copy protection. In order to format a disk to use these tracks it is necessary to have a machine language program run in the disk drive. The 1571 disk drive uses two heads to read/write both sides of a disk, side 0 (bottom side) contains tracks 1-35 and side 1 (top side) contains tracks 36-70. Blocks (Sectors) : Both 'block' and 'sector' are used to refer to the chunk of data that is accessible from the disk media in one read/write operation. Each track is broken down into blocks. A block of data requires space along the path of a track. In order to put as much information on a disk as possible a Commodore format disk contains more blocks in tracks near the outer edge of the disk than near the inner edge. Tracks 1-17 contain 21 blocks, tracks 18-24 contain 19 blocks, tracks 25-30 contain 18 blocks and tracks 31-35 contain 17 blocks. This gives a total of 683 blocks of data on a disk, 19 of which are reserved for disk housekeeping chores. Thus there are 664 blocks available for data storage on a disk side. Blocks on each track are numbered starting at zero. So track 1 would have blocks 0 through 20 on it. Commodore standard disk blocks contain 256 bytes, two of which are used by the DOS to point to the next track and block for a file. This gives 254 bytes of data per block in files. Each block on a track is separated from its preceding and following blocks by an inter block gap (unrecorded), synchronizing pattern, disk ID characters, track and block number for the following block, and check sum data. This information is used by the drive electronics and DOS to find the desired block for reading or writing and for checking the data. It cannot be written or modified to be any different than the disk was formatted except by machine language programs running in the disk drive. All of this is transparent to the user. Block availability map (BAM) : In order for the DOS to allocate disk space to files without overwriting existing data it keeps a permanent record on the disk of which tracks and blocks are allocated to files and which ones are free to be assigned to a file. This information is kept in a BAM on track 18, sector 0. With a 1571 disk drive the BAM for side 1 of the disk is kept in track 53, sector 0. In addition to the BAM this sector also contains the disk header and format information. The BAM for side 0 starts at byte 4 and has 4 bytes for each track, tracks 1-35. The first byte for each track contains the number of sectors free on that track. The next 3 bytes contain bits set to %1 if the corresponding sector is available and %0 if the sector is allocated to a file. In these 3 bytes, the first byte, bit 0 represents sector 0 through the third byte, bit 7 which represents sector 23 on that track. So to find the bit for any track & sector... byte = 4*track + int(sector/8) : bit = sector - int(sector/8) To see if that track & block is free... free = (data from byte) and 2Şbit On the 1571 the BAM for side 1 starts at byte 0 of track 53, sector 0. There are 3 bytes for each track counting tracks 36 through 70. So to find the bit for any track and sector... byte = 3*(track-35) + int(sector/8) : bit = sector - int(sector/8) The byte which contains the number of sectors on each track for tracks 36 through 70 on a 1571 is on track 18, sector 0 starting at byte 221. The disk drive automatically takes care of allocating blocks and marking them in the BAM when a file is saved or written to disk. It also releases blocks and updates the BAM when a file is scratched from disk. The disk VALIDATE command cleans up the BAM in the event that something happens to cause the BAM to be incorrect (usually leaving sectors identified as being allocated when they are in fact not used). To do this the DOS recreates the BAM by tracing each properly closed file in the directory, keeping track of which blocks are used in each file. After searching all files it then rewrites the BAM to disk and removes any unclosed (splat) files from the directory. Disk HEADER information : As noted above, track 18 sector 0 also contains the disk header information. T18, S0 byte 0 and 1 contain the track and sector for the start of the disk directory (T18, S1). Byte 2 contains the ASCII character "A", CHR$(65), which is the disk format identifier. The "A" says that the disk is a 1541/1571/1551/4040 format and is write compatable with any of those disk drives. If this character is changed to some other value then the disk drive DOS will not allow the disk to be written to by the 1541, 1571, etc. Some of the Commodore disk drives wrote blocks with different size gaps between blocks. If you were to attempt to use one of these disks and write to a block with a 1541 disk drive then the block that you write could overwrite part or all of the trailing gap. This would make the next block on that track inaccessable. To prevent that this character is checked when the BAM is read and if not an "A" the write operations are disabled. You can change this byte to some other value and then the disk cannot be written to. This effectively 'locks' a disk or provides a soft write protect. There are programs available that allow you to 'lock' a disk or 'unlock' a disk. It is easy to change this byte to lock the disk but in order to change it back it is necessary to temporarily override the format check by modifying the disk drive memory after a disk is logged. T18, S0, byte 3 is the single/double sided flag. This byte is not checked by the 1541 which only recognizes single sided disks but it is always set to 0 when a disk header is written. The byte is 0 for a single sided disk and $80 for a double sided disk. If you have a 1571 disk drive and want to format a 1541 single sided disk quickly you can use the HEADER command in 1571 double sided mode then reset the computer in C64 mode and use the short form format command. By sending a format command with no ID characters in it the DOS only rewrites the disk header information without rewriting the dummy information to each track and sector. T18, S0, bytes 144-161 contain the disk name padded with shifted spaces ($a0). T18, S0, bytes 162-164 contain the diskette ID characters given to it when it was formatted followed by a shifted space. The characters in these 3 bytes are only displayed when you list a directory. The actual ID characters for a disk are stored in each block header. The 2 ID characters written as part of the T18, S0 block header have to match the ID characters that are on each block of a disk or you can get a "DISK ID MISMATCH" error when trying to write to a disk. It was important on older Commodore disk drives, even early 1541 drives, that this ID be unique for each disk in your disk library. These disk drives do not detect when a disk is removed from the drive and replaced with a different disk. So if you have a file open to write to the disk and the disk is swapped for another disk that has the same ID then you could overwrite other files on the disk. Later disk drives (1541's with a lever on the diskette door) now recognize when a disk is removed and close all open channels. This effectively prevents this type of error since nothing can be written to the new diskette before the BAM is re-read from the disk. There are also possible error conditions that could occur on reading files from one disk and swapping to a different disk with the file open. To be safe, try to keep unique ID's for all of your disks. This is just good practice. T18, S0, bytes 165-166 normally contain the characters "2A" for a 1541/1571/4040 formatted disk. These identify the DOS version number and the format as noted above. This information is only printed as part of the directory header and is not checked for reading or writing a disk. The 5 bytes 162-166 taken together are only listed at the top of a directory listing. You can put your name or initials in these bytes with a sector editor to display on a disk directory. If you make one of these bytes a shifted-L then the disk directory cannot be LOADed and LISTed on a C64. T18, S0, bytes 167-170 normally contain shifted spaces. You can put something else in these bytes to internally identify a disk if you want, but they are not displayed anywhere. Disk Directory : The disk directory is stored on the remainder of track 18. It starts on T18/S1 and will use as many of the remaining blocks on T18 as are needed. Each block of the directory contains the directory information for eight files. The details of the directory data structure on a disk are covered in detail in each disk drive manual so I won't go into detail here on the structure. As for all files on a diskette, byte zero and byte 1 of the sector contain the link to the next track and sector of the directory. These bytes are processed internal to the DOS and are not transferred to the computer as part of the file data. So when reading the file you will get just the 254 bytes of data. For the last block of data in a file the next-track byte is a zero. This is a flag to the DOS that there are no more sectors after the current one being read and that the byte which is normally the next sector is instead a pointer to the last byte of data in this sector. Once that last byte of data has been transferred to the computer in a READ operation the EOI (End Or Identify) status is set marking the end of the file. If you continue to read bytes in after receiving the EOI status you will get the bytes over again for the current block. For the directory file only, the EOI byte pointer (byte 1 of the block) is ALWAYS set to 255 ($ff). There are three ways to read the directory file on a disk. You can OPEN the file as a program load file. This is what is normally done when you type LOAD"$",8. The OPEN statement that does the equivalent is OPEN 2,8,0,"$". The secondary address of zero tells the DOS that the file is open as a program LOAD file. This does two things. One, it causes the DOS to send the directory file data in the form of a BASIC program rather than the way it is structured on the disk. Two, it causes the disk drive to leave the drive motor running as long as the file is open. A second way to read the directory is to open it as a program READ file. This is a normal OPEN command in the form OPEN 2,8,2,"$,P,R". This form of the open command will cause the disk drive to send the directory data starting with T18, S0, byte 2 to the computer just as it is on the disk sectors, with automatic chaining of sectors and a normal EOI status returned when all data has been transferred. The third method of reading the disk directory is to open it as a random access file and read each sector of the directory with the command channel BLOCK READ commands. This form of the OPEN command is OPEN 2,8,2,"$". In order to use this method you must have the disk command channel open to send the block read commands and buffer pointer commands before you open the random access channel. If you want to modify the disk directory data on the disk you must use this third form of the directory file operation since the directory cannot be opened normally for write. Use great care when writing or rewriting directory sectors on a disk since a programming error can cause all files on a disk to be inaccessible. Track 18 has a total of 19 sectors, one of which is the BAM. This leaves 18 sectors for directory and with 8 file entries per sector there could be 146 files in the directory. The DOS will not let you write more than 144 however. It reserves one for temporary use during save with replace and it will not write file info into the last one because that would require allocating another block for the directory and there isn't one available. One byte in each directory entry which I do want to talk about is the file TYPE byte. This is the first byte of the 30 bytes for each directory entry. The TYPE byte, bit 7 is set to %1 if the file was properly closed when it was written. If the file was not properly closed and this bit is a %0 then you cannot open the file for either read or write. If you attempt to do so the disk drive will return a error status of "write file open". There is a form of the OPEN statement however which will allow you to read a splat file... the MODIFY command, OPEN 2,8,2,"filename,M". This will ONLY allow for reading the file to the extent that it was actually written to disk. The last block on the file can be incorrect and can possibly also be linked to other disk blocks of deleted files. The modify form of the OPEN statement therefore will allow you to recover most of a splat file but this may not be sufficient in all cases. The TYPE byte, bit 6 is set to %1 to "lock" a file. A locked file cannot be deleted with the DOS SCRATCH command. The only way to set this bit is with a sector editor or a program that reads-modifies-writes the disk directory data directly. It cannot be done with any DOS command. Bits 5-3 of the TYPE byte are unused and should be left as %0 bits. Assigning %1 to any of these bits gives strange results for a file type. Likewise with bits 2-0. These bits have the following "legal" meanings. Any other values will give very strange results for file types that will generally make a file inaccessible except as a random access file. The valid states of these three bits are values 0-4 (%000 - %100). %000 = DELeted, %001 = SEQuential, %010 = PRoGram, %011 = USeR, %100 = RELative. Normally a DELeted file has the entire TYPE byte set to 0. If you use a sector editor to change bit 7 of the TYPE byte to a %1 but leave the rest of the bits set to %0 then the file will be listed in a directory as type DEL. Validating a disk with a DEL file in the directory will allocate, or leave allocated, the blocks used by the file. You can read a DEL file just as if it was a SEQ file by using the OPEN statement without a letter indicating the file type. For example, OPEN 2,8,2,"deletefile". This of course applies to any file type other than a splat file. You can also open any file for read by leaving out the ",x,R" in the OPEN command and the file type will not be checked. Files : There isn't a lot to say about how files are stored on a disk. There is a 2 byte value in the disk directory for each file on a disk that tells where the file starts in track and sector. At this track and sector the first 2 bytes of the block contain the track number and sector number of the next block of the file if there is one. If not, the first byte of the block will be zero and the second byte of the block will be a pointer with a range of 2-255 that points to the last byte of the file in the current block. When the disk drive DOS has sent all bytes through this last byte to the computer it will set the EOI status on the serial bus so that the computer kernal serial read routines know that the end of the file has been reached. The EOI status is saved in the ST variable when the last byte of the file has been received by the computer. With files of type SEQ, PRG and USR the first byte sent to the computer is the first byte of the data (first block, byte 2). With files of type PRG the first 2 bytes sent to the computer are the load address of the program in low-byte, high-byte form. If the computer is doing a LOAD operation this will be the address that the file is loaded into if an absolute load (,8,1 for example) was requested. If a relocatable load (,8) was requested then these two bytes will be ignored by the kernal load routines.