Second Week – some bugs and new stuff

This week’s plan was to get the STUBs done. At their current state, I won’t say ALL of them have been addressed, but yes, very few of them remain, and some of them can’t be done right now (more on that). This week’s work can be seen here.

The STUB list can be referred to know the current state of STUBs.

  • LB::b_saveMovie() : something sev mentioned on the #director-engine channel. This would be a project in itself and there is no code around it. Saving is currently not supported in ScummVM’s Director Engine and this STUB would need some time to be implemented.
  • LB::b_setCallBack() : This needs to be adressed when I encounter this command while playing one of the director targets. The documentation to this is vague. here:
    • The third point is not very clear. It appears as if this would involve the implementation of the specific factory object, and the interactions between that and the XCMD or XFCN callbacks are not generalised.
  • LB::b_playAccel() : Deprecated since D3
  • LC::cb_v4theentitypush – kTEAMenuIdItemId and LC::cb_v4theentityassign – kTEAString : These will also be implemented when I encounter their usage. The MenuIdItemId is probably creating menus from scripts, but that is just speculation. I haven’t encountered their usage and their purpose is not documented, so I would have to encounter them first
  • LM::m_moveToBack() and LM::m_moveToFront() : These will be tackled along with this.

There were some things which were fixed this week, like checking for constraint before rendering sprites, acknowledging movable sprites as active.

There are some things which ScummVM would never be able to fully support, like openDA and closeDA, which open the Macintosh’s Desk Accessories like calculator. To be honest, I don’t thin many games which target both Windows and Mac might be using it. There is the open comman too which opens any document in the application specified. ScummVM can’t run arbitrary applications.

I took the liberty of modifying some commands in the way they work (these commands were mainly used for debugging in Lingo and sev said that those are basically no ops for us, so can be ignore) like showResFile and showXlib as we have a REPL Lingo debugger in ScummVM.

While I looked into importFileInto cast Lingo command, we realised that the BitmapCastMember uses PICT files. So I would be porting WAGE’s PICT handling code for that. The remaining STUBs (5 exactly: param, importFileInto and special cases of framesToHMS, HMStoFrames and openResFile) I intend to do in the next couple of days (param has a bug which I can hopefully solve tonight, importFileInto would need work and for the others, I must discuss with sev and others.)

So this week, I can start looking at Meet Mediaband. Personally, I have never seen anything more interestingly weird than that. There are bugs in UNDO ME, Macaroni Man and HOUSE JAM, and the popupMenuXObj needs to be implemented. I had alotted 4 weeks of time to work on Meet Mediaband, and I believe that would be enough to fix it and clear up my trello cards too! Hoping for a productive week ahead.

Compiling the game and completing resource loading

The first thing I worked on this week was finishing up the code that does resource loading which has now been moved to the LowLevelResources class inside of the resources folder. After doing that I worked on bringing in the game code. This was easier than expected with only a few problems with conversions from strings of wchar_t, used mainly for file code, to char strings. I then did some testing and got to load the game.cfg file, which is part of the game’s assets. After that the code to instantiate the various systems gets executed and the program crashed by trying to dereference a null pointer.

Next, I worked on replacing image loading. This was done by the LowLevelResourcesSDL class through the SDL image library. The data was then stored in the SDLBitmap2D class in the form of SDL_surface. Everything has now been moved to the Bitmap2D class in the graphics folder which also handles image loading through the Image::*Decoder family of classes. One problem I encountered was that after loading an image the relative Graphics::Surface cannot be modified, which is required by the fillRect and drawToBitmap methods. This led to the implementation of a copy-on-write mechanism.

This week the codebase was also modified to reduce the number of warnings produced by gcc. After bringing in the engine code the total lines of warnings were about 24000. A lot of these where caused by missing virtual destructors in classes that had virtual methods. After adding these, the line count was reduced to 3000. With more changes, it’s now at less than 2000, which is a huge improvement. The credit for this work goes to sev with me doing only very minor changes.

The last thing I worked on this week was font loading. The game uses a format split into a .fnt xml file and some tga image files which contain a table of characters. This concludes resource loading for now, there’s still the save system but this has been pushed to a later date since it’s not necessary to get the game working.

Bonus problems

  • After including pixelformat.h in some files, the compiler started generating some weird errors which I couldn’t figure out. Turn out the problem was that hpl1/engine/graphics/PixelFormat.h and graphics/pixelformat.h were considered the same file. I don’t know what happened with the directories but for the files, this was caused by the windows file system being case insensitive.
  • In the bitmap class I used Common::ScopedPtr for storing the image decoder. Trying to use the move assignment operator though generated compilation errors. The operation is implemented as:
    template<class T2>
    ScopedPtr &operator=(ScopedPtr<T2> &&other) {
    	PointerType oldPointer = _pointer;
    	_pointer = other._pointer;
    	other._pointer = nullptr;
    	DL()(oldPointer);
    	return *this;
    }
    

    The problem turned out to be that different istances of the same template class cannot access each other’s private members. The fix was a friend declaration in Common::ScopedPtr to all the other instances of itself:

    template<class T2, class DL2>
    friend ScopedPtr; 
    

Next week

Next week I’ll start working the graphics system.

Thanks for reading.

Translating Assembly

Welcome Back!
Since the last post, I finished translating the decompression routines, experimented a little with displaying an image, and looked further into the structure and program flow of the game. However before diving further into that I thought it would be a good time to make this post about the process of translating Assembly into C++ (for now it’s mostly C so as to translate more directly, but can eventually be refactored into object oriented C++).

Also, small note from last week:
I need to be more careful with += statements. I was finding incorrect data being read, and after some debugging I found the problem to be this statement:

dataOffset += (_disk->readByte() << 8) * ProDOSDisk::kBlockSize;

Which, as it is written, will add the result of _disk->readByte()<<8 multiplied by the block size to dataOffset. In reality, what I meant for the logic was to have the entire statement wrapped in the multiplication, and therefor the statement needed to be:

dataOffset = (dataOffset + (_disk->readByte() << 8)) * ProDOSDisk::kBlockSize;


And for anyone curious whether the ProDOS reading code from last week is working, here’s the main window bitmap being loaded directly from the .dsk (don’t mind the fact that it’s purple instead of grey. The palette is a whole other thing. At least it’s a nice purple!):

 

Preface

There are a couple of things I want to go over before getting into the code, so that it’s a little easier to follow for anyone not already familiar with different programming paradigms/abstraction levels and specifically Assembly language. **Please feel free to skip this preface if that does not describe you!**

*Keep in mind this is a very surface level explanation, so there are lots of things I am leaving out to keep it focused:*

You may have heard that many games from the 70s-90s were written primarily (if not entirely) in the programming language Assembly, with The Immortal being no exception to that. But what does that mean and why is it important here? Fundamentally, programming languages can be seen as different methods of expressing a concept to a machine (so long as they are all Turing complete that is). Much the same way human languages like English and French can be used to convey largely the same thoughts to another person, programs written in different languages end up as a sequence of 1s and 0s calling hardware functions regardless of what they look like on the surface. No matter what language a program is written in by a human, the machine will only understand that sequence of 1s and 0s. As a result, unless the human wants to write their program 8 bits at a time (so-called Machine Code, where each byte (or more) represents a hardware operation (or parameter) or data), they need a way to translate some kind of natural language to that sequence of bits. This is where Assembly language comes in. Generally speaking, for any given computer architecture, it is the lowest level interface with the machine code that is written in some kind of natural language (as opposed to being strictly numerical). This language, which itself must still be compiled into machine code, allows the programmer to create complex programs manipulating the hardware of the machine. However, because assembly language is only a little bit above that machine code conceptually, it comes with advantages and disadvantages for the programmer. It is close enough to the hardware that it can perform operations directly, and explicitly, on the computer’s memory, from the smallest segments (generally the registers) to the largest pools of memory available. This direct interface allows the program to efficiently access and manipulate any part of the machine it needs to perform a given function. The flip side, is that to do so you must write everything in terms of these low level system operations. For very simple operations, this can look very short and clean. However, as the complexity of the concept being conveyed to the machine increases, the length and complexity of the assembly code increases often exponentially.

For example, take a relatively simple concept: Multiplication.
In natural language, the complexity of writing this mathematical statement is the same for any two numbers no matter the size. A * B = C. But when we bring this concept to a machine, we have to express it in practical operations it can understand, usually in terms of manipulating individual bits (for reasons I won’t go into here). And this is where details like the size of the number affect the way the statement can be expressed. For example, take the statement 5 * 2. In Assembly, you can write this operation as “Load 5, shift the bits to the left once”. The machine will take 0101 (5 in binary), and ‘shift’ the bits of the number itself over by 1 position, resulting in 1010 (10 in binary). This is because shifting the bits by 1 position is (*in general terms*) equivalent to multiplying by 2 (think decimal places multiplying by 10 in decimal numbers). But if instead we take the statement 5 * 7, the complexity changes completely. Now you need not only a way to multiply by 2, you need a way to perform that operation 7 times. You could write 5 + 5 + 5 + 5 + 5 + 5 + 5, or you could design a loop to perform the addition an arbitrary number of times, but either way the expression is written differently depending on the size of the number involved (alternatively, a special hardware register for some architectures (like the 65816) or even direct support like more modern assembly can be used. The point being A*2 is different from A*B). I know this is getting a bit involved, but I promise it will be relevant!

This brings us to the concept of Abstraction.
Hopefully that example gives an idea of why writing a program in assembly gets large and complex very quickly even for conceptually simple functions. Arithmetic is only the tip of the iceberg though. Managing memory, machine cycles, hardware interactions, status registers and flags, etc. effectively end up obfuscating the concept being expressed. And possibly the most important reason for abstraction, the machine code (and subsequent assembly) is not necessarily the same from one architecture to another (one machine might have a function X that is represented by the number 20, and another machine might have the same function with a different number, or might not even have the function at all!). This is why programming is done in layers of abstraction away from the machine code. This is a whole topic of its own, so I will only mention the relevant languages here. In terms of abstraction, you can think of machine code being the Low level and natural language as the High level. Between those two, we have different layers of abstraction serving different purposes.
Machine Code -> Assembly -> System level languages -> C (bit of a special case, can sort of be both) -> Application level (ex. Java)
With every layer of abstraction, comes a penalty to performance. Although even the high level languages end up being compiled into machine code, you generally give up a degree of efficiency when you need a program to translate abstract statements and structures into low level equivalents. As a result, if the machine running the program is limited in speed, memory, etc. or the program needs to get every bit of performance out of the hardware, some or all of the program may need to give up abstraction for performance. This balance was especially important to developing video games, because the developer often could not afford to leave any performance on the table if they wanted to make the most of the machine they were working with.

Going back to the human language analogy, we can use language to bring this together and ultimately illustrate what it means to go from assembly to C.
To do so, we can start by taking a simple concept, and walking it through the layers of abstraction. Let’s use movement as an example, where moving an object from one location to another can represent the movement of data in memory by the computer. Say you want to move a few rocks from point A to point B. We’ll start with the machine code level, which would be the physical act of moving each of them (roughly speaking). The concept of the rocks moving from point A to point B is understood solely by the act of moving them each time itself. But, if we move up to a low level language like assembly, we can express the concept instead by a statement that simply describes the act of moving it. Maybe we say “Walk to point A. Pick up rock 1. walk to point B. Put rock 1 down. Walk back to point A. Pick up rock 2, ….”. This expression of the concept still requires us to describe the way someone will physically move the rocks, and the individual steps involved, but it is more removed from the action. Now let’s move up another level of abstraction. At this point, we could say something like “Give Point B what is currently at Point A”. Notice that we no longer care what method is used to get the contents from one point and move it to another. Instead, we are closer to simply describing the goal, with an assumption that whoever is going to do it will figure out the physical method based on the environment they are in. Maybe they need a certain tool in one environment, and a different tool in another one. This level of abstraction doesn’t care about those details, and as a result can apply the same statement to similar situations in other places. At this level we are more or less talking about both the system and application levels, but if we wanted to take it even further we could move from imperative to declarative and simply say “Point B has the contents of Point A.”, but then we’re getting into other paradigms that aren’t important here (ex. functional languages). To tie this together, the same concept can be expressed in assembly and C, but in assembly the code might need to describe more steps with more details to do it. In assembly, you need to load a given value into a register (the load command and register being the details) and explicitly state where it’s going in memory.

That certainly means you have to read through more assembly code to determine the translated C version, but there is another wrinkle, and this is where it gets interesting. Looking again at that assembly statement describing the process of moving rocks, let’s now imagine that the statement was also written in Latin and that you have no idea how to read it. We’ll use a very simple version of the sentence for this:
English: “move a rock from here to there”
Latin: “hinc illuc saxum movere”
Okay, looking at the sentence we can see that the individual letters look like English, and they seem to be organized in words similar to English. So then, it should be easy enough to simply replace each word with the english literal equivalent. Let’s do that (this is not exact, and does not take into account all kinds of complexity in language, it’s just to tie programming to something familiar):

“here towards rock move”

Hmm, that still doesn’t look right. Now we have english words, but the way they are arranged doesn’t make sense. However we can at least get the general idea, that there is a rock moving towards a place. And now we can make an educated guess at the sentence, “move rock towards here”. At this point we can maybe use context to figure out where the rock is coming from and going towards, and eventually the sentence can be translated.

Okay at this point you may be wondering if I’m the cat in this gif because the relevancy of this seems like a big stretch.
Black Cat Stretching GIF - Black Cat Stretching Morning GIFs

But stick with me for a second and I think it’ll make sense. The reason I chose Latin for this example is that it is a language which uses a different word order from English, with the same letters. I think this highlights a fundamental difference between low and high level programming languages. If you don’t know how to read assembly, but you know how to read say, C#, you may end up going through a similar process. The words are still written in english, they aren’t alien, but right away they don’t read as natural language. Instead of words like Print, you see words like LDA and EOR. You can look up those opcodes (represents the ‘code’ for an ‘op'[eration]), and understand the individual words as Load_into_A and Exclusive_Or. But you are now presented with a different problem. The structure of the operations themselves doesn’t match what you’re used to. Once that is sorted out, there is the final issue, abstraction. You can understand the words, and the structure of the operations, but because they are on a lower level of abstraction, you also need to understand the purpose of those instructions as they relate to the hardware compared to the conceptual statements you’re used to. You might understand that LDA means Load_into_A and that the code is written LDA this : STA here, instead of here = this. But it still doesn’t tell you what the point of the statement LDA $0392 : STA $1528 is, unless you have explicit names already assigned to those addresses (which in the case of assembly source code like this project, you do have). This is different from translating between say, C# and Java, where that first stage, finding equivalent words, is where most of the process lies (for this surface level explanation I’m ignoring lots of other differences as you would expect), because the way the code works is fundamentally at the same level of abstraction (ie. neither language tends to deal in direct memory addresses).

Finally, we come back to the original question, why is Assembly important to any of this?
The Immortal was written in pure 65816 assembly, with all the hardware interaction, status registers, memory management, optimizations, and architecture specific functions that come with it. As a result, translating the source code to the high level language that ScummVM uses (C++) means translating between layers of abstraction in addition to the syntactical differences. As we will see in this blog post, the process involves first sorting the logic and movement of memory across registers and addresses, and then finding ways to closely match those interactions in C.

If you made it through this overly verbose preface, you’re awesome, I appreciate it.


Translation
Okay, with that out of the way, we can get into the code.
The compression itself is interesting, but not entirely relevant to this post, so I will just mention that the algorithm is a modified implementation of LZW, which is itself a descendant of LZ78. The modification is related to memory concerns for the Apple IIGS.

So, for translating the decompression routines, we can start by identifying all the components involved. In the file compression.gs, we have several routines, stemming from ‘myCompress’ and ‘myUnCompress’. Some of these are shared between them, as they are applicable to both.
For our purposes (we don’t need the compression routines since we are only decompressing data), the routines we will need to translate are:
– myUnCompress
    – setupdictionary
    – inputcode
    – member

With these routines, we can look a little deeper and identify the branching structures within the routines, as well as the general purpose of the routines:
– myUnCompress
    – set up the memory that will be used (essentially the memory for the dictionary and the stack of output characters that are periodically dumped to the output memory)
    – initialize the dictionary
    – loop :nextcode (this loop uses a jmp to return due to length)
        – get the next code
        – if there is a code and it’s not empty
            – loop :nextsymbol (this loop uses a bra to return)
                – if it is a single char
                    – loop :dump (another bra loop)
                – process symbol

– setupdictionary
    – clear the reserved memory
    – set first 256 bytes as already used

– inputCode
    – get the next byte from the input file
    – depending on even/odd, perform a different function on it
    – return the result

– member
    – create a hash
    – if the first entry is empty
        – start the new list
    – loop :ag (returned with jmp)
        – find the right entry
        – branch :match
            – found a match, return result
        – branch :next
            – continue searching for match
        – branch :appendlist
            – loop :findempty (returned with bne)
                – if empty found, add and link with dictionary, then exit routine
                – if no space, reset dictionary and exit routine

Now for the purposes of this post, I just outlined them in the most surface level way. The result when I actually did this was a kind of pseudo-C code in a similar structure but detailing everything including the status flags used for branches, return values, index register usage, arithmetic, etc. This kind of stuff is not super easy to follow for the purposes of this post. Ex.
– if ((origin[index]&0xF000)|(ptk[index]&0xFF00))

– index = ((ptk[index]&0xFF00)>>8)|tmp)<<1
– h = (((((k<<3) xor k)<<1)xor k)xor codeW)<<1)

The other thing I did while making the pseudo-C outline was to sort out the variables and memory addresses. This is a bit tricky with assembly, because is it not always obvious how an address or variable is being used at a glance. In C, every bit of memory used has a ‘type’. But right away that means we’re dealing with an abstraction. In assembly, the programmer might intend for an address in memory to be utilized as an integer (for example, a timer). But there is nothing stopping that data from being utilized differently. It gets a little more nuanced from there however, when we start dealing with registers. So let’s talk about registers.

On the 65816 you have 3 primary registers, with each register being a very small bit of memory right on the cpu, about as close to the action as it can possibly be. This means it can be accessed when running code, faster than virtually anything else. There are other registers as well, but they don’t really need to be explained here (when I mention status flags, those are in another register for example). The main registers are responsible for most of the operations you can perform that involve manipulating any kind of data. They are A (consisting of AB), X, and Y. These are a great example of what is explicit, and required in assembly, vs what is implicit in C.
The A register, also known as the accumulator, is primarily used to hold 1 or 2 bytes of data, so that it can have an operation performed on it. You can LDA to LoaD into A, and you can STA to STore A to an address. But this is also where operations like ASL (Arithmetic Shift Left, the multiply by 2 operation mentioned earlier) take place. The X and Y on the other hand, are generally intended to be used as indexing registers. This is because in assembly you have to do the indexing manually. For example, we can see the difference in this statement from assembly, translated to C:

 lda [ptcodew],y    –>   start[index]

In assembly, ‘indexing’ really just means that adding ‘,y’ or ‘,x’ to an address will ensure that the opcode it translates to will be one that takes the address, and adds the value in the Y or X register before performing its operation on it. For complete clarity, LDA is translated to many different opcodes depending on the context the compiler finds it in, but for a normal load from address statement, the compiler produces the hex value 0xAD. If the compiler instead sees LDA addr,y, it will produce the hex value 0xB9. B9 will add the contents of Y when it runs (the [ ] around the variable name on the left in this case also tells the compiler that this is indirect, so that would become 0xB1 instead, but that’s not important right now).
What this means, is that at a glance, the assembly statement on the left does not tell you what value is being used to index the address referenced by ‘ptcodew’. Where as in C, the actual indexing part is implicit, but the value used to do the indexing, is explicit. This ends up requiring the C code to have a dedicated variable just for indexing data, compared to the always available index registers in assembly. This is sort of a blessing and a curse in assembly though. Those index registers are another example of data that is generally expected to be one thing, but is not always used that way. The index registers can also act like extra 16bit registers to hold and move around data. For this reason, when reading assembly code like this you have to be sure you know where that Y value came from, and keep track of what happens to it. For an example of exactly that, lets take a look at this line:

lda [ptk],y
and #$ff
ldy topstack
sta [pstack],y

Right here, the Y register has a completely different value on the first line compared to the last. If you aren’t careful, you could miss the LDY, maybe mistaking it for a LDA, and continue reading with the assumption that it is indexing with the same value in both places. This is a pretty simple example, but the 65816 has opcodes that can swap the contents of different registers, which allows for some very complex movement of data in a small number of lines. A TXY (Transfer X into Y) or TYA can quickly make a translation of the statement more complicated.

One more note on variables and memory:
In assembly, one of the most basic operations you perform, is to manipulate individual bits. You can still do this in C, with bitwise operators (&, |, ^), but it’s less common as you have generally more memory to work with and the compiler will optimize for space to an extent anyway. In assembly however, when you do not have much space to work with, you may need to store data that is smaller than a byte or word, but larger than a bit. Decompression is a good example, where there are two pieces of information within a word (on the 65816 a Word is 2 bytes) for certain data. As a result, the code needs to extract the bits by masking the relevant ones. For C, I made this part of an enum specifically for masking bits (to allow the translation to be as direct as possible)

enum BitMask : uint16 {
    kMask12Bit = 0x0F9F, // Code (pos, 00, len) is stored in lower 12 bits of word
    kMaskLow   = 0x00FF,
    kMaskHigh  = 0xFF00,
    kMaskLast  = 0xF000
};

The relevant entry being kMask12Bit, which takes care of grabbing just the 12 bits that are used.

Once I had a handle on the memory being used, and an outline of the logic and flow of branches and routines, I could start turning it into C code.
I won’t go over every step, because much of it is what you would expect (add file compression.h and .cpp, make a namespace, etc.), but I will briefly talk about noteworthy parts.

 


 

The first one I will mention demonstrates one of the interesting things about reading assembly compared to C. In C, a function has a set of arguments, and a return value. This is a rigid structure, and it is done that way for good reason. Generally, if you want to manipulate multiple variables outside of a function, from within the function, you pass those variables as pointers in the arguments. In assembly, even the idea of a function is a little fluid. There is a stack, but there isn’t a stack frame. That is to say, there is a record of addresses when jumping to routines if that jump is made as a JSR/JSL, and an opcode can use those address to return to the point when the jump occurred (RTS/RTL), but there is no section of memory dedicated to the instance of a function. In fact, you can make every function call by manually using a JMP command to an explicit address, and then using a JMP back to it. It follows then, that without explicit local variables, the idea of a single return value is entirely up to the programmer. And often, the return value of a routine is found in the carry flag (that’s the status register things I mentioned). But it also doesn’t have to be. You can ‘return’ the A, X, or Y registers. You can also return a status flag by virtue of the last operation done before returning. The point is, a routine in 65816 assembly can be consistent in what it considers a return value, but even then the question becomes, how do you deal with a routine expecting multiple return values from a subroutine, in C?
Well there are different ways to accomplish this, but what I chose to do for inputCode (renamed as getInputCode()), was to have a boolean that represents the carry flag, be passed in by reference so that it could be utilized by the function, while allowing the actual return value of the function to be the most relevant thing, the input code. This is also because the result of that carry flag determines the status of the loop around it as well. Here is an example of the return value of this routine:

asl
tay
lsr
clc
rts

The thing to note here, is that we are left with 3 separate return values. In the A, there is the result of LSR. In the Y, there is the result of the ASL. And the CLC clears the carry flag, which is an expected output as well. SEC or CLC is often at the end of a routine to act as a binary response. It could be used as an error message (this routine didn’t do what you expected), or it could be used to denote a binary choice within the routine (routine ExampleChooseAorB has found A to be the choice). Once again, the assembly requires special care to keep track of this in case it is not always consistent.

The next example is small, but I think it demonstrates another difference in structure. The compression routines use a hashed dictionary, and the hash value itself is an example of a statement needing to perform many operations on a value in a row. Take this line:

asl #3
eor B
asl
eor B
eor A
asl
(I have altered the statement a little bit so that it is just the sequence of operations and generic A and B names, as that’s all we need for this example).
(EOR in 65816 = XOR)

So what we have here can be described in natural language like this:
‘for the current value, double it 3 times, xor with B, double, xor with B, xor with A, and double again’. In assembly, it is written more or less the same way it is described. The accumulator acts exactly like you would expect, it accumulates that value as operations are performed on it. But this is not how higher level languages generally work. For example with C, you can’t leave an equation with only one side completed, because that sort of accumulator register is not visible and therefor the result is unusable. This makes sense, because C can’t be bound by a single architecture with a single structure of registers. So in C, you must either make the entire equation one expression, or you must break it up and have faith that the compiler will sort out the best way to do it based on the architecture being used. Here is the balance I decided on with this statement:

uint16 hash;
hash = (k << 3) ^ k;
hash = (hash << 1) ^ codeW;
hash <<= 1;
This is pretty easy to follow, at least compared to that pseudo-C code equivalent I originally wrote down:

– h = (((((k<<3) xor k)<<1)xor k)xor codeW)<<1)

What I’m getting at with this, is that translating assembly to C can require striking a balance between readability, and efficiency (just like writing assembly itself!). Another example of this which is used multiple times, can be found in conditional branches. In the assembly, branching acts like any other operation, it will jump to a position based on the a given status flag. This means it can be triggered in multiple ways with the same flag. Ie. you can set the flag manually, or through the result of an operation between multiple values, etc. As a result, you can have code in one section respond to the result of code in another, because the status flags are not tied directly to a single variable or expression. Combine this with the ability to apply a sequence of operations on a value in a row, and the conditional branches can become hard to follow even in the assembly, let alone the expressions representing them in C. For example you might have a conditional like:
if ((((origin[codeIndex]&0xF000)&ptk[codeIndex])&0xFF00) == 0)

Which is not exactly easy to follow at a glance. So instead, to make it readable at all we need to break up the statement into components like this:

uint16 cond;
cond = start[index] & kMaskLast;
cond |= ptk[index];
if ((cond & kMaskHigh) == 0) {
And this could be made even more explicit if you were to have the final & statement outside of the conditional.
This last example will demonstrate a few things, but primarily the way the more rigid structure for branching code in C requires rethinking the assembly. For this example, we will look at the structure of Member (renamed to getMember()).
If we look back at the outline for that routine, we see that it consists of a loop and several branches. We will get back to the loop in a moment, but right away we have a decision to make. In C, a function is a distinct structure. Or to put it another way, you can’t make a function that is simultaneously a subsection of another function. In Member, we have the branch appendList, which is part of the loop but performs a distinct function, is written after the other code in the routine, and has its own exit from the entire routine. In other words, it is possibly more accurate to think of Member branching to a different subroutine, appendlist, as if the entire routine is sort of split in two. It doesn’t need to be a separate function (in its infinite versatility, C actually allows direct GoTo statements, but they are highly discouraged, and this translation will restructure as needed instead), but the code would be hard to follow and rather ugly if written this way in C. With the actual branching being part of the implicit mechanics underneath, C generally wants branching structures to be complete. Ie. You can write a while True loop and have a branch inside return from the entire routine, but that would bypass the structure of the loop. In other words, the overhead of a complete loop is generally preferred for the sake of future additions/changes, more control over the flow of logic, and readability. C really doesn’t restrict you to any particular methodology, so you can have a nested loop structure where some branches return from the loop and others the function, but it also gives us the opportunity to make the structure follow the purpose a little more clearly. So for this routine I decided to separate them into getMember() and appendList(), so that they each have a distinct focus. getMember() searches for a code in the dictionary, appendList() adds a new entry to the dictionary.

Let’s start by identifying the exit points from this routine by making a simple map of the branching statements in the routine:
– Member
    – beq :newlist

– :ag
    – bne :next

    – beq :match
– :next:
    – beq :appendlist
    – jmp :ag

– :match
    – return

– :newlist
    – return

– :appendlist
    – bcc :tablefull

– :findempty
    – bcc :tablefull
    – bne :findempty
    – return

– :tablefull
    – return

From this we can see that :next and :ag are clearly connected together, and :findempty needs :tablefull which is connected to :appendlist. But once we branch to :appendlist, there is no chance that we end up back in :next or :ag. So making it a function seems like a fairly natural translation of the structure.
Okay, we can now also identify the looping structures. I’m not going to go into the specifics of the different types of branches (bcc vs bne for example) and what they become in C, but we don’t need that to arrange the structure based on the loops. For example, if we look at :appendlist, we have :tablefull written after the loop, despite the loop having its own return and not also using :tablefull. This doesn’t technically need to change, but we can make it a little easier to read at a glance by not requiring the reader to skip past the loop to see the result. As it is, the (for this post, pseudo-)C code structure looks like this:
– appendList():
    – if (condition for :tablefull not met):
        – loop :findempty:
            – …
    – else:
        – …
        – return

But we can rearrange this a little bit as a result of that else branch not being used by the loop:
– appendList():
    – if (condition for :tablefull IS met):
        – …
        – return
    – loop :findempty:
        – …
        – return

And now, there’s no need to read past the loop just to find out what happens with the first condition.
This example may seem a little superfluous, but I chose it because in this format, the concept of branches in assembly being translated into easier to read conditionals can be fully digested. In reality, we aren’t talking about reading a few lines ahead, we’re actually talking about seeing a label, having to scroll around searching for the branch it references, keeping that branch and the sub branches, and their sub branches in mind as you go back to the first branch, which might then have several more branches going backwards and forwards in the code, branching to each other, etc. Doing this when reading the code to understand the flow of logic is unavoidable, but what we want, ideally, is to avoid translating that part of the assembly to C. Instead, we want the logic to be properly structured and contained within loops and conditionals, so that the logic is preserved, but the ability to read it (and change, fix, expand, etc.) is easier.
Basically, we want to turn this (ex. the pseudo-C):

Into this (I rearranged the example conditional as I was writing it, so it’s unfortunately not visible in this):

So that instead of having to digest the entire routine with the distinct subsection right in the middle of it, you can see ‘appendList()’ and have an idea of what it does, without having to dive into how it does it at the same time.

And now one final thing in getMember() that is relevant to translation. In assembly, you often have access to architecture specific operations that can do neat things due to the nature of the hardware, and these can lead to tricks and optimizations. The 65816 has a good example of this in the opcode XBA. This is a neat feature of the 65816 in particular, taking direct advantage of the way the accumulator register works. On 65816, unlike the 6502, and for reasons I won’t go into now, the 16bit accumulator is actually made up of two different 8bit registers, A and B. When operating in 16bit mode (the 65816 architecture with it’s 6502 8bit emulation mode is a whole other interesting topic), A and B act as a singular register, A. But thanks to this register actually being two distinct registers, there exists an opcode to swap the data between them, similar to the opcodes for transferring between other registers (TXY for example, Transfer X to Y). XBA (eXchange B and A) effectively reverses the byte order of a given value in the accumulator. This may not sound especially useful, but when we are looking at efficiency on the order of machine cycles, every single operation we save is useful. For example, if you have the number 0x5 in the accumulator. And you want to turn that number into 0x500, you need to multiply it by 0x100. To avoid the comparatively slow arbitrary multiplication, you can shift the bits over by one byte, requiring an ASL for every bit position, in this case 8. So that would be 8 bytes of code, one for each ASL, and each ASL takes 2 cycles, giving us a total of 16 cycles to move 0x5 to 0x500. On the other hand, since 0x5 can be read as A: 0x00, B: 0x05, we can use XBA to instantly give us 0x0500. That’s one byte for the XBA, and the XBA is 3 cycles, making it 13 cycles faster and 7 bytes shorter. That’s pretty great! But what happens when we have to translate code that makes use of this trick into C? It gets a little messy. The decompression routines have a couple of instances of this which aren’t too bad, such as:

lda[ptk],y
xba
and #$ff

Which I translated as:
hash = ptk[hash] & kMaskHigh;
hash >>= 8;

The process for this is:
– identify the goal: XBA + AND #$00FF  –>  get the top byte alone, and make it the low byte for the next operation
– rearrange for C: the destination is Y (not shown in snippet), which in the C code is the variable hash. As such, we start by making hash the high byte of ptk[hash], and then we can shift down until it is the low byte through a >>= statement

But what about when XBA is used in the middle of an already complex sequence of operations on a single value? Translating the arithmetic suddenly involves a non-arithmetic component (in terms of the arithmetic around it), which can get tricky. Just another consideration during translation.


The take away from those examples is this: When translating assembly to C, there are a number of factors that are not necessarily obvious from the surface. There are two parts: Reading the assembly, and writing the C. Reading the assembly has challenges such as: keeping track of the movement of data between registers and addresses, bit manipulation, status flags (not just for branching either! Although nothing in the decompression routines does this, it can be useful to manipulate things like the carry flag within expressions themselves), and often complicated webs of branches. While writing it in C brings challenges that include: untangling and rearranging those branch webs, finding a balance in expressions between efficiency and readability, replacing non-portable register manipulation with consistent use of local variables, and fitting the dynamic movement of branches and routines into structures that are philosophically consistent with how C code in a function should flow.


Alright, we made it to the end!
Yeah Cat GIF - Yeah Cat High Five GIFs

If you read this whole post, first I want to say thanks! That’s pretty wild considering how long it was. And second, if you found yourself frequently thinking ‘what the heck are they talking about and what is this code supposed to do’, and/or looking up terms, then congratulations! You have a good idea of how reading assembly code feels 😛

As for what’s next in the project, I’ll talk more about it next time, but the plan currently is to start translating the files ‘kernal’ and ‘driver’, which handle much of the core structure of the game engine, using a sort of inside-out methodology that I will talk more about next time. I promise it won’t be as long as this one! Probably!

Week 5 – Finishing up Scott

Picking up from where I left last week, the bug I was stuck on has been fixed! It was caused by a missing conditional return statement in the 6520 emu which would cause it to run for longer than it was supposed to for certain games. I fixed it just some hours after the last blog post went up so luckily I didn’t lose much time because of it.

Once the bug was fixed, the rest went by relatively quickly. I finished adding all the games to detection and that was basically it. That work has been merged into the ScummVM master which is great. There are some minor visual bugs still present but I decided to move on to TI99/4A support as they were nothing game breaking.

TI99/4A support was much easier as it doesn’t have any graphics and as of writing this I am done with it too. This means I am done with the Scott engine outside of any fixes I have to make if needed.

The next thing I’ll be working on will be either PINK or WAGE engine. Look forward to next week for updates on that and thanks for reading!

First Week!!

This week was quite eventful. sev assigned me some tasks in the Trello board for the director engine. I have shared the link to that in my previous blog post.
I first tried my hands on the window properties task that was assigned to me on Trello. I had some progress in finding the cause of the issue, but didn’t have any code to push (this was the first day of the week). So I kept it for later and started working on the STUBs in the Director Engine codebase. A github gist of the STUBs.

This gist doesn’t contain some STUBs I had implemented at the first day of week. Still, it has all the ones that are unimplemented. I will add a strikethrough to the ones that are done.

This week I created 19 Pull Requests. 14 of them are merged. 4 are open and 1 is a draft (Though that will be open tomorrow). 18 of them were implementations of STUBs. The lingo properties now don’t have STUBs (yay!) (Actually one is left, but I have figured that out. Need some sleep and then will push it)

I also refactored the RIFX Chunk dumping so it dumps all Archive chunks now.
You can see all my PRs for the week, sorted by last updated here.

I had 2 non productive days this week. One was when I had some mild fever and was exhausted. Second was my birthday, I turned 20 this week. So there wasn’t much progress these days.

But seeing how things went this week, I am sure that I can implement all the STUBs in Director Engine’s Codebase next week, which would be inline to what was my goal for the first two weeks in my proposal. Then the issues won’t be about missing code in Director, it would be about wrong code. So it would be all about resolving bugs in various targets like Meet Mediaband, The Journeyman Project, Spaceship Warlock.

Speaking of targets, hsrtron playtested The Seven Colours : Legend of PSYS City. There were a few issues with the target :

  • Wrong palette in the corridor (This is an issue while importing palettes from sharedCast)
  • Animations are quite fast at some places
  • Not being able to leave the first level (sev identified this as an issue of not checking for punycoded file paths)

These issues have been documented on the trello board. sev also pushed a quick fix for the game’s cursor. Seems like this target is also being one which we would be using to fix the Director Engine and will make it work just like the original.

So this would be another target for me to tackle in my coding period. This week, when I get done with the STUBs, I can start looking for the source of bugs in the targets I mentioned in my proposal. I can also finish my trello tasks one by one (as they are identified bugs)

Looking ahead to an even more eventful and productive week!

ProDOS File System

The first week of the coding period is over, and there a few things to talk about.

ProDOS File System

Last week I talked about how The Immortal is stored as .dsk files formatted as ProDOS. Since ScummVm did not have a way to read ProDOS specifically, the first thing that needed to be done was to implement some way to do so. This was a somewhat lengthy process, as I will get into shortly. First however I should mention that because ProDOS is an entire file system and not just a series of files stored together, and because accessing the files should be as similar to the original game as possible, I needed to have the engine simulate the file system. I also mentioned last time that it could eventually be converted into an implementation of the Common::Archive system in ScummVm, so that other engines reading ProDOS files would not also need to simulate the file system. There were a number of frustrating issues along the way, but I was successful in implementing a Common::Archive class that can handle ProDOS .dsk files (although it does not account for Sparse files currently, which are unlikely to be relevant for game engines but it should be noted none the less).

In the last post, I gave a brief explanation of how the file structure works. Today, I will show how it works in more detail with reference to the code.

The general structure has stayed mostly the same throughout, where the disk volume is a class, and the file itself is another. The disk volume contains structs that mirror the data layout in the disk, for example a directory header:

The disk class also contains methods used for parsing the entire volume, the loader blocks data, the volume bitmap, and a hashmap to hold all the file objects with their full path name as their key (ie. if a file is in a subdirectory, the path would be “subdir/file”).
The file class then, is much smaller, just containing a bit of metadata about the file (name, size, type), and a block index pointer to where either the data, or the index block/master index block is stored. It also has methods that use that block index pointer to put together the file data from its disparate data blocks when it is asked to do so.

As a result, the program flow works like this:
ProDosDisk object is created -> constructor is called -> open() method is called -> the volume is parsed by creating structs of the directory header, looping through each file entry in the directory, and if the entry is a file, creates an object for it, or if it is a subdirectory call the same method recursively until it reaches the final directory.
Then, if asked for a file, ProDosDisk gets the file object from its hashmap, and calls the method for putting the file together.

Originally, I had done this using a byte vector (Common::Array). This was because the size of the data is unknown until the file entry was parsed, so I needed a data structure that could expand. This worked well enough, and was able to return a given file when the disk object was called. However, once I started implementing Common::Archive, a number of things needed to change.

Archive

The Common::Archive class in ScummVM is designed to act as a universal container for file types. At its most basic level, it works like any other input stream. You can tell it to load a given file by its file path (the path being handled by search manager, allowing for relative file paths and removing the need to account for different OS file structure) into a Common::File object, and then use methods to seek() and read() from that data stream. However, the class can also be implemented, so that those same universal file methods can be used on file types that need to be unpacked and parsed before they can be used. Essentially, this means that an engine could use a single method for loading resources, for many different game versions that store their resources in different file types, allowing the code to stay smaller and also closer to the original game code.

What does that actually look like for the ProDOS file system?
It requires the class to implement a version of certain functions. Specifically these:

  • bool hasFile(const Common::Path &path) const override;
  • int listMembers(Common::ArchiveMemberList &list) const override;
  • const Common::ArchiveMemberPtr getMember(const Common::Path &path) const override;
  • Common::SeekableReadStream *createReadStreamForMember(const Common::Path &path) const override

Which are each called from the Common::File class. Implementing those methods brought up another issue however, which was that the file in this case was already an object. Normally, an archive class will create an object itself and call a method to get the file contents. With ProDosFile already existing however, to avoid creating an object just to call another object, I had to also implement Common::ArchiveMember. This required its own set of universal methods inside ProDosFile:

  • Common::String getName() const override;
  • Common::SeekableReadStream *createReadStream() const override;
  • void getDataBlock(byte *memOffset, int offset, int size) const;
  • int parseIndexBlock(byte *memOffset, int blockNum, int cSize) const;

Where getName and createReadStream are the archive methods, and parseIndexBlock and getDataBlock are the methods that ProDosFile uses to put together its file contents.
Then finally the file contents itself had to change as well. Archive requires a byte stream (as you would expect to get from calling Common::File), but my methods were designed to give back a byte vector. After rewriting the methods that put the file together, they now allocate a set of memory at run time, add the file contents one block at a time, stepping through the index blocks, to return a standard read stream.
A side note about those index blocks: They are a set of block index pointers to individual data blocks (or index blocks in the case of a tree file), but the way they store those indexes is very strange. I’m sure there’s a specific reason for this, but they store the low and high bytes of the pointer 256 bytes away from each other. Ie. The low bytes are stored in the first 256 bytes of the block, and then the high bytes are stored in the following 256 bytes. I’m not sure what the reasoning was, but it made it slightly more complicated to manage seeking through the file.

Result
The result of this is that the engine can now call Common::File to get a file containing only the data of a given filename, put together in the background by the ProDosDisk and ProDosFile classes, wrapped together by Archive.

(not shown is the actual byte data, because that is just a long string of hexadecimal bytes).

Next

So I was unfortunately not able to get through the decompression function within the first week, as almost all the time was spent getting the disk file reading into the state it is now. However I have begun translating the decompression routines into C code, and I will have an update for that next week or earlier.

 

Game detection, engine compilation and resource loading

It’s been three weeks since my last blog post. The time working on the project has mostly been spent reading code with very little writing.

The first thing I’ve worked on was creating the engine’s basics and implement some detection. For now, the code looks for the Penumbra.exe file but there are plans to expand it in the future to support different versions and languages.

Next, I worked on getting the open-source engine code in the into scummvm and to get it to a state where it would compile. Fortunately, the code is well structured with most of the non-portable or library dependent code is isolated in the impl folders. Some notable exceptions to this where anonymous union members that was used in the Vector* classes and the __super keyword which are both msvc extensions. Speaking of anonymous union members, while trying to compile without extensions I kept getting an error related to some windows header. Now, I was sure that they had all been removed and a quick search confirmed it. I then looked for other headers surrounded by #ifdef WIN32 but still nothing. Turn out the problem was caused by the innocent looking ObjectArray header which I found out by trying to compile the engine on another platform. Another problem I encountered was that after auto formatting the new code some headers where reordered leading to errors for headers that didn’t include everything they were using. It’s the first time I’ve delt with this problem and understanding what had happened took some time.

This past week I’ve been working on resource loading. I’ve started by looking at the tinyxml library present in the impl folder which is heavily used to load resources such as configuration files, information about materials and sound effects, model data, and others. The work mostly consisted of replacing the file handling functions and the string implementation. After this I’ve worked on the logging functions (Error, Warning and Log) which previously wrote everything to a file but now use debugN with an appropriate level. In the future, as I go through the code, unnecessary logging will be removed, and appropriate debug channel will be added. The last thing I did this week was to look at the management of the resource files. This is done by adding resource directories to a cFileSearcher object which looks for all the files (optionaly matching a specific pattern, though this is rarely used) and when given a file name returns its path. This functionality can be replaced by the global search manager.

The plan for next week is to finish with resource file management, bring in the game code to start doing some testing and replace image and font loading.

Thanks for reading.

Week 4 – Start of the coding period

The coding period has started and I managed to make some decent progress this week. However I did hit a major roadblock which has had me stumped for the last few days.

C64 support has reached the point where it seems to handle games without any special decompression. This include 11 Mysterious Adventures which were the first games to be tested. The ones which do need to be decompressed (or decrunched) are a mixed bag.

I started with Robin of Sherwood and that worked fine. I then moved to Gremlins and unfortunately a variant of the game wouldn’t work. Thanks to Petter (the original author) for letting me know about garglk a few weeks ago as that would give me a better way to debug my code.

I managed to setup it on my Windows laptop thanks to WSL but it seemed to have problems with that variant of Gremlins too. I initially thought that this meant that the game wasn’t supported by spatterlight either but after talking to Petter about it, he found out a bug which was causing it to not work. He fixed it promptly (along with some other bugs) and the game does indeed work properly on spatterlight.

I now have a working piece of code to compare my version to and hopefully fixing this bug won’t take a lot of time now. If everything goes fine then I might even finish working on scott by the time the next blog post comes out 😀

That was it for this week. Thanks for reading and see you next week!

Official Coding Period Begins!

The GSoC coding period has begun, and I have some tasks in my kitty for the week.

Starting with the updates from last blog. I got the MENUREF working! The Pull Request for same is opened. I also implemented the Text and Font related STUBs.
The Discord user hsrtron#3373 had translated a D3 game (The Seven Colours: Legend of PSYS City) but this had some issues, with a wrong color palette being the most evident one. While ScummVM had most of the palette implementation required to use custom palettes, there were some missing links. After changing a few things, we could run the game with its intended color palette!

I talked to sev about my apprehensions of making different git branches for every task, and turns out its fine!

Also, I got added to the Trello board of the Director Engine so I can pick up the ToDo tasks from there and complete them. Link

Now my tasks for this week: Complete the implementation of window properties. I would try to do this soon. And then move on to making all archive types to dump all movie chunks.

Looking for a great week ahead!!

Game versions, file systems, and the next step

The first day of the coding period starts tomorrow, so it’s time for an update!

There are two important things to go over since the last post. The first is about the version of the game being used, and the second is following up from that.

Game Version

My original assumption with the project was that I would be continuing the partial engine that JoeFish had been working on, using the IBM version of the game. However, after a better examination of the source code and discussion with my mentor, I am instead working on an engine for the Apple IIGS version of the game, and using the JoeFish code as reference where it is relevant. The reasons for this decision are twofold. First, the JoeFish code is largely already refactored compared to the source assembly, and I would instead prefer to start from a more direct port of the source, before refactoring it once it is functional. The second factor is the source code itself. The Apple IIGS version is written in 65816 assembly, which is the architecture that I am most comfortable with already (it is the same processor architecture used in the Super Nintendo, so I am used to reading and writing assembly for it). When looking through the source code of each version, it stood out as the one that I could get a handle on most quickly.
In terms of game version differences, The Immortal is an interesting case. There are many differences between every version of the game (with the NES version being distinct even in the screen rendering itself), but the largest mechanical difference has to be the combat system. The original version of the game on the Apple IIGS uses a top down combat system which takes place on the same game screen as the rest of the gameplay. However, this was not the case for many other versions, such as the IBM DOS, the NES, or the Genesis, as they used a separate combat screen with large sprites of the wizard and goblin. It raises an interesting question for the unified engine, that being how much of the combat system is shared between the two implementations, and whether it might be possible to make overhead combat an option in other versions. This is not relevant for now, but examining the different versions brings up many questions about how the unified engine should handle them in the long term.

The ProDos File System

The first challenge in working on the Apple IIGS version of the game is the file type itself. The game primarily exists as disk image files (boot.dsk and graphics.dsk) like many other Apple II games, but unlike any already supported by ScummVM, The Immortal uses the ProDos file system. This was a later revision of the Apple DOS file system, and came with many fundamental changes. The Apple IIGS, using the much more powerful 16bit 65816 chip as compared to the 8bit 6502 on the Apple II, demanded a more powerful file system (there were many other reasons to move away from DOS3.3 as well), and what we got was ProDos. However, the file structure of ProDos is somewhat complex in comparison. It could support very large file sizes, and the file system was abstracted away from the physical tracks and sectors in DOS3.3, but it did come with extra complexity and slower read speed. The important thing about this, is that with the file format not having a backend implementation in ScummVM, I will need to make my own for the engine to be able to extract and work with the data files of the game.
After spending some time reading about the file system and looking at the raw byte data of the disks, I have written code to follow the file structure and list files from directories.

However, in the interest of ensuring the source code retain its own structure with regard to data loading, the next step is to implement a virtualization of the file system in the engine. This would allow  a call to ProDos (a program on the apple IIGS could retrieve data from the disk with a JSR to a special address in memory, followed by the command it wants ProDos to perform, and the arguments for that command) to be translated to a call to the ProDos file system object instead, retaining the structure of the call in the engine. This is taking some time to implement, so the result will be in the next blogpost. Another benefit to mention about this virtual file system, is that it can eventually be converted into an implementation of the Common::Archive system in ScummVM, allowing engines for other games that use ProDos to avoid implementing their own ProDos handling code. This makes sense as the ProDos interface code is already distinct from the game source code, so instead of having multiple implementations of loading game data for different game engines, the different game engines can retain more of their source structure.

Okay, now that we know why we’re working on ProDos, I should explain how the file structure actually works.
It looks complicated when interpreting any individual bit of data from the disk, but the structure is actually pretty straightforward.

ProDos
As mentioned earlier, ProDos abstracts away from the physics tracks and sectors of disks, with the device driver itself translating between ProDos commands and where the data is physically located. Instead, a ProDos disk is divided into a series of Blocks, containing 512 bytes each. The first two blocks are always the ‘Loader’ program, which gets run immediately and allows programs to interface with the disk. After that, the rest of the disk is made up of a few things:

  • Directories
  • The Volume Bitmap
  • File data

The Directories are comprised of a Header, and a list of File Entries. The Header contains various bits of information required to traverse the file system, such as a pointer to which block contains the remaining entries in the directory if there is not enough room in the current block. Since each block only has 512 bytes of data, and the blocks are in functionally random positions within the disk, there need to be pointers that link relevant blocks together. The Volume Bitmap is a sequence of bits that represent how space in blocks is used or unused in the disk, and is located after the directory blocks, but before the file data. One other thing to note is the term KeyBlock which just refers to the first block of any directory, if the directory requires more than one block.

A ProDos disk can be described in terms of blocks (any given directory can take more than one block, but for the purpose of this post I will assume only one) like this:

Block 0: Loader 1
Block 1: Loader2
Block 2: Volume Directory (the main directory)
Block 3: Subdirectory
Block 4: Volume Bitmap
Block 5+: File Data

The header and general file system traversal is fairly straightforward, but there is one big complication to note about this. Files can be very small, or very large, and the file system has to have a way to handle this in terms of small, 512 byte blocks. The way it does this is pretty interesting, and although I will go into further detail in the next post when the virtual file system is implemented, for now I will just mention the basics.

Files in ProDos can be classified as either inactive (deleted etc.), some type of tree file, or a subdirectory file. The subdirectory file of course just directs you to the subdirectory keyblock, and an inactive file is self explanatory. The tree file is what regular files are stored as, and in a nutshell it defines the size of the file.
ProDos would ideally like to have files always be small enough to fit in single blocks, as you would expect. But if a file gets too big for a single block, it needs a way to link multiple blocks together to save and load that file. To do this, it defines them as a seed, a sapling, or a tree. These definitions are functional, as a seed only needs a directory entry pointing to the data block in the disk. Whereas a sapling (bigger than one block, but not bigger than 256 blocks) requires another step (it has to ‘grow’ from a seed to sapling as the data size grows). In that case ProDos will dedicate a new block as an index for up to 256 data blocks, and the directory entry will then link to the index block instead of the data block, which is used to find and link all the data blocks across the disk. Lastly, a tree file is for the biggest files, and adds another step, linking together the index files of a sapling with a Master Index block, allowing for massive files of non-contiguous blocks of data. This makes virtualizing the file system somewhat complicated, but it is also a pretty neat way to handle files.

The Next Step

Alright, to close out this post I will briefly mention the next steps. The coding period begins tomorrow and the first step will be to ensure the virtual file system is usable, and get all the data from the disk organized and extracted by the engine. If there is time, hopefully I can implement the compression/decompression functions and maybe even get an image from the graphics data.
I have been revising my week by week plan to fit the new game version, and will continue to do so as I get a better idea of how the new engine will progress.

Okay, thanks for reading if you made it this far, I’ll have another update next week!