THUMB Tutorial essay

This document is made to teach those who want to learn assembly for the ARM-Thumb mode. In this document I shall teach the components more on a need-to-know basis. 1 Registry 2 Making a program that runs: -The Push and Pop commands 3 Arithmetic functions: Add, Sub, mull, Eng logic Functions: Orr, and, NV. Error, SSL, Sir, ROR 5 Memory access: Dir, Dirt, Drib, Star, star, star 6 Control Flow and Unconditional Jumps: CPM, b, b 7 Call and return to functions: bal, box, Swiss Before begin, I shall explain briefly some concepts you need to know.

Registry Registers are small, extremely fast, extremely low capacity memory that sits inside the processor. Why are they so important? Because processors only know how to work with them. All operations involve at least one register, but should normally involve two or three directly. The processor cannot work directly on any other memory, so every piece of data to be worked on must be loaded into one of these. How many are there? For the Thumb mode (the one in this document) only 16 registers are available. Of those 16, only 8 are available for any use.

We Will Write a Custom Essay Specifically
For You For Only $13.90/page!


order now

Those registers are called low registers, and are all-purpose, meaning the processor doesn’t have any designed use for them. Those are called OR to RE respectively, and are the ones you will normally use on most programs. The remaining 8 are called High registers and are usually unavailable to manipulate directly. The last three of them have special names: SP, LURE and PC. SP is the Stack Pointer. The stack pointer contains an address in memory where data is stored when the Push command is called. More on that later. LURE is the Link Return register.

When a branch and link(bal) function is called, the value of the line where it was called is stored here. Not very important right now, but it needs to be kept safe. PC is the Program Counter the address of the next instruction to be executed. This is where the Processor finds out where on the memory is the next command it should run. Messing with it can cause undesired effects if you don’t know what you’re doing. Registers RE to RI 2 are also General purpose, but because they cannot be used by the processor directly in this mode, they will not be used at all in this document.

Making a program that runs: -The Push and Pop commands In a normal situation, these are about the last commands newcomers to the world of Assembly will learn, but because this is a hacking document, we gust Start with these. As you are hacking a Pokes game, more than likely the use you will have for your program is by calling it through a script, with the callas command. The registry is shared by the entire game, meaning that when you call your program, something else was using those registers, and probably needs the information in them. So, we should save them somewhere safe and recover them at the end of our program.

That is when the Push and Pop commands come In. Push{Rug} is a command that places the contents of Rug on the stack, and then moves the stack pointer up. Pop{Rye} is a command that places the top intent of the stack on Rye and then moves the stack pointer down. Where x,y are any number between O and 7. The push and pop commands have a very specific use in mind: Store things temporarily. So, it doesn’t matter where you push, or how many times you push, just that you pop that many times afterwards. This is very important! If you push, pop it afterwards!

Stack misalignment is really bad, and you will see why after I introduce some other variants of the push and pop command. Now, if you wanted to push all registers into the stack, right now you would have to make it like this: Push{or} bashful} Push {r 7} Of course, since it is so usual to do it, the people who made the language came up with a single line to do it. Push{or-re} That makes it a lot easier. Pop also works like this. That literally means “push all from ro to re”. But what if you only wanted to push OR and RE, but not those in between? Push{or,re} That comma there makes all the difference.

Now that means “push ro and re”. It also works like that with pop. Also, you can combine them, making lines like Push{or, re-re, r 7} Or There is one more pushpop command you need to know, and will be the opening and ending to all programs teach to make here SSH {Ir} Pop{PC} There is no pop{lure} or push{PC}. These make the job of returning to functions easier, and is one of the main reason why pushes and pops must be checked carefully. These two, respectively, place the return address of this function on the stack and makes the next instruction be the one that was next from the calling function.

These can be combined with the other push and pops, making lines like these: Push{or-re,lure} Main cares to take when using PushPop. As previously stated, Pop what you Push. Not doing it may result in the game crashing instantly, as it tries to execute the bios header. Example: ro has 00000000 in it) Push {Ir} push{ro} (PC now has 00000000 in it, that is, will try to run on the address 00000000, that is protected against reading. Game crashes) Also, pop in the reverse order you push. In the code Push {or} push {r 1} pop {arm Pop {r 1} OR now has RL data and RL has the ro data. More than likely, you expected the opposite to happen.

When coding Push{or} Push{RL} Pop (search for latest push, and place the register here) {r 1} Pop(again, see the latest, “not popped” push, and place its register Also, on code like this Pop{or-re,PC} The designers made it so it’s safe, as longs no other push or pop is in teens unpaired. If, for some reason, you need to push them all out and then, one by one, in, that push is equivalent to Push{lure} push{RL} push{re} So the first to pop would be re. Now, to apply this in a rather uninteresting program, will first teach some assembling directives for the devastate assembler as I write the code.

Comments are marked by an @ behind them *align 2 @tells the assembler that all code must start in a multiple of 2 @that is because Thumb is a bit code, and the ARM processor @in Thumb mode adds to the PC register +2, so anything other @than that would not work the numb @tells the assembler this is thumb code Start: push (r} pop {PC} And there it is. To compile it, save it on any file, go to the command line and type the location of your devastate assembler folder (egg. And type “as (your filenames’. It should come out as a file a. Out.

Open it on your hex viewer. Around line Ox (usually Ox) there should be a place with oxbow, in this case xx will be 00. Copy up to the part that has a oxbow, 00 again in this case. Actually, this code will only be two wallflowers long, Bibb bad. And there you have it, your first compiled Thumb code. Why search for oxbow? Well, that’s because Push{something, Ir} will be xx be in it’s compiled form. Xx varies from 00 to if, one bit for each registry to push. Feb. would be push{or-re,Ir} in hexadecimal notation. Likewise for oxbow. Bad is Pop{something ,PC}.

If you want to run it in your game of choice, just put it somewhere with an even address, remember where you put it, and then make a script with “callas” address + 1. Why? You will learn why later on. Also remind you that you should keep a backup of your ROOM and your save file before trying this. While this function will not mess up your ROOM or Save, mom more advanced ones can even corrupt both of them to unrecognizable forms. Now, this code made absolutely nothing, only being called and returned from. To do the interesting stuff, you need to learn about a few other commands.

Also, that Start: right there is marking the beginning of the code. It’s not just an annotation like commentaries. That is a label. These are the name of the address of that instruction (the name we gave it, that is). While compiling, any instructions that make a reference to a label will be replaced with its respective address relative to its position. More on that when we talk about jumps. Arithmetic functions: Add, Sub, mull, Eng The first two, you won’t get enough. The third generally has a better alternative to it, so we don’t use it much.

Add and Sub have the nearly the same syntax, and generally come in three varieties: Add rug, y, razz > adds the contents of rye and r z and places it on r x Add rug, #sunup > adds mum to rug and places it on rye. Mum must be between Ox And Goff Add r x, rye , adds rye and Oxen_Jam and place the result in r x. This mum is between O and 7 Mull, however, there is only one Mull rug,rye > multiplies rosy rye, storing it on rug. As well as Eng Eng rd,urn > places the symmetric number of urn on rd Also, to make it clear, storing in a register effectively erases what’s inside. Let me focus a bit more on AddSub.

These are very useful commands, as any operation can be made from them. In fact, most programs are just that, lots of adds. Also, sub can be substituted by add the negative number. If you want to subtract 1, it’s the same as adding -1 . But won’t go into detail in this matter, for most numbers it is easier to simply call Sub. Also, a good way to move a value from one register to another is to call Add rd, urn, #ex. Being rd the destination register and urn the one to be copied. Applying this new information to the previous compile-able code, we can play a bit with numbers.

It is a common practice to push registers you are going to use prior to usage, and pop them back at the end. So, here it goes . Align with numb Start: push {ro-re,lure} sub r 1, r 1, RL @places the number subtracted from RL and RL in RL . As @you may have guessed, this makes relocation 0. Add r 1, #ex. @places 5 on RL sub or, or, ro add ro, #ex. mull or, RL @multiplies ro and RL Eng re, ro @and places the result in re(negative) pop {ro-re,PC} This code still does nothing we can see in-game, but now it’s doing something. If you placed this code directly in a GAB or emulator, you would get a program that multiplied 3 by 5.

Not the most wanted outcome, but it’s a Start. Next, we will look at one more code that makes some logic operations on the registers. Logic Functions: Orr, and, earn. Error, SSL, Sir, ROR You should know by now of the common logic operators, but in case you don’t, let me recap them for you. And takes two numbers in its binary form and places 1 only where both were one. So 0110(6) and 0100(4) 01 00(4) Very useful to separate data that is joint to the bit, like the IV data on a pokes, using what we call a mask. For instance, Ifs are stored in 5 bits of a bit word. To get the lowest IV in the word (Sp. Beef) you would make (iv-word ) and Ox0000001f, that has the last 5 bits in one, so only the real IV value would remain. Or is the operation that lets you see where are the 1 bits in two words. So Orr 1001(9) 1101 (d) Has its uses, but its exclusive counterpart is much more useful. NV (move negated), or not as it is most known, turns all 0 bits into 1 and 1 into O. It’s a one-word operation. NV 1100(c) 0011(3) May also be useful to invert masks. Error, more known as Xerox, the exclusive or, is an interesting function that given woo words, will create one that has 1 where only one of the original words had one.

In 01 10(6) Error 101 1(b) 1101(d) The interesting part of this function is that, if reapplied with one of the original keys, will return the second key. 1 101 (d) Error 101 Error 01 10(6) 1011 (b) As such, it is ideal to use this as an encryption method. As such, Pokes data is encrypted using this method. After Error both IDIOT and the pokes individual Personality ID, it makes a key that key Error word of data will make a pokes data hidden. The main use for Error is to decrypt or encrypt data, but t has many other uses that I leave for you to find out.

Finally, we have three functions that work almost the same. LSI means load and shift left, while Sir means load and shift right. Both will do what their name implies: they will load a register and then shift the bits left or right as we command them. 01 11(7) 1100 Cox -12) SSL 1 & Sir 2 11 = 14) 0011(3) You also may notice that SSL Of a number by x = number * ax, as well as Sir mum x That is why these two functions are so overused. LSI is the easiest and most efficient method to multiply by two, as well as to place a hex number one byte up: Lash Ox’s Ox Ox’s

Also, in Thumb assembly, there is no division command, and as such the only way to divide is to use Sir (actually, there is another way, but I’ll leave that one to really later on) Ror is rotate register right. It’s similar to Sir, but different. 1001 (9) 1 001 (9) 1 & ROR 01 00(4) 1100(c) Spotted the difference? Ror placed that one that left to the right on the beginning of the word. That is the main difference between them. But, because most of the time, you will use Sir to get rid of a piece of data, ROR is rarely used. Now to the real assembly AND, ORR and ERROR have the same syntax

And rd, arm > place on rd the result of rd and arm NV rd, arm > place on rd the result of not(arm) SSL and Sir have the same syntax, and there are two forms SSL rd, arm, rd is loaded with arm and is shifted left for mum bits. #xx will clean the register rd SSL rd,arm > shifts the content in rd by arm bits. Numbers larger than Ox will clean the register ROR rd, arm >same as Sir rd, arm These are all the logic functions. Now, for a coded example. This Will be a code that could be used on a function you wrote. .Align 2 numb Make_address: push{RL -re,lure} SSL or, or, #xx @this is how clean registers add or, or, #xx

SSL RL, or, #ex. @this will make the Ox we placed become a Ox Sir or, or, #ex. @this also cleans the register because Ox is 1 Bibb add r 2, or, OX @cleans re. Purely for educational purposes add r 1, or, #xx @RL now has Ox SSL RL , r 1, #xx @RL was pushed left Ox08800000 and or. [email protected] needed, but educational. X And O = D. Another way @to clean variables Sir or,re @places ro in r 1, another way I use to move them around poll -re} Pop {PC} Notice that, in this example, we pushed and popped differently. First, I pushed RL-re, but I didn’t use re. You don’t need to use all registers you push.

It is, however, considered inefficient, but better safe than sorry. Also, notice that used ro but didn’t push or pop it. That is called passing ro as an argument, return value. In here, ro is what whatever the other program that called it is waiting for. As such, a program that calls this one is interested in that or, so either saved the ro it had prior to calling the function or, just like RL or re in this function, were temporary values and had no need to be stored permanently. Placing this code on the game will work, for both ro and RL are permanently temporary values in Gameness’s code.

Although it would cause o problems on its own, you wouldn’t take any advantage over it. So, even though what this code does is load to ro our favorite ROOM address, it is still quite useless. You must be wondering, When will we learn how to make the useful stuff?. Well, the time has come. Introducing the only way you can interact with the memory, Loads and Stores. Memory access: Dir, Dirt, Drib, star, star, star Up until now, we’ve been treating registers as what they are, data containers. But what data is stored inside it? Well, data falls in three main categories. Byte: the byte is the second most basic unit in computer science.

It’s 8 bits long, and take only one slot in memory. Half-word or bit-word: this is the unit the Thumb-mode processor knows how to read. Not that common nowadays, but the GAB uses it plenty. Word or bit-word: taking four memory slots, this is the biggest value the GAB can handle at once, and is also the default size handled by most processors out there. The ARM processor is built for 32-bit addresses and data, but the Thumb- mode was built to avoid use such long commands, making them smaller and, therefore, occupy less space in the usually-limited ROOM space (that most of their clients used at the time).

The GAB can only read IAMB of ROOM memory, so Thumb was preferred. Also, the GAB was made so that it reads 1 bit words from the room, so it’s also faster than normal ARM-assembly when executing from the room. In those three types of data, there are subtypes, most of them you might be familiar with: In the bytes, we have Characters: The single letters that compose the game’s alphabet. Small-nit: numbers between O and 255. Unsigned only In the half-words we have Nit-16: numbers between O and 65535. Unsigned Only. Thumb-instructions: doubt you will see these, but they may also be all the above commands in its machine-code form.

Leave a Reply

Your email address will not be published. Required fields are marked *

x

Hi!
I'm Gerard!

Would you like to get a custom essay? How about receiving a customized one?

Check it out