Legal part
Introduction
Brief description of functions
Assemble
Checkcondition
Decodeaddress
Disasm
Disassembleback
Disassembleforward
Isfilling
Printfloat* functions
This package includes source code of 32-bit Disassembler and 32-bit single line Assembler for 80x86-compatible processors. The source is a slightly stripped version of code used in OllyDbg v1.04 and is well proven by its numerous users. (If you haven't heard before, OllyDbg is a 32-bit Assembler level debugger with powerful analyzing capabilities that makes binary machine code understandable).
This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License (http://www.fsf.org/copyleft/gpl.html) for more details.
You should have received a copy of the GNU General Public License (gpl.txt) along with this program; if not, write to the Free Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA.
All brand names and product names used in 80x86 Assembler and Disassembler,
accompanying files or in this help file are trademarks, registered trademarks,
or trade names of their respective holders.
Disassembler understands all standard 80x86 commands, FPU, MMX, AMD's MMX extensions, Athlon/PIII MMX extensions and 3DNow! instructions. It does not decode SSI or SSI2 commands. Disassembler assumes 32 bit code and data segments but correctly decodes prefixed 16-bit commands. Several decoding modes allow you to select the amount of returned information (which is inversely proportional to execution speed): command length only, basic information useful for code analysis, or full decoding with dump and assembler form. Multiple options select desired format. Disassembler and Assembler support both MASM and Borland's IDEAL modes.
Assembler converts single command from the ASCII form to the binary code. It allows to find several possible encodings, or even to create search patterns with undefined operands.
This package includes following files:
#define MAINPROG // Place all unique variables here
#include "disasm.h"
(I use this trick to define shared global variables). Below is a small
piece of code disassembled with OllyDbg 1.04 using different text settings:
004505B3 A1 DC464B00
MOV EAX,DS:[4B46DC]
004505B8 8B0498 MOV EAX,DS:[EAX+EBX*4] 004505BB 50 PUSH EAX 004505BC 8D85 E0FBFFFF LEA EAX,SS:[EBP-420] 004505C2 50 PUSH EAX 004505C3 E8 141BFCFF CALL 004120DC 004505C8 83C4 08 ADD ESP,8 004505CB 43 INC EBX 004505CC 3B1D D8464B00 CMP EBX,DS:[4B46D8] 004505D2 0F8C AFFEFFFF JL 00450487 004505D8 80BD E0FDFFFF 00 CMP BYTE PTR SS:[EBP-220],0 004505DF 75 14 JNZ SHORT 004505F5 004505E1 68 B39E4600 PUSH 469EB3 004505E6 8D85 E0FDFFFF LEA EAX,SS:[EBP-220] 004505EC 50 PUSH EAX 004505ED E8 521BFCFF CALL 00412144 |
004505B3 A1 DC464B00
mov eax,[dword ds:4B46DC]
004505B8 8B0498 mov eax,[dword ds:eax+ebx*4] 004505BB 50 push eax 004505BC 8D85 E0FBFFFF lea eax,[dword ss:ebp-420] 004505C2 50 push eax 004505C3 E8 141BFCFF call 004120DC 004505C8 83C4 08 add esp,8 004505CB 43 inc ebx 004505CC 3B1D D8464B00 cmp ebx,[dword ds:4B46D8] 004505D2 0F8C AFFEFFFF jl 00450487 004505D8 80BD E0FDFFFF 00 cmp [byte ss:ebp-220],0 004505DF 75 14 jnz short 004505F5 004505E1 68 B39E4600 push 469EB3 004505E6 8D85 E0FDFFFF lea eax,[dword ss:ebp-220] 004505EC 50 push eax 004505ED E8 521BFCFF call 00412144 |
Function Assemble(), as expected, converts command from ASCII form to binary 32 bit code. It shares command table with Disasm(), so if some command can be disassembled, it can be assembled back too, with one exception: Assemble doesn't support 16 bit addresses. With some unimportant exceptions, 16 bit addresses cannot be used in Win32 programs.
Some commands have more than one encoding. Assemble() allows you to find them all. This is important, for example, if you want to find the shortest possible code or to find all possible occurrences of this command in the code. There are two parameters, constsize and attempt. First parameter selects size of immediate constant and address constant (8 or 32 bits), second is the occurrence of the command in the command table. To find all variants, call Assemble() with attempt=0,1,2... and for each attempt with constsize=0,1,2,3 as long as function reports success for at least one constsize. Generated codes may repeat. Please note that if command uses memory addresses, only one form will be generated in each case: [EAX*2] but not [EAX+EAX]; [EBX+EAX] but not [EAX+EBX]; [EAX] will not use SIB byte; no DS: prefix and so on.
Assemble compiles also imprecise commands that include following generalized operands:
Function returns number of bytes in assembled code or non-positive (zero or negative) number in case of error or when variant selected by combination of attempt and constsize doesn't exist. This number is the negative position of error in the input command. If you generate executable code, imprecise commands are usually not allowed. To assure that command is precise, check that all significant bytes in mask contain 0xFF.
int Assemble(char *cmd,ulong ip,t_asmmodel *model,int attempt,int constsize,char *errtext);
Parameters:
typedef struct t_asmmodel {
// Model to search for assembler command
char code[MAXCMDSIZE];
// Binary code
char mask[MAXCMDSIZE];
// Mask for binary code (0: bit ignored)
int length;
// Length of code, bytes (0: empty)
int jmpsize;
// Offset size if relative jump
int jmpoffset;
// Offset relative to IP
int jmppos;
// Position of jump offset in command
} t_asmmodel;
Members:
Checks whether 80x86 flags meet condition code in the command. Returns 1 if condition is met and 0 if not.
int Checkcondition(int code,ulong flags);
Parameters:
Custom user-supplied function that converts constant (address) into symbolic name. Initially, source code includes dummy function that returns 0.
Decodeaddress() decodes memory address or constant to the ASCII string and optionally comments this address. Returns length of decoded string (not including terminal 0), or 0 on error or if symbolic name is not available.
int Decodeaddress(ulong addr,char *symb,int nsymb,char *comment);
Parameters:
The most important (and complex) function in this package. Depending on the specified disasmmode, Disasm() performs one of the four functions:
ulong Disasm(char *src,ulong srcsize,ulong srcip,t_disasm *disasm,int disasmmode);
Parameters:
typedef struct t_disasm {
// Results of disassembling
ulong pi;
// Instruction pointer
char dump[TEXTLEN];
// (*) Hexadecimal dump of the command
char result[TEXTLEN];
// (*) Disassembled command
char comment[TEXTLEN];
// (*) Brief comment
int cmdtype;
// One of C_xxx
int memtype;
// Type of addressed variable in memory
int nprefix;
// Number of prefixes
int indexed;
// Address contains register(s)
ulong jmpconst;
// Constant jump address
ulong jmptable;
// Possible address of switch table
ulong adrconst;
// Constant part of address
ulong immconst;
// Immediate constant
int zeroconst;
// Whether contains zero constant
int fixupoffset;
// Possible offset of 32 bit fixups
int fixupsize;
// Possible total size of fixups or 0
int error;
// Error while disassembling command
int warnings;
// Combination of DAW_xxx
} t_disasm;
Members:
Calculates address of assembler instruction that is n instructions (maximally 127) back from the instruction at specified pi. Returns address of found instruction. In case of error, it may be less than n instructions apart.
80x86 commands have variable length. Disassembleback uses heuristical methods to separate commands and in some (astoundingly rare!) cases may return invalid answer.
ulong Disassembleback(char *block,ulong base,ulong size,ulong ip,int n);
Parameters:
Calculates address of assembler instruction that is n instructions forward from instruction at specified address. Returns address of found instruction. In case of error, it may be less than n instructions apart.
ulong Disassembleforward(char *block,ulong base,ulong size,ulong ip,int n,int usedec);
Parameters:
Function determines whether pointed instruction is a no-action command (equivalent to NOP) used by different compilers to fill the gap between procedures or data blocks to a specified aligned border. Returns length of filling command in bytes or 0 if command is not a recognized filling.
int Isfilling(ulong addr,char *data,ulong size,ulong align);
Parameters:
These functions decode 4-, 8-, 10-byte floating point number or 8-byte 3DNow! operand into the text form to string s. They correctly decode all cases of NANs or INFs without triggering floating point exceptions. If operand is not a valid floating point number, functions print hexadecimal dump of the number. Return length of decoded string in bytes, not including terminal 0.
int Print3dnow(char *s,char *f);
int Printfloat10(char *s,long double ext);
int Printfloat4(char *s,float f);
int Printfloat8(char *s,double d);
Copyleft (C) 2001 Oleh Yuschuk