This is the mail archive of the binutils@sourceware.org mailing list for the binutils project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Disassembler/Decompiler using libbfd


Hello,
Last time I though to write simple asm to C decompiler, (or just
simple disassembler which can guess functions, args for functions,
display imports from other shared library, etc..)
I don't like idea about implementing inside my program all opcodes from
intel x86, and other arch... So I though about libbfd. I look at objdump
sources, cleanup interesting stuff and i have 272lines disassembler..
It's great using libbfd for this stuff.. But now I stuck.

I can display decoded opcodes+args on the screen/on file. I can display 
all decoded opcodes from one call to leave/retn/another call (using
sscanf()). Ok, but I don't have any idea how should I parse decoded opcodes + args.. 
I need to make it more parsable. [Not sscanf()!]

Yeah, I know I can do:
	init_disassemble_info(&info, my_own_data, (fprintf_ftype) my_own_function);
And i do, and

When i just print format and pipe it to sort and uniq.
I get only: "%s" and "," in format,

So we would have:
/* lock_opcode global variable */

lock_opcode = 1;		/* lock -> new opcode */
(*disassemble_fn)(section->vma + start_offset, &info); /* call disassemble_fn */
lock_opcode = 0;		/* unlock */

and then in my_own_function() 

we can check if lock_opcode = 1, than we'd have decoded name of opcode.

than we'd have first param [if avail], Than if opcode has more params.
We'd have: ',' next_param_in_%s, ',' next_param_in_%s, etc...
till we'd have lock_opcode = 0.

First I though it'd be hack.. Now I think it's quite good idea, but I
don't know if on every file, on every arch it'd happen.
[For now I'm trying to dissasm win32 PE file, good file, not `broken`. 
`Broken` -> obfuscated, compressed, etc.. are not in my concern at
whole. I only want to dissasm/decompile good files]

So I have some questions:
 - If this method is acceptable to do decode first opcode than args - 
 	If all arch-system-opcode-decoders work this way?
 - If libbfd can/shouldn't be used this way [For writting
   decompiler/disassembler]
 - If there's other way to do what I want. I don't know maybe something
   from: disassemble_info struct, there's some *results of instruction
   decoders.*

I don't really like idea of copying code from bfd, or by implementing my
own instruction translators.. [SPOT rule, inventing wheel once again,
etc.. really, really bad idea :(]

I would be grateful for any reply. Even: `it's senseless/stupid/whatever
to write decompiler`

My english is not best, so if you don't understand some part or big
part or event whole thing. Sorry. I'll try to explain again.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]