Wednesday, May 13, 2009

About quick method invocation

A good 3 months ago I wrote about the ODEX format when support for that DEX variant was added to the dedexer tool. Then I asked if there is anybody out there who knows, how the index in the invoke-virtual-quick Dalvik instruction can be interpreted. For example:

invoke-virtual-quick {v1,v2},vtable #0x3b


That 3BH is an offset but into what? I did not know, therefore dedexer does not interpret the offset which makes its output on ODEX files less useful. Then Nenik (nickname according to his request) finally gave me the solution. Here comes his mail, verbatim.

The index computation is pretty simple, but for reverse analysis you need more data.

The vtable contains all the methods that can be invoked by invoke-virtual, that is
all nonprivate member methods (even those final and native).
The vtable is obviously constructed by copying superclass's vtable, then replacing
overridden methods and appending all additional virtual methods.

The methods in the vtable are ordered as they were in the dex file.
Let's look at android.view.KeyCharacterMap for example. It extends java.lang.Object,
so it starts with:
Object:
0: .method protected clone()Ljava/lang/Object;
1: .method public equals(Ljava/lang/Object;)Z
2: .method protected finalize()V
3: .method public final native getClass()Ljava/lang/Class;
4: .method public native hashCode()I
5: .method public final native notify()V
6: .method public final native notifyAll()V
7: .method public toString()Ljava/lang/String;
8: .method public final wait()V
9: .method public final wait(J)V
a: .method public final native wait(JI)V

Then it replaces:
2: .method protected finalize()V
with the implementation from KeyCharacterMap

and adds:
b: .method public get(II)I
c: .method public getDisplayLabel(I)C
d: .method public getEvents([C)[Landroid/view/KeyEvent;
e: .method public getKeyData(ILandroid/view/KeyCharacterMap$KeyData;)Z
f: .method public getKeyboardType()I
10: .method public getMatch(I[C)C
11: .method public getMatch(I[CI)C
12: .method public getNumber(I)C
13: .method public isPrintingKey(I)Z

Anyway,
invoke-virtual-quick {v1},vtable #0x2
on an instance of any kind is simply the quick variant of
invoke-virtual {v1}, Ljava/lang/Object;.finalize:()V


Obviously for decoding invoke-virtual-quick opcodes inside, say,
framework.odex, you need the coresponding core.odex and ext.odex
to reconstruct the vtables of all the base classes.

An .odex file contains a list of such dependencies appended
after the body of the encapsulated dex, together with their SHA-1s:


00498D10 44 6E 95 3A │ 13 BC 10 91 │ 0E 00 00 00 │ 02 00 00 00 Dn.:.¼..........
00498D20 1C 00 00 00 │ 2F 73 79 73 │ 74 65 6D 2F │ 66 72 61 6D ..../system/fram
00498D30 65 77 6F 72 │ 6B 2F 63 6F │ 72 65 2E 6F │ 64 65 78 00 ework/core.odex.
00498D40 93 97 93 82 │ 99 DA 46 4E │ E2 13 DD 35 │ 4C 48 B5 7B .....ÚFNâ.Ý5LHµ{
00498D50 F0 06 51 D7 │ 1B 00 00 00 │ 2F 73 79 73 │ 74 65 6D 2F ð.Q×..../system/
00498D60 66 72 61 6D │ 65 77 6F 72 │ 6B 2F 65 78 │ 74 2E 6F 64 framework/ext.od
00498D70 65 78 00 54 │ F1 D0 82 95 │ 97 53 15 60 │ E4 D6 2C 48 ex.TñÐ...S.`äÖ,H
00498D80 8E 36 51 C5 │ 75 BE 2C 00 │ 50 4B 4C 43 │ 08 80 01 00 .6QÅu¾,.PKLC....
00498D90 08 80 01 00 │ 00 20 00 00 │ 00 00 21 AA │ C2 5C 31 00 ..... ....!ªÂ\1.


With all this data, it should be possible to reconstruct .dex from current
.odex format, something I would like to be able to do.


Thanks, Nenik, this is the information I needed to improve the dedexer tool. I will do that soon. If anyone has something to add, please, comment.

2 comments:

Dan Bornstein said...

One wrinkle that was elided above is that in order to know which vtable to look at, you need to know at least a superclass of the object being invoked on. Methods on Object are easy, but for anything else, you need to do at least a little bit of data flow analysis, along the lines of what the verifier does. You could of course start with the vm's verifier as a reference if nothing else.

Gabor Paller said...

I see. Thanks for the observation.

Rewriting the disassembler's first pass as trace disassembler is long overdue, here is a new motivation. :-)