[maemo-developers] Java acceleration/Jazelle
From: Simon Pickering S.G.Pickering at bath.ac.ukDate: Mon Jul 30 14:24:41 EEST 2007
- Previous message: Java acceleration/Jazelle
- Next message: Java acceleration/Jazelle
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Hello all. My apologies this is going to be a long one... All the code mentioned in this email can be found under this directory: http://people.bath.ac.uk/enpsgp/nokia770/jazelle/ After reading the patent I wrote a piece of code to test whether Jazelle works, as Scott Bambrough suggested. The patent indicated that R14 should hold the address of the Java bytecodes while R12 might possibly hold the address of a handler. The code I wrote performs BXJ R12 (with R12 pointing to the handler and R14 pointing to the Java code). In the handler I was trying to get the current bytecode value printed out by calling printf inside assembler. Here's the code: (http://people.bath.ac.uk/enpsgp/nokia770/jazelle/test_jazelle9.c), I can't seem to get printf to work in this code, even though it quite happily works in another piece of test code (http://people.bath.ac.uk/enpsgp/nokia770/jazelle/test_printf2.c). I've no idea what's causing the difference, can anyone see something I've missed? I should note that there is an error in the test_jazelle9.c code which means it won't work correctly after the printf anyway. Branching away to this function alters the value of R14 (which should contain the address of the Java bytecode). This can be easily fixed by saving R14 to another unused register or memory location over the call. So I then removed all of the printf business and started using gdb to step through the code and look at the registers as the code progresses. The idea is that within the handler the value in R14 should change and the bytecodes are handled, and if some known bytecodes are encountered, the value of R14 should jump by more than sizeof(java bytecode). This code is here: (http://people.bath.ac.uk/enpsgp/nokia770/jazelle/test_jazelle10.c). I've stepped through this code with gdb and the bad news is that the handler is called all the time (even for the instructions that should be handleable). Results below: Java code address=67240 I added a breakpoint just after the start of the handler code. Most of the registers are uninteresting as they do not change at all except for R1 into which we store the bytecode value (but the previous value as this instruction hasn't had a chance to run yet - poor choice on my part) and LR (R14) which contains the pointer to the current bytecode. This is the first run through the handler, that's why R1 doesn't contain a bytecode value 1: x/i $pc 0x8444 <main+148>: ldr r1, [lr] (gdb) info registers r1 0x40008000 1073774592 lr 0x106a8 67240 1: x/i $pc 0x8444 <main+148>: ldr r1, [lr] (gdb) info registers r1 0xcc 204 lr 0x106ac 67244 1: x/i $pc 0x8444 <main+148>: ldr r1, [lr] (gdb) info registers r1 0xcd 205 lr 0x106b0 67248 1: x/i $pc 0x8444 <main+148>: ldr r1, [lr] (gdb) info registers r1 0xce 206 lr 0x106b4 67252 1: x/i $pc 0x8444 <main+148>: ldr r1, [lr] (gdb) info registers r1 0xcf 207 lr 0x106b8 67256 1: x/i $pc 0x8444 <main+148>: ldr r1, [lr] (gdb) info registers r1 0x10 16 lr 0x106bc 67260 1: x/i $pc 0x8444 <main+148>: ldr r1, [lr] (gdb) info registers r1 0x0 0 lr 0x106c0 67264 1: x/i $pc 0x8444 <main+148>: ldr r1, [lr] (gdb) info registers r1 0x0 0 lr 0x106c4 67268 1: x/i $pc 0x8444 <main+148>: ldr r1, [lr] (gdb) info registers r1 0x0 0 lr 0x106c8 67272 1: x/i $pc 0x8444 <main+148>: ldr r1, [lr] (gdb) info registers r1 0x2a 42 lr 0x106cc 67276 1: x/i $pc 0x8444 <main+148>: ldr r1, [lr] (gdb) info registers r1 0x3b 59 lr 0x106d0 67280 1: x/i $pc 0x8444 <main+148>: ldr r1, [lr] (gdb) info registers r1 0x1a 26 lr 0x106d4 67284 1: x/i $pc 0x8444 <main+148>: ldr r1, [lr] (gdb) info registers r1 0xd0 208 lr 0x106d8 67288 1: x/i $pc 0x8444 <main+148>: ldr r1, [lr] (gdb) info registers r1 0xd1 209 lr 0x106dc 67292 Program exited normally. So, the thing to see from these results are that even the bytecodes that we'd expect to be handled were not, control was always passed straight to the handler. This raises a couple of questions. Are the register choices correct for passing the handler and Java bytecode addresses? Does Java need to be enabled somehow before the Jazelle hardware starts working (i.e. is this BXJ currently working as a simple B instruction)? Another possibility is whether I need to use a byte array rather than an int array for the bytecodes? Staying with R12 and R14 in the hope that the patent almost gave us the correct information, I wrote and tested some other pieces of code: (test_jazelle10.c - use int array for bytecodes, R12=handler address, R14=bytecode address), call BXJ R12 test_jazelle10b.c - use char array for bytecodes, R12=handler address, R14=bytecode address, call BXJ R12 test_jazelle10c.c - use int array for bytecodes, R14=handler address, R12=bytecode address, call BXJ R14 test_jazelle10d.c - use byte array for bytecodes, R12=handler address, R14=bytecode address, call BXJ R14 test_jazelle10e.c - use int array for bytecodes, R14=handler address, R12=bytecode address, call BXJ R12 test_jazelle10f.c - use byte array for bytecodes, R12=handler address, R14=bytecode address, call BXJ R12 test_jazelle10g.c - use int array for bytecodes, R12=handler address, R14=bytecode address, call BXJ R14 test_jazelle10h.c - use byte array for bytecodes, R12=handler address, R14=bytecode address, call BXJ R14 This should cover all the possibilities with these two registers, with the following results: B & C & D Same result as test_jazelle10, stepped through bytecodes always calling the handler. E & F & G & H Segfaults So calling the Java bytecode as the argument to BXJ (whichever register) causes a segfault, while calling BXJ with the handler address in whichever register as the argument to BXJ seems to just branch to the handler. So, I'm still wondering about whether Jazelle needs to be enabled, but I also assume that the patent was probably fibbing about the registers (which would not be surprising as these may have been arbitrary register names). Then I moved onto variations of Sebastian's code to see whether I could investigate the segfaults and see whether these are caused by jumping to bytecodes and trying to interpret them as ARM code, or the Jazelle hardware faulting once it reaches the end/and unhandled bytecode. Jalimo1.bin =========== http://people.bath.ac.uk/enpsgp/nokia770/jazelle/jalimo1.c char[] for Java bytecodes (I thought I'd give it a try) All registers are set to 0 (R1-R14), other than R0 with address of Java bytecode array. I've done this to avoid branching somewhere else if a handler address is stored in a register, etc. codepos is beccf6b0 1: x/i $pc 0x8418 <main+104>: bxj r0 (gdb) info registers r0 0xbeccf6b0 -1093863760 r1 0x0 0 r2 0x0 0 r3 0x0 0 r4 0x0 0 r5 0x0 0 r6 0x0 0 r7 0x0 0 r8 0x0 0 r9 0x0 0 r10 0x0 0 r11 0x0 0 r12 0x0 0 sp 0x0 0 lr 0x0 0 pc 0x8418 33816 fps 0x1001000 16781312 cpsr 0x60000010 1610612752 Program received signal SIGSEGV, Segmentation fault. 0xbeccf7d0 in ?? () Disabling display 1 to avoid infinite recursion. 1: x/i $pc Cannot access memory at address 0x0 0xbeccf7d0: ldreqd r0, [r0], -r7 (gdb) info registers r0 0x0 0 r1 0x0 0 r2 0x0 0 r3 0x0 0 r4 0x0 0 r5 0x0 0 r6 0x0 0 r7 0x0 0 r8 0x0 0 r9 0x0 0 r10 0x0 0 r11 0x0 0 r12 0x0 0 sp 0x0 0 lr 0x0 0 pc 0xbeccf7d0 -1093863472 fps 0x1001000 16781312 cpsr 0x60000010 1610612752 R0 has been changed to 0 and the PC register naturally changes showing that we've ended up somewhere other than where we started. Code position=0xbeccf7d0 -> 0xbeccf7d0-0xbeccf6b0 = 288 bytes = 72 ints. So we've managed to move along 288 bytes, which is far longer than the bytecode array (13 chars). Therefore, why do we end up at this address? Has the bytecode interpreter run out the end of the bytecode array and been interpreting random data as valid bytecodes? Or was a handler invoked (though we cleared the registers - so if it did take a register address for the handler, we might branch to the ARM exception vector location...)? Or perhaps Java mode was never entered and we've been interpreting random bits as valid ARM instructions? We could perform a memory dump and look for the ldreqd instruction that seems to have caused the segfault? Then work back and see where the code started executing that met this instruction (it may just be a random selection of bits that looks like an instruction). LDREQD = LDRD EQ The form is "Load and Store Word or Unsigned Byte - Register post-indexed" 31-28 = cond. EQ = 0000 27 = 0 26 = 0 25 = 0 24 = P = 0 for post-indexed addressing 23 = U. For subtract Rm U=0, for add Rm U=1 22 = I = 0 for Register offset/index 21 = W =0 (if p==0) 20 = 0 19-16 =Rn 15-12 = Rd 11-8 = addr mode = SBZ = all 0s 7 = 1 6 = S = 1 (signed) 5 = H = 0 (byte access) 4 = 1 3-0 = Rm. R7 = 111 So there's what I need to look for. So, we could try setting all the registers to the address of some handler code and see what happens. First let's alter the memory layout to see if we get a different error at a different location. But before that, let's try using an int array for the bytecodes: jalimo2.bin =========== This is the same as jalimo1, but using an int array for the bytecodes rather than a char array: All registers are set to 0 (R1-R14), other than R0 with address of Java bytecode array codepos is bee4f688 1: x/i $pc 0x8404 <main+132>: bxj r0 (gdb) info registers r0 0xbee4f688 -1092290936 r1 0x0 0 r2 0x0 0 r3 0x0 0 r4 0x0 0 r5 0x0 0 r6 0x0 0 r7 0x0 0 r8 0x0 0 r9 0x0 0 r10 0x0 0 r11 0x0 0 r12 0x0 0 sp 0x0 0 lr 0x0 0 pc 0x8404 33796 fps 0x1001000 16781312 cpsr 0x60000010 1610612752 Program received signal SIGSEGV, Segmentation fault. 0xbee4f6a4 in ?? () Disabling display 1 to avoid infinite recursion. 1: x/i $pc Cannot access memory at address 0x0 0xbee4f6a4: streqh r0, [r0], -r1 (gdb) info registers r0 0x0 0 r1 0x0 0 r2 0x0 0 r3 0x0 0 r4 0x0 0 r5 0x0 0 r6 0x0 0 r7 0x0 0 r8 0x0 0 r9 0x0 0 r10 0x0 0 r11 0x0 0 r12 0x0 0 sp 0x0 0 lr 0x0 0 pc 0xbee4f6a4 -1092290908 fps 0x1001000 16781312 cpsr 0x60000010 1610612752 Different instruction faulting this time... PC moved = 0xbee4f6a4-0xbee4f688 = 28 bytes = 7 ints. Interesting, this has changed things... remember there are 13 bytecodes in the array. This looks more like it's actually running some Java code. We've got a different instruction 'causing' the segfault (I say 'causing' as it may be that we've branched to the ARM exception vector and this has misinterpreted the bytecode instruction that we came from as being an ARM instruction and having caused a segfault). So going back to the theory above, change the memory layout (by adding bytecodes to the array) and see whether we get a different instruction causing the segfault, or perhaps we'll progress further through the bytecodes (the extra bytecodes should all be unhandleable so we shouldn't actually go any further if the bytecodes are actually being interpreted). jalimo3.bin =========== In this code we use an int array for the bytecodes, and pad the bytecode array with unhandleable bytecodes (0xCC to 0xFF). codepos is be9485c0 1: x/i $pc 0x8418 <main+104>: bxj r0 (gdb) info registers r0 0xbe9485c0 -1097562688 r1 0x0 0 r2 0x0 0 r3 0x0 0 r4 0x0 0 r5 0x0 0 r6 0x0 0 r7 0x0 0 r8 0x0 0 r9 0x0 0 r10 0x0 0 r11 0x0 0 r12 0x0 0 sp 0x0 0 lr 0x0 0 pc 0x8418 33816 fps 0x1001000 16781312 cpsr 0x60000010 1610612752 Program received signal SIGSEGV, Segmentation fault. 0xbe9485ec in ?? () Disabling display 1 to avoid infinite recursion. 1: x/i $pc Cannot access memory at address 0x0 0xbe9485ec: ldreqd r0, [r0], -r0 (gdb) info registers r0 0x0 0 r1 0x0 0 r2 0x0 0 r3 0x0 0 r4 0x0 0 r5 0x0 0 r6 0x0 0 r7 0x0 0 r8 0x0 0 r9 0x0 0 r10 0x0 0 r11 0x0 0 r12 0x0 0 sp 0x0 0 lr 0x0 0 pc 0xbe9485ec -1097562644 fps 0x1001000 16781312 cpsr 0x60000010 1610612752 PC moved = 0xbe9485ec-0xbe9485c0 = 44 bytes = 11 ints. Not sure we've learned anything here other than the fact that we've (possibly) moved further into the bytecode array. In jalimo2.c we segfaulted at &bytecodearray+7 which would be 0xB1, the return instruction. Now we manage to go a bit further (though the return instruction has been removed, so this still makes sense) to 0xD0. I must check and see whether the 0xCC-0xCF instructions can actually be handled (but I don't think this should be possible). There are few enough instructions here to check their binary form and see if they could be valid ARM instructions (if we've not actually gone into Java mode at all) and therefore the segfault is really happening. jalimo4.bin =========== Out of interest, try using the char array again plus the code from jalimo3 above (with the padded bytecodes): codepos is bed13680 1: x/i $pc 0x8418 <main+104>: bxj r0 (gdb) info registers r0 0xbed13680 -1093585280 r1 0x0 0 r2 0x0 0 r3 0x0 0 r4 0x0 0 r5 0x0 0 r6 0x0 0 r7 0x0 0 r8 0x0 0 r9 0x0 0 r10 0x0 0 r11 0x0 0 r12 0x0 0 sp 0x0 0 lr 0x0 0 pc 0x8418 33816 fps 0x1001000 16781312 cpsr 0x60000010 1610612752 Program received signal SIGSEGV, Segmentation fault. 0xbed1368c in ?? () Disabling display 1 to avoid infinite recursion. 1: x/i $pc Cannot access memory at address 0x0 0xbed1368c: ldrleb sp, [r3], #721 (gdb) info registers r0 0x0 0 r1 0x0 0 r2 0x0 0 r3 0x0 0 r4 0x0 0 r5 0x0 0 r6 0x0 0 r7 0x0 0 r8 0x0 0 r9 0x0 0 r10 0x0 0 r11 0x0 0 r12 0xbed13690 -1093585264 sp 0x0 0 lr 0x0 0 pc 0xbed1368c -1093585268 fps 0x1001000 16781312 cpsr 0x60000010 1610612752 Different faulting instruction. PC moved: 0xbed1368c-0xbed13680 = 12 bytes = 3 ints ======================================================================== I've not done much more analysis, but I thought I'd share and see if anyone has any bright ideas. It would be good to be sure that we actually enter Java mode (gdb can't follow the train of execution which makes me think this is the case...?). Things to check - look at whether the bytecodes form valid ARM instructions to determine whether BXJ could simply be branching in ARM mode (what would gdb do in this case, as it doesn't seem to be able to follow these instructions, but perhaps this is a problem with the code's handling of BXJ. I have been using "si" rather than "ni" at the BXJ instruction which should step into the code, I think). Assuming the bytecodes don't form valid ARM instructions, and therefore are being handled by the Jazelle hardware (which automatically tells us that we don't need to enable Jazelle mode manually), why do we get the particular errors (segfaults) at the end? Is this because we've accidentally set one of the registers to point to code that should be the handler, but is actually the ARM exception vector (as we've set all the registers to #0)? This is possible to test, by setting all of the registers to something else (like a real handler) and see what happens: =========================================================== jalimo5.bin =========================================================== All registers pointing to a handler that simply sets all the registers to #10, then exits the loop: codepos is bef9d680 1: x/i $pc 0x8418 <main+104>: bxj r0 (gdb) info registers r0 0xbef9d680 -1090922880 r1 0x8424 33828 r2 0x8424 33828 r3 0x8424 33828 r4 0x8424 33828 r5 0x8424 33828 r6 0x8424 33828 r7 0x8424 33828 r8 0x8424 33828 r9 0x8424 33828 r10 0x8424 33828 r11 0x8424 33828 r12 0x8424 33828 sp 0x8424 33828 lr 0x8424 33828 pc 0x8418 33816 fps 0x1001000 16781312 cpsr 0x60000010 1610612752 Program received signal SIGSEGV, Segmentation fault. 0xe3a0a008 in ?? () Disabling display 1 to avoid infinite recursion. 1: x/i $pc 0xe3a0a008: Cannot access memory at address 0xe3a0a008 (gdb) info registers r0 0xe3a0100a -476049398 r1 0x8424 33828 r2 0xe3a0200a -476045302 r3 0x86f5 34549 r4 0xe3a0300a -476041206 r5 0x8424 33828 r6 0xe3a0400a -476037110 r7 0xe3a0500a -476033014 r8 0x8424 33828 r9 0xe3a0600a -476028918 r10 0xe3a0700a -476024822 r11 0x8424 33828 r12 0xe3a0800a -476020726 sp 0xa 10 lr 0xe3a0900a -476016630 pc 0xe3a0a008 -476012536 fps 0x1001000 16781312 cpsr 0x60000010 1610612752 So no, this hasn't worked. It's possible that this may be because we're calling BXJ with the address of the bytecodes in the wrong register? I've not looked at these results further, though the PC seems to have moved a long way: PC = 0xe3a0a008 - 0xbef9d680 = 614910344 bytes. How does the memory space work? It's also interesting to see how the registers have changed. Curiously, if you try running this program outside of gdb, you get a different exception: Nokia-N800-26:/home/user# ./jalimo5.bin codepos is beaba6b0 Illegal instruction Right, that's about all from me, I need to have a sit down and think about where to go now, and what else there is to test. If anyone has any bright ideas or comments, please share them, Best regards, Simon
- Previous message: Java acceleration/Jazelle
- Next message: Java acceleration/Jazelle
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]