[maemo-developers] Java acceleration/Jazelle
From: Simon Pickering S.G.Pickering at bath.ac.ukDate: Wed Aug 15 14:43:43 EEST 2007
- Previous message: Java acceleration/Jazelle
- Next message: Java acceleration/Jazelle
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Thank you for the links, these are things I've not seen before. > So let me dump the stuff I turned up so far: > > URL: <http://www.scratchpost.org/patches/jazelle-disassembly.png> > Here you can see the size and alignment of the java instructions. > (the entire document is > <http://www.arm.com/pdfs/DUI0066D_ADS1_2_AXD_armsd.pdf>) Looking at the Memory Processor view in Jazelle state (fig 5-39 on page 5-33 of the pdf), the left-hand column showing the Address of the bytecodes indicates that bytecodes are byte-length (or variable length depending on their arguments), not 32bit as we were thinking. This does assume that the Address column is showing the address in terms of bytes and not some other unit, but I think this is a fair assumption. The same thing is seen in the disassembler shown in Fig 5-52 on page 5-41. Section 6.5 on page 6-9 specifically states that Jazelle assembly instructions are 8-bit. So we can conclude that they are byte aligned rather than word aligned. I wonder why the word aligned code appeared to work? > <http://209.85.129.104/search?q=cache:eT7UO_bq1XIJ:www.commsde sign.com/design_corner/showArticle.jhtml%3FarticleID%3D16503207> +arm+bxj&hl=en&ct=clnk&cd=4&client=firefox-a>: <snip> > In Java state, the processor assigns several ARM registers to > functions specific to the Java machine (for example, R6 = > stack pointer, R0-R3 = top elements of stack, R4 = local > variable 0). This hardware reuse contributes to the small > size of the additional logic (12,000 gates) required to > implement the Java machine, and keeps all of the states > required by the Jazelle extension in ARM registers, In > addition, it ensures compatibility with existing operating > systems, interrupt handlers and exception code. > > Keeping the top four elements of the stack in ARM registers [...]. > > The extension we've added divides Java byte codes into three > classes: directly executed, emulated and undefined. The > majority of the Java byte codes (138 on the ARM926EJ-S > microprocessor core) are executed directly in hardware; the > remainder are emulated by short sequences of highly optimized > ARM instructions. > -------------- So we now have the following register mappings: Top elements of stack - probably R0, R1, R2, R3 logical variable 0 - might be R4 Pointer to exception table - ?? Pointer to Java stack - ?? Pointer to Java variables area - ?? Pointer to the constant pool - ?? Do my original R12 and R14 mappings mean anything I wonder (see last section of this email), or were they just random names for the patent? I suppose we could try testing some of these other register mappings by pushing things to the stack and setting the value of local variable 0 and then looking at the registers once the code returns from the BXJ call. This assumes that these values are not altered when the exception occurs. I've looked at this in passing and it doesn't seem to show anything (that I expected - see my previous long email to see for yourselves) in the registers after the exception handler has been run. This may be an effect of the ARM exception handler overwriting though. Obviously we ought to be setting the pointer to the Jazelle exception table (if we knew which register to put it in and what form it takes!), but do we also need to setup things like the Java stack pointer, pointer to variables area and constant pool pointer? Even if we don't need to actually initialise the data at these addresses, do we need to allocate some memory and then provide pointers? There's another interesting bit in this article: "The key to making this approach work lies in a single new ARM instruction, "BXJ Rm," for entering Java state. This instruction first performs a test on one of the condition codes. If the condition is met, it then stores the current program counter (PC), puts the processor into Java state, branches to the specified target address and begins executing Java byte codes." Performs a test on one of the condition codes.... Which one I wonder? Or is this where a Java flag is checked (I'll have to take another look in the chip manual pdf). Anyone have any thoughts? My understanding is that condition codes are N(egative), Z(ero), C(arried over) and (o)V(erflow) and that the J bit, which is also in CPSR (and isn't a condition code afaik), is set by the BXJ instruction, rather than needing to be set before the BXJ instruction. In fact setting this bit is explicitly advised against wherever it's mentioned. Therefore do we need to do a CMP before the BXJ to get it to do something? I created some test code for this: http://people.bath.ac.uk/enpsgp/nokia770/jazelle/jalimo6a.c I don't know whether the BXJ instruction requires the condition code suffix, but it certainly compiles without complaint. The output is: Jalimo6a.bin ============ 1: x/i $pc 0x841c <main+108>: bxjne r0 (gdb) info registers r0 0xbef68640 -1091140032 r1 0x8428 33832 r2 0x8428 33832 r3 0x8428 33832 r4 0x8428 33832 r5 0x8428 33832 r6 0x8428 33832 r7 0x8428 33832 r8 0x8428 33832 r9 0x8428 33832 r10 0x8428 33832 r11 0x8428 33832 r12 0x8428 33832 sp 0x8428 33832 lr 0x8428 33832 pc 0x841c 33820 fps 0x1001000 16781312 cpsr 0x20000010 536870928 (gdb) si Program received signal SIGILL, Illegal instruction. 0xbef68640 in ?? () 1: x/i $pc 0xbef68640: undefined instruction 0xffffff10 (gdb) info registers r0 0xbef68640 -1091140032 r1 0x8428 33832 r2 0x8428 33832 r3 0x8428 33832 r4 0x8428 33832 r5 0x8428 33832 r6 0x8428 33832 r7 0x8428 33832 r8 0x8428 33832 r9 0x8428 33832 r10 0x8428 33832 r11 0x8428 33832 r12 0x8428 33832 sp 0x8428 33832 lr 0x8428 33832 pc 0xbef68640 -1091140032 fps 0x1001000 16781312 cpsr 0x20000010 536870928 Note that the BXJ instruction appears to have made the PC jump to the location of the Java bytecodes, but it's tried interpreting them as ARM instructions. Is this what's been happening all along? Do we need to set some bit to enable Jazelle? I'm assuming that this isn't a case of the Jazelle hardware falling back to ARM mode after trying to run in Jazelle mode (both because this is the first bytecode instruction and because it should be handlable). I think it would be odd logic if we've not switched to Jazelle mode (because of some condition flag or other) and therefore performed a standard BX (jump), but this may be the case. Any ideas? Might be worth looking at the presence (and accessibility) of Jazelle enable bits again. > <http://www.elecdesign.com/Articles/Index.cfm?ArticleID=4841&pg=2>: (Best try this link which shows all the document on one page: http://www.elecdesign.com/Articles/Print.cfm?ArticleID=4841) "Consequently, calling the Java mode is exactly like calling a subroutine. The return (from subroutine) is fairly straightforward. There are a number of unused Java byte codes. All of the unused byte codes are handled as exceptions. One of the unused byte codes is used as the means to return to the calling program. Whenever this byte code is encountered, the hardware takes an exception because it's an undefined byte code. The exception handler recognizes that byte code as a "return me to the calling program" instruction, and it will do that." This confirms Scott's idea. From the wording it looks like it's up to the handler software to actually perform the return operation, rather than the Jazelle hardware doing it itself (we still don't know what form the handler takes, nor where to pass its location, etc.). And a confirmation that BXJ is a conditional instruction: "BXJ is a conditional instruction. If a condition is false, nothing will happen. If a condition is true-which could be a zero condition, carry condition, or whatever-the branch will be taken. Before the branch is taken, the current program counter (PC) is stored and the J bit is set. Engineers can save three program steps when the program enters the Java state because the BXJ instruction performs three operations. First, it checks the condition. If the condition is true, it will store it in the PC and load a new PC. Then, it sets the Java state and takes a branch." Interesting wording here: "If the condition is true, it will store it in the PC and load a new PC." Does this mean the condition needs to return the address of the Java bytecodes? Any ideas as to how to do this? I found another little titbit here: http://www.itee.uq.edu.au/~esg/about/public/arm-intro.ppt on page 12, in the notes at the bottom of the page it says: "In Jazelle state, the processor doesn't perform 8-bit fetches from memory. Instead it does aligned 32-bit fetches (4-byte prefetching) which is more efficient. Note we don't mention the PC in Jazelle state because the 'Jazelle PC' is actually stored in r14 - this is technical detail that is not relevant as it is completely hidden by the Jazelle support code." So we know(?) R14 is the bytecode PC. Sorry for the length and slightly random arrangement, but I also wanted to write it down while it was fresh in my mind, Cheers, Simon
- Previous message: Java acceleration/Jazelle
- Next message: Java acceleration/Jazelle
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]