site logo  PikeOs21

HomePage | Categories |* BLOG *| RecentChanges | RecentlyCommented | Login/Register

/DEV/FASTIO$ - the Final Way

Okay. We just managed to get a transforming (32->16bit) call gate, that just happens to point to the wrong address. It was a matter of seconds to find the address of the corresponding GDT entry, and redirect it to the expected position. A kernel debugger is really a neat tool for the hacker. It worked!

At this point, calling the DevHlp_DynamicAPI function becomes useless, and will just occupy a later unusable entry point in the kernel. A quick look into the list of device helper functions offers the function DevHlp_AllocGDTSelector. We acquire a default GDT selector for exclusive use by the driver, and "adjust" it to form a 32->16 bit R3->R0 call gate into the I/O routine section of the driver.

Have a look at the code fragment in the FASTIO$ driver (figure 4) which does it all.

	            .386p
_acquire_gdt    proc    far
	    pusha

	    mov     ax, word ptr [_io_gdt32]        ; get selector
	    or      ax,ax
	    jnz     aexit                           ; if we didn't have one
	                                            ; make one

	    xor     ax, ax
	    mov     word ptr [_io_gdt32], ax        ; clear gdt save
	    mov     word ptr [gdthelper], ax        ; helper

	    push    ds
	    pop     es                              ; ES:DI = addr of
	    mov     di, offset _io_gdt32            ; _io_gdt32
	    mov     cx, 2                           ; two selectors
	    mov     dl, DevHlp_AllocGDTSelector     ; get GDT selectors
	    call    [_Device_Help]
	    jc      aexit                           ; exit if failed

	    sgdt    qword ptr [gdtsave]             ; access the GDT ptr
	    mov     ebx, dword ptr [gdtsave+2]      ; get lin addr of GDT
	    movzx   eax, word ptr [_io_gdt32]       ; build offset into table
	    and     eax, 0fffffff8h                 ; mask away DPL
	    add     ebx, eax                        ; build address in EBX

	    mov     ax, word ptr [gdthelper]        ; selector to map GDT at
	    mov     ecx, 08h                        ; a single entry (8 bytes)
	    mov     dl, DevHlp_LinToGDTSelector
	    call    [_Device_Help]
	    jc      aexit0                          ; if failed exit

	    mov     ax, word ptr [gdthelper]
	    mov     es, ax                          ; build address to GDT
	    xor     bx, bx

	    mov     word ptr es:[bx], offset _io_call ; fix address off
	    mov     word ptr es:[bx+2], cs          ; fix address sel
	    mov     word ptr es:[bx+4], 0ec00h      ; a r0 386 call gate
	    mov     word ptr es:[bx+6], 0000h       ; high offset

	    mov     dl, DevHlp_FreeGDTSelector      ; free gdthelper
	    call    [_Device_Help]
	    jnc     short aexit

aexit0: xor     ax,ax                           ; clear selector
	    mov     word ptr [_io_gdt32], ax
aexit:  popa                                    ; restore all registers
	    mov     ax, word ptr [_io_gdt32]
	    ret
_acquire_gdt    endp

Figure 4: Initialization routine of FASTIO$ driver

Since a device driver is initialized in ring 3, this routine does not work during startup. Rather, the driver will call this code once the first time some client opens the device. Thus, to use the driver, a small routine io_init() needs to be called first. Refer to the file iolib.asm that comes with this issue of EDM/2.

A final improvement: Usually, C code passes arguments on the stack. A call gate can be configured to copy these parameters over to the new ring. But why should we do this? For really fast I/O access we pass the data in registers. This allows for direct replacement of I/O instructions in assembler code by a simple indirect call as shown in figure 5. The address of the indirect call is set up by the above mentioned io_init() procedure.

EXTRN   ioentry:FWORD
  :
MOV     DX, portaddr
MOV     AL, 123
MOV     BX, 4           ; function code 4 = write byte
CALL    FWORD PTR [ioentry]
  :

Figure 5: Calling I/O from assembler

If the code needs to be called from C, we simply write a small stub that wraps a stack frame envelope around it, just as shown in figure 6.

; Calling convention:
;       void c_outb(short port,char data)
;
;
	    PUBLIC  _c_outb
	    PUBLIC  c_outb
_c_outb PROC
c_outb:
	    PUSH    EBP
	    MOV     EBP, ESP                ; set standard stack frame
	    PUSH    EBX                     ; save register
	    MOV     DX, WORD PTR [EBP+8]    ; get port
	    MOV     AL, BYTE PTR [EBP+12]   ; get data
	    MOV     BX, 4                   ; function code 4 = write byte
	    CALL    FWORD PTR [ioentry]     ; call intersegment indirect 16:32
	    POP     EBX                     ; restore bx
	    POP     EBP                     ; return
	    RET
	    ALIGN   4
_c_outb ENDP

Figure 6: A C callable I/O function

The file iolib.asm contains a set of functions c_inX() and c_outX() for using I/O from any 32 bit compiler that supports the standard stack frame. The files iolib.a and iolib.lib are precompiled versions; the file iolib.h contains the C prototypes.

In the complete driver, I gave up a small amount of the theoretically reachable performance. There are six basic I/O operations: IN and OUT instructions exist for transferring bytes, 16 bit words and 32 bit long words. To become really fast, one would have to provide a separate GDT selector for each of them. In a typical OS/2 system, this should not be a problem. However, if now everyone would start to add more routines, each with its own entry point, this resource could become rather quickly a scarce one. So I spent a function code, to be passed in the BX register, to multiplex the six functions into a single GDT selector. Refer to the io_call entry point in the fastio_a.asm driver source file.

There are no comments on this page. [Add comment]

Valid XHTML 1.0 Transitional :: Valid CSS :: Powered by Wikka Wakka Wiki 1.1.6.7
Page was generated in 0.5173 seconds