documentation update

author: Wladimir J. van der Laan <laanwj@gmail.com> 2013-01-12 20:29:07 +0100
committer: Wladimir J. van der Laan <laanwj@gmail.com> 2013-01-12 20:29:07 +0100
commit: 2fc2018c6979149f0e9f868ab9cbdf4db8680219 (patch)
tree: e6572a396b1fa4e9f16828282e82f6d66063a6ac /doc/kernel_interface.md
parent: ca2c00723e3697d85b6abc3e400979720a0d3a96 (diff)
1 files changed, 114 insertions, 58 deletions
diff --git a/doc/kernel_interface.md b/doc/kernel_interface.md
index b819b7b..7626178 100644
--- a/doc/kernel_interface.md
+++ b/doc/kernel_interface.md
@@ -1,26 +1,65 @@
 Devices
 =======================
 
-At startup, the application connects to galcore device using open with the device
+At startup, the application connects to galcore device using `open` with the device
 
 - `/dev/galcore`, or
 - `/dev/graphics/galcore`
 
-Then it figures out what the base and size of the contigous memory is using the "query video memory" call.
+Ioctl
+-------
 
-After that it mmaps the complete contiguous memory (128MB in my case) from device into memory space.
+Communication with the kernel driver happens through ioctl calls on the resulting file descriptor. The following request ids are defined:
+
+- `IOCTL_GCHAL_INTERFACE` (30000)
+- `IOCTL_GCHAL_KERNEL_INTERFACE` (30001)
+- `IOCTL_GCHAL_TERMINATE` (30002)
+
+`IOCTL_GCHAL_INTERFACE` is the only one of these that is actually used by the userspace blob. This ioctl is passed one argument
+which is a pointer to the following structure:
+
+    typedef struct 
+    {
+        void *in_buf;
+        uint32_t in_buf_size;
+        void *out_buf;
+        uint32_t out_buf_size;
+    } vivante_ioctl_data_t;
+
+When used by the blob, `in_buf` and `out_buf` point to the same memory address: a `gcsHAL_INTERFACE` structure that is 
+used both for input and output arguments.
+
+gcsHAL_INTERFACE
+-----------------
+The `gcsHAL_INTERFACE` (defined in `gc_hal_driver`) is the structure used by the driver to communicate with the 
+kernel. It can be seen as a communication packet with a command opcode and an union with parameters. 
+Depending on the `command` a different field of this union is used. The same structure is used both for input and output
+arguments. 
+
+For example, the command `gcvHAL_ALLOCATE_LINEAR_VIDEO_MEMORY` (I will leave off the `gcvHAL_` from now on) 
+uses the fields in `interface->u.AllocateLinearVideoMemory` to pass in the number of bytes to allocate, but 
+also to pass out the number of bytes actually allocated. 
+
+It appears that the structure has been designed with platform-independence in mind, and so some of the fields are not used in the Linux
+drivers such as `status`, `handle`, `pid`.
 
 Allocations
 ------------
-Two types of memory are allocated:
+Memory management happens in the kernel. Two types of memory are allocated:
+
+- Contiguous memory
 
-Contiguous memory
   Used for command buffers
+  Allocated with command `ALLOCATE_CONTIGUOUS_MEMORY`
+
   Reserved system memory that is contiguous (not fragmented by MMU) and mapped into GPU memory
   It looks like the blob driver also allocates a signal for each contigous memory block, how does this get used?
 
-Linear video memory
-  Used for render targets, textures, surfaces, vertex buffers, bitmaps (see `gcvSURF_*`) ...
+- Linear video memory
+
+  Used for render targets, textures, surfaces, vertex buffers, bitmaps.
+  The type of usage is specified by allocating the memory (see `gceSURF_TYPE` in `gc_hal_enum.h`).
+  Allocated with command `ALLOCATE_LINEAR_VIDEO_MEMORY`
 
   Device memory, from one of the pools (default, local, unified or contiguous system memory)
   The available pools depend on the hardware; many of the devices have no local memory, and simply 
@@ -30,10 +69,35 @@ Linear video memory
 into CPU memory so that the application can read/write. It is interesting that these are done by
 the same call.
 
+Command buffers
+-------------------
+
+Like many other GPUs, the primary means of programming the chip is through a command stream 
+interpreted by a DMA engine. This "Front End" takes care of distributing state changes through
+the individual modules of the GPU, kicking off primitive rendering, synchronization, 
+and also supports some primitive flow control (branch, call, return).
+
+The command stream is submitted to the kernel by means of command buffers. As most important part these 
+structures contain a pointer to contiguous memory (allocated with command `ALLOCATE_CONTIGUOUS_MEMORY`) 
+where the commands start.
+
+Command buffers are built in user space by the driver in a `gcoCMDBUF` structure, then submitted to the kernel with the 
+`COMMIT` command. 
+
+The following structure fields of `gcoCMDBUF` are used by the kernel:
+
+- `object`: marks the type of object (`gcvOBJ_COMMANDBUFFER`)
+- `physical`: physical address of command buffer
+- `logical`: logical (user space) address of command buffer
+- `bytes`: size of command buffer memory block in bytes
+- `startOffset`: offset at which to start sending command buffer (in bytes)
+- `offset`: end offset (in bytes)
+- `free`: number of free bytes in command buffer
+
 User signal API
 ----------------
-Commands with op `USER_SIGNAL` are used for signaling between the kernel and userspace
-driver.
+Command `USER_SIGNAL` is used for synchronization signals between the kernel and userspace driver.
+
 The subcommands are:
 
 - `USER_SIGNAL_CREATE` Create a new signal
@@ -69,28 +133,29 @@ The subcommands are:
   Inputs: id
   Outputs: N/A
 
-This is used to synchronize GPU and CPU
+This is used to synchronize GPU and CPU. 
 Signals can be scheduled to be signalled/unsignalled when the GPU finished a certain operation (using an Event).
 They are also used for inter-thread synchronization by the EGL driver.
 
-The event queue schedules kernel operations to happen in the future.
+The event queue effectively schedules kernel operations to happen in the future, when the GPU has finished processing the currently
+committed command buffers.
 
-Events queues are sent to the kernel using `gcvHAL_EVENT_COMMIT`. Types of interfaces that can be sent using an event are:
+Events queues are sent to the kernel using the command `HAL_EVENT_COMMIT`. Types of interfaces that can be sent using an event are:
 
-- `FREE_NON_PAGED_MEMORY`
-- `FREE_CONTIGUOUS_MEMORY`
-- `FREE_VIDEO_MEMORY`
-- `WRITE_DATA`
-- `UNLOCK_VIDEO_MEMORY`
-- `SIGNAL`
-- `UNMAP_USER_MEMORY`
+- `FREE_NON_PAGED_MEMORY`: free earlier allocated non paged memory
+- `FREE_CONTIGUOUS_MEMORY`: free earier allocated contiguous memory
+- `FREE_VIDEO_MEMORY`: free earlier allocated video memory
+- `WRITE_DATA`: write data to memory using `writel`
+- `UNLOCK_VIDEO_MEMORY`: unlock earlier locked video memory
+- `SIGNAL`: command from the signal API described in this section
+- `UNMAP_USER_MEMORY`: unmap earlier mapped user memory
 
-User can then wait for them using `gcvHAL_USER_EVENT`.
+Userspace can then wait for them using `USER_SIGNAL` with subcommand `USER_SIGNAL_WAIT`.
 
 Anatomy of a small rendering test
 ----------------------------------
 
-See `native/replay` examples.
+See `native/replay` tests for details.
 
 - Get GPU base address
 - Get chip identity
@@ -127,20 +192,6 @@ See `native/replay` examples.
 - Lock vidmem K, address 7a002f00, memory 422f0f00
 - Build and commit the command buffer
 
-Command buffers
-==================
-Command buffers are built in user space by the driver in a `gcoCMDBUF` structure, then submitted to the kernel with the 
-`gcvHAL_COMMIT` command. 
-
-The following structure fields of `gcoCMDBUF` are used by the kernel:
-
-- object: marks the type of object (`gcvOBJ_COMMANDBUFFER`)
-- physical: physical address of command buffer
-- logical: logical (user space) address of command buffer
-- bytes: size of command buffer memory block in bytes
-- startOffset: offset at which to start sending command buffer (in bytes)
-- offset: end offset (in bytes)
-- free: number of free bytes in command buffer
 
 Context switching
 ==================
@@ -159,68 +210,69 @@ to the main command buffer.
 
 The state `FE.VERTEX_ELEMENT_CONFIG` is handled specially: write only the elements that are used, starting from 0x00600
 
-Used fields in `_gcoCONTEXT` from the kernel:
+Used fields in `struct _gcoCONTEXT` from the kernel:
 
-- id is used
+- `id` is used
     [in] This id is used to determine wether to switch context
     [out] A unique id for the context is generated the first time a COMMIT is done, with context->id==0
-- `hint*` only used when SECURE_USER
-- logical and bufferSize are used
-- pipe2DIndex is used
+- `hint*` only used when `SECURE_USER` is set
+- `logical` and `bufferSize` are used
+- `pipe2DIndex` is used
     if this is set, "we have to check pipes", and the pipe is set to initialPipe if needed
-- entryPipe is used
+- `entryPipe` is used
     this is the pipe that has to be active on entering the passed command buffer
     (and that holds at the end of the context buffer)
-- initialPipe is used 
+- `initialPipe` is used 
     this is the pipe that has to be active on entering the context command buffer
-- currentPipe is used
+- `currentPipe` is used
     this is the pipe that is active after the passed command buffer
-- inUse is used
+- `inUse` is used
     value at this address is set to gcvTRUE, to mark the context as used. The context is "used" when
     a context switch happened.
 
 All command buffers are padded with 4 NOPs at the beginning to make place for a PIPE command if needed.
 At the end of the command buffer must be place for a LINK (1 NOP + padding).
 
-What are these used for? Seems they are the last parameters of a 'LOAD_STATE' command so that it
+What are these used for? Seems they are the last parameters of a `LOAD_STATE` command so that it
   can be extended, but why? Was this only used for building or does the kernel also use it?
-- lastAddress
-- lastSize
-- lastIndex
-- lastFixed
-- postCommit
-- buffer (userspace buffer for what is put into logical)
+- `lastAddress`
+- `lastSize`
+- `lastIndex`
+- `lastFixed`
+- `postCommit`
+- `buffer` (userspace buffer for what is put into logical)
    same as logical, except that the PIPE2D command at the end is nopped out
    including accompanying semaphore and stall
    (probably because we're using the 3D pipe)
 
-At least in the v2 kernel driver they are not used. They are used for building the buffer from the 
+At least in the v2 kernel driver these fields are not used. They are used for building the buffer from the 
 userspace driver, but not for using it.
 
 Profiling
 ===============
 
 To enable profiling, the kernel most have been built with `VIVANTE_PROFILER` enabled in `gc_hal_options.h`.
+    
+HW profiling registers can be read using the command `READ_ALL_PROFILE_REGISTERS`.
 
-HW profiling registers can be read using the ioctl:
-
-    gcvHAL_READ_ALL_PROFILE_REGISTERS
+There are also the commands `GET_PROFILE_SETTING` and `SET_PROFILE_SETTING`, apparently for logging to files, 
+but these aren't even implemented in the kernel drivers.
 
 This will return a structure `gcsPROFILER_COUNTERS`, defined in `GC_HAL_PROFILER.h`, which has the following timers:
 
 Hardware-wise, the memory controller keeps track of these counters in registers `MC_PROFILE_xx_READ`,
 switched by corresponding bits in registers `MC_PROFILE_CONFIGx`.
 
-HW static counters (clock rates). These are not filled in by the kernel, it appears.
+HW static counters (clock rates). These are never filled in by the kernel, it appears, so will likely contain garbage.
 
     gpuClock
     axiClock
     shaderClock
+    gpuClockStart
+    gpuClockEnd
 
 HW variable counters
 
-    gpuClockStart
-    gpuClockEnd
     gpuCyclesCounter
     gpuTotalRead64BytesPerFrame
     gpuTotalWrite64BytesPerFrame
@@ -291,4 +343,8 @@ HI (Host interface)
     hi_axi_cycles_write_request_stalled
     hi_axi_cycles_write_data_stalled
 
+Resetting the GPU
+-------------------
+
+When the GPU gets stuck, it can be reset with the `gcvHAL_RESET` ioctl command. This calls the `gckHARDWARE_Reset` kernel function.
author	Wladimir J. van der Laan <laanwj@gmail.com>	2013-01-12 20:29:07 +0100
committer	Wladimir J. van der Laan <laanwj@gmail.com>	2013-01-12 20:29:07 +0100
commit	2fc2018c6979149f0e9f868ab9cbdf4db8680219 (patch)
tree	e6572a396b1fa4e9f16828282e82f6d66063a6ac /doc/kernel_interface.md
parent	ca2c00723e3697d85b6abc3e400979720a0d3a96 (diff)