summaryrefslogtreecommitdiff
path: root/doc/kernel_bugs.md
diff options
context:
space:
mode:
authorWladimir J. van der Laan <laanwj@gmail.com>2013-09-09 12:38:27 +0200
committerWladimir J. van der Laan <laanwj@gmail.com>2013-09-09 13:51:19 +0200
commit7d5eafe76d06e6611a6e1e8c175f11a95a4f371f (patch)
tree8f5ef01813ada37354040d37227adebf27c72ade /doc/kernel_bugs.md
parent199c781c703eb3b4986bd454d29a5049f19290ff (diff)
driver: fix cousins of MSAA hang bug
- Change the order in `reset_context` as well - Clobber PS INPUTS state when MSAA samples changed to make sure it's always written, as we don't know the current value anymore.
Diffstat (limited to 'doc/kernel_bugs.md')
-rw-r--r--doc/kernel_bugs.md22
1 files changed, 22 insertions, 0 deletions
diff --git a/doc/kernel_bugs.md b/doc/kernel_bugs.md
index 0ccb33a..76e5445 100644
--- a/doc/kernel_bugs.md
+++ b/doc/kernel_bugs.md
@@ -31,3 +31,25 @@ signal causes the kernel to run out of signals. This causes hangs in rendering.
Status: workaround found (see `ETNA_MAX_UNSIGNALED_FLUSHES`)
+Command buffer memory management on cubox
+-------------------------------------------
+
+ <_rmk_> err, this is not good.
+ <_rmk_> gckCOMMAND_Start: command queue is at 0xe7846000
+ <_rmk_> gckCOMMAND_Start: WL 380000c8 0cf6d19f 40000002 10281000
+ <_rmk_> that's what the first 4 words should've been at the beginning (and where when the GPU was started)
+ <_rmk_> they then conveniently become: 4000002C 19CE0000 AAAAAAAA AAAAAAAA
+ <_rmk_> the first two change because of the wait being converted to a link
+ <_rmk_> the second two...
+ <_rmk_> well, 0xaaaaaaaa is the free page poisoning.
+ <_rmk_> which suggests that the GPU command page was freed while still in use
+ <_rmk_> ah ha, that'll be how. get lucky with the cache behaviour and this is what happens...
+ <_rmk_> page poisoning enabled. When a lowmem page is allocated, it's memset() through its lowmem mapping, which is cacheable.
+ <_rmk_> this data can sit in the CPU cache.
+ <_rmk_> it is then ioremap()'d (which I've been dead against for years) which creates a *device* mapping.
+ <_rmk_> we then write the wait/link through the device mapping.
+ <_rmk_> at some point later, the cache lines get evicted from the _normal memory_ mapping.
+ <_rmk_> thereby overwriting the original wait/link commands in the GPU stream with 0xAAAAAAAA
+ <_rmk_> I wonder what that a command word with a bit pattern of 10101 in the top 5 bits tells the GPU to do...
+ <_rmk_> the only thing I can say is... if people would damn well listen to me when I say "don't do this" then you wouldn't get these bugs.
+ <_rmk_> this is one of the reasons why my check in ioremap.c exists to prevent system memory being ioremap'd and therefore this kind of issue cropping up