Age | Commit message (Collapse) | Author |
|
it will cause hwmon node of power1_label is not created.
v2:
the hwmon node of "power1_label" is always needed for all ASICs.
and the patch will remove ASIC type check for "power1_label".
Fixes: ae07970a0621d6 ("drm/amd/pm: add support for hwmon control of slow and fast PPT limit on vangogh")
Signed-off-by: Yang Wang <KevinYang.Wang@amd.com>
Reviewed-by: Kenneth Feng <kenneth.feng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
PPT limit cannot be queried from VF
Fixes: f3527a6483fbcc ("drm/amd/pm: Enable sysfs required by rocm-smi tool for One VF mode")
Signed-off-by: Yang Wang <KevinYang.Wang@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Kenneth Feng <kenneth.feng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
If no driver is attached to a device or the driver does not provide the
path_event function, an FCES path-event on this device could end up in a
kernel-panic. Verify the driver availability before the path_event
function call.
Fixes: 32ef938815c1 ("s390/cio: Add support for FCES status notification")
Cc: stable@vger.kernel.org
Signed-off-by: Vineeth Vijayan <vneethv@linux.ibm.com>
Suggested-by: Peter Oberparleiter <oberpar@linux.ibm.com>
Reviewed-by: Jan Hoeppner <hoeppner@linux.ibm.com>
Reviewed-by: Peter Oberparleiter <oberpar@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
|
|
As reported by Jeff, dereferencing the openat2 syscall argument in
audit_match_perm() to obtain the open_how::flags can result in an
oops/page-fault. This patch fixes this by using the open_how struct
that we store in the audit_context with audit_openat2_how().
Independent of this patch, Richard Guy Briggs posted a similar patch
to the audit mailing list roughly 40 minutes after this patch was
posted.
Cc: stable@vger.kernel.org
Fixes: 1c30e3af8a79 ("audit: add support for the openat2 syscall")
Reported-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
|
|
it will cause hwmon node of power1_label is not created.
v2:
the hwmon node of "power1_label" is always needed for all ASICs.
and the patch will remove ASIC type check for "power1_label".
Fixes: ae07970a0621d6 ("drm/amd/pm: add support for hwmon control of slow and fast PPT limit on vangogh")
Signed-off-by: Yang Wang <KevinYang.Wang@amd.com>
Reviewed-by: Kenneth Feng <kenneth.feng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
[Why]
Even if can_apply_edp_fast_boot is set to 1 at boot, this flag will
be cleared to 0 at S3 resume.
[How]
Keep eDP Vdd on when eDP stream is already enabled.
Reviewed-by: Charlene Liu <Charlene.Liu@amd.com>
Acked-by: Jasdeep Dhillon <jdhillon@amd.com>
Signed-off-by: Zhan Liu <Zhan.Liu@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Fix clamping to match register field size
Reviewed-by: Charlene Liu <Charlene.Liu@amd.com>
Acked-by: Jasdeep Dhillon <jdhillon@amd.com>
Signed-off-by: Dmytro Laktyushkin <Dmytro.Laktyushkin@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
[Why]
pflip interrupt order are mapped 1 to 1 to otg id.
e.g. if irq_src=26 corresponds to otg0 then 27->otg1, 28->otg2...
Linux DM registers pflip interrupts per number of crtcs.
In fused pipe case crtc numbers can be less than otg id.
e.g. if one pipe out of 3(otg#0-2) is fused adev->mode_info.num_crtc=2
so DM only registers irq_src 26,27.
This is a bug since if pipe#2 remains unfused DM never gets
otg2 pflip interrupt (irq_src=28)
That may results in gfx failure due to pflip timeout.
[How]
Register pflip interrupts per max num of otg instead of num_crtc
Signed-off-by: Roman Li <Roman.Li@amd.com>
Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Confirmed with hardware team, there is harvesting for gc 10.3.1.
Signed-off-by: Aaron Liu <aaron.liu@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
A number of BIOS versions have a problem with the watermarks table not
being configured properly. This manifests as a very scary looking warning
during resume from s0i3. This should be harmless in most cases and is well
understood, so decrease the assertion to a clearer warning about the problem.
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
IPS must be disabled prior to disabling the last plane (excluding
the cursor). Make the code do that instead of assuming the primary
plane would be the last one. This is probably 100% theoretical
as the BIOS should never light up the other planes anyway. But
no harm in making the code totally consistent.
Also let's update the ips_enabled flag in the crtc state afterwards
so that the first atomic commit has accurate information about
the state of IPS.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220209113526.7595-6-ville.syrjala@linux.intel.com
Acked-by: Jani Nikula <jani.nikula@intel.com>
|
|
Pull the IPS state readout into hsw_ips.c.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220209113526.7595-5-ville.syrjala@linux.intel.com
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
|
|
IPS is a pretty well isolated feature. Move the relevant code
to a separate file from polluting intel_display.c.
I stuck to the hsw_ips name since that's what the function were
already using, and also to avoid confusion with the ILK
"Intelligen Power Sharing"/intel_ips GPU turbo stuff.
And let's also do the s/dev_priv/i915/ rename while touching
most of the code.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220209113526.7595-4-ville.syrjala@linux.intel.com
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
|
|
No reason the caller of the IPS pre/post update hooks should
be responsible for the actual IPS enab/disable. Just pull those
calls into the pre/post update hooks themselves. And while
at it let's adjust the function naming a bit to have a consistent
namespace.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220209113526.7595-3-ville.syrjala@linux.intel.com
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
|
|
Follow the modern state+crtc calling convention for the IPS
code as well.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220209113526.7595-2-ville.syrjala@linux.intel.com
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
|
|
Hoist the IPS related vblank waits one level up. Later on we'll
want to consolidate all the potential pre-plane update vblank
waits into one so we can't be hiding any in low level code.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220209113526.7595-1-ville.syrjala@linux.intel.com
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
|
|
Replace reset queue for specific PASID with unmap all queues, reset
queue could break CP scheduler.
Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
As the function is used in more different cases, use a more general
name.
Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Commit 7f7b4236f204 ("x86/PCI: Ignore E820 reservations for bridge windows
on newer systems") fixes the touchpad not working on laptops like
the Lenovo IdeaPad 3 15IIL05 and the Lenovo IdeaPad 5 14IIL05, as well as
fixing thunderbolt hotplug issues on the Lenovo Yoga C940.
Unfortunately it turns out that this is causing issues with suspend/resume
on Lenovo ThinkPad X1 Carbon Gen 2 laptops. So, per the no regressions
policy, rever this. Note I'm looking into another fix for the issues this
fixed.
Fixes: 7f7b4236f204 ("x86/PCI: Ignore E820 reservations for bridge windows on newer systems")
BugLink: https://bugzilla.redhat.com/show_bug.cgi?id=2029207
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
The original version of the IORT PMCG definition had an oversight
wherein there was no way to describe the second register page for an
implementation using the recommended RELOC_CTRS feature. Although the
spec was fixed, and the final patches merged to ACPICA and Linux written
against the new version, it seems that some old firmware based on the
original revision has survived and turned up in the wild.
Add a check for the original PMCG definition, and avoid filling in the
second memory resource with nonsense if so. Otherwise it is likely that
something horrible will happen when the PMCG driver attempts to probe.
Reported-by: Michael Petlan <mpetlan@redhat.com>
Fixes: 24e516049360 ("ACPI/IORT: Add support for PMCG")
Cc: <stable@vger.kernel.org> # 5.2.x
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Acked-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Link: https://lore.kernel.org/r/75628ae41c257fb73588f7bf1c4459160e04be2b.1643916258.git.robin.murphy@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux
Pull more nfsd fixes from Chuck Lever:
"Ensure that NFS clients cannot send file size or offset values that
can cause the NFS server to crash or to return incorrect or surprising
results.
In particular, fix how the NFS server handles values larger than
OFFSET_MAX"
* tag 'nfsd-5.17-2' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux:
NFSD: Deprecate NFS_OFFSET_MAX
NFSD: Fix offset type in I/O trace points
NFSD: COMMIT operations must not return NFS?ERR_INVAL
NFSD: Clamp WRITE offsets
NFSD: Fix NFSv3 SETATTR/CREATE's handling of large file sizes
NFSD: Fix ia_size underflow
NFSD: Fix the behavior of READ near OFFSET_MAX
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6
Pull crypto fixes from Herbert Xu:
"Fix two regressions:
- Potential boot failure due to missing cryptomgr on initramfs
- Stack overflow in octeontx2"
* 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
crypto: api - Move cryptomgr soft dependency into algapi
crypto: octeontx2 - Avoid stack variable overflow
|
|
Commit 3ba442d5331f ("fs: move binfmt_misc sysctl to its own file") did
not go unnoticed, binfmt-support stopped to work on my Debian system
since v5.17-rc2 (did not check with -rc1).
The existance of the /proc/sys/fs/binfmt_misc is a precondition for
attempting to mount the binfmt_misc fs, which in turn triggers the
autoload of the binfmt_misc module. Without it, no module is loaded and
no binfmt is available at boot.
Building as built-in or manually loading the module and mounting the fs
works fine, it's therefore only a matter of interaction with user-space.
I could try to improve the Debian systemd configuration but I can't say
anything about the other distributions.
This patch restores a working system right after boot.
Fixes: 3ba442d5331f ("fs: move binfmt_misc sysctl to its own file")
Signed-off-by: Domenico Andreoli <domenico.andreoli@linux.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Luis Chamberlain <mcgrof@kernel.org>
Reviewed-by: Tong Zhang <ztong0001@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Since we have a single instance of reset semaphore which we
lock only once even for XGMI hive we don't need the nested
locking hint anymore.
Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Link: https://www.spinics.net/lists/amd-gfx/msg74120.html
|
|
This functions needs to be split into 2 parts where
one is called only once for locking single instance of
reset_domain's sem and reset flag and the other part
which handles MP1 states should still be called for
each device in XGMI hive.
Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Link: https://www.spinics.net/lists/amd-gfx/msg74118.html
|
|
We should have a single instance per entrire reset domain.
Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Suggested-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Link: https://www.spinics.net/lists/amd-gfx/msg74116.html
|
|
We want single instance of reset sem across all
reset clients because in case of XGMI we should stop
access cross device MMIO because any of them could be
in a reset in the moment.
Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Link: https://www.spinics.net/lists/amd-gfx/msg74117.html
|
|
The reset domain contains register access semaphor
now and so needs to be present as long as each device
in a hive needs it and so it cannot be binded to XGMI
hive life cycle.
Adress this by making reset domain refcounted and pointed
by each member of the hive and the hive itself.
v4:
Fix crash on boot witrh XGMI hive by adding type to reset_domain.
XGMI will only create a new reset_domain if prevoius was of single
device type meaning it's first boot. Otherwsie it will take a
refocunt to exsiting reset_domain from the amdgou device.
Add a wrapper around reset_domain->refcount get/put
and a wrapper around send to reset wq (Lijo)
Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Link: https://www.spinics.net/lists/amd-gfx/msg74121.html
|
|
Since now all GPU resets are serialzied there is no need for this.
This patch also reverts 'drm/amdgpu: race issue when jobs on 2 ring timeout'
Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Link: https://www.spinics.net/lists/amd-gfx/msg74119.html
|
|
Since we serialize all resets no need to protect from concurrent
resets.
Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Link: https://www.spinics.net/lists/amd-gfx/msg74115.html
|
|
No need to to trigger another work queue inside the work queue.
v3:
Problem:
Extra reset caused by host side FLR notification
following guest side triggered reset.
Fix: Preven qeuing flr_work from mailbox irq if guest
already executing a reset.
Suggested-by: Liu Shaoyun <Shaoyun.Liu@amd.com>
Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Reviewed-by: Liu Shaoyun <Shaoyun.Liu@amd.com>
Link: https://www.spinics.net/lists/amd-gfx/msg74114.html
|
|
Use reset domain wq also for non TDR gpu recovery trigers
such as sysfs and RAS. We must serialize all possible
GPU recoveries to gurantee no concurrency there.
For TDR call the original recovery function directly since
it's already executed from within the wq. For others just
use a wrapper to qeueue work and wait on it to finish.
v2: Rename to amdgpu_recover_work_struct
Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Link: https://www.spinics.net/lists/amd-gfx/msg74113.html
|
|
Before we initialize schedulers we must know which reset
domain are we in - for single device there iis a single
domain per device and so single wq per device. For XGMI
the reset domain spans the entire XGMI hive and so the
reset wq is per hive.
Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Link: https://www.spinics.net/lists/amd-gfx/msg74112.html
|
|
Defined a reset_domain struct such that
all the entities that go through reset
together will be serialized one against
another. Do it for both single device and
XGMI hive cases.
Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Suggested-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Suggested-by: Christian König <ckoenig.leichtzumerken@gmail.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Link: https://www.spinics.net/lists/amd-gfx/msg74111.html
|
|
Pull s390 kvm fix from Christian Borntraeger:
"Add missing check for the MEMOP ioctl
The SIDA MEMOPs must only be used for secure guests, otherwise
userspace can do unwanted memory accesses"
* tag 'kvm-s390-kernel-access' from emailed bundle:
KVM: s390: Return error on SIDA memop on normal guest
|
|
NFS_OFFSET_MAX was introduced way back in Linux v2.3.y before there
was a kernel-wide OFFSET_MAX value. As a clean up, replace the last
few uses of it with its generic equivalent, and get rid of it.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
|
|
NFSv3 and NFSv4 use u64 offset values on the wire. Record these values
verbatim without the implicit type case to loff_t.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
|
|
Since, well, forever, the Linux NFS server's nfsd_commit() function
has returned nfserr_inval when the passed-in byte range arguments
were non-sensical.
However, according to RFC 1813 section 3.3.21, NFSv3 COMMIT requests
are permitted to return only the following non-zero status codes:
NFS3ERR_IO
NFS3ERR_STALE
NFS3ERR_BADHANDLE
NFS3ERR_SERVERFAULT
NFS3ERR_INVAL is not included in that list. Likewise, NFS4ERR_INVAL
is not listed in the COMMIT row of Table 6 in RFC 8881.
RFC 7530 does permit COMMIT to return NFS4ERR_INVAL, but does not
specify when it can or should be used.
Instead of dropping or failing a COMMIT request in a byte range that
is not supported, turn it into a valid request by treating one or
both arguments as zero. Offset zero means start-of-file, count zero
means until-end-of-file, so we only ever extend the commit range.
NFS servers are always allowed to commit more and sooner than
requested.
The range check is no longer bounded by NFS_OFFSET_MAX, but rather
by the value that is returned in the maxfilesize field of the NFSv3
FSINFO procedure or the NFSv4 maxfilesize file attribute.
Note that this change results in a new pynfs failure:
CMT4 st_commit.testCommitOverflow : RUNNING
CMT4 st_commit.testCommitOverflow : FAILURE
COMMIT with offset + count overflow should return
NFS4ERR_INVAL, instead got NFS4_OK
IMO the test is not correct as written: RFC 8881 does not allow the
COMMIT operation to return NFS4ERR_INVAL.
Reported-by: Dan Aloni <dan.aloni@vastdata.com>
Cc: stable@vger.kernel.org
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: Bruce Fields <bfields@fieldses.org>
|
|
Ensure that a client cannot specify a WRITE range that falls in a
byte range outside what the kernel's internal types (such as loff_t,
which is signed) can represent. The kiocb iterators, invoked in
nfsd_vfs_write(), should properly limit write operations to within
the underlying file system's s_maxbytes.
Cc: stable@vger.kernel.org
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
|
|
iattr::ia_size is a loff_t, so these NFSv3 procedures must be
careful to deal with incoming client size values that are larger
than s64_max without corrupting the value.
Silently capping the value results in storing a different value
than the client passed in which is unexpected behavior, so remove
the min_t() check in decode_sattr3().
Note that RFC 1813 permits only the WRITE procedure to return
NFS3ERR_FBIG. We believe that NFSv3 reference implementations
also return NFS3ERR_FBIG when ia_size is too large.
Cc: stable@vger.kernel.org
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
|
|
iattr::ia_size is a loff_t, which is a signed 64-bit type. NFSv3 and
NFSv4 both define file size as an unsigned 64-bit type. Thus there
is a range of valid file size values an NFS client can send that is
already larger than Linux can handle.
Currently decode_fattr4() dumps a full u64 value into ia_size. If
that value happens to be larger than S64_MAX, then ia_size
underflows. I'm about to fix up the NFSv3 behavior as well, so let's
catch the underflow in the common code path: nfsd_setattr().
Cc: stable@vger.kernel.org
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
|
|
Dan Aloni reports:
> Due to commit 8cfb9015280d ("NFS: Always provide aligned buffers to
> the RPC read layers") on the client, a read of 0xfff is aligned up
> to server rsize of 0x1000.
>
> As a result, in a test where the server has a file of size
> 0x7fffffffffffffff, and the client tries to read from the offset
> 0x7ffffffffffff000, the read causes loff_t overflow in the server
> and it returns an NFS code of EINVAL to the client. The client as
> a result indefinitely retries the request.
The Linux NFS client does not handle NFS?ERR_INVAL, even though all
NFS specifications permit servers to return that status code for a
READ.
Instead of NFS?ERR_INVAL, have out-of-range READ requests succeed
and return a short result. Set the EOF flag in the result to prevent
the client from retrying the READ request. This behavior appears to
be consistent with Solaris NFS servers.
Note that NFSv3 and NFSv4 use u64 offset values on the wire. These
must be converted to loff_t internally before use -- an implicit
type cast is not adequate for this purpose. Otherwise VFS checks
against sb->s_maxbytes do not work properly.
Reported-by: Dan Aloni <dan.aloni@vastdata.com>
Cc: stable@vger.kernel.org
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
|
|
AER is not backed by a real request, hence we should not incorrectly
assume that when failing to send a nvme command, it is a normal request
but rather check if this is an aer and if so complete the aer (similar
to the normal completion path).
Cc: stable@vger.kernel.org
Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Christoph Hellwig <hch@lst.de>
|
|
Add NVMe request completion trace in nvme_complete_batch_req() because
nvme:nvme_complete_req tracepoint is missing in case of request batched
completion.
Signed-off-by: Bean Huo <beanhuo@micron.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
|
|
Xin Long says:
====================
vlan: fix a netdev refcnt leak for QinQ
This issue can be simply reproduced by:
# ip link add dummy0 type dummy
# ip link add link dummy0 name dummy0.1 type vlan id 1
# ip link add link dummy0.1 name dummy0.1.2 type vlan id 2
# rmmod 8021q
unregister_netdevice: waiting for dummy0.1 to become free. Usage count = 1
So as to fix it, adjust vlan_dev_uninit() in Patch 1/1 so that it won't
be called twice for the same device, then do the fix in vlan_dev_uninit()
in Patch 2/2.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Shuang Li reported an QinQ issue by simply doing:
# ip link add dummy0 type dummy
# ip link add link dummy0 name dummy0.1 type vlan id 1
# ip link add link dummy0.1 name dummy0.1.2 type vlan id 2
# rmmod 8021q
unregister_netdevice: waiting for dummy0.1 to become free. Usage count = 1
When rmmods 8021q, all vlan devs are deleted from their real_dev's vlan grp
and added into list_kill by unregister_vlan_dev(). dummy0.1 is unregistered
before dummy0.1.2, as it's using for_each_netdev() in __rtnl_kill_links().
When unregisters dummy0.1, dummy0.1.2 is not unregistered in the event of
NETDEV_UNREGISTER, as it's been deleted from dummy0.1's vlan grp. However,
due to dummy0.1.2 still holding dummy0.1, dummy0.1 will keep waiting in
netdev_wait_allrefs(), while dummy0.1.2 will never get unregistered and
release dummy0.1, as it delays dev_put until calling dev->priv_destructor,
vlan_dev_free().
This issue was introduced by Commit 563bcbae3ba2 ("net: vlan: fix a UAF in
vlan_dev_real_dev()"), and this patch is to fix it by moving dev_put() into
vlan_dev_uninit(), which is called after NETDEV_UNREGISTER event but before
netdev_wait_allrefs().
Fixes: 563bcbae3ba2 ("net: vlan: fix a UAF in vlan_dev_real_dev()")
Reported-by: Shuang Li <shuali@redhat.com>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This patch is to introduce vlan_dev_free_egress_priority() to
free egress priority for vlan dev, and keep vlan_dev_uninit()
static as .ndo_uninit. It makes the code more clear and safer
when adding new code in vlan_dev_uninit() in the future.
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The ax25_kill_by_device() will set s->ax25_dev = NULL and
call ax25_disconnect() to change states of ax25_cb and
sock, if we call ax25_bind() before ax25_kill_by_device().
However, if we call ax25_bind() again between the window of
ax25_kill_by_device() and ax25_dev_device_down(), the values
and states changed by ax25_kill_by_device() will be reassigned.
Finally, ax25_dev_device_down() will deallocate net_device.
If we dereference net_device in syscall functions such as
ax25_release(), ax25_sendmsg(), ax25_getsockopt(), ax25_getname()
and ax25_info_show(), a UAF bug will occur.
One of the possible race conditions is shown below:
(USE) | (FREE)
ax25_bind() |
| ax25_kill_by_device()
ax25_bind() |
ax25_connect() | ...
| ax25_dev_device_down()
| ...
| dev_put_track(dev, ...) //FREE
ax25_release() | ...
ax25_send_control() |
alloc_skb() //USE |
the corresponding fail log is shown below:
===============================================================
BUG: KASAN: use-after-free in ax25_send_control+0x43/0x210
...
Call Trace:
...
ax25_send_control+0x43/0x210
ax25_release+0x2db/0x3b0
__sock_release+0x6d/0x120
sock_close+0xf/0x20
__fput+0x11f/0x420
...
Allocated by task 1283:
...
__kasan_kmalloc+0x81/0xa0
alloc_netdev_mqs+0x5a/0x680
mkiss_open+0x6c/0x380
tty_ldisc_open+0x55/0x90
...
Freed by task 1969:
...
kfree+0xa3/0x2c0
device_release+0x54/0xe0
kobject_put+0xa5/0x120
tty_ldisc_kill+0x3e/0x80
...
In order to fix these UAF bugs caused by rebinding operation,
this patch adds dev_hold_track() into ax25_bind() and
corresponding dev_put_track() into ax25_kill_by_device().
Signed-off-by: Duoming Zhou <duoming@zju.edu.cn>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Rafael reports that on a system with LX2160A and Marvell DSA switches,
if a reboot occurs while the DSA master (dpaa2-eth) is up, the following
panic can be seen:
systemd-shutdown[1]: Rebooting.
Unable to handle kernel paging request at virtual address 00a0000800000041
[00a0000800000041] address between user and kernel address ranges
Internal error: Oops: 96000004 [#1] PREEMPT SMP
CPU: 6 PID: 1 Comm: systemd-shutdow Not tainted 5.16.5-00042-g8f5585009b24 #32
pc : dsa_slave_netdevice_event+0x130/0x3e4
lr : raw_notifier_call_chain+0x50/0x6c
Call trace:
dsa_slave_netdevice_event+0x130/0x3e4
raw_notifier_call_chain+0x50/0x6c
call_netdevice_notifiers_info+0x54/0xa0
__dev_close_many+0x50/0x130
dev_close_many+0x84/0x120
unregister_netdevice_many+0x130/0x710
unregister_netdevice_queue+0x8c/0xd0
unregister_netdev+0x20/0x30
dpaa2_eth_remove+0x68/0x190
fsl_mc_driver_remove+0x20/0x5c
__device_release_driver+0x21c/0x220
device_release_driver_internal+0xac/0xb0
device_links_unbind_consumers+0xd4/0x100
__device_release_driver+0x94/0x220
device_release_driver+0x28/0x40
bus_remove_device+0x118/0x124
device_del+0x174/0x420
fsl_mc_device_remove+0x24/0x40
__fsl_mc_device_remove+0xc/0x20
device_for_each_child+0x58/0xa0
dprc_remove+0x90/0xb0
fsl_mc_driver_remove+0x20/0x5c
__device_release_driver+0x21c/0x220
device_release_driver+0x28/0x40
bus_remove_device+0x118/0x124
device_del+0x174/0x420
fsl_mc_bus_remove+0x80/0x100
fsl_mc_bus_shutdown+0xc/0x1c
platform_shutdown+0x20/0x30
device_shutdown+0x154/0x330
__do_sys_reboot+0x1cc/0x250
__arm64_sys_reboot+0x20/0x30
invoke_syscall.constprop.0+0x4c/0xe0
do_el0_svc+0x4c/0x150
el0_svc+0x24/0xb0
el0t_64_sync_handler+0xa8/0xb0
el0t_64_sync+0x178/0x17c
It can be seen from the stack trace that the problem is that the
deregistration of the master causes a dev_close(), which gets notified
as NETDEV_GOING_DOWN to dsa_slave_netdevice_event().
But dsa_switch_shutdown() has already run, and this has unregistered the
DSA slave interfaces, and yet, the NETDEV_GOING_DOWN handler attempts to
call dev_close_many() on those slave interfaces, leading to the problem.
The previous attempt to avoid the NETDEV_GOING_DOWN on the master after
dsa_switch_shutdown() was called seems improper. Unregistering the slave
interfaces is unnecessary and unhelpful. Instead, after the slaves have
stopped being uppers of the DSA master, we can now reset to NULL the
master->dsa_ptr pointer, which will make DSA start ignoring all future
notifier events on the master.
Fixes: 0650bf52b31f ("net: dsa: be compatible with masters which unregister on shutdown")
Reported-by: Rafael Richter <rafael.richter@gin.de>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Originally we proposed a new hdmi-5v-supply regulator reference
for CI20 device tree but that was superseded by a better idea to use
the already defined "ddc-en-gpios" property of the "hdmi-connector".
Since "MIPS: DTS: CI20: Add DT nodes for HDMI setup" has already
been applied to v5.17-rc1, we add this on top.
Fixes: ae1b8d2c2de9 ("MIPS: DTS: CI20: Add DT nodes for HDMI setup")
Signed-off-by: H. Nikolaus Schaller <hns@goldelico.com>
Reviewed-by: Paul Cercueil <paul@crapouillou.net>
Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
|