Information Disclosure
Information disclosure occurs when an application unintentionally exposes sensitive data that aids attackers in reconnaissance or directly compromises security.
How It Works
Information disclosure occurs when an application unintentionally exposes sensitive data that aids attackers in reconnaissance or directly compromises security. This happens through multiple channels: verbose error messages that display stack traces revealing internal paths and frameworks, improperly secured debug endpoints left active in production, and misconfigured servers that expose directory listings or version control artifacts like .git folders. APIs often leak excessive data in responses—returning full user objects when only a name is needed, or revealing system internals through metadata fields.
Attackers exploit these exposures systematically. They probe for common sensitive files (.env, config.php, backup archives), trigger error conditions to extract framework details, and analyze response timing or content differences to enumerate valid usernames or resources. Even subtle variations—like "invalid password" versus "user not found"—enable account enumeration. Exposed configuration files frequently contain database credentials, API keys, or internal service URLs that unlock further attack vectors.
The attack flow typically starts with passive reconnaissance: examining HTTP headers, JavaScript bundles, and public endpoints for version information and architecture clues. Active probing follows—testing predictable paths, manipulating parameters to trigger exceptions, and comparing responses across similar requests to identify information leakage patterns.
Impact
- Credential compromise: Exposed configuration files, hardcoded secrets in source code, or API keys enable direct authentication bypass
- Attack surface mapping: Stack traces, framework versions, and internal paths help attackers craft targeted exploits for known vulnerabilities
- Data breach: Direct exposure of user data, payment information, or proprietary business logic through oversharing APIs or accessible backups
- Privilege escalation pathway: Internal URLs, service discovery information, and architecture details facilitate lateral movement and SSRF attacks
- Compliance violations: GDPR, PCI-DSS, and HIPAA penalties for exposing regulated data through preventable disclosures
Real-World Examples
A major Git repository exposure affected thousands of websites when .git folders remained accessible on production servers, allowing attackers to reconstruct entire source code histories including deleted commits containing credentials. Tools like GitDumper automated mass exploitation of this misconfiguration.
Cloud storage misconfigurations have repeatedly exposed sensitive data when companies left S3 buckets or Azure Blob containers publicly readable. One incident exposed 150 million voter records because verbose API error messages revealed the storage URL structure, and no authentication was required.
Framework debug modes left enabled in production have caused numerous breaches. Django's DEBUG=True setting exposed complete stack traces with database queries and environment variables, while Laravel's debug pages revealed encryption keys through the APP_KEY variable in environment dumps.
Mitigation
- Generic error pages: Return uniform error messages to users; log detailed exceptions server-side only
- Disable debug modes: Enforce production configurations that suppress stack traces, verbose logging, and debug endpoints through deployment automation
- Access control audits: Restrict or remove development artifacts (
.git, backup files,phpinfo()) and internal endpoints before deployment - Response minimization: API responses should return only necessary fields; implement allowlists rather than blocklists for data exposure
- Security headers: Deploy
X-Content-Type-Options, remove server version banners, and disable directory indexing - Timing consistency: Ensure authentication and validation responses take uniform time regardless of input validity
Recent CVEs (16041)
In the Linux kernel, the following vulnerability has been resolved: spi: imx: fix use-after-free on unbind The SPI subsystem frees the controller and any subsystem allocated driver data as part of deregistration (unless the allocation is device managed). Take another reference before deregistering the controller so that the driver data is not freed until the driver is done with it.
In the Linux kernel, the following vulnerability has been resolved: io_uring/zcrx: fix user_struct uaf io_free_rbuf_ring() usees a struct user_struct, which io_zcrx_ifq_free() puts it down before destroying the ring.
In the Linux kernel, the following vulnerability has been resolved: ibmasm: fix OOB reads in command_file_write due to missing size checks The command_file_write() handler allocates a kernel buffer of exactly count bytes and copies user data into it, but does not validate the buffer against the dot command protocol before passing it to get_dot_command_size() and get_dot_command_timeout(). Since both the allocation size (count) and the header fields (command_size, data_size) are independently user-controlled, an attacker can cause get_dot_command_size() to return a value exceeding the allocation, triggering OOB reads in get_dot_command_timeout() and an out-of-bounds memcpy_toio() that leaks kernel heap memory to the service processor. Fix with two guards: reject writes smaller than sizeof(struct dot_command_header) before allocation, then after copying user data reject commands where the buffer is smaller than the total size declared by the header (sizeof(header) + command_size + data_size). This ensures all subsequent header and payload field accesses stay within the buffer.
In the Linux kernel, the following vulnerability has been resolved: LoongArch: Add spectre boundry for syscall dispatch table The LoongArch syscall number is directly controlled by userspace, but does not have a array_index_nospec() boundry to prevent access past the syscall function pointer tables.
In the Linux kernel, the following vulnerability has been resolved: ALSA: caiaq: Fix potentially leftover ep1_in_urb at error path The previous fix for handling the error from setup_card() missed that an internal URB cdev->ep1_in_urb might have been already submitted beforehand. In the normal case, this URB gets killed at the disconnection, but in the error path, we didn't do it, hence there can be a potential leak. Fix it in the error path for setup_card(), too.
In the Linux kernel, the following vulnerability has been resolved: of: unittest: fix use-after-free in testdrv_probe() The function testdrv_probe() retrieves the device_node from the PCI device, applies an overlay, and then immediately calls of_node_put(dn). This releases the reference held by the PCI core, potentially freeing the node if the reference count drops to zero. Later, the same freed pointer 'dn' is passed to of_platform_default_populate(), leading to a use-after-free. The reference to pdev->dev.of_node is owned by the device model and should not be released by the driver. Remove the erroneous of_node_put() to prevent premature freeing.
In the Linux kernel, the following vulnerability has been resolved: rxrpc: Fix re-decryption of RESPONSE packets If a RESPONSE packet gets a temporary failure during processing, it may end up in a partially decrypted state - and then get requeued for a retry. Fix this by just discarding the packet; we will send another CHALLENGE packet and thereby elicit a further response. Similarly, discard an incoming CHALLENGE packet if we get an error whilst generating a RESPONSE; the server will send another CHALLENGE.
In the Linux kernel, the following vulnerability has been resolved: KVM: nSVM: Sync interrupt shadow to cached vmcb12 after VMRUN of L2 After VMRUN in guest mode, nested_sync_control_from_vmcb02() syncs fields written by the CPU from vmcb02 to the cached vmcb12. This is because the cached vmcb12 is used as the authoritative copy of some of the controls, and is the payload when saving/restoring nested state. int_state is also written by the CPU, specifically bit 0 (i.e. SVM_INTERRUPT_SHADOW_MASK) for nested VMs, but it is not sync'd to cached vmcb12. This does not cause a problem if KVM_SET_NESTED_STATE preceeds KVM_SET_VCPU_EVENTS in the restore path, as an interrupt shadow would be correctly restored to vmcb02 (KVM_SET_VCPU_EVENTS overwrites what KVM_SET_NESTED_STATE restored in int_state). However, if KVM_SET_VCPU_EVENTS preceeds KVM_SET_NESTED_STATE, an interrupt shadow would be restored into vmcb01 instead of vmcb02. This would mostly be benign for L1 (delays an interrupt), but not for L2. For L2, the vCPU could hang (e.g. if a wakeup interrupt is delivered before a HLT that should have been in an interrupt shadow). Sync int_state to the cached vmcb12 in nested_sync_control_from_vmcb02() to avoid this problem. With that, KVM_SET_NESTED_STATE restores the correct interrupt shadow state, and if KVM_SET_VCPU_EVENTS follows it would overwrite it with the same value.
In the Linux kernel, the following vulnerability has been resolved: crypto: ccree - fix a memory leak in cc_mac_digest() Add cc_unmap_result() if cc_map_hash_request_final() fails to prevent potential memory leak.
In the Linux kernel, the following vulnerability has been resolved: ext4: don't set EXT4_GET_BLOCKS_CONVERT when splitting before submitting I/O When allocating blocks during within-EOF DIO and writeback with dioread_nolock enabled, EXT4_GET_BLOCKS_PRE_IO was set to split an existing large unwritten extent. However, EXT4_GET_BLOCKS_CONVERT was set when calling ext4_split_convert_extents(), which may potentially result in stale data issues. Assume we have an unwritten extent, and then DIO writes the second half. [UUUUUUUUUUUUUUUU] on-disk extent U: unwritten extent [UUUUUUUUUUUUUUUU] extent status tree |<- ->| ----> dio write this range First, ext4_iomap_alloc() call ext4_map_blocks() with EXT4_GET_BLOCKS_PRE_IO, EXT4_GET_BLOCKS_UNWRIT_EXT and EXT4_GET_BLOCKS_CREATE flags set. ext4_map_blocks() find this extent and call ext4_split_convert_extents() with EXT4_GET_BLOCKS_CONVERT and the above flags set. Then, ext4_split_convert_extents() calls ext4_split_extent() with EXT4_EXT_MAY_ZEROOUT, EXT4_EXT_MARK_UNWRIT2 and EXT4_EXT_DATA_VALID2 flags set, and it calls ext4_split_extent_at() to split the second half with EXT4_EXT_DATA_VALID2, EXT4_EXT_MARK_UNWRIT1, EXT4_EXT_MAY_ZEROOUT and EXT4_EXT_MARK_UNWRIT2 flags set. However, ext4_split_extent_at() failed to insert extent since a temporary lack -ENOSPC. It zeroes out the first half but convert the entire on-disk extent to written since the EXT4_EXT_DATA_VALID2 flag set, but left the second half as unwritten in the extent status tree. [0000000000SSSSSS] data S: stale data, 0: zeroed [WWWWWWWWWWWWWWWW] on-disk extent W: written extent [WWWWWWWWWWUUUUUU] extent status tree Finally, if the DIO failed to write data to the disk, the stale data in the second half will be exposed once the cached extent entry is gone. Fix this issue by not passing EXT4_GET_BLOCKS_CONVERT when splitting an unwritten extent before submitting I/O, and make ext4_split_convert_extents() to zero out the entire extent range to zero for this case, and also mark the extent in the extent status tree for consistency.
In the Linux kernel, the following vulnerability has been resolved: gfs2: Fix use-after-free in iomap inline data write path The inline data buffer head (dibh) is being released prematurely in gfs2_iomap_begin() via release_metapath() while iomap->inline_data still points to dibh->b_data. This causes a use-after-free when iomap_write_end_inline() later attempts to write to the inline data area. The bug sequence: 1. gfs2_iomap_begin() calls gfs2_meta_inode_buffer() to read inode metadata into dibh 2. Sets iomap->inline_data = dibh->b_data + sizeof(struct gfs2_dinode) 3. Calls release_metapath() which calls brelse(dibh), dropping refcount to 0 4. kswapd reclaims the page (~39ms later in the syzbot report) 5. iomap_write_end_inline() tries to memcpy() to iomap->inline_data 6. KASAN detects use-after-free write to freed memory Fix by storing dibh in iomap->private and incrementing its refcount with get_bh() in gfs2_iomap_begin(). The buffer is then properly released in gfs2_iomap_end() after the inline write completes, ensuring the page stays alive for the entire iomap operation. Note: A C reproducer is not available for this issue. The fix is based on analysis of the KASAN report and code review showing the buffer head is freed before use. [agruenba: Take buffer head reference in gfs2_iomap_begin() to avoid leaks in gfs2_iomap_get() and gfs2_iomap_alloc().]
In the Linux kernel, the following vulnerability has been resolved: nfsd: never defer requests during idmap lookup During v4 request compound arg decoding, some ops (e.g. SETATTR) can trigger idmap lookup upcalls. When those upcall responses get delayed beyond the allowed time limit, cache_check() will mark the request for deferral and cause it to be dropped. This prevents nfs4svc_encode_compoundres from being executed, and thus the session slot flag NFSD4_SLOT_INUSE never gets cleared. Subsequent client requests will fail with NFSERR_JUKEBOX, given that the slot will be marked as in-use, making the SEQUENCE op fail. Fix this by making sure that the RQ_USEDEFERRAL flag is always clear during nfs4svc_decode_compoundargs(), since no v4 request should ever be deferred.
In the Linux kernel, the following vulnerability has been resolved: drm/amdgpu: clean up the amdgpu_cs_parser_bos In low memory conditions, kmalloc can fail. In such conditions unlock the mutex for a clean exit. We do not need to amdgpu_bo_list_put as it's been handled in the amdgpu_cs_parser_fini.
In the Linux kernel, the following vulnerability has been resolved: fbnic: close fw_log race between users and teardown Fixes a theoretical race on fw_log between the teardown path and fw_log write functions. fw_log is written inside fbnic_fw_log_write() and can be reached from the mailbox handler fbnic_fw_msix_intr(), but fw_log is freed before IRQ/MBX teardown during cleanup, resulting in a potential data race of dereferencing a freed/null variable. Possible Interleaving Scenario: CPU0: fbnic_fw_msix_intr() // Entry fbnic_fw_log_write() if (fbnic_fw_log_ready()) // true ... preempt ... CPU1: fbnic_remove() // Entry fbnic_fw_log_free() vfree(log->data_start); log->data_start = NULL; CPU0: continues, walks log->entries or writes to log->data_start The initialization also has an incorrect order problem, as the fw_log is currently allocated after MBX setup during initialization. Fix the problems by adjusting the synchronization order to put initialization in place before the mailbox is enabled, and not cleared until after the mailbox has been disabled.
In the Linux kernel, the following vulnerability has been resolved: drm/amdgpu: Fix memory leak in amdgpu_ras_init() When amdgpu_nbio_ras_sw_init() fails in amdgpu_ras_init(), the function returns directly without freeing the allocated con structure, leading to a memory leak. Fix this by jumping to the release_con label to properly clean up the allocated memory before returning the error code. Compile tested only. Issue found using a prototype static analysis tool and code review.
In the Linux kernel, the following vulnerability has been resolved: ublk: use READ_ONCE() to read struct ublksrv_ctrl_cmd struct ublksrv_ctrl_cmd is part of the io_uring_sqe, which may lie in userspace-mapped memory. It's racy to access its fields with normal loads, as userspace may write to them concurrently. Use READ_ONCE() to copy the ublksrv_ctrl_cmd from the io_uring_sqe to the stack. Use the local copy in place of the one in the io_uring_sqe.
In the Linux kernel, the following vulnerability has been resolved: btrfs: fix invalid leaf access in btrfs_quota_enable() if ref key not found If btrfs_search_slot_for_read() returns 1, it means we did not find any key greater than or equals to the key we asked for, meaning we have reached the end of the tree and therefore the path is not valid. If this happens we need to break out of the loop and stop, instead of continuing and accessing an invalid path.
In the Linux kernel, the following vulnerability has been resolved: RDMA/mlx5: Fix UMR hang in LAG error state unload During firmware reset in LAG mode, a race condition causes the driver to hang indefinitely while waiting for UMR completion during device unload. See [1]. In LAG mode the bond device is only registered on the master, so it never sees sys_error events from the slave. During firmware reset this causes UMR waits to hang forever on unload as the slave is dead but the master hasn't entered error state yet, so UMR posts succeed but completions never arrive. Fix this by adding a sys_error notifier that gets registered before MLX5_IB_STAGE_IB_REG and stays alive until after ib_unregister_device(). This ensures error events reach the bond device throughout teardown. [1] Call Trace: __schedule+0x2bd/0x760 schedule+0x37/0xa0 schedule_preempt_disabled+0xa/0x10 __mutex_lock.isra.6+0x2b5/0x4a0 __mlx5_ib_dereg_mr+0x606/0x870 [mlx5_ib] ? __xa_erase+0x4a/0xa0 ? _cond_resched+0x15/0x30 ? wait_for_completion+0x31/0x100 ib_dereg_mr_user+0x48/0xc0 [ib_core] ? rdmacg_uncharge_hierarchy+0xa0/0x100 destroy_hw_idr_uobject+0x20/0x50 [ib_uverbs] uverbs_destroy_uobject+0x37/0x150 [ib_uverbs] __uverbs_cleanup_ufile+0xda/0x140 [ib_uverbs] uverbs_destroy_ufile_hw+0x3a/0xf0 [ib_uverbs] ib_uverbs_remove_one+0xc3/0x140 [ib_uverbs] remove_client_context+0x8b/0xd0 [ib_core] disable_device+0x8c/0x130 [ib_core] __ib_unregister_device+0x10d/0x180 [ib_core] ib_unregister_device+0x21/0x30 [ib_core] __mlx5_ib_remove+0x1e4/0x1f0 [mlx5_ib] auxiliary_bus_remove+0x1e/0x30 device_release_driver_internal+0x103/0x1f0 bus_remove_device+0xf7/0x170 device_del+0x181/0x410 mlx5_rescan_drivers_locked.part.10+0xa9/0x1d0 [mlx5_core] mlx5_disable_lag+0x253/0x260 [mlx5_core] mlx5_lag_disable_change+0x89/0xc0 [mlx5_core] mlx5_eswitch_disable+0x67/0xa0 [mlx5_core] mlx5_unload+0x15/0xd0 [mlx5_core] mlx5_unload_one+0x71/0xc0 [mlx5_core] mlx5_sync_reset_reload_work+0x83/0x100 [mlx5_core] process_one_work+0x1a7/0x360 worker_thread+0x30/0x390 ? create_worker+0x1a0/0x1a0 kthread+0x116/0x130 ? kthread_flush_work_fn+0x10/0x10 ret_from_fork+0x22/0x40
In the Linux kernel, the following vulnerability has been resolved: smb: client: fix potential UAF and double free in smb2_open_file() Zero out @err_iov and @err_buftype before retrying SMB2_open() to prevent an UAF bug if @data != NULL, otherwise a double free.
In the Linux kernel, the following vulnerability has been resolved: bpf: Limit bpf program signature size Practical BPF signatures are significantly smaller than KMALLOC_MAX_CACHE_SIZE Allowing larger sizes opens the door for abuse by passing excessive size values and forcing the kernel into expensive allocation paths (via kmalloc_large or vmalloc).
In the Linux kernel, the following vulnerability has been resolved: bpf: Return proper address for non-zero offsets in insn array The map_direct_value_addr() function of the instruction array map incorrectly adds offset to the resulting address. This is a bug, because later the resolve_pseudo_ldimm64() function adds the offset. Fix it. Corresponding selftests are added in a consequent commit.
In the Linux kernel, the following vulnerability has been resolved: SUNRPC: fix gss_auth kref leak in gss_alloc_msg error path Commit 5940d1cf9f42 ("SUNRPC: Rebalance a kref in auth_gss.c") added a kref_get(&gss_auth->kref) call to balance the gss_put_auth() done in gss_release_msg(), but forgot to add a corresponding kref_put() on the error path when kstrdup_const() fails. If service_name is non-NULL and kstrdup_const() fails, the function jumps to err_put_pipe_version which calls put_pipe_version() and kfree(gss_msg), but never releases the gss_auth reference. This leads to a kref leak where the gss_auth structure is never freed. Add a forward declaration for gss_free_callback() and call kref_put() in the err_put_pipe_version error path to properly release the reference taken earlier.
In the Linux kernel, the following vulnerability has been resolved: ublk: Validate SQE128 flag before accessing the cmd ublk_ctrl_cmd_dump() accesses (header *)sqe->cmd before IO_URING_F_SQE128 flag check. This could cause out of boundary memory access. Move the SQE128 flag check earlier in ublk_ctrl_uring_cmd() to return -EINVAL immediately if the flag is not set.
In the Linux kernel, the following vulnerability has been resolved: hfsplus: return error when node already exists in hfs_bnode_create When hfs_bnode_create() finds that a node is already hashed (which should not happen in normal operation), it currently returns the existing node without incrementing its reference count. This causes a reference count inconsistency that leads to a kernel panic when the node is later freed in hfs_bnode_put(): kernel BUG at fs/hfsplus/bnode.c:676! BUG_ON(!atomic_read(&node->refcnt)) This scenario can occur when hfs_bmap_alloc() attempts to allocate a node that is already in use (e.g., when node 0's bitmap bit is incorrectly unset), or due to filesystem corruption. Returning an existing node from a create path is not normal operation. Fix this by returning ERR_PTR(-EEXIST) instead of the node when it's already hashed. This properly signals the error condition to callers, which already check for IS_ERR() return values.
In the Linux kernel, the following vulnerability has been resolved: drm/exynos: vidi: fix to avoid directly dereferencing user pointer In vidi_connection_ioctl(), vidi->edid(user pointer) is directly dereferenced in the kernel. This allows arbitrary kernel memory access from the user space, so instead of directly accessing the user pointer in the kernel, we should modify it to copy edid to kernel memory using copy_from_user() and use it.
In the Linux kernel, the following vulnerability has been resolved: md/md-llbitmap: fix percpu_ref not resurrected on suspend timeout When llbitmap_suspend_timeout() times out waiting for percpu_ref to become zero, it returns -ETIMEDOUT without resurrecting the percpu_ref. The caller (md_llbitmap_daemon_fn) then continues to the next page without calling llbitmap_resume(), leaving the percpu_ref in a killed state permanently. Fix this by resurrecting the percpu_ref before returning the error, ensuring the page control structure remains usable for subsequent operations.
In the Linux kernel, the following vulnerability has been resolved: fbdev: au1200fb: Fix a memory leak in au1200fb_drv_probe() In au1200fb_drv_probe(), when platform_get_irq fails(), it directly returns from the function with an error code, which causes a memory leak. Replace it with a goto label to ensure proper cleanup.
In the Linux kernel, the following vulnerability has been resolved: md/raid5: fix IO hang with degraded array with llbitmap When llbitmap bit state is still unwritten, any new write should force rcw, as bitmap_ops->blocks_synced() is checked in handle_stripe_dirtying(). However, later the same check is missing in need_this_block(), causing stripe to deadloop during handling because handle_stripe() will decide to go to handle_stripe_fill(), meanwhile need_this_block() always return 0 and nothing is handled.
In the Linux kernel, the following vulnerability has been resolved: eth: fbnic: Add validation for MTU changes Increasing the MTU beyond the HDS threshold causes the hardware to fragment packets across multiple buffers. If a single-buffer XDP program is attached, the driver will drop all multi-frag frames. While we can't prevent a remote sender from sending non-TCP packets larger than the MTU, this will prevent users from inadvertently breaking new TCP streams. Traditionally, drivers supported XDP with MTU less than 4Kb (packet per page). Fbnic currently prevents attaching XDP when MTU is too high. But it does not prevent increasing MTU after XDP is attached.
In the Linux kernel, the following vulnerability has been resolved: bpf: Fix a potential use-after-free of BTF object Refcounting in the check_pseudo_btf_id() function is incorrect: the __check_pseudo_btf_id() function might get called with a zero refcounted btf. Fix this, and patch related code accordingly. v3: rephrase a comment (AI) v2: fix a refcount leak introduced in v1 (AI)
In the Linux kernel, the following vulnerability has been resolved: crypto: starfive - Fix memory leak in starfive_aes_aead_do_one_req() The starfive_aes_aead_do_one_req() function allocates rctx->adata with kzalloc() but fails to free it if sg_copy_to_buffer() or starfive_aes_hw_init() fails, which lead to memory leaks. Since rctx->adata is unconditionally freed after the write_adata operations, ensure consistent cleanup by freeing the allocation in these earlier error paths as well. Compile tested only. Issue found using a prototype static analysis tool and code review.
In the Linux kernel, the following vulnerability has been resolved: hwrng: core - use RCU and work_struct to fix race condition Currently, hwrng_fill is not cleared until the hwrng_fillfn() thread exits. Since hwrng_unregister() reads hwrng_fill outside the rng_mutex lock, a concurrent hwrng_unregister() may call kthread_stop() again on the same task. Additionally, if hwrng_unregister() is called immediately after hwrng_register(), the stopped thread may have never been executed. Thus, hwrng_fill remains dirty even after hwrng_unregister() returns. In this case, subsequent calls to hwrng_register() will fail to start new threads, and hwrng_unregister() will call kthread_stop() on the same freed task. In both cases, a use-after-free occurs: refcount_t: addition on 0; use-after-free. WARNING: ... at lib/refcount.c:25 refcount_warn_saturate+0xec/0x1c0 Call Trace: kthread_stop+0x181/0x360 hwrng_unregister+0x288/0x380 virtrng_remove+0xe3/0x200 This patch fixes the race by protecting the global hwrng_fill pointer inside the rng_mutex lock, so that hwrng_fillfn() thread is stopped only once, and calls to kthread_run() and kthread_stop() are serialized with the lock held. To avoid deadlock in hwrng_fillfn() while being stopped with the lock held, we convert current_rng to RCU, so that get_current_rng() can read current_rng without holding the lock. To remove the lock from put_rng(), we also delay the actual cleanup into a work_struct. Since get_current_rng() no longer returns ERR_PTR values, the IS_ERR() checks are removed from its callers. With hwrng_fill protected by the rng_mutex lock, hwrng_fillfn() can no longer clear hwrng_fill itself. Therefore, if hwrng_fillfn() returns directly after current_rng is dropped, kthread_stop() would be called on a freed task_struct later. To fix this, hwrng_fillfn() calls schedule() now to keep the task alive until being stopped. The kthread_stop() call is also moved from hwrng_unregister() to drop_current_rng(), ensuring kthread_stop() is called on all possible paths where current_rng becomes NULL, so that the thread would not wait forever.
In the Linux kernel, the following vulnerability has been resolved: ext4: fix memory leak in ext4_ext_shift_extents() In ext4_ext_shift_extents(), if the extent is NULL in the while loop, the function returns immediately without releasing the path obtained via ext4_find_extent(), leading to a memory leak. Fix this by jumping to the out label to ensure the path is properly released.
In the Linux kernel, the following vulnerability has been resolved: drm/amdgpu: Fix memory leak in amdgpu_acpi_enumerate_xcc() In amdgpu_acpi_enumerate_xcc(), if amdgpu_acpi_dev_init() returns -ENOMEM, the function returns directly without releasing the allocated xcc_info, resulting in a memory leak. Fix this by ensuring that xcc_info is properly freed in the error paths. Compile tested only. Issue found using a prototype static analysis tool and code review.
In the Linux kernel, the following vulnerability has been resolved: iommu/vt-d: Fix race condition during PASID entry replacement The Intel VT-d PASID table entry is 512 bits (64 bytes). When replacing an active PASID entry (e.g., during domain replacement), the current implementation calculates a new entry on the stack and copies it to the table using a single structure assignment. struct pasid_entry *pte, new_pte; pte = intel_pasid_get_entry(dev, pasid); pasid_pte_config_first_level(iommu, &new_pte, ...); *pte = new_pte; Because the hardware may fetch the 512-bit PASID entry in multiple 128-bit chunks, updating the entire entry while it is active (Present bit set) risks a "torn" read. In this scenario, the IOMMU hardware could observe an inconsistent state - partially new data and partially old data - leading to unpredictable behavior or spurious faults. Fix this by removing the unsafe "replace" helpers and following the "clear-then-update" flow, which ensures the Present bit is cleared and the required invalidation handshake is completed before the new configuration is applied.
In the Linux kernel, the following vulnerability has been resolved: iommu/vt-d: Clear Present bit before tearing down context entry When tearing down a context entry, the current implementation zeros the entire 128-bit entry using multiple 64-bit writes. This creates a window where the hardware can fetch a "torn" entry - where some fields are already zeroed while the 'Present' bit is still set - leading to unpredictable behavior or spurious faults. While x86 provides strong write ordering, the compiler may reorder writes to the two 64-bit halves of the context entry. Even without compiler reordering, the hardware fetch is not guaranteed to be atomic with respect to multiple CPU writes. Align with the "Guidance to Software for Invalidations" in the VT-d spec (Section 6.5.3.3) by implementing the recommended ownership handshake: 1. Clear only the 'Present' (P) bit of the context entry first to signal the transition of ownership from hardware to software. 2. Use dma_wmb() to ensure the cleared bit is visible to the IOMMU. 3. Perform the required cache and context-cache invalidation to ensure hardware no longer has cached references to the entry. 4. Fully zero out the entry only after the invalidation is complete. Also, add a dma_wmb() to context_set_present() to ensure the entry is fully initialized before the 'Present' bit becomes visible.
In the Linux kernel, the following vulnerability has been resolved: ext4: fix e4b bitmap inconsistency reports A bitmap inconsistency issue was observed during stress tests under mixed huge-page workloads. Ext4 reported multiple e4b bitmap check failures like: ext4_mb_complex_scan_group:2508: group 350, 8179 free clusters as per group info. But got 8192 blocks Analysis and experimentation confirmed that the issue is caused by a race condition between page migration and bitmap modification. Although this timing window is extremely narrow, it is still hit in practice: folio_lock ext4_mb_load_buddy __migrate_folio check ref count folio_mc_copy __filemap_get_folio folio_try_get(folio) ...... mb_mark_used ext4_mb_unload_buddy __folio_migrate_mapping folio_ref_freeze folio_unlock The root cause of this issue is that the fast path of load_buddy only increments the folio's reference count, which is insufficient to prevent concurrent folio migration. We observed that the folio migration process acquires the folio lock. Therefore, we can determine whether to take the fast path in load_buddy by checking the lock status. If the folio is locked, we opt for the slow path (which acquires the lock) to close this concurrency window. Additionally, this change addresses the following issues: When the DOUBLE_CHECK macro is enabled to inspect bitmap-related issues, the following error may be triggered: corruption in group 324 at byte 784(6272): f in copy != ff on disk/prealloc Analysis reveals that this is a false positive. There is a specific race window where the bitmap and the group descriptor become momentarily inconsistent, leading to this error report: ext4_mb_load_buddy ext4_mb_load_buddy __filemap_get_folio(create|lock) folio_lock ext4_mb_init_cache folio_mark_uptodate __filemap_get_folio(no lock) ...... mb_mark_used mb_mark_used_double mb_cmp_bitmaps mb_set_bits(e4b->bd_bitmap) folio_unlock The original logic assumed that since mb_cmp_bitmaps is called when the bitmap is newly loaded from disk, the folio lock would be sufficient to prevent concurrent access. However, this overlooks a specific race condition: if another process attempts to load buddy and finds the folio is already in an uptodate state, it will immediately begin using it without holding folio lock.
In the Linux kernel, the following vulnerability has been resolved: tpm: tpm_i2c_infineon: Fix locality leak on get_burstcount() failure get_burstcount() can return -EBUSY on timeout. When this happens, the function returns directly without releasing the locality that was acquired at the beginning of tpm_tis_i2c_send(). Use goto out_err to ensure proper cleanup when get_burstcount() fails.
In the Linux kernel, the following vulnerability has been resolved: net: stmmac: fix oops when split header is enabled For GMAC4, when split header is enabled, in some rare cases, the hardware does not fill buf2 of the first descriptor with payload. Thus we cannot assume buf2 is always fully filled if it is not the last descriptor. Otherwise, the length of buf2 of the second descriptor will be calculated wrong and cause an oops: Unable to handle kernel paging request at virtual address ffff00019246bfc0 ... x2 : 0000000000000040 x1 : ffff00019246bfc0 x0 : ffff00009246c000 Call trace: dcache_inval_poc+0x28/0x58 (P) dma_direct_sync_single_for_cpu+0x38/0x6c __dma_sync_single_for_cpu+0x34/0x6c stmmac_napi_poll_rx+0x8f0/0xb60 __napi_poll.constprop.0+0x30/0x144 net_rx_action+0x160/0x274 handle_softirqs+0x1b8/0x1fc ... To fix this, the PL bit-field in RDES3 register is used for all descriptors, whether it is the last descriptor or not.
In the Linux kernel, the following vulnerability has been resolved: gpib: Fix memory leak in ni_usb_init() In ni_usb_init(), if ni_usb_setup_init() fails, the function returns -EFAULT without freeing the allocated writes buffer, leading to a memory leak. Additionally, ni_usb_setup_init() returns 0 on failure, which causes ni_usb_init() to return -EFAULT, an inappropriate error code for this situation. Fix the leak by freeing writes in the error path. Modify ni_usb_setup_init() to return -EINVAL on failure and propagate this error code in ni_usb_init().
In the Linux kernel, the following vulnerability has been resolved: crypto: inside-secure/eip93 - fix kernel panic in driver detach During driver detach, the same hash algorithm is unregistered multiple times due to a wrong iterator.
In the Linux kernel, the following vulnerability has been resolved: btrfs: fix EEXIST abort due to non-consecutive gaps in chunk allocation I have been observing a number of systems aborting at insert_dev_extents() in btrfs_create_pending_block_groups(). The following is a sample stack trace of such an abort coming from forced chunk allocation (typically behind CONFIG_BTRFS_EXPERIMENTAL) but this can theoretically happen to any DUP chunk allocation. [81.801] ------------[ cut here ]------------ [81.801] BTRFS: Transaction aborted (error -17) [81.801] WARNING: fs/btrfs/block-group.c:2876 at btrfs_create_pending_block_groups+0x721/0x770 [btrfs], CPU#1: bash/319 [81.802] Modules linked in: virtio_net btrfs xor zstd_compress raid6_pq null_blk [81.803] CPU: 1 UID: 0 PID: 319 Comm: bash Kdump: loaded Not tainted 6.19.0-rc6+ #319 NONE [81.803] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Arch Linux 1.17.0-2-2 04/01/2014 [81.804] RIP: 0010:btrfs_create_pending_block_groups+0x723/0x770 [btrfs] [81.806] RSP: 0018:ffffa36241a6bce8 EFLAGS: 00010282 [81.806] RAX: 000000000000000d RBX: ffff8e699921e400 RCX: 0000000000000000 [81.807] RDX: 0000000002040001 RSI: 00000000ffffffef RDI: ffffffffc0608bf0 [81.807] RBP: 00000000ffffffef R08: ffff8e69830f6000 R09: 0000000000000007 [81.808] R10: ffff8e699921e5e8 R11: 0000000000000000 R12: ffff8e6999228000 [81.808] R13: ffff8e6984d82000 R14: ffff8e69966a69c0 R15: ffff8e69aa47b000 [81.809] FS: 00007fec6bdd9740(0000) GS:ffff8e6b1b379000(0000) knlGS:0000000000000000 [81.809] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [81.810] CR2: 00005604833670f0 CR3: 0000000116679000 CR4: 00000000000006f0 [81.810] Call Trace: [81.810] <TASK> [81.810] __btrfs_end_transaction+0x3e/0x2b0 [btrfs] [81.811] btrfs_force_chunk_alloc_store+0xcd/0x140 [btrfs] [81.811] kernfs_fop_write_iter+0x15f/0x240 [81.812] vfs_write+0x264/0x500 [81.812] ksys_write+0x6c/0xe0 [81.812] do_syscall_64+0x66/0x770 [81.812] entry_SYSCALL_64_after_hwframe+0x76/0x7e [81.813] RIP: 0033:0x7fec6be66197 [81.814] RSP: 002b:00007fffb159dd30 EFLAGS: 00000202 ORIG_RAX: 0000000000000001 [81.815] RAX: ffffffffffffffda RBX: 00007fec6bdd9740 RCX: 00007fec6be66197 [81.815] RDX: 0000000000000002 RSI: 0000560483374f80 RDI: 0000000000000001 [81.816] RBP: 0000560483374f80 R08: 0000000000000000 R09: 0000000000000000 [81.816] R10: 0000000000000000 R11: 0000000000000202 R12: 0000000000000002 [81.817] R13: 00007fec6bfb85c0 R14: 00007fec6bfb5ee0 R15: 00005604833729c0 [81.817] </TASK> [81.817] irq event stamp: 20039 [81.818] hardirqs last enabled at (20047): [<ffffffff99a68302>] __up_console_sem+0x52/0x60 [81.818] hardirqs last disabled at (20056): [<ffffffff99a682e7>] __up_console_sem+0x37/0x60 [81.819] softirqs last enabled at (19470): [<ffffffff999d2b46>] __irq_exit_rcu+0x96/0xc0 [81.819] softirqs last disabled at (19463): [<ffffffff999d2b46>] __irq_exit_rcu+0x96/0xc0 [81.820] ---[ end trace 0000000000000000 ]--- [81.820] BTRFS: error (device dm-7 state A) in btrfs_create_pending_block_groups:2876: errno=-17 Object already exists Inspecting these aborts with drgn, I observed a pattern of overlapping chunk_maps. Note how stripe 1 of the first chunk overlaps in physical address with stripe 0 of the second chunk. Physical Start Physical End Length Logical Type Stripe ---------------------------------------------------------------------------------------------------- 0x0000000102500000 0x0000000142500000 1.0G 0x0000000641d00000 META|DUP 0/2 0x0000000142500000 0x0000000182500000 1.0G 0x0000000641d00000 META|DUP 1/2 0x0000000142500000 0x0000000182500000 1.0G 0x0000000601d00000 META|DUP 0/2 0x0000000182500000 0x00000001c2500000 1.0G 0x0000000601d00000 META|DUP 1/2 Now how could this possibly happen? All chunk allocation is ---truncated---
In the Linux kernel, the following vulnerability has been resolved: bpf: Preserve id of register in sync_linked_regs() sync_linked_regs() copies the id of known_reg to reg when propagating bounds of known_reg to reg using the off of known_reg, but when known_reg was linked to reg like: known_reg = reg ; both known_reg and reg get same id known_reg += 4 ; known_reg gets off = 4, and its id gets BPF_ADD_CONST now when a call to sync_linked_regs() happens, let's say with the following: if known_reg >= 10 goto pc+2 known_reg's new bounds are propagated to reg but now reg gets BPF_ADD_CONST from the copy. This means if another link to reg is created like: another_reg = reg ; another_reg should get the id of reg but assign_scalar_id_before_mov() sees BPF_ADD_CONST on reg and assigns a new id to it. As reg has a new id now, known_reg's link to reg is broken. If we find new bounds for known_reg, they will not be propagated to reg. This can be seen in the selftest added in the next commit: 0: (85) call bpf_get_prandom_u32#7 ; R0=scalar() 1: (57) r0 &= 255 ; R0=scalar(smin=smin32=0,smax=umax=smax32=umax32=255,var_off=(0x0; 0xff)) 2: (bf) r1 = r0 ; R0=scalar(id=1,smin=smin32=0,smax=umax=smax32=umax32=255,var_off=(0x0; 0xff)) R1=scalar(id=1,smin=smin32=0,smax=umax=smax32=umax32=255,var_off=(0x0; 0xff)) 3: (07) r1 += 4 ; R1=scalar(id=1+4,smin=umin=smin32=umin32=4,smax=umax=smax32=umax32=259,var_off=(0x0; 0x1ff)) 4: (a5) if r1 < 0xa goto pc+4 ; R1=scalar(id=1+4,smin=umin=smin32=umin32=10,smax=umax=smax32=umax32=259,var_off=(0x0; 0x1ff)) 5: (bf) r2 = r0 ; R0=scalar(id=2,smin=umin=smin32=umin32=6,smax=umax=smax32=umax32=255) R2=scalar(id=2,smin=umin=smin32=umin32=6,smax=umax=smax32=umax32=255) 6: (a5) if r1 < 0xe goto pc+2 ; R1=scalar(id=1+4,smin=umin=smin32=umin32=14,smax=umax=smax32=umax32=259,var_off=(0x0; 0x1ff)) 7: (35) if r0 >= 0xa goto pc+1 ; R0=scalar(id=2,smin=umin=smin32=umin32=6,smax=umax=smax32=umax32=9,var_off=(0x0; 0xf)) 8: (37) r0 /= 0 div by zero When 4 is verified, r1's bounds are propagated to r0 but r0 also gets BPF_ADD_CONST (bug). When 5 is verified, r0 gets a new id (2) and its link with r1 is broken. After 6 we know r1 has bounds [14, 259] and therefore r0 should have bounds [10, 255], therefore the branch at 7 is always taken. But because r0's id was changed to 2, r1's new bounds are not propagated to r0. The verifier still thinks r0 has bounds [6, 255] before 7 and execution can reach div by zero. Fix this by preserving id in sync_linked_regs() like off and subreg_def.
In the Linux kernel, the following vulnerability has been resolved: net: mctp: ensure our nlmsg responses are initialised Syed Faraz Abrar (@farazsth98) from Zellic, and Pumpkin (@u1f383) from DEVCORE Research Team working with Trend Micro Zero Day Initiative report that a RTM_GETNEIGH will return uninitalised data in the pad bytes of the ndmsg data. Ensure we're initialising the netlink data to zero, in the link, addr and neigh response messages.
In the Linux kernel, the following vulnerability has been resolved: ovpn: fix possible use-after-free in ovpn_net_xmit When building the skb_list in ovpn_net_xmit, skb_share_check will free the original skb if it is shared. The current implementation continues to use the stale skb pointer for subsequent operations: - peer lookup, - skb_dst_drop (even though all segments produced by skb_gso_segment will have a dst attached), - ovpn_peer_stats_increment_tx. Fix this by moving the peer lookup and skb_dst_drop before segmentation so that the original skb is still valid when used. Return early if all segments fail skb_share_check and the list ends up empty. Also switch ovpn_peer_stats_increment_tx to use skb_list.next; the next patch fixes the stats logic.
In the Linux kernel, the following vulnerability has been resolved: media: chips-media: wave5: Fix memory leak on codec_info allocation failure In wave5_vpu_open_enc() and wave5_vpu_open_dec(), a vpu instance is allocated via kzalloc(). If the subsequent allocation for inst->codec_info fails, the functions return -ENOMEM without freeing the previously allocated instance, causing a memory leak. Fix this by calling kfree() on the instance in this error path to ensure it is properly released.
In the Linux kernel, the following vulnerability has been resolved: bpf: Require frozen map for calculating map hash Currently, bpf_map_get_info_by_fd calculates and caches the hash of the map regardless of the map's frozen state. This leads to a TOCTOU bug where userspace can call BPF_OBJ_GET_INFO_BY_FD to cache the hash and then modify the map contents before freezing. Therefore, a trusted loader can be tricked into verifying the stale hash while loading the modified contents. Fix this by returning -EPERM if the map is not frozen when the hash is requested. This ensures the hash is only generated for the final, immutable state of the map.
In the Linux kernel, the following vulnerability has been resolved: rust: pwm: Fix potential memory leak on init error When initializing a PWM chip using pwmchip_alloc(), the allocated device owns an initial reference that must be released on all error paths. If __pinned_init() were to fail, the allocated pwm_chip would currently leak because the error path returns without calling pwmchip_put().
In the Linux kernel, the following vulnerability has been resolved: thermal/of: Fix reference leak in thermal_of_cm_lookup() In thermal_of_cm_lookup(), tr_np is obtained via of_parse_phandle(), but never released. Use the __free(device_node) cleanup attribute to automatically release the node and fix the leak. [ rjw: Changelog edits ]
{.+.+}-{0:0}, at: ksmbd_vfs_kern_path_locked+0x142/0x660 #1: ffff888130e966c0 (&type->i_mutex_dir_key#3/1){+.+.}-{4:4}, at: ksmbd_vfs_kern_path_locked+0x17d/0x660 CPU: 5 PID: 7596 Comm: kworker/5:21 Not tainted 6.1.162-00456-gc29b353f383b #138 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.17.0-debian-1.17.0-1 04/01/2014 Workqueue: ksmbd-io handle_ksmbd_work Call Trace: <TASK> dump_stack_lvl+0x44/0x5b process_one_work.cold+0x57/0x5c worker_thread+0x82/0x600 kthread+0x153/0x190 ret_from_fork+0x22/0x30 </TASK> Found by Linux Verification Center (linuxtesting.org).
In the Linux kernel, the following vulnerability has been resolved: net: usb: catc: enable basic endpoint checking catc_probe() fills three URBs with hardcoded endpoint pipes without verifying the endpoint descriptors: - usb_sndbulkpipe(usbdev, 1) and usb_rcvbulkpipe(usbdev, 1) for TX/RX - usb_rcvintpipe(usbdev, 2) for interrupt status A malformed USB device can present these endpoints with transfer types that differ from what the driver assumes. Add a catc_usb_ep enum for endpoint numbers, replacing magic constants throughout. Add usb_check_bulk_endpoints() and usb_check_int_endpoints() calls after usb_set_interface() to verify endpoint types before use, rejecting devices with mismatched descriptors at probe time. Similar to - commit 90b7f2961798 ("net: usb: rtl8150: enable basic endpoint checking") which fixed the issue in rtl8150.
In the Linux kernel, the following vulnerability has been resolved: RDMA/mlx5: Fix memory leak in GET_DATA_DIRECT_SYSFS_PATH handler The UVERBS_HANDLER(MLX5_IB_METHOD_GET_DATA_DIRECT_SYSFS_PATH) function allocates memory for the device path using kobject_get_path(). If the length of the device path exceeds the output buffer length, the function returns -ENOSPC but does not free the allocated memory, resulting in a memory leak. Add a kfree() call to the error path to ensure the allocated memory is properly freed. Compile tested only. Issue found using a prototype static analysis tool and code review.
In the Linux kernel, the following vulnerability has been resolved: mtd: parsers: Fix memory leak in mtd_parser_tplink_safeloader_parse() The function mtd_parser_tplink_safeloader_parse() allocates buf via mtd_parser_tplink_safeloader_read_table(). If the allocation for parts[idx].name fails inside the loop, the code jumps to the err_free label without freeing buf, leading to a memory leak. Fix this by freeing the temporary buffer buf in the err_free label. Compile tested only. Issue found using a prototype static analysis tool and code review.
In the Linux kernel, the following vulnerability has been resolved: ext4: fix dirtyclusters double decrement on fs shutdown fstests test generic/388 occasionally reproduces a warning in ext4_put_super() associated with the dirty clusters count: WARNING: CPU: 7 PID: 76064 at fs/ext4/super.c:1324 ext4_put_super+0x48c/0x590 [ext4] Tracing the failure shows that the warning fires due to an s_dirtyclusters_counter value of -1. IOW, this appears to be a spurious decrement as opposed to some sort of leak. Further tracing of the dirty cluster count deltas and an LLM scan of the resulting output identified the cause as a double decrement in the error path between ext4_mb_mark_diskspace_used() and the caller ext4_mb_new_blocks(). First, note that generic/388 is a shutdown vs. fsstress test and so produces a random set of operations and shutdown injections. In the problematic case, the shutdown triggers an error return from the ext4_handle_dirty_metadata() call(s) made from ext4_mb_mark_context(). The changed value is non-zero at this point, so ext4_mb_mark_diskspace_used() does not exit after the error bubbles up from ext4_mb_mark_context(). Instead, the former decrements both cluster counters and returns the error up to ext4_mb_new_blocks(). The latter falls into the !ar->len out path which decrements the dirty clusters counter a second time, creating the inconsistency. To avoid this problem and simplify ownership of the cluster reservation in this codepath, lift the counter reduction to a single place in the caller. This makes it more clear that ext4_mb_new_blocks() is responsible for acquiring cluster reservation (via ext4_claim_free_clusters()) in the !delalloc case as well as releasing it, regardless of whether it ends up consumed or returned due to failure.
In the Linux kernel, the following vulnerability has been resolved: sched/rt: Skip currently executing CPU in rto_next_cpu() CPU0 becomes overloaded when hosting a CPU-bound RT task, a non-CPU-bound RT task, and a CFS task stuck in kernel space. When other CPUs switch from RT to non-RT tasks, RT load balancing (LB) is triggered; with HAVE_RT_PUSH_IPI enabled, they send IPIs to CPU0 to drive the execution of rto_push_irq_work_func. During push_rt_task on CPU0, if next_task->prio < rq->donor->prio, resched_curr() sets NEED_RESCHED and after the push operation completes, CPU0 calls rto_next_cpu(). Since only CPU0 is overloaded in this scenario, rto_next_cpu() should ideally return -1 (no further IPI needed). However, multiple CPUs invoking tell_cpu_to_push() during LB increments rd->rto_loop_next. Even when rd->rto_cpu is set to -1, the mismatch between rd->rto_loop and rd->rto_loop_next forces rto_next_cpu() to restart its search from -1. With CPU0 remaining overloaded (satisfying rt_nr_migratory && rt_nr_total > 1), it gets reselected, causing CPU0 to queue irq_work to itself and send self-IPIs repeatedly. As long as CPU0 stays overloaded and other CPUs run pull_rt_tasks(), it falls into an infinite self-IPI loop, which triggers a CPU hardlockup due to continuous self-interrupts. The trigging scenario is as follows: cpu0 cpu1 cpu2 pull_rt_task tell_cpu_to_push <------------irq_work_queue_on rto_push_irq_work_func push_rt_task resched_curr(rq) pull_rt_task rto_next_cpu tell_cpu_to_push <-------------------------- atomic_inc(rto_loop_next) rd->rto_loop != next rto_next_cpu irq_work_queue_on rto_push_irq_work_func Fix redundant self-IPI by filtering the initiating CPU in rto_next_cpu(). This solution has been verified to effectively eliminate spurious self-IPIs and prevent CPU hardlockup scenarios.
In the Linux kernel, the following vulnerability has been resolved: ipvs: do not keep dest_dst if dev is going down There is race between the netdev notifier ip_vs_dst_event() and the code that caches dst with dev that is going down. As the FIB can be notified for the closed device after our handler finishes, it is possible valid route to be returned and cached resuling in a leaked dev reference until the dest is not removed. To prevent new dest_dst to be attached to dest just after the handler dropped the old one, add a netif_running() check to make sure the notifier handler is not currently running for device that is closing.
In the Linux kernel, the following vulnerability has been resolved: fat: avoid parent link count underflow in rmdir Corrupted FAT images can leave a directory inode with an incorrect i_nlink (e.g. 2 even though subdirectories exist). rmdir then unconditionally calls drop_nlink(dir) and can drive i_nlink to 0, triggering the WARN_ON in drop_nlink(). Add a sanity check in vfat_rmdir() and msdos_rmdir(): only drop the parent link count when it is at least 3, otherwise report a filesystem error.
In the Linux kernel, the following vulnerability has been resolved: net: bridge: mcast: always update mdb_n_entries for vlan contexts syzbot triggered a warning[1] about the number of mdb entries in a context. It turned out that there are multiple ways to trigger that warning today (some got added during the years), the root cause of the problem is that the increase is done conditionally, and over the years these different conditions increased so there were new ways to trigger the warning, that is to do a decrease which wasn't paired with a previous increase. For example one way to trigger it is with flush: $ ip l add br0 up type bridge vlan_filtering 1 mcast_snooping 1 $ ip l add dumdum up master br0 type dummy $ bridge mdb add dev br0 port dumdum grp 239.0.0.1 permanent vid 1 $ ip link set dev br0 down $ ip link set dev br0 type bridge mcast_vlan_snooping 1 ^^^^ this will enable snooping, but will not update mdb_n_entries because in __br_multicast_enable_port_ctx() we check !netif_running $ bridge mdb flush dev br0 ^^^ this will trigger the warning because it will delete the pg which we added above, which will try to decrease mdb_n_entries Fix the problem by removing the conditional increase and always keep the count up-to-date while the vlan exists. In order to do that we have to first initialize it on port-vlan context creation, and then always increase or decrease the value regardless of mcast options. To keep the current behaviour we have to enforce the mdb limit only if the context is port's or if the port-vlan's mcast snooping is enabled. [1] ------------[ cut here ]------------ n == 0 WARNING: net/bridge/br_multicast.c:718 at br_multicast_port_ngroups_dec_one net/bridge/br_multicast.c:718 [inline], CPU#0: syz.4.4607/22043 WARNING: net/bridge/br_multicast.c:718 at br_multicast_port_ngroups_dec net/bridge/br_multicast.c:771 [inline], CPU#0: syz.4.4607/22043 WARNING: net/bridge/br_multicast.c:718 at br_multicast_del_pg+0x1bbe/0x1e20 net/bridge/br_multicast.c:825, CPU#0: syz.4.4607/22043 Modules linked in: CPU: 0 UID: 0 PID: 22043 Comm: syz.4.4607 Not tainted syzkaller #0 PREEMPT(full) Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/24/2026 RIP: 0010:br_multicast_port_ngroups_dec_one net/bridge/br_multicast.c:718 [inline] RIP: 0010:br_multicast_port_ngroups_dec net/bridge/br_multicast.c:771 [inline] RIP: 0010:br_multicast_del_pg+0x1bbe/0x1e20 net/bridge/br_multicast.c:825 Code: 41 5f 5d e9 04 7a 48 f7 e8 3f 73 5c f7 90 0f 0b 90 e9 cf fd ff ff e8 31 73 5c f7 90 0f 0b 90 e9 16 fd ff ff e8 23 73 5c f7 90 <0f> 0b 90 e9 60 fd ff ff e8 15 73 5c f7 eb 05 e8 0e 73 5c f7 48 8b RSP: 0018:ffffc9000c207220 EFLAGS: 00010293 RAX: ffffffff8a68042d RBX: ffff88807c6f1800 RCX: ffff888066e90000 RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000 RBP: 0000000000000000 R08: ffff888066e90000 R09: 000000000000000c R10: 000000000000000c R11: 0000000000000000 R12: ffff8880303ef800 R13: dffffc0000000000 R14: ffff888050eb11c4 R15: 1ffff1100a1d6238 FS: 00007fa45921b6c0(0000) GS:ffff8881256f5000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007fa4591f9ff8 CR3: 0000000081df2000 CR4: 00000000003526f0 Call Trace: <TASK> br_mdb_flush_pgs net/bridge/br_mdb.c:1525 [inline] br_mdb_flush net/bridge/br_mdb.c:1544 [inline] br_mdb_del_bulk+0x5e2/0xb20 net/bridge/br_mdb.c:1561 rtnl_mdb_del+0x48a/0x640 net/core/rtnetlink.c:-1 rtnetlink_rcv_msg+0x77e/0xbe0 net/core/rtnetlink.c:6967 netlink_rcv_skb+0x232/0x4b0 net/netlink/af_netlink.c:2550 netlink_unicast_kernel net/netlink/af_netlink.c:1318 [inline] netlink_unicast+0x80f/0x9b0 net/netlink/af_netlink.c:1344 netlink_sendmsg+0x813/0xb40 net/netlink/af_netlink.c:1894 sock_sendmsg_nosec net/socket.c:727 [inline] __sock_sendmsg net/socket.c:742 [inline] ____sys_sendmsg+0xa68/0xad0 net/socket.c:2592 ___sys_sendmsg+0x2a5/0x360 net/socke ---truncated---
In the Linux kernel, the following vulnerability has been resolved: ext4: don't cache extent during splitting extent Caching extents during the splitting process is risky, as it may result in stale extents remaining in the status tree. Moreover, in most cases, the corresponding extent block entries are likely already cached before the split happens, making caching here not particularly useful. Assume we have an unwritten extent, and then DIO writes the first half. [UUUUUUUUUUUUUUUU] on-disk extent U: unwritten extent [UUUUUUUUUUUUUUUU] extent status tree |<- ->| ----> dio write this range First, when ext4_split_extent_at() splits this extent, it truncates the existing extent and then inserts a new one. During this process, this extent status entry may be shrunk, and calls to ext4_find_extent() and ext4_cache_extents() may occur, which could potentially insert the truncated range as a hole into the extent status tree. After the split is completed, this hole is not replaced with the correct status. [UUUUUUU|UUUUUUUU] on-disk extent U: unwritten extent [UUUUUUU|HHHHHHHH] extent status tree H: hole Then, the outer calling functions will not correct this remaining hole extent either. Finally, if we perform a delayed buffer write on this latter part, it will re-insert the delayed extent and cause an error in space accounting. In adition, if the unwritten extent cache is not shrunk during the splitting, ext4_cache_extents() also conflicts with existing extents when caching extents. In the future, we will add checks when caching extents, which will trigger a warning. Therefore, Do not cache extents that are being split.
{ spin_lock_irqsave rxe_destroy_qp() __rxe_cleanup() __rxe_put() // qp->ref_count decrease to 0 rxe_qp_do_cleanup() { if (qp->valid) { rxe_sched_task() { WARN_ON(rxe_read(task->qp) <= 0); } } spin_unlock_irqrestore } spin_lock_irqsave qp->valid = 0 spin_unlock_irqrestore } Ensure the QP's reference count is maintained and its validity is checked within the timer callbacks by adding calls to rxe_get(qp) and corresponding rxe_put(qp) after use.
In the Linux kernel, the following vulnerability has been resolved: clk: mediatek: Drop __initconst from gates Since commit 8ceff24a754a ("clk: mediatek: clk-gate: Refactor mtk_clk_register_gate to use mtk_gate struct") the mtk_gate structs are no longer just used for initialization/registration, but also at runtime. So drop __initconst annotations.
In the Linux kernel, the following vulnerability has been resolved: accel/amdxdna: Fix memory leak in amdxdna_ubuf_map The amdxdna_ubuf_map() function allocates memory for sg and internal sg table structures, but it fails to free them if subsequent operations (sg_alloc_table_from_pages or dma_map_sgtable) fail.
In the Linux kernel, the following vulnerability has been resolved: net/mlx5e: Fix deadlocks between devlink and netdev instance locks In the mentioned "Fixes" commit, various work tasks triggering devlink health reporter recovery were switched to use netdev_trylock to protect against concurrent tear down of the channels being recovered. But this had the side effect of introducing potential deadlocks because of incorrect lock ordering. The correct lock order is described by the init flow: probe_one -> mlx5_init_one (acquires devlink lock) -> mlx5_init_one_devl_locked -> mlx5_register_device -> mlx5_rescan_drivers_locked -...-> mlx5e_probe -> _mlx5e_probe -> register_netdev (acquires rtnl lock) -> register_netdevice (acquires netdev lock) => devlink lock -> rtnl lock -> netdev lock. But in the current recovery flow, the order is wrong: mlx5e_tx_err_cqe_work (acquires netdev lock) -> mlx5e_reporter_tx_err_cqe -> mlx5e_health_report -> devlink_health_report (acquires devlink lock => boom!) -> devlink_health_reporter_recover -> mlx5e_tx_reporter_recover -> mlx5e_tx_reporter_recover_from_ctx -> mlx5e_tx_reporter_err_cqe_recover The same pattern exists in: mlx5e_reporter_rx_timeout mlx5e_reporter_tx_ptpsq_unhealthy mlx5e_reporter_tx_timeout Fix these by moving the netdev_trylock calls from the work handlers lower in the call stack, in the respective recovery functions, where they are actually necessary.
In the Linux kernel, the following vulnerability has been resolved: xfrm: fix ip_rt_bug race in icmp_route_lookup reverse path icmp_route_lookup() performs multiple route lookups to find a suitable route for sending ICMP error messages, with special handling for XFRM (IPsec) policies. The lookup sequence is: 1. First, lookup output route for ICMP reply (dst = original src) 2. Pass through xfrm_lookup() for policy check 3. If blocked (-EPERM) or dst is not local, enter "reverse path" 4. In reverse path, call xfrm_decode_session_reverse() to get fl4_dec which reverses the original packet's flow (saddr<->daddr swapped) 5. If fl4_dec.saddr is local (we are the original destination), use __ip_route_output_key() for output route lookup 6. If fl4_dec.saddr is NOT local (we are a forwarding node), use ip_route_input() to simulate the reverse packet's input path 7. Finally, pass rt2 through xfrm_lookup() with XFRM_LOOKUP_ICMP flag The bug occurs in step 6: ip_route_input() is called with fl4_dec.daddr (original packet's source) as destination. If this address becomes local between the initial check and ip_route_input() call (e.g., due to concurrent "ip addr add"), ip_route_input() returns a LOCAL route with dst.output set to ip_rt_bug. This route is then used for ICMP output, causing dst_output() to call ip_rt_bug(), triggering a WARN_ON: ------------[ cut here ]------------ WARNING: net/ipv4/route.c:1275 at ip_rt_bug+0x21/0x30, CPU#1 Call Trace: <TASK> ip_push_pending_frames+0x202/0x240 icmp_push_reply+0x30d/0x430 __icmp_send+0x1149/0x24f0 ip_options_compile+0xa2/0xd0 ip_rcv_finish_core+0x829/0x1950 ip_rcv+0x2d7/0x420 __netif_receive_skb_one_core+0x185/0x1f0 netif_receive_skb+0x90/0x450 tun_get_user+0x3413/0x3fb0 tun_chr_write_iter+0xe4/0x220 ... Fix this by checking rt2->rt_type after ip_route_input(). If it's RTN_LOCAL, the route cannot be used for output, so treat it as an error. The reproducer requires kernel modification to widen the race window, making it unsuitable as a selftest. It is available at: https://gist.github.com/mrpre/eae853b72ac6a750f5d45d64ddac1e81
{+.+.}-{3:3}, at: pci_lock_rescan_remove+0x28/0x40 [ 84.964342] [ T928] but task is already holding lock: [ 84.964347] [ T928] c000000003b29d58 (pci_rescan_remove_lock){+.+.}-{3:3}, at: pci_lock_rescan_remove+0x28/0x40 [ 84.964357] [ T928] other info that might help us debug this: [ 84.964363] [ T928] Possible unsafe locking scenario: [ 84.964367] [ T928] CPU0 [ 84.964370] [ T928] ---- [ 84.964373] [ T928] lock(pci_rescan_remove_lock); [ 84.964378] [ T928] lock(pci_rescan_remove_lock); [ 84.964383] [ T928] *** DEADLOCK *** [ 84.964388] [ T928] May be due to missing lock nesting notation [ 84.964393] [ T928] 1 lock held by eehd/928: [ 84.964397] [ T928] #0: c000000003b29d58 (pci_rescan_remove_lock){+.+.}-{3:3}, at: pci_lock_rescan_remove+0x28/0x40 [ 84.964408] [ T928] stack backtrace: [ 84.964414] [ T928] CPU: 2 UID: 0 PID: 928 Comm: eehd Not tainted 6.18.0-rc3 #51 VOLUNTARY [ 84.964417] [ T928] Hardware name: IBM,9080-HEX POWER10 (architected) 0x800200 0xf000006 of:IBM,FW1060.00 (NH1060_022) hv:phyp pSeries [ 84.964419] [ T928] Call Trace: [ 84.964420] [ T928] [c0000011a7157990] [c000000001705de4] dump_stack_lvl+0xc8/0x130 (unreliable) [ 84.964424] [ T928] [c0000011a71579d0] [c0000000002f66e0] print_deadlock_bug+0x430/0x440 [ 84.964428] [ T928] [c0000011a7157a70] [c0000000002fd0c0] __lock_acquire+0x1530/0x2d80 [ 84.964431] [ T928] [c0000011a7157ba0] [c0000000002fea54] lock_acquire+0x144/0x410 [ 84.964433] [ T928] [c0000011a7157cb0] [c0000011a7157cb0] __mutex_lock+0xf4/0x1050 [ 84.964436] [ T928] [c0000011a7157e00] [c000000000de21d8] pci_lock_rescan_remove+0x28/0x40 [ 84.964439] [ T928] [c0000011a7157e20] [c00000000004ed98] eeh_pe_bus_get+0x48/0xc0 [ 84.964442] [ T928] [c0000011a7157e50] [c00000 ---truncated---
In the Linux kernel, the following vulnerability has been resolved: bpf: Fix memory access flags in helper prototypes After commit 37cce22dbd51 ("bpf: verifier: Refactor helper access type tracking"), the verifier started relying on the access type flags in helper function prototypes to perform memory access optimizations. Currently, several helper functions utilizing ARG_PTR_TO_MEM lack the corresponding MEM_RDONLY or MEM_WRITE flags. This omission causes the verifier to incorrectly assume that the buffer contents are unchanged across the helper call. Consequently, the verifier may optimize away subsequent reads based on this wrong assumption, leading to correctness issues. For bpf_get_stack_proto_raw_tp, the original MEM_RDONLY was incorrect since the helper writes to the buffer. Change it to ARG_PTR_TO_UNINIT_MEM which correctly indicates write access to potentially uninitialized memory. Similar issues were recently addressed for specific helpers in commit ac44dcc788b9 ("bpf: Fix verifier assumptions of bpf_d_path's output buffer") and commit 2eb7648558a7 ("bpf: Specify access type of bpf_sysctl_get_name args"). Fix these prototypes by adding the correct memory access flags.
In the Linux kernel, the following vulnerability has been resolved: netfilter: nf_tables: revert commit_mutex usage in reset path It causes circular lock dependency between commit_mutex, nfnl_subsys_ipset and nlk_cb_mutex when nft reset, ipset list, and iptables-nft with '-m set' rule run at the same time. Previous patches made it safe to run individual reset handlers concurrently so commit_mutex is no longer required to prevent this.
In the Linux kernel, the following vulnerability has been resolved: crypto: caam - fix netdev memory leak in dpaa2_caam_probe When commit 0e1a4d427f58 ("crypto: caam: Unembed net_dev structure in dpaa2") converted embedded net_device to dynamically allocated pointers, it added cleanup in dpaa2_dpseci_disable() but missed adding cleanup in dpaa2_dpseci_free() for error paths. This causes memory leaks when dpaa2_dpseci_dpio_setup() fails during probe due to DPIO devices not being ready yet. The kernel's deferred probe mechanism handles the retry successfully, but the netdevs allocated during the failed probe attempt are never freed, resulting in kmemleak reports showing multiple leaked netdev-related allocations all traced back to dpaa2_caam_probe(). Fix this by preserving the CPU mask of allocated netdevs during setup and using it for cleanup in dpaa2_dpseci_free(). This approach ensures that only the CPUs that actually had netdevs allocated will be cleaned up, avoiding potential issues with CPU hotplug scenarios.
In the Linux kernel, the following vulnerability has been resolved: ext4: drop extent cache when splitting extent fails When the split extent fails, we might leave some extents still being processed and return an error directly, which will result in stale extent entries remaining in the extent status tree. So drop all of the remaining potentially stale extents if the splitting fails.
In the Linux kernel, the following vulnerability has been resolved: RDMA/iwcm: Fix workqueue list corruption by removing work_list The commit e1168f0 ("RDMA/iwcm: Simplify cm_event_handler()") changed the work submission logic to unconditionally call queue_work() with the expectation that queue_work() would have no effect if work was already pending. The problem is that a free list of struct iwcm_work is used (for which struct work_struct is embedded), so each call to queue_work() is basically unique and therefore does indeed queue the work. This causes a problem in the work handler which walks the work_list until it's empty to process entries. This means that a single run of the work handler could process item N+1 and release it back to the free list while the actual workqueue entry is still queued. It could then get reused (INIT_WORK...) and lead to list corruption in the workqueue logic. Fix this by just removing the work_list. The workqueue already does this for us. This fixes the following error that was observed when stress testing with ucmatose on an Intel E830 in iWARP mode: [ 151.465780] list_del corruption. next->prev should be ffff9f0915c69c08, but was ffff9f0a1116be08. (next=ffff9f0a15b11c08) [ 151.466639] ------------[ cut here ]------------ [ 151.466986] kernel BUG at lib/list_debug.c:67! [ 151.467349] Oops: invalid opcode: 0000 [#1] SMP NOPTI [ 151.467753] CPU: 14 UID: 0 PID: 2306 Comm: kworker/u64:18 Not tainted 6.19.0-rc4+ #1 PREEMPT(voluntary) [ 151.468466] Hardware name: QEMU Ubuntu 24.04 PC (i440FX + PIIX, 1996), BIOS 1.16.3-debian-1.16.3-2 04/01/2014 [ 151.469192] Workqueue: 0x0 (iw_cm_wq) [ 151.469478] RIP: 0010:__list_del_entry_valid_or_report+0xf0/0x100 [ 151.469942] Code: c7 58 5f 4c b2 e8 10 50 aa ff 0f 0b 48 89 ef e8 36 57 cb ff 48 8b 55 08 48 89 e9 48 89 de 48 c7 c7 a8 5f 4c b2 e8 f0 4f aa ff <0f> 0b 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 90 90 90 90 90 90 [ 151.471323] RSP: 0000:ffffb15644e7bd68 EFLAGS: 00010046 [ 151.471712] RAX: 000000000000006d RBX: ffff9f0915c69c08 RCX: 0000000000000027 [ 151.472243] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff9f0a37d9c600 [ 151.472768] RBP: ffff9f0a15b11c08 R08: 0000000000000000 R09: c0000000ffff7fff [ 151.473294] R10: 0000000000000001 R11: ffffb15644e7bba8 R12: ffff9f092339ee68 [ 151.473817] R13: ffff9f0900059c28 R14: ffff9f092339ee78 R15: 0000000000000000 [ 151.474344] FS: 0000000000000000(0000) GS:ffff9f0a847b5000(0000) knlGS:0000000000000000 [ 151.474934] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 151.475362] CR2: 0000559e233a9088 CR3: 000000020296b004 CR4: 0000000000770ef0 [ 151.475895] PKRU: 55555554 [ 151.476118] Call Trace: [ 151.476331] <TASK> [ 151.476497] move_linked_works+0x49/0xa0 [ 151.476792] __pwq_activate_work.isra.46+0x2f/0xa0 [ 151.477151] pwq_dec_nr_in_flight+0x1e0/0x2f0 [ 151.477479] process_scheduled_works+0x1c8/0x410 [ 151.477823] worker_thread+0x125/0x260 [ 151.478108] ? __pfx_worker_thread+0x10/0x10 [ 151.478430] kthread+0xfe/0x240 [ 151.478671] ? __pfx_kthread+0x10/0x10 [ 151.478955] ? __pfx_kthread+0x10/0x10 [ 151.479240] ret_from_fork+0x208/0x270 [ 151.479523] ? __pfx_kthread+0x10/0x10 [ 151.479806] ret_from_fork_asm+0x1a/0x30 [ 151.480103] </TASK>
In the Linux kernel, the following vulnerability has been resolved: netfilter: nft_counter: serialize reset with spinlock Add a global static spinlock to serialize counter fetch+reset operations, preventing concurrent dump-and-reset from underrunning values. The lock is taken before fetching the total so that two parallel resets cannot both read the same counter values and then both subtract them. A global lock is used for simplicity since resets are infrequent. If this becomes a bottleneck, it can be replaced with a per-net lock later.
In the Linux kernel, the following vulnerability has been resolved: quota: fix livelock between quotactl and freeze_super When a filesystem is frozen, quotactl_block() enters a retry loop waiting for the filesystem to thaw. It acquires s_umount, checks the freeze state, drops s_umount and uses sb_start_write() - sb_end_write() pair to wait for the unfreeze. However, this retry loop can trigger a livelock issue, specifically on kernels with preemption disabled. The mechanism is as follows: 1. freeze_super() sets SB_FREEZE_WRITE and calls sb_wait_write(). 2. sb_wait_write() calls percpu_down_write(), which initiates synchronize_rcu(). 3. Simultaneously, quotactl_block() spins in its retry loop, immediately executing the sb_start_write() - sb_end_write() pair. 4. Because the kernel is non-preemptible and the loop contains no scheduling points, quotactl_block() never yields the CPU. This prevents that CPU from reaching an RCU quiescent state. 5. synchronize_rcu() in the freezer thread waits indefinitely for the quotactl_block() CPU to report a quiescent state. 6. quotactl_block() spins indefinitely waiting for the freezer to advance, which it cannot do as it is blocked on the RCU sync. This results in a hang of the freezer process and 100% CPU usage by the quota process. While this can occur intermittently on multi-core systems, it is reliably reproducing on a node with the following script, running both the freezer and the quota toggle on the same CPU: xfs_freeze -u a_mount; done" & quotaoff a_mount; done" & Adding cond_resched() to the retry loop fixes the issue. It acts as an RCU quiescent state, allowing synchronize_rcu() in percpu_down_write() to complete.
In the Linux kernel, the following vulnerability has been resolved: iommu/vt-d: Clear Present bit before tearing down PASID entry The Intel VT-d Scalable Mode PASID table entry consists of 512 bits (64 bytes). When tearing down an entry, the current implementation zeros the entire 64-byte structure immediately using multiple 64-bit writes. Since the IOMMU hardware may fetch these 64 bytes using multiple internal transactions (e.g., four 128-bit bursts), updating or zeroing the entire entry while it is active (P=1) risks a "torn" read. If a hardware fetch occurs simultaneously with the CPU zeroing the entry, the hardware could observe an inconsistent state, leading to unpredictable behavior or spurious faults. Follow the "Guidance to Software for Invalidations" in the VT-d spec (Section 6.5.3.3) by implementing the recommended ownership handshake: 1. Clear only the 'Present' (P) bit of the PASID entry. 2. Use a dma_wmb() to ensure the cleared bit is visible to hardware before proceeding. 3. Execute the required invalidation sequence (PASID cache, IOTLB, and Device-TLB flush) to ensure the hardware has released all cached references. 4. Only after the flushes are complete, zero out the remaining fields of the PASID entry. Also, add a dma_wmb() in pasid_set_present() to ensure that all other fields of the PASID entry are visible to the hardware before the Present bit is set.
In the Linux kernel, the following vulnerability has been resolved: apparmor: Fix & Optimize table creation from possibly unaligned memory Source blob may come from userspace and might be unaligned. Try to optize the copying process by avoiding unaligned memory accesses. - Added Fixes tag - Added "Fix &" to description as this doesn't just optimize but fixes a potential unaligned memory access [jj: remove duplicate word "convert" in comment trigger checkpatch warning]
In the Linux kernel, the following vulnerability has been resolved: ext4: drop extent cache after doing PARTIAL_VALID1 zeroout When splitting an unwritten extent in the middle and converting it to initialized in ext4_split_extent() with the EXT4_EXT_MAY_ZEROOUT and EXT4_EXT_DATA_VALID2 flags set, it could leave a stale unwritten extent. Assume we have an unwritten file and buffered write in the middle of it without dioread_nolock enabled, it will allocate blocks as written extent. 0 A B N [UUUUUUUUUUUU] on-disk extent U: unwritten extent [UUUUUUUUUUUU] extent status tree [--DDDDDDDD--] D: valid data |<- ->| ----> this range needs to be initialized ext4_split_extent() first try to split this extent at B with EXT4_EXT_DATA_PARTIAL_VALID1 and EXT4_EXT_MAY_ZEROOUT flag set, but ext4_split_extent_at() failed to split this extent due to temporary lack of space. It zeroout B to N and leave the entire extent as unwritten. 0 A B N [UUUUUUUUUUUU] on-disk extent [UUUUUUUUUUUU] extent status tree [--DDDDDDDDZZ] Z: zeroed data ext4_split_extent() then try to split this extent at A with EXT4_EXT_DATA_VALID2 flag set. This time, it split successfully and leave an written extent from A to N. 0 A B N [UUWWWWWWWWWW] on-disk extent W: written extent [UUUUUUUUUUUU] extent status tree [--DDDDDDDDZZ] Finally ext4_map_create_blocks() only insert extent A to B to the extent status tree, and leave an stale unwritten extent in the status tree. 0 A B N [UUWWWWWWWWWW] on-disk extent W: written extent [UUWWWWWWWWUU] extent status tree [--DDDDDDDDZZ] Fix this issue by always cached extent status entry after zeroing out the second part.
In the Linux kernel, the following vulnerability has been resolved: net: hns3: fix double free issue for tx spare buffer In hns3_set_ringparam(), a temporary copy (tmp_rings) of the ring structure is created for rollback. However, the tx_spare pointer in the original ring handle is incorrectly left pointing to the old backup memory. Later, if memory allocation fails in hns3_init_all_ring() during the setup, the error path attempts to free all newly allocated rings. Since tx_spare contains a stale (non-NULL) pointer from the backup, it is mistaken for a newly allocated buffer and is erroneously freed, leading to a double-free of the backup memory. The root cause is that the tx_spare field was not cleared after its value was saved in tmp_rings, leaving a dangling pointer. Fix this by setting tx_spare to NULL in the original ring structure when the creation of the new `tx_spare` fails. This ensures the error cleanup path only frees genuinely newly allocated buffers.
In the Linux kernel, the following vulnerability has been resolved: xen-netback: reject zero-queue configuration from guest A malicious or buggy Xen guest can write "0" to the xenbus key "multi-queue-num-queues". The connect() function in the backend only validates the upper bound (requested_num_queues > xenvif_max_queues) but not zero, allowing requested_num_queues=0 to reach vzalloc(array_size(0, sizeof(struct xenvif_queue))), which triggers WARN_ON_ONCE(!size) in __vmalloc_node_range(). On systems with panic_on_warn=1, this allows a guest-to-host denial of service. The Xen network interface specification requires the queue count to be "greater than zero". Add a zero check to match the validation already present in xen-blkback, which has included this guard since its multi-queue support was added.
In the Linux kernel, the following vulnerability has been resolved: mptcp: do not account for OoO in mptcp_rcvbuf_grow() MPTCP-level OoOs are physiological when multiple subflows are active concurrently and will not cause retransmissions nor are caused by drops. Accounting for them in mptcp_rcvbuf_grow() causes the rcvbuf slowly drifting towards tcp_rmem[2]. Remove such accounting. Note that subflows will still account for TCP-level OoO when the MPTCP-level rcvbuf is propagated. This also closes a subtle and very unlikely race condition with rcvspace init; active sockets with user-space holding the msk-level socket lock, could complete such initialization in the receive callback, after that the first OoO data reaches the rcvbuf and potentially triggering a divide by zero Oops.
In the Linux kernel, the following vulnerability has been resolved: md/raid1: fix memory leak in raid1_run() raid1_run() calls setup_conf() which registers a thread via md_register_thread(). If raid1_set_limits() fails, the previously registered thread is not unregistered, resulting in a memory leak of the md_thread structure and the thread resource itself. Add md_unregister_thread() to the error path to properly cleanup the thread, which aligns with the error handling logic of other paths in this function. Compile tested only. Issue found using a prototype static analysis tool and code review.
In the Linux kernel, the following vulnerability has been resolved: af_unix: Fix memleak of newsk in unix_stream_connect(). When prepare_peercred() fails in unix_stream_connect(), unix_release_sock() is not called for newsk, and the memory is leaked. Let's move prepare_peercred() before unix_create1().
In the Linux kernel, the following vulnerability has been resolved: bpf: Fix bpf_xdp_store_bytes proto for read-only arg While making some maps in Cilium read-only from the BPF side, we noticed that the bpf_xdp_store_bytes proto is incorrect. In particular, the verifier was throwing the following error: ; ret = ctx_store_bytes(ctx, l3_off + offsetof(struct iphdr, saddr), &nat->address, 4, 0); 635: (79) r1 = *(u64 *)(r10 -144) ; R1=ctx() R10=fp0 fp-144=ctx() 636: (b4) w2 = 26 ; R2=26 637: (b4) w4 = 4 ; R4=4 638: (b4) w5 = 0 ; R5=0 639: (85) call bpf_xdp_store_bytes#190 write into map forbidden, value_size=6 off=0 size=4 nat comes from a BPF_F_RDONLY_PROG map, so R3 is a PTR_TO_MAP_VALUE. The verifier checks the helper's memory access to R3 in check_mem_size_reg, as it reaches ARG_CONST_SIZE argument. The third argument has expected type ARG_PTR_TO_UNINIT_MEM, which includes the MEM_WRITE flag. The verifier thus checks for a BPF_WRITE access on R3. Given R3 points to a read-only map, the check fails. Conversely, ARG_PTR_TO_UNINIT_MEM can also lead to the helper reading from uninitialized memory. This patch simply fixes the expected argument type to match that of bpf_skb_store_bytes.
In the Linux kernel, the following vulnerability has been resolved: apparmor: avoid per-cpu hold underflow in aa_get_buffer When aa_get_buffer() pulls from the per-cpu list it unconditionally decrements cache->hold. If hold reaches 0 while count is still non-zero, the unsigned decrement wraps to UINT_MAX. This keeps hold non-zero for a very long time, so aa_put_buffer() never returns buffers to the global list, which can starve other CPUs and force repeated kmalloc(aa_g_path_max) allocations. Guard the decrement so hold never underflows.
In the Linux kernel, the following vulnerability has been resolved: iio: sca3000: Fix a resource leak in sca3000_probe() spi->irq from request_threaded_irq() not released when iio_device_register() fails. Add an return value check and jump to a common error handler when iio_device_register() fails.
In the Linux kernel, the following vulnerability has been resolved: soc: mediatek: svs: Fix memory leak in svs_enable_debug_write() In svs_enable_debug_write(), the buf allocated by memdup_user_nul() is leaked if kstrtoint() fails. Fix this by using __free(kfree) to automatically free buf, eliminating the need for explicit kfree() calls and preventing leaks. [Angelo: Added missing cleanup.h inclusion]
In the Linux kernel, the following vulnerability has been resolved: PCI/P2PDMA: Release per-CPU pgmap ref when vm_insert_page() fails When vm_insert_page() fails in p2pmem_alloc_mmap(), p2pmem_alloc_mmap() doesn't invoke percpu_ref_put() to free the per-CPU ref of pgmap acquired after gen_pool_alloc_owner(), and memunmap_pages() will hang forever when trying to remove the PCI device. Fix it by adding the missed percpu_ref_put().
In the Linux kernel, the following vulnerability has been resolved: HID: intel-ish-hid: fix NULL-ptr-deref in ishtp_bus_remove_all_clients During a warm reset flow, the cl->device pointer may be NULL if the reset occurs while clients are still being enumerated. Accessing cl->device->reference_count without a NULL check leads to a kernel panic. This issue was identified during multi-unit warm reboot stress clycles. Add a defensive NULL check for cl->device to ensure stability under such intensive testing conditions. KASAN: null-ptr-deref in range [0000000000000000-0000000000000007] Workqueue: ish_fw_update_wq fw_reset_work_fn Call Trace: ishtp_bus_remove_all_clients+0xbe/0x130 [intel_ishtp] ishtp_reset_handler+0x85/0x1a0 [intel_ishtp] fw_reset_work_fn+0x8a/0xc0 [intel_ish_ipc]
In the Linux kernel, the following vulnerability has been resolved: arm64/gcs: Fix error handling in arch_set_shadow_stack_status() alloc_gcs() returns an error-encoded pointer on failure, which comes from do_mmap(), not NULL. The current NULL check fails to detect errors, which could lead to using an invalid GCS address. Use IS_ERR_VALUE() to properly detect errors, consistent with the check in gcs_alloc_thread_stack().
In the Linux kernel, the following vulnerability has been resolved: mfd: arizona: Fix regulator resource leak on wm5102_clear_write_sequencer() failure The wm5102_clear_write_sequencer() helper may return an error and just return, bypassing the cleanup sequence and causing regulators to remain enabled, leading to a resource leak. Change the direct return to jump to the err_reset label to properly free the resources.
In the Linux kernel, the following vulnerability has been resolved: netfilter: nft_set_rbtree: check for partial overlaps in anonymous sets Userspace provides an optimized representation in case intervals are adjacent, where the end element is omitted. The existing partial overlap detection logic skips anonymous set checks on start elements for this reason. However, it is possible to add intervals that overlap to this anonymous where two start elements with the same, eg. A-B, A-C where C < B. start end A B start end A C Restore the check on overlapping start elements to report an overlap.
In the Linux kernel, the following vulnerability has been resolved: scsi: smartpqi: Fix memory leak in pqi_report_phys_luns() pqi_report_phys_luns() fails to release the rpl_list buffer when encountering an unsupported data format or when the allocation for rpl_16byte_wwid_list fails. These early returns bypass the cleanup logic, leading to memory leaks. Consolidate the error handling by adding an out_free_rpl_list label and use goto statements to ensure rpl_list is consistently freed on failure. Compile tested only. Issue found using a prototype static analysis tool and code review.