diff mbox series

[iwl-net,v4] ice: fix iteration of TLVs in Preserved Fields Area

Message ID 20240524-iwl-net-2024-05-16-fix-nvm-tlv-issue-v4-1-4fc1da2a450d@intel.com
State Accepted
Delegated to: Anthony Nguyen
Headers show
Series [iwl-net,v4] ice: fix iteration of TLVs in Preserved Fields Area | expand

Commit Message

Jacob Keller May 24, 2024, 11:06 p.m. UTC
The ice_get_pfa_module_tlv() function iterates over the Type-Length-Value
structures in the Preserved Fields Area (PFA) of the NVM. This is used by
the driver to access data such as the Part Board Assembly identifier.

The function uses simple logic to iterate over the PFA. First, the pointer
to the PFA in the NVM is read. Then the total length of the PFA is read
from the first word.

A pointer to the first TLV is initialized, and a simple loop iterates over
each TLV. The pointer is moved forward through the NVM until it exceeds the
PFA area.

The logic seems sound, but it is missing a key detail. The Preserved
Fields Area length includes one additional final word. This is documented
in the device data sheet as a dummy word which contains 0xFFFF. All NVMs
have this extra word.

If the driver tries to scan for a TLV that is not in the PFA, it will read
past the size of the PFA. It reads and interprets the last dummy word of
the PFA as a TLV with type 0xFFFF. It then reads the word following the PFA
as a length.

The PFA resides within the Shadow RAM portion of the NVM, which is
relatively small. All of its offsets are within a 16-bit size. The PFA
pointer and TLV pointer are stored by the driver as 16-bit values.

In almost all cases, the word following the PFA will be such that
interpreting it as a length will result in 16-bit arithmetic overflow. Once
overflowed, the new next_tlv value is now below the maximum offset of the
PFA. Thus, the driver will continue to iterate the data as TLVs. In the
worst case, the driver hits on a sequence of reads which loop back to
reading the same offsets in an endless loop.

To fix this, we need to correct the loop iteration check to account for
this extra word at the end of the PFA. This alone is sufficient to resolve
the known cases of this issue in the field. However, it is plausible that
an NVM could be misconfigured or have corrupt data which results in the
same kind of overflow. Protect against this by using check_add_overflow
when calculating both the maximum offset of the TLVs, and when calculating
the next_tlv offset at the end of each loop iteration. This ensures that
the driver will not get stuck in an infinite loop when scanning the PFA.

Fixes: e961b679fb0b ("ice: add board identifier info to devlink .info_get")
Co-developed-by: Paul Greenwalt <paul.greenwalt@intel.com>
Signed-off-by: Paul Greenwalt <paul.greenwalt@intel.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
---
Changes in v4:
- Update title and description for true root cause
- Correct driver logic to account for PFA length being 1 larger than TLVs
- Link to v3: https://lore.kernel.org/r/20240517-iwl-net-2024-05-16-fix-nvm-tlv-issue-v3-1-f46c53cfb67f@intel.com

Changes in v3:
- Fix missing {
- Fix missing pfa_ptr variable to dev_warn()
- Add Fixes tag
- Link to v2: https://lore.kernel.org/r/20240517-iwl-net-2024-05-16-fix-nvm-tlv-issue-v2-1-fdee184ece86@intel.com

Changes in v2:
- Use check_add_overflow instead of increasing the variables to u32
- Upgrade log messages to dev_warn()
- Link to v1: https://lore.kernel.org/r/20240516-iwl-net-2024-05-16-fix-nvm-tlv-issue-v1-1-ecbb6a75961e@intel.com
---
 drivers/net/ethernet/intel/ice/ice_nvm.c | 28 +++++++++++++++++++++-------
 1 file changed, 21 insertions(+), 7 deletions(-)


---
base-commit: 83e93942796db58652288f0391ac00072401816f
change-id: 20240516-iwl-net-2024-05-16-fix-nvm-tlv-issue-99ebb2c55c52

Best regards,

Comments

Przemek Kitszel May 27, 2024, 9:06 a.m. UTC | #1
On 5/25/24 01:06, Jacob Keller wrote:
> The ice_get_pfa_module_tlv() function iterates over the Type-Length-Value
> structures in the Preserved Fields Area (PFA) of the NVM. This is used by
> the driver to access data such as the Part Board Assembly identifier.
> 
> The function uses simple logic to iterate over the PFA. First, the pointer
> to the PFA in the NVM is read. Then the total length of the PFA is read
> from the first word.
> 
> A pointer to the first TLV is initialized, and a simple loop iterates over
> each TLV. The pointer is moved forward through the NVM until it exceeds the
> PFA area.
> 
> The logic seems sound, but it is missing a key detail. The Preserved
> Fields Area length includes one additional final word. This is documented
> in the device data sheet as a dummy word which contains 0xFFFF. All NVMs
> have this extra word.
> 
> If the driver tries to scan for a TLV that is not in the PFA, it will read
> past the size of the PFA. It reads and interprets the last dummy word of
> the PFA as a TLV with type 0xFFFF. It then reads the word following the PFA
> as a length.
> 
> The PFA resides within the Shadow RAM portion of the NVM, which is
> relatively small. All of its offsets are within a 16-bit size. The PFA
> pointer and TLV pointer are stored by the driver as 16-bit values.
> 
> In almost all cases, the word following the PFA will be such that
> interpreting it as a length will result in 16-bit arithmetic overflow. Once
> overflowed, the new next_tlv value is now below the maximum offset of the
> PFA. Thus, the driver will continue to iterate the data as TLVs. In the
> worst case, the driver hits on a sequence of reads which loop back to
> reading the same offsets in an endless loop.
> 
> To fix this, we need to correct the loop iteration check to account for
> this extra word at the end of the PFA. This alone is sufficient to resolve
> the known cases of this issue in the field. However, it is plausible that
> an NVM could be misconfigured or have corrupt data which results in the
> same kind of overflow. Protect against this by using check_add_overflow
> when calculating both the maximum offset of the TLVs, and when calculating
> the next_tlv offset at the end of each loop iteration. This ensures that
> the driver will not get stuck in an infinite loop when scanning the PFA.
> 
> Fixes: e961b679fb0b ("ice: add board identifier info to devlink .info_get")
> Co-developed-by: Paul Greenwalt <paul.greenwalt@intel.com>
> Signed-off-by: Paul Greenwalt <paul.greenwalt@intel.com>
> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
> ---
> Changes in v4:
> - Update title and description for true root cause

Thank you for the fix and thoughtful description of the issue,
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>

> - Correct driver logic to account for PFA length being 1 larger than TLVs
> - Link to v3: https://lore.kernel.org/r/20240517-iwl-net-2024-05-16-fix-nvm-tlv-issue-v3-1-f46c53cfb67f@intel.com
> 
> Changes in v3:
> - Fix missing {
> - Fix missing pfa_ptr variable to dev_warn()
> - Add Fixes tag
> - Link to v2: https://lore.kernel.org/r/20240517-iwl-net-2024-05-16-fix-nvm-tlv-issue-v2-1-fdee184ece86@intel.com
> 
> Changes in v2:
> - Use check_add_overflow instead of increasing the variables to u32
> - Upgrade log messages to dev_warn()
> - Link to v1: https://lore.kernel.org/r/20240516-iwl-net-2024-05-16-fix-nvm-tlv-issue-v1-1-ecbb6a75961e@intel.com
> ---
>   drivers/net/ethernet/intel/ice/ice_nvm.c | 28 +++++++++++++++++++++-------
>   1 file changed, 21 insertions(+), 7 deletions(-)
> 
> diff --git a/drivers/net/ethernet/intel/ice/ice_nvm.c b/drivers/net/ethernet/intel/ice/ice_nvm.c
> index 84eab92dc03c..ea7cbdf8492d 100644
> --- a/drivers/net/ethernet/intel/ice/ice_nvm.c
> +++ b/drivers/net/ethernet/intel/ice/ice_nvm.c
> @@ -440,8 +440,7 @@ int
>   ice_get_pfa_module_tlv(struct ice_hw *hw, u16 *module_tlv, u16 *module_tlv_len,
>   		       u16 module_type)
>   {
> -	u16 pfa_len, pfa_ptr;
> -	u16 next_tlv;
> +	u16 pfa_len, pfa_ptr, next_tlv, max_tlv;
>   	int status;
>   
>   	status = ice_read_sr_word(hw, ICE_SR_PFA_PTR, &pfa_ptr);
> @@ -454,11 +453,23 @@ ice_get_pfa_module_tlv(struct ice_hw *hw, u16 *module_tlv, u16 *module_tlv_len,
>   		ice_debug(hw, ICE_DBG_INIT, "Failed to read PFA length.\n");
>   		return status;
>   	}
> +
> +	/* The Preserved Fields Area contains a sequence of Type-Length-Value
> +	 * structures which define its contents. The PFA length includes all
> +	 * of the TLVs, plus the initial length word itself, *and* one final
> +	 * word at the end after all of the TLVs.
> +	 */
> +	if (check_add_overflow(pfa_ptr, pfa_len - 1, &max_tlv)) {
> +		dev_warn(ice_hw_to_dev(hw), "PFA starts at offset %u. PFA length of %u caused 16-bit arithmetic overflow.\n",
> +			 pfa_ptr, pfa_len);
> +		return -EINVAL;
> +	}
> +
>   	/* Starting with first TLV after PFA length, iterate through the list
>   	 * of TLVs to find the requested one.
>   	 */
>   	next_tlv = pfa_ptr + 1;
> -	while (next_tlv < pfa_ptr + pfa_len) {
> +	while (next_tlv < max_tlv) {
>   		u16 tlv_sub_module_type;
>   		u16 tlv_len;
>   
> @@ -482,10 +493,13 @@ ice_get_pfa_module_tlv(struct ice_hw *hw, u16 *module_tlv, u16 *module_tlv_len,
>   			}
>   			return -EINVAL;
>   		}
> -		/* Check next TLV, i.e. current TLV pointer + length + 2 words
> -		 * (for current TLV's type and length)
> -		 */
> -		next_tlv = next_tlv + tlv_len + 2;
> +
> +		if (check_add_overflow(next_tlv, 2, &next_tlv) ||
> +		    check_add_overflow(next_tlv, tlv_len, &next_tlv)) {
> +			dev_warn(ice_hw_to_dev(hw), "TLV of type %u and length 0x%04x caused 16-bit arithmetic overflow. The PFA starts at 0x%04x and has length of 0x%04x\n",
> +				 tlv_sub_module_type, tlv_len, pfa_ptr, pfa_len);
> +			return -EINVAL;
> +		}
>   	}
>   	/* Module does not exist */
>   	return -ENOENT;
> 
> ---
> base-commit: 83e93942796db58652288f0391ac00072401816f
> change-id: 20240516-iwl-net-2024-05-16-fix-nvm-tlv-issue-99ebb2c55c52
> 
> Best regards,
Pucha, HimasekharX Reddy May 30, 2024, 4:39 a.m. UTC | #2
> -----Original Message-----
> From: Intel-wired-lan <intel-wired-lan-bounces@osuosl.org> On Behalf Of Jacob Keller
> Sent: Saturday, May 25, 2024 4:37 AM
> To: Kitszel, Przemyslaw <przemyslaw.kitszel@intel.com>; Intel Wired LAN <intel-wired-lan@lists.osuosl.org>; Paul Menzel <pmenzel@molgen.mpg.de>
> Cc: Keller, Jacob E <jacob.e.keller@intel.com>; Greenwalt, Paul <paul.greenwalt@intel.com>
> Subject: [Intel-wired-lan] [PATCH iwl-net v4] ice: fix iteration of TLVs in Preserved Fields Area
>
> The ice_get_pfa_module_tlv() function iterates over the Type-Length-Value structures in the Preserved Fields Area (PFA) of the NVM. This is used by the driver to access data such as the Part Board Assembly identifier.
>
> The function uses simple logic to iterate over the PFA. First, the pointer to the PFA in the NVM is read. Then the total length of the PFA is read from the first word.
>
> A pointer to the first TLV is initialized, and a simple loop iterates over each TLV. The pointer is moved forward through the NVM until it exceeds the PFA area.
>
> The logic seems sound, but it is missing a key detail. The Preserved Fields Area length includes one additional final word. This is documented in the device data sheet as a dummy word which contains 0xFFFF. All NVMs have this extra word.
>
> If the driver tries to scan for a TLV that is not in the PFA, it will read past the size of the PFA. It reads and interprets the last dummy word of the PFA as a TLV with type 0xFFFF. It then reads the word following the PFA as a length.
>
> The PFA resides within the Shadow RAM portion of the NVM, which is relatively small. All of its offsets are within a 16-bit size. The PFA pointer and TLV pointer are stored by the driver as 16-bit values.
>
> In almost all cases, the word following the PFA will be such that interpreting it as a length will result in 16-bit arithmetic overflow. Once overflowed, the new next_tlv value is now below the maximum offset of the PFA. Thus, the driver will continue to iterate the data as TLVs. > In the worst case, the driver hits on a sequence of reads which loop back to reading the same offsets in an endless loop.
>
> To fix this, we need to correct the loop iteration check to account for this extra word at the end of the PFA. This alone is sufficient to resolve the known cases of this issue in the field. However, it is plausible that an NVM could be misconfigured or have corrupt data which results in the same kind of overflow. Protect against this by using check_add_overflow when calculating both the maximum offset of the TLVs, and when calculating the next_tlv offset at the end of each loop iteration. This ensures that the driver will not get stuck in an infinite > loop when scanning the PFA.
> 
> Fixes: e961b679fb0b ("ice: add board identifier info to devlink .info_get")
> Co-developed-by: Paul Greenwalt <paul.greenwalt@intel.com>
> Signed-off-by: Paul Greenwalt <paul.greenwalt@intel.com>
> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
> ---
> Changes in v4:
> - Update title and description for true root cause
> - Correct driver logic to account for PFA length being 1 larger than TLVs
> - Link to v3: https://lore.kernel.org/r/20240517-iwl-net-2024-05-16-fix-nvm-tlv-issue-v3-1-f46c53cfb67f@intel.com
>
> Changes in v3:
> - Fix missing {
> - Fix missing pfa_ptr variable to dev_warn()
> - Add Fixes tag
> - Link to v2: https://lore.kernel.org/r/20240517-iwl-net-2024-05-16-fix-nvm-tlv-issue-v2-1-fdee184ece86@intel.com
>
> Changes in v2:
> - Use check_add_overflow instead of increasing the variables to u32
> - Upgrade log messages to dev_warn()
> - Link to v1: https://lore.kernel.org/r/20240516-iwl-net-2024-05-16-fix-nvm-tlv-issue-v1-1-ecbb6a75961e@intel.com
> ---
>  drivers/net/ethernet/intel/ice/ice_nvm.c | 28 +++++++++++++++++++++-------
>  1 file changed, 21 insertions(+), 7 deletions(-)
>

Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel)
diff mbox series

Patch

diff --git a/drivers/net/ethernet/intel/ice/ice_nvm.c b/drivers/net/ethernet/intel/ice/ice_nvm.c
index 84eab92dc03c..ea7cbdf8492d 100644
--- a/drivers/net/ethernet/intel/ice/ice_nvm.c
+++ b/drivers/net/ethernet/intel/ice/ice_nvm.c
@@ -440,8 +440,7 @@  int
 ice_get_pfa_module_tlv(struct ice_hw *hw, u16 *module_tlv, u16 *module_tlv_len,
 		       u16 module_type)
 {
-	u16 pfa_len, pfa_ptr;
-	u16 next_tlv;
+	u16 pfa_len, pfa_ptr, next_tlv, max_tlv;
 	int status;
 
 	status = ice_read_sr_word(hw, ICE_SR_PFA_PTR, &pfa_ptr);
@@ -454,11 +453,23 @@  ice_get_pfa_module_tlv(struct ice_hw *hw, u16 *module_tlv, u16 *module_tlv_len,
 		ice_debug(hw, ICE_DBG_INIT, "Failed to read PFA length.\n");
 		return status;
 	}
+
+	/* The Preserved Fields Area contains a sequence of Type-Length-Value
+	 * structures which define its contents. The PFA length includes all
+	 * of the TLVs, plus the initial length word itself, *and* one final
+	 * word at the end after all of the TLVs.
+	 */
+	if (check_add_overflow(pfa_ptr, pfa_len - 1, &max_tlv)) {
+		dev_warn(ice_hw_to_dev(hw), "PFA starts at offset %u. PFA length of %u caused 16-bit arithmetic overflow.\n",
+			 pfa_ptr, pfa_len);
+		return -EINVAL;
+	}
+
 	/* Starting with first TLV after PFA length, iterate through the list
 	 * of TLVs to find the requested one.
 	 */
 	next_tlv = pfa_ptr + 1;
-	while (next_tlv < pfa_ptr + pfa_len) {
+	while (next_tlv < max_tlv) {
 		u16 tlv_sub_module_type;
 		u16 tlv_len;
 
@@ -482,10 +493,13 @@  ice_get_pfa_module_tlv(struct ice_hw *hw, u16 *module_tlv, u16 *module_tlv_len,
 			}
 			return -EINVAL;
 		}
-		/* Check next TLV, i.e. current TLV pointer + length + 2 words
-		 * (for current TLV's type and length)
-		 */
-		next_tlv = next_tlv + tlv_len + 2;
+
+		if (check_add_overflow(next_tlv, 2, &next_tlv) ||
+		    check_add_overflow(next_tlv, tlv_len, &next_tlv)) {
+			dev_warn(ice_hw_to_dev(hw), "TLV of type %u and length 0x%04x caused 16-bit arithmetic overflow. The PFA starts at 0x%04x and has length of 0x%04x\n",
+				 tlv_sub_module_type, tlv_len, pfa_ptr, pfa_len);
+			return -EINVAL;
+		}
 	}
 	/* Module does not exist */
 	return -ENOENT;