[RFC] ip tunnel flag byte order issue

Submitted by stephen hemminger on Oct. 10, 2012, 7:06 p.m.

Details

Message ID 20121010120630.5e9f2c2c@nehalam.linuxnetplumber.net
State RFC
Delegated to: David Miller
Headers show

Commit Message

stephen hemminger Oct. 10, 2012, 7:06 p.m.
Sparse found a real problem with the ABI for tunnelling.

The SIT and VTI tunnel ioctl's both overload the i_flags field in the
ip_tunnel parameters structure. This field is defined as big endian
(be16) and the various GRE_XXX macros do the necessary byte swapping.

The problem is that both SIT and VTI are using an additional flag bit
that is defined in host byte order, and is then or'd in. It happens to
work because both possible locations hit holes in the current usage of
GRE.  For big endian cpu's it overlaps the GRE_VERSION which is always
zero, and for little endian it overlaps the GRE recursion field also
always zero.

Having the field in different places on different CPU architectures
was a mistake. The problem is fixing it will break the ABI on one or
the other architecture.  I choose to break big endian since it the
minority.

Also both VTI and SIT are overloading the same bit which is an
accident waiting to happen.  Since VTI is newer, I propose giving a
different bit to VTI.

The other alternative is keeping the same ABI, but putting a big note
as to why it works in spite of our stupidity.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

Ben Hutchings Oct. 10, 2012, 8:26 p.m.
On Wed, 2012-10-10 at 12:06 -0700, Stephen Hemminger wrote:
> Sparse found a real problem with the ABI for tunnelling.
> 
> The SIT and VTI tunnel ioctl's both overload the i_flags field in the
> ip_tunnel parameters structure. This field is defined as big endian
> (be16) and the various GRE_XXX macros do the necessary byte swapping.
> 
> The problem is that both SIT and VTI are using an additional flag bit
> that is defined in host byte order, and is then or'd in. It happens to
> work because both possible locations hit holes in the current usage of
> GRE.  For big endian cpu's it overlaps the GRE_VERSION which is always
> zero, and for little endian it overlaps the GRE recursion field also
> always zero.

Why do these fields exist if they're always going to be 0?

> Having the field in different places on different CPU architectures
> was a mistake. The problem is fixing it will break the ABI on one or
> the other architecture.  I choose to break big endian since it the
> minority.

Or we can define the 'flag' to have both bits set (0x0101, with a
__cpu_to_be16 to keep sparse happy) while accepting either set on input.

> Also both VTI and SIT are overloading the same bit which is an
> accident waiting to happen.  Since VTI is newer, I propose giving a
> different bit to VTI.

Indeed VTI is new in 3.6, so there is still a short window in which it's
fairly safe to tweak its ABI.

> The other alternative is keeping the same ABI, but putting a big note
> as to why it works in spite of our stupidity.
[...]

Does it even matter that different tunnel types have different meanings
for flags?

Ben.
stephen hemminger Oct. 10, 2012, 8:34 p.m.
On Wed, 10 Oct 2012 21:26:36 +0100
Ben Hutchings <bhutchings@solarflare.com> wrote:

> On Wed, 2012-10-10 at 12:06 -0700, Stephen Hemminger wrote:
> > Sparse found a real problem with the ABI for tunnelling.
> > 
> > The SIT and VTI tunnel ioctl's both overload the i_flags field in the
> > ip_tunnel parameters structure. This field is defined as big endian
> > (be16) and the various GRE_XXX macros do the necessary byte swapping.
> > 
> > The problem is that both SIT and VTI are using an additional flag bit
> > that is defined in host byte order, and is then or'd in. It happens to
> > work because both possible locations hit holes in the current usage of
> > GRE.  For big endian cpu's it overlaps the GRE_VERSION which is always
> > zero, and for little endian it overlaps the GRE recursion field also
> > always zero.
> 
> Why do these fields exist if they're always going to be 0?

They exist in the RFC. GRE implementation mixes bits on the wire
with bits from ioctl().

> 
> > Having the field in different places on different CPU architectures
> > was a mistake. The problem is fixing it will break the ABI on one or
> > the other architecture.  I choose to break big endian since it the
> > minority.
> 
> Or we can define the 'flag' to have both bits set (0x0101, with a
> __cpu_to_be16 to keep sparse happy) while accepting either set on input.
> 
> > Also both VTI and SIT are overloading the same bit which is an
> > accident waiting to happen.  Since VTI is newer, I propose giving a
> > different bit to VTI.
> 
> Indeed VTI is new in 3.6, so there is still a short window in which it's
> fairly safe to tweak its ABI.
> 
> > The other alternative is keeping the same ABI, but putting a big note
> > as to why it works in spite of our stupidity.
> [...]
> 
> Does it even matter that different tunnel types have different meanings
> for flags?
> 
> Ben.
> 

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Patch hide | download patch | download mbox

--- a/include/linux/if_tunnel.h	2012-10-10 11:37:22.444050762 -0700
+++ b/include/linux/if_tunnel.h	2012-10-10 11:50:34.692168074 -0700
@@ -42,7 +42,7 @@  struct ip_tunnel_parm {
 };
 
 /* SIT-mode i_flags */
-#define	SIT_ISATAP	0x0001
+#define	SIT_ISATAP	__cpu_to_be16(0x0100)
 
 struct ip_tunnel_prl {
 	__be32			addr;
@@ -84,7 +84,7 @@  enum {
 #define IFLA_GRE_MAX	(__IFLA_GRE_MAX - 1)
 
 /* VTI-mode i_flags */
-#define VTI_ISVTI 0x0001
+#define VTI_ISVTI __cpu_to_be16(0x0200)
 
 enum {
 	IFLA_VTI_UNSPEC,