diff mbox

[v12,2/3] NETFILTER module xt_hmark, new target for HASH based fwmark

Message ID 201205071020.44449.hans.schillstrom@ericsson.com
State Superseded
Headers show

Commit Message

Hans Schillstrom May 7, 2012, 8:20 a.m. UTC
On Monday 07 May 2012 00:57:38 Pablo Neira Ayuso wrote:
> Hi Hans,
> 
> [...]
> > > > > Regarding ICMP traffic, I think we can use the ID field for the
> > > > > hashing as well. Thus, we handle ICMP like other protocols.
> > > >
> > > > Yes why not, I can give it a try.
> > > >
> >
> > I think we wait with this one..
> 
> I see. This is easy to add for the conntrack side, but it will require
> some extra code for the packet-based solution.

Actually I think there is very little gain to spread with type 
and then we must add a user mode possibility to turn it off 
i.e. a --hmark-icmp-type-mask 

> Not directly related to this but, I know that your intention is to
> make this as flexible as possible. However, I still don't find how I
> would use the port mask feature in any of my setups.  Basically, I
> don't come up with any useful example for this situation.

We have plenty of rules where just source port mask is zero.
and the dest-port-mask is 0xfffc (or 0xffff)


> I'm also telling this because I think that ICMP support will be
> easier to add if port masking is removed.
> 
> [...]
> > This is what I have done.
> >
> > - I reduced the code size a little bit by combining the hmark_ct_set_htuple_ipvX into one func.
> >   by adding a hmark_addr6_mask() and hmark_addr_any_mask()
> >   Note that using "otuple->src.l3num" as param 1 in both src and dst is not a typo.
> >   (it's not set in the rtuple)
> 
> Good one, this made the code even smaller.
> 
> > - Made the if (dst < src) swap() in the hmark_hash() since it should be used by every caller.
> 
> Not really, you don't need for the conntrack part. The original tuple
> is always the same, not matter where the packet is coming from. I have
> removed this again so it only affects packet-based hashing.

Yes original tuple is always the same but not always less than the rtuple.
If you have two nodes that should produce the same hmark,
one with conntrack an one without you must make a compare to make it consistent.

> 
> > - Moved the L3 check a little bit earlier.
> 
> good.
> 
> > - changed return values for fragments.
> 
> With this, you're giving up on trying to classify fragments. Do you
> really want this?
> 
> From my point of view, if your firewalls (assuming they are the HMARK
> classification) are stateless, it still makes sense to me to classify
> fragments using the XT_HMARK_METHOD_L3_4.

I do agree, it is back to "return 0" again.

> 
> > - Added nhoffs to: hmark_set_tuple_ports(skb, (ip->ihl * 4) + nhoff, t, info);
> >   to get icmp working
> 
> good catch.
> 
> Below, some minor changes that I made to your patch (you can find a
> new version enclosed to this email).
> 
> [...]
> > +#ifndef XT_HMARK_H_
> > +#define XT_HMARK_H_
> > +
> > +#include <linux/types.h>
> > +
> > +enum {
> > +     XT_HMARK_NONE,
> > +     XT_HMARK_SADR_AND,
> > +     XT_HMARK_DADR_AND,
> > +     XT_HMARK_SPI_AND,
> > +     XT_HMARK_SPI_OR,
> > +     XT_HMARK_SPORT_AND,
> > +     XT_HMARK_DPORT_AND,
> > +     XT_HMARK_SPORT_OR,
> > +     XT_HMARK_DPORT_OR,
> > +     XT_HMARK_PROTO_AND,
> > +     XT_HMARK_RND,
> > +     XT_HMARK_MODULUS,
> > +     XT_HMARK_OFFSET,
> > +     XT_HMARK_CT,
> > +     XT_HMARK_METHOD_L3,
> > +     XT_HMARK_METHOD_L3_4,
> > +     XT_F_HMARK_SADR_AND    = 1 << XT_HMARK_SADR_AND,
> > +     XT_F_HMARK_DADR_AND    = 1 << XT_HMARK_DADR_AND,
> > +     XT_F_HMARK_SPI_AND     = 1 << XT_HMARK_SPI_AND,
> > +     XT_F_HMARK_SPI_OR      = 1 << XT_HMARK_SPI_OR,
> > +     XT_F_HMARK_SPORT_AND   = 1 << XT_HMARK_SPORT_AND,
> > +     XT_F_HMARK_DPORT_AND   = 1 << XT_HMARK_DPORT_AND,
> > +     XT_F_HMARK_SPORT_OR    = 1 << XT_HMARK_SPORT_OR,
> > +     XT_F_HMARK_DPORT_OR    = 1 << XT_HMARK_DPORT_OR,
> > +     XT_F_HMARK_PROTO_AND   = 1 << XT_HMARK_PROTO_AND,
> > +     XT_F_HMARK_RND         = 1 << XT_HMARK_RND,
> > +     XT_F_HMARK_MODULUS     = 1 << XT_HMARK_MODULUS,
> > +     XT_F_HMARK_OFFSET      = 1 << XT_HMARK_OFFSET,
> > +     XT_F_HMARK_CT          = 1 << XT_HMARK_CT,
> > +     XT_F_HMARK_METHOD_L3   = 1 << XT_HMARK_METHOD_L3,
> > +     XT_F_HMARK_METHOD_L3_4 = 1 << XT_HMARK_METHOD_L3_4,
> 
> I've defined:
> 
> #define XT_HMARK_FLAG(flag) (1 << flag)
> 
> So we save all those extra _F_ defintions, they look redundant.

OK, I had to change the user mode code to keep up with this change...
The user code part is also included now.

[snip]

>+static inline u32
>+hmark_addr_mask(int l3num, const __u32 *addr32, const __u32 *mask)
>+{
>+       switch (l3num) {
              ^
Added a space here

>+       case AF_INET:
>+               return *addr32 & *mask;
>+       case AF_INET6:
>+               return hmark_addr6_mask(addr32, mask);

Comments

Pablo Neira Ayuso May 7, 2012, 9:03 a.m. UTC | #1
On Mon, May 07, 2012 at 10:20:42AM +0200, Hans Schillstrom wrote:
> On Monday 07 May 2012 00:57:38 Pablo Neira Ayuso wrote:
> > Hi Hans,
> > 
> > [...]
> > > > > > Regarding ICMP traffic, I think we can use the ID field for the
> > > > > > hashing as well. Thus, we handle ICMP like other protocols.
> > > > >
> > > > > Yes why not, I can give it a try.
> > > > >
> > >
> > > I think we wait with this one..
> > 
> > I see. This is easy to add for the conntrack side, but it will require
> > some extra code for the packet-based solution.
> 
> Actually I think there is very little gain to spread with type 
> and then we must add a user mode possibility to turn it off 
> i.e. a --hmark-icmp-type-mask 
> 
> > Not directly related to this but, I know that your intention is to
> > make this as flexible as possible. However, I still don't find how I
> > would use the port mask feature in any of my setups.  Basically, I
> > don't come up with any useful example for this situation.
> 
> We have plenty of rules where just source port mask is zero.
> and the dest-port-mask is 0xfffc (or 0xffff)

0xffff and 0x0000 means on/off respectively.

Still curious, how can 0xfffc be useful?

> > I'm also telling this because I think that ICMP support will be
> > easier to add if port masking is removed.
> > 
> > [...]
> > > This is what I have done.
> > >
> > > - I reduced the code size a little bit by combining the hmark_ct_set_htuple_ipvX into one func.
> > >   by adding a hmark_addr6_mask() and hmark_addr_any_mask()
> > >   Note that using "otuple->src.l3num" as param 1 in both src and dst is not a typo.
> > >   (it's not set in the rtuple)
> > 
> > Good one, this made the code even smaller.
> > 
> > > - Made the if (dst < src) swap() in the hmark_hash() since it should be used by every caller.
> > 
> > Not really, you don't need for the conntrack part. The original tuple
> > is always the same, not matter where the packet is coming from. I have
> > removed this again so it only affects packet-based hashing.
> 
> Yes original tuple is always the same but not always less than the rtuple.
> If you have two nodes that should produce the same hmark,
> one with conntrack an one without you must make a compare to make it consistent.

I see, for consistency still makes sense although this seems to me
like still strange configuration. In what scenario would you use two
different approaches?

> > > - Moved the L3 check a little bit earlier.
> > 
> > good.
> > 
> > > - changed return values for fragments.
> > 
> > With this, you're giving up on trying to classify fragments. Do you
> > really want this?
> > 
> > From my point of view, if your firewalls (assuming they are the HMARK
> > classification) are stateless, it still makes sense to me to classify
> > fragments using the XT_HMARK_METHOD_L3_4.
> 
> I do agree, it is back to "return 0" again.

OK.
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Hans Schillstrom May 7, 2012, 9:14 a.m. UTC | #2
On Monday 07 May 2012 11:03:28 Pablo Neira Ayuso wrote:
> On Mon, May 07, 2012 at 10:20:42AM +0200, Hans Schillstrom wrote:
> > On Monday 07 May 2012 00:57:38 Pablo Neira Ayuso wrote:
> > > Hi Hans,
> > > 
> > > [...]
> > > > > > > Regarding ICMP traffic, I think we can use the ID field for the
> > > > > > > hashing as well. Thus, we handle ICMP like other protocols.
> > > > > >
> > > > > > Yes why not, I can give it a try.
> > > > > >
> > > >
> > > > I think we wait with this one..
> > > 
> > > I see. This is easy to add for the conntrack side, but it will require
> > > some extra code for the packet-based solution.
> > 
> > Actually I think there is very little gain to spread with type 
> > and then we must add a user mode possibility to turn it off 
> > i.e. a --hmark-icmp-type-mask 
> > 
> > > Not directly related to this but, I know that your intention is to
> > > make this as flexible as possible. However, I still don't find how I
> > > would use the port mask feature in any of my setups.  Basically, I
> > > don't come up with any useful example for this situation.
> > 
> > We have plenty of rules where just source port mask is zero.
> > and the dest-port-mask is 0xfffc (or 0xffff)
> 
> 0xffff and 0x0000 means on/off respectively.
> 
> Still curious, how can 0xfffc be useful?

That's a special case where an appl is using 4 ports.
But in general, have not seen other than "on/off" except for above.

> 
> > > I'm also telling this because I think that ICMP support will be
> > > easier to add if port masking is removed.
> > > 
> > > [...]
> > > > This is what I have done.
> > > >
> > > > - I reduced the code size a little bit by combining the hmark_ct_set_htuple_ipvX into one func.
> > > >   by adding a hmark_addr6_mask() and hmark_addr_any_mask()
> > > >   Note that using "otuple->src.l3num" as param 1 in both src and dst is not a typo.
> > > >   (it's not set in the rtuple)
> > > 
> > > Good one, this made the code even smaller.
> > > 
> > > > - Made the if (dst < src) swap() in the hmark_hash() since it should be used by every caller.
> > > 
> > > Not really, you don't need for the conntrack part. The original tuple
> > > is always the same, not matter where the packet is coming from. I have
> > > removed this again so it only affects packet-based hashing.
> > 
> > Yes original tuple is always the same but not always less than the rtuple.
> > If you have two nodes that should produce the same hmark,
> > one with conntrack an one without you must make a compare to make it consistent.
> 
> I see, for consistency still makes sense although this seems to me
> like still strange configuration. In what scenario would you use two
> different approaches?

In the way that we use HMARK,
in the incomming path there is conntrack disabled in the contrainer, 
for the outgoing patch i.e. at the payloads there is conntrack used.
In that case the --hmark-ct makes life easier.

> 
> > > > - Moved the L3 check a little bit earlier.
> > > 
> > > good.
> > > 
> > > > - changed return values for fragments.
> > > 
> > > With this, you're giving up on trying to classify fragments. Do you
> > > really want this?
> > > 
> > > From my point of view, if your firewalls (assuming they are the HMARK
> > > classification) are stateless, it still makes sense to me to classify
> > > fragments using the XT_HMARK_METHOD_L3_4.
> > 
> > I do agree, it is back to "return 0" again.
> 
> OK.
>
Pablo Neira Ayuso May 7, 2012, 11:56 a.m. UTC | #3
On Mon, May 07, 2012 at 11:14:34AM +0200, Hans Schillstrom wrote:
> > > We have plenty of rules where just source port mask is zero.
> > > and the dest-port-mask is 0xfffc (or 0xffff)
> > 
> > 0xffff and 0x0000 means on/off respectively.
> > 
> > Still curious, how can 0xfffc be useful?
> 
> That's a special case where an appl is using 4 ports.
> But in general, have not seen other than "on/off" except for above.

I see. Well I'm fine with this way to switch on/off things, just
wanted some clafication.

Still one final thing I'd like to remove before inclusion:

+       union hmark_ports       port_mask;
+       union hmark_ports       port_set;
+       __u32                   spi_mask;
+       __u32                   spi_set;

the spi_mask seems redundant. The port_mask already provides u32 for
it.

In case you want to support different masks for AH/ESP and TCP, you
could do the following:

iptables -I PREROUTING -t mangle -p esp -j HARK --spi-mask 0xffff0000
iptables -I PREROUTING -t mangle -p tcp -j HARK --port-mask 0xfffc

Any objection?

Yes, you'll have to change user-space again, but we have time for
that.

> > > > I'm also telling this because I think that ICMP support will be
> > > > easier to add if port masking is removed.
> > > > 
> > > > [...]
> > > > > This is what I have done.
> > > > >
> > > > > - I reduced the code size a little bit by combining the hmark_ct_set_htuple_ipvX into one func.
> > > > >   by adding a hmark_addr6_mask() and hmark_addr_any_mask()
> > > > >   Note that using "otuple->src.l3num" as param 1 in both src and dst is not a typo.
> > > > >   (it's not set in the rtuple)
> > > > 
> > > > Good one, this made the code even smaller.
> > > > 
> > > > > - Made the if (dst < src) swap() in the hmark_hash() since it should be used by every caller.
> > > > 
> > > > Not really, you don't need for the conntrack part. The original tuple
> > > > is always the same, not matter where the packet is coming from. I have
> > > > removed this again so it only affects packet-based hashing.
> > > 
> > > Yes original tuple is always the same but not always less than the rtuple.
> > > If you have two nodes that should produce the same hmark,
> > > one with conntrack an one without you must make a compare to make it consistent.
> > 
> > I see, for consistency still makes sense although this seems to me
> > like still strange configuration. In what scenario would you use two
> > different approaches?
> 
> In the way that we use HMARK,
> in the incomming path there is conntrack disabled in the contrainer, 
> for the outgoing patch i.e. at the payloads there is conntrack used.
> In that case the --hmark-ct makes life easier.

That's still not enough to guarantee that the mark will be consistent
if NAT is in user, but I don't mind recovering the swap and add some
comment on the code to explain this if this makes your life easier.
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Hans Schillstrom May 7, 2012, 12:09 p.m. UTC | #4
On Monday 07 May 2012 13:56:12 Pablo Neira Ayuso wrote:
> On Mon, May 07, 2012 at 11:14:34AM +0200, Hans Schillstrom wrote:
> > > > We have plenty of rules where just source port mask is zero.
> > > > and the dest-port-mask is 0xfffc (or 0xffff)
> > > 
> > > 0xffff and 0x0000 means on/off respectively.
> > > 
> > > Still curious, how can 0xfffc be useful?
> > 
> > That's a special case where an appl is using 4 ports.
> > But in general, have not seen other than "on/off" except for above.
> 
> I see. Well I'm fine with this way to switch on/off things, just
> wanted some clafication.
> 
> Still one final thing I'd like to remove before inclusion:
> 
> +       union hmark_ports       port_mask;
> +       union hmark_ports       port_set;
> +       __u32                   spi_mask;
> +       __u32                   spi_set;
> 
> the spi_mask seems redundant. The port_mask already provides u32 for
> it.

No problems, I'll remove it.

> In case you want to support different masks for AH/ESP and TCP, you
> could do the following:
> 
> iptables -I PREROUTING -t mangle -p esp -j HARK --spi-mask 0xffff0000
> iptables -I PREROUTING -t mangle -p tcp -j HARK --port-mask 0xfffc
> 
> Any objection?

I don't think this is a problem, but it should be written in the man page
that ports and spi share mask so they can't be used at the same time.


> Yes, you'll have to change user-space again, but we have time for
> that.

:-)

> 
> > > > > I'm also telling this because I think that ICMP support will be
> > > > > easier to add if port masking is removed.
> > > > > 
> > > > > [...]
> > > > > > This is what I have done.
> > > > > >
> > > > > > - I reduced the code size a little bit by combining the hmark_ct_set_htuple_ipvX into one func.
> > > > > >   by adding a hmark_addr6_mask() and hmark_addr_any_mask()
> > > > > >   Note that using "otuple->src.l3num" as param 1 in both src and dst is not a typo.
> > > > > >   (it's not set in the rtuple)
> > > > > 
> > > > > Good one, this made the code even smaller.
> > > > > 
> > > > > > - Made the if (dst < src) swap() in the hmark_hash() since it should be used by every caller.
> > > > > 
> > > > > Not really, you don't need for the conntrack part. The original tuple
> > > > > is always the same, not matter where the packet is coming from. I have
> > > > > removed this again so it only affects packet-based hashing.
> > > > 
> > > > Yes original tuple is always the same but not always less than the rtuple.
> > > > If you have two nodes that should produce the same hmark,
> > > > one with conntrack an one without you must make a compare to make it consistent.
> > > 
> > > I see, for consistency still makes sense although this seems to me
> > > like still strange configuration. In what scenario would you use two
> > > different approaches?
> > 
> > In the way that we use HMARK,
> > in the incomming path there is conntrack disabled in the contrainer, 
> > for the outgoing patch i.e. at the payloads there is conntrack used.
> > In that case the --hmark-ct makes life easier.
> 
> That's still not enough to guarantee that the mark will be consistent
> if NAT is in user, but I don't mind recovering the swap and add some
> comment on the code to explain this if this makes your life easier.

Thanks,  I will send a new patch soon.
Pablo Neira Ayuso May 7, 2012, 12:22 p.m. UTC | #5
On Mon, May 07, 2012 at 02:09:46PM +0200, Hans Schillstrom wrote:
> On Monday 07 May 2012 13:56:12 Pablo Neira Ayuso wrote:
> > On Mon, May 07, 2012 at 11:14:34AM +0200, Hans Schillstrom wrote:
> > > > > We have plenty of rules where just source port mask is zero.
> > > > > and the dest-port-mask is 0xfffc (or 0xffff)
> > > > 
> > > > 0xffff and 0x0000 means on/off respectively.
> > > > 
> > > > Still curious, how can 0xfffc be useful?
> > > 
> > > That's a special case where an appl is using 4 ports.
> > > But in general, have not seen other than "on/off" except for above.
> > 
> > I see. Well I'm fine with this way to switch on/off things, just
> > wanted some clafication.
> > 
> > Still one final thing I'd like to remove before inclusion:
> > 
> > +       union hmark_ports       port_mask;
> > +       union hmark_ports       port_set;
> > +       __u32                   spi_mask;
> > +       __u32                   spi_set;
> > 
> > the spi_mask seems redundant. The port_mask already provides u32 for
> > it.
> 
> No problems, I'll remove it.

OK. As a nice side-effect, this will lead to removing the branch that
tests ESP/AH in hmark_set_tuple_ports.

Please, use the patch that I sent you yesterday. Recover the swap
behaviour that you need, I'll mangle the patch myself to add the
little comment to explain why we do this with CT as well.

BTW, note that you do *not* have to remove the XT_HMARK_SPI flags, we
still need those for iptables-save.

While at it:

+enum {                      
+       XT_HMARK_NONE,       
+       XT_HMARK_SADR_AND,   
+       XT_HMARK_DADR_AND,   
+       XT_HMARK_SPI_AND,    
+       XT_HMARK_SPI_OR,    

remove all trailing _OR

+       XT_HMARK_SPORT_AND,  
+       XT_HMARK_DPORT_AND,  
+       XT_HMARK_SPORT_OR,   
+       XT_HMARK_DPORT_OR,   
+       XT_HMARK_PROTO_AND,

rename all _AND by _MASK.

+       XT_HMARK_RND,        
+       XT_HMARK_MODULUS,    
+       XT_HMARK_OFFSET,     
+       XT_HMARK_CT,         
+       XT_HMARK_METHOD_L3,  
+       XT_HMARK_METHOD_L3_4,
};

What I'm asking should require very little changes in the kernel-code.

> > In case you want to support different masks for AH/ESP and TCP, you
> > could do the following:
> > 
> > iptables -I PREROUTING -t mangle -p esp -j HARK --spi-mask 0xffff0000
> > iptables -I PREROUTING -t mangle -p tcp -j HARK --port-mask 0xfffc
> > 
> > Any objection?
> 
> I don't think this is a problem, but it should be written in the man page
> that ports and spi share mask so they can't be used at the same time.

documentation is fine.

iptables can stop this by spotting a warning message from user-space.
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Hans Schillstrom May 7, 2012, 12:57 p.m. UTC | #6
On Monday 07 May 2012 14:22:32 Pablo Neira Ayuso wrote:
> On Mon, May 07, 2012 at 02:09:46PM +0200, Hans Schillstrom wrote:
> > On Monday 07 May 2012 13:56:12 Pablo Neira Ayuso wrote:
> > > On Mon, May 07, 2012 at 11:14:34AM +0200, Hans Schillstrom wrote:
> > > > > > We have plenty of rules where just source port mask is zero.
> > > > > > and the dest-port-mask is 0xfffc (or 0xffff)
> > > > > 
> > > > > 0xffff and 0x0000 means on/off respectively.
> > > > > 
> > > > > Still curious, how can 0xfffc be useful?
> > > > 
> > > > That's a special case where an appl is using 4 ports.
> > > > But in general, have not seen other than "on/off" except for above.
> > > 
> > > I see. Well I'm fine with this way to switch on/off things, just
> > > wanted some clafication.
> > > 
> > > Still one final thing I'd like to remove before inclusion:
> > > 
> > > +       union hmark_ports       port_mask;
> > > +       union hmark_ports       port_set;
> > > +       __u32                   spi_mask;
> > > +       __u32                   spi_set;
> > > 
> > > the spi_mask seems redundant. The port_mask already provides u32 for
> > > it.
> > 
> > No problems, I'll remove it.
> 
> OK. As a nice side-effect, this will lead to removing the branch that
> tests ESP/AH in hmark_set_tuple_ports.
>
Yes, only check if not ESP or AH to swap src/dst

+static void
+hmark_set_tuple_ports(const struct sk_buff *skb, unsigned int nhoff,
+		      struct hmark_tuple *t, const struct xt_hmark_info *info)
+{
+	int protoff;
+
+	protoff = proto_ports_offset(t->proto);
+	if (protoff < 0)
+		return;
+
+	nhoff += protoff;
+	if (skb_copy_bits(skb, nhoff, &t->uports, sizeof(t->uports)) < 0)
+		return;
+
+	t->uports.v32 = (t->uports.v32 & info->port_mask.v32) |
+			info->port_set.v32;
+
+	if (t->proto != IPPROTO_ESP && t->proto != IPPROTO_AH)
+		if (t->uports.p16.dst < t->uports.p16.src)
+			swap(t->uports.p16.dst, t->uports.p16.src);
+}

> Please, use the patch that I sent you yesterday. Recover the swap
> behaviour that you need, I'll mangle the patch myself to add the
> little comment to explain why we do this with CT as well.
> 
> BTW, note that you do *not* have to remove the XT_HMARK_SPI flags, we
> still need those for iptables-save.
> 
> While at it:
> 
> +enum {                      
> +       XT_HMARK_NONE,       
> +       XT_HMARK_SADR_AND,   
> +       XT_HMARK_DADR_AND,   
> +       XT_HMARK_SPI_AND,    
> +       XT_HMARK_SPI_OR,    
> 
> remove all trailing _OR
> 
> +       XT_HMARK_SPORT_AND,  
> +       XT_HMARK_DPORT_AND,  
> +       XT_HMARK_SPORT_OR,   
> +       XT_HMARK_DPORT_OR,   
> +       XT_HMARK_PROTO_AND,
> 
> rename all _AND by _MASK.
> 
> +       XT_HMARK_RND,        
> +       XT_HMARK_MODULUS,    
> +       XT_HMARK_OFFSET,     
> +       XT_HMARK_CT,         
> +       XT_HMARK_METHOD_L3,  
> +       XT_HMARK_METHOD_L3_4,
> };
> 
> What I'm asking should require very little changes in the kernel-code.
> 

I'll send you the updates later to day

> > > In case you want to support different masks for AH/ESP and TCP, you
> > > could do the following:
> > > 
> > > iptables -I PREROUTING -t mangle -p esp -j HARK --spi-mask 0xffff0000
> > > iptables -I PREROUTING -t mangle -p tcp -j HARK --port-mask 0xfffc
> > > 
> > > Any objection?
> > 
> > I don't think this is a problem, but it should be written in the man page
> > that ports and spi share mask so they can't be used at the same time.
> 
> documentation is fine.
> 
> iptables can stop this by spotting a warning message from user-space.

If you think thats enough, I fine with that.
Pablo Neira Ayuso May 7, 2012, 2:54 p.m. UTC | #7
On Mon, May 07, 2012 at 02:57:30PM +0200, Hans Schillstrom wrote:
> On Monday 07 May 2012 14:22:32 Pablo Neira Ayuso wrote:
> > On Mon, May 07, 2012 at 02:09:46PM +0200, Hans Schillstrom wrote:
> > > On Monday 07 May 2012 13:56:12 Pablo Neira Ayuso wrote:
> > > > On Mon, May 07, 2012 at 11:14:34AM +0200, Hans Schillstrom wrote:
> > > > > > > We have plenty of rules where just source port mask is zero.
> > > > > > > and the dest-port-mask is 0xfffc (or 0xffff)
> > > > > > 
> > > > > > 0xffff and 0x0000 means on/off respectively.
> > > > > > 
> > > > > > Still curious, how can 0xfffc be useful?
> > > > > 
> > > > > That's a special case where an appl is using 4 ports.
> > > > > But in general, have not seen other than "on/off" except for above.
> > > > 
> > > > I see. Well I'm fine with this way to switch on/off things, just
> > > > wanted some clafication.
> > > > 
> > > > Still one final thing I'd like to remove before inclusion:
> > > > 
> > > > +       union hmark_ports       port_mask;
> > > > +       union hmark_ports       port_set;
> > > > +       __u32                   spi_mask;
> > > > +       __u32                   spi_set;
> > > > 
> > > > the spi_mask seems redundant. The port_mask already provides u32 for
> > > > it.
> > > 
> > > No problems, I'll remove it.
> > 
> > OK. As a nice side-effect, this will lead to removing the branch that
> > tests ESP/AH in hmark_set_tuple_ports.
> >
> Yes, only check if not ESP or AH to swap src/dst

Do you really that branch? I mean, unless I'm missing anything, swapping
them shouldn't be a problem.
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

From edcb596187a50172481d1e9fa11ae062337c69eb Mon Sep 17 00:00:00 2001
From: Hans Schillstrom <hans.schillstrom@ericsson.com>
Date: Mon, 7 May 2012 09:46:38 +0200
Subject: [PATCH 1/1] netfilter: userspace part for target HMARK

    The target allows you to create rules in the "raw" and "mangle" tables
    which alter the netfilter mark (nfmark) field within a given range.
    First a 32 bit hash value is generated then modulus by <limit> and
    finally an offset is added before it's written to nfmark.
    Prior to routing, the nfmark can influence the routing method (see
    "Use netfilter MARK value as routing key") and can also be used by
    other subsystems to change their behaviour.

    The mark match can also be used to match nfmark produced by this module.
Ver 13
    Name change of defines.

Ver 12
    Reset option flag in some cases, where option is disabled by value.

Ver 10
    conntrack reduced to --hmark-ct switch
    renaming of vars in xt_hmark_info
    Adding helptext and updated man due to --hmark-ct switc

Ver 9
    Formating changes.

Ver 8
    Syntax changes more descriptive options
    --hmark-method added.

Ver 6-7 -

Ver 5
      smask and dmask changed to length

Ver 4
      xtoptions used for parsing.

Ver 3
       -

Ver 2
      IPv4 NAT added
      iptables ver 1.4.12.1 adaptions.

Signed-off-by: Hans Schillstrom <hans.schillstrom@ericsson.com>
---
 extensions/libxt_HMARK.c           |  510 ++++++++++++++++++++++++++++++++++++
 extensions/libxt_HMARK.man         |   84 ++++++
 include/linux/netfilter/xt_HMARK.h |   48 ++++
 3 files changed, 642 insertions(+), 0 deletions(-)
 create mode 100644 extensions/libxt_HMARK.c
 create mode 100644 extensions/libxt_HMARK.man
 create mode 100644 include/linux/netfilter/xt_HMARK.h

diff --git a/extensions/libxt_HMARK.c b/extensions/libxt_HMARK.c
new file mode 100644
index 0000000..4b13cd3
--- /dev/null
+++ b/extensions/libxt_HMARK.c
@@ -0,0 +1,510 @@ 
+/*
+ * Shared library add-on to iptables to add HMARK target support.
+ *
+ * The kernel module calculates a hash value that can be modified by modulus
+ * and an offset. The hash value is based on a direction independent
+ * five tuple: src & dst addr src & dst ports and protocol.
+ * However src & dst port can be masked and are not used for fragmented
+ * packets, ESP and AH don't have ports so SPI will be used instead.
+ * For ICMP error messages the hash mark values will be calculated on
+ * the source packet i.e. the packet caused the error (If sufficient
+ * amount of data exists).
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+#include <stdbool.h>
+#include <stdio.h>
+#include <string.h>
+
+#include "xtables.h"
+#include <linux/netfilter/xt_HMARK.h>
+
+
+#define DEF_HRAND 0xc175a3b8	/* Default "random" value to jhash */
+
+#define XT_F_HMARK_L4_OPTS \
+		(XT_HMARK_FLAG(XT_HMARK_SPI_AND) |\
+		 XT_HMARK_FLAG(XT_HMARK_SPI_OR) |\
+		 XT_HMARK_FLAG(XT_HMARK_SPORT_AND) |\
+		 XT_HMARK_FLAG(XT_HMARK_SPORT_OR) |\
+		 XT_HMARK_FLAG(XT_HMARK_DPORT_AND) |\
+		 XT_HMARK_FLAG(XT_HMARK_DPORT_OR) |\
+		 XT_HMARK_FLAG(XT_HMARK_PROTO_AND))
+
+static void HMARK_help(void)
+{
+	printf(
+"HMARK target options, i.e. modify hash calculation by:\n"
+"  --hmark-method <method>          Overall L3/L4 and fragment behavior\n"
+"                 L3                Fragment safe, do not use ports or proto\n"
+"                                   i.e. Fragments don't need special care.\n"
+"                 L3-4 (Default)    Fragment unsafe, use ports and proto\n"
+"                                   if defrag off in conntrack\n"
+"                                      no hmark on any part of a fragment\n"
+"  Limit/modify the calculated hash mark by:\n"
+"  --hmark-mod value                nfmark modulus value\n"
+"  --hmark-offset value             Last action add value to nfmark\n\n"
+" Fine tuning of what will be included in hash calculation\n"
+"  --hmark-src-mask length          Source address mask length\n"
+"  --hmark-dst-mask length          Dest address mask length\n"
+"  --hmark-sport-mask value         Mask src port with value\n"
+"  --hmark-dport-mask value         Mask dst port with value\n"
+"  --hmark-spi-mask value           For esp and ah AND spi with value\n"
+"  --hmark-sport-set value          OR src port with value\n"
+"  --hmark-dport-set value          OR dst port with value\n"
+"  --hmark-spi-set value            For esp and ah OR spi with value\n"
+"  --hmark-proto-mask value         Mask Protocol with value\n"
+"  --hmark-rnd                      Initial Random value to hash cacl.\n"
+" For NAT in IPv4: src part from original/reply tuple will always be used\n"
+" i.e. orig src part will be used as src address/port.\n"
+"     reply src part will be used as dst address/port\n"
+" Make sure to qualify the rule in a proper way when using NAT flag\n"
+" When --ct is used only tracked connections will match\n"
+"  --hmark-ct                       Force conntrack orig and rely tuples as\n"
+"                                   source and destination.\n\n"
+" In many cases hmark can be omitted i.e. --src-mask can be used\n");
+}
+
+#define hi struct xt_hmark_info
+
+static const struct xt_option_entry HMARK_opts[] = {
+	{ .name  = "hmark-method",
+	  .type  = XTTYPE_STRING,
+	  .id    = XT_HMARK_METHOD_L3
+	},
+	{ .name  = "hmark-src-mask",
+	  .type  = XTTYPE_PLENMASK,
+	  .id    = XT_HMARK_SADR_AND,
+	  .flags = XTOPT_PUT, XTOPT_POINTER(hi, src_mask)
+	},
+	{ .name  = "hmark-dst-mask",
+	  .type  = XTTYPE_PLENMASK,
+	  .id    = XT_HMARK_DADR_AND,
+	  .flags = XTOPT_PUT,
+	  XTOPT_POINTER(hi, dst_mask)
+	},
+	{ .name  = "hmark-sport-mask",
+	  .type  = XTTYPE_UINT16,
+	  .id    = XT_HMARK_SPORT_AND,
+	  .flags = XTOPT_PUT,
+	  XTOPT_POINTER(hi, port_mask.p16.src)
+	},
+	{ .name  = "hmark-dport-mask",
+	  .type  = XTTYPE_UINT16,
+	  .id    = XT_HMARK_DPORT_AND,
+	  .flags = XTOPT_PUT,
+	  XTOPT_POINTER(hi, port_mask.p16.dst)
+	},
+	{ .name  = "hmark-spi-mask",
+	  .type  = XTTYPE_UINT32,
+	  .id    = XT_HMARK_SPI_AND,
+	  .flags = XTOPT_PUT,
+	  XTOPT_POINTER(hi, spi_mask)
+	},
+	{ .name  = "hmark-sport-set",
+	  .type  = XTTYPE_UINT16,
+	  .id    = XT_HMARK_SPORT_OR,
+	  .flags = XTOPT_PUT,
+	  XTOPT_POINTER(hi, port_set.p16.src)
+	},
+	{ .name  = "hmark-dport-set",
+	  .type  = XTTYPE_UINT16,
+	  .id    = XT_HMARK_DPORT_OR,
+	  .flags = XTOPT_PUT,
+	  XTOPT_POINTER(hi, port_set.p16.dst)
+	},
+	{ .name  = "hmark-spi-set",
+	  .type  = XTTYPE_UINT32,
+	  .id    = XT_HMARK_SPI_OR,
+	  .flags = XTOPT_PUT,
+	  XTOPT_POINTER(hi, spi_set)
+	},
+	{ .name  = "hmark-proto-mask",
+	  .type  = XTTYPE_UINT16,
+	  .id    = XT_HMARK_PROTO_AND,
+	  .flags = XTOPT_PUT,
+	  XTOPT_POINTER(hi, proto_mask)
+	},
+	{ .name  = "hmark-rnd",
+	  .type  = XTTYPE_UINT32,
+	  .id    = XT_HMARK_RND,
+	  .flags = XTOPT_PUT,
+	  XTOPT_POINTER(hi, hashrnd)
+	},
+	{ .name = "hmark-mod",
+	  .type = XTTYPE_UINT32,
+	  .id = XT_HMARK_MODULUS,
+	  .min = 1,
+	  .flags = XTOPT_PUT | XTOPT_MAND,
+	  XTOPT_POINTER(hi, hmodulus)
+	},
+	{ .name  = "hmark-offset",
+	  .type  = XTTYPE_UINT32,
+	  .id    = XT_HMARK_OFFSET,
+	  .flags = XTOPT_PUT,
+	  XTOPT_POINTER(hi, hoffset)
+	},
+	{ .name  = "hmark-ct",
+	  .type  = XTTYPE_NONE,
+	  .id    = XT_HMARK_CT
+	},
+
+	{ .name  = "method",
+	  .type  = XTTYPE_STRING,
+	  .id    = XT_HMARK_METHOD_L3
+	},
+	{ .name  = "src-mask",
+	  .type  = XTTYPE_PLENMASK,
+	  .id    = XT_HMARK_SADR_AND,
+	  .flags = XTOPT_PUT,
+	  XTOPT_POINTER(hi, src_mask)
+	},
+	{ .name  = "dst-mask",
+	  .type  = XTTYPE_PLENMASK,
+	  .id    = XT_HMARK_DADR_AND,
+	  .flags = XTOPT_PUT,
+	  XTOPT_POINTER(hi, dst_mask)
+	},
+	{ .name  = "sport-mask",
+	  .type  = XTTYPE_UINT16,
+	  .id    = XT_HMARK_SPORT_AND,
+	  .flags = XTOPT_PUT,
+	  XTOPT_POINTER(hi, port_mask.p16.src)
+	},
+	{ .name  = "dport-mask", .type = XTTYPE_UINT16,
+	  .id = XT_HMARK_DPORT_AND,
+	  .flags = XTOPT_PUT,
+	  XTOPT_POINTER(hi, port_mask.p16.dst)
+	},
+	{ .name  = "spi-mask",
+	  .type  = XTTYPE_UINT32,
+	  .id    = XT_HMARK_SPI_AND,
+	  .flags = XTOPT_PUT,
+	  XTOPT_POINTER(hi, spi_mask)
+	},
+	{ .name  = "sport-set",
+	  .type  = XTTYPE_UINT16,
+	  .id    = XT_HMARK_SPORT_OR,
+	  .flags = XTOPT_PUT,
+	  XTOPT_POINTER(hi, port_set.p16.src)
+	},
+	{ .name  = "dport-set",
+	  .type  = XTTYPE_UINT16,
+	  .id    = XT_HMARK_DPORT_OR,
+	  .flags = XTOPT_PUT,
+	  XTOPT_POINTER(hi, port_set.p16.dst)
+	},
+	{ .name  = "spi-set",
+	  .type  = XTTYPE_UINT32,
+	  .id    = XT_HMARK_SPI_OR,
+	  .flags = XTOPT_PUT,
+	  XTOPT_POINTER(hi, spi_set)
+	},
+	{ .name  = "proto-mask",
+	  .type  = XTTYPE_UINT16,
+	  .id    = XT_HMARK_PROTO_AND,
+	  .flags = XTOPT_PUT,
+	  XTOPT_POINTER(hi, proto_mask)
+	},
+	{ .name  = "rnd",
+	  .type  = XTTYPE_UINT32,
+	  .id    = XT_HMARK_RND,
+	  .flags = XTOPT_PUT,
+	  XTOPT_POINTER(hi, hashrnd)
+	},
+	{ .name  = "mod",
+	  .type  = XTTYPE_UINT32,
+	  .id    = XT_HMARK_MODULUS,
+	  .min   = 1,
+	  .flags = XTOPT_PUT,
+	  XTOPT_MAND, XTOPT_POINTER(hi, hmodulus)
+	},
+	{ .name  = "offset",
+	  .type  = XTTYPE_UINT32,
+	  .id    = XT_HMARK_OFFSET,
+	  .flags = XTOPT_PUT,
+	  XTOPT_POINTER(hi, hoffset)
+	},
+	{ .name  = "ct",
+	  .type  = XTTYPE_NONE,
+	  .id    = XT_HMARK_CT
+	},
+	XTOPT_TABLEEND,
+};
+
+static void HMARK_parse(struct xt_option_call *cb, int plen)
+{
+	struct xt_hmark_info *info = cb->data;
+
+	if (!cb->xflags) {
+		memset(info, 0xff, sizeof(struct xt_hmark_info));
+		info->port_set.v32 = 0;
+		info->flags = 0;
+		info->spi_set = 0;
+		info->hoffset = 0;
+		info->hashrnd = DEF_HRAND;
+	}
+	xtables_option_parse(cb);
+
+	switch (cb->entry->id) {
+	case XT_HMARK_SADR_AND:
+		if (cb->val.hlen == plen)
+			cb->xflags &= ~XT_HMARK_FLAG(XT_HMARK_SADR_AND);
+		break;
+	case XT_HMARK_DADR_AND:
+		if (cb->val.hlen == plen)
+			cb->xflags &= ~XT_HMARK_FLAG(XT_HMARK_DADR_AND);
+		break;
+	case XT_HMARK_SPI_AND:
+		info->spi_mask = htonl(cb->val.u32);
+		if (cb->val.u32 == 0xffffffff)
+			cb->xflags &= ~XT_HMARK_FLAG(XT_HMARK_SPI_AND);
+		break;
+	case XT_HMARK_SPI_OR:
+		info->spi_set = htonl(cb->val.u32);
+		if (cb->val.u32 == 0)
+			cb->xflags &= ~XT_HMARK_FLAG(XT_HMARK_SPI_OR);
+		break;
+	case XT_HMARK_SPORT_AND:
+		info->port_mask.p16.src = htons(cb->val.u16);
+		if (cb->val.u16 == 0xffff)
+			cb->xflags &= ~XT_HMARK_FLAG(XT_HMARK_SPORT_AND);
+		break;
+	case XT_HMARK_DPORT_AND:
+		info->port_mask.p16.dst = htons(cb->val.u16);
+		if (cb->val.u16 == 0xffff)
+			cb->xflags &= ~XT_HMARK_FLAG(XT_HMARK_DPORT_AND);
+		break;
+	case XT_HMARK_SPORT_OR:
+		info->port_set.p16.src = htons(cb->val.u16);
+		if (cb->val.u16 == 0)
+			cb->xflags &= ~XT_HMARK_FLAG(XT_HMARK_SPORT_OR);
+		break;
+	case XT_HMARK_DPORT_OR:
+		info->port_set.p16.dst = htons(cb->val.u16);
+		if (cb->val.u16 == 0)
+			cb->xflags &= ~XT_HMARK_FLAG(XT_HMARK_DPORT_OR);
+		break;
+	case XT_HMARK_PROTO_AND:
+		if (cb->val.u16 == 0xffff)
+			cb->xflags &= ~XT_HMARK_FLAG(XT_HMARK_PROTO_AND);
+		break;
+	case XT_HMARK_MODULUS:
+		if (info->hmodulus == 0) {
+			xtables_error(PARAMETER_PROBLEM,
+				      "xxx modulus 0 ? "
+				      "thats a div by 0");
+			info->hmodulus = 0xffffffff;
+		}
+		break;
+	case XT_HMARK_METHOD_L3:
+		if (strcmp(cb->arg, "L3") == 0) {
+			info->proto_mask = 0;
+			cb->xflags &= ~XT_HMARK_FLAG(XT_HMARK_METHOD_L3_4);
+		} else if (strcmp(cb->arg, "L3-4") == 0) {
+			cb->xflags &= ~XT_HMARK_FLAG(XT_HMARK_METHOD_L3);
+			cb->xflags |= XT_HMARK_FLAG(XT_HMARK_METHOD_L3_4);
+		}
+		break;
+	}
+	info->flags = cb->xflags;
+}
+
+static void HMARK_ip4_parse(struct xt_option_call *cb)
+{
+	HMARK_parse(cb, 32);
+}
+static void HMARK_ip6_parse(struct xt_option_call *cb)
+{
+	HMARK_parse(cb, 128);
+}
+
+static void HMARK_check(struct xt_fcheck_call *cb)
+{
+	if (!(cb->xflags & XT_HMARK_FLAG(XT_HMARK_MODULUS)))
+		xtables_error(PARAMETER_PROBLEM, "HMARK: the --hmark-mod, "
+			      "is not set, or zero wich is a div by zero");
+	/* Check for invalid options */
+	if (cb->xflags & XT_HMARK_FLAG(XT_HMARK_METHOD_L3) &&
+	   (cb->xflags & XT_F_HMARK_L4_OPTS))
+		xtables_error(PARAMETER_PROBLEM, "HMARK: --hmark-method L3, "
+			      "can not be combined by an Layer 4 options: "
+			      "port, spi or proto ");
+}
+/*
+ * Common print for IPv4 & IPv6
+ */
+static void HMARK_print(const struct xt_hmark_info *info)
+{
+	if (info->flags & XT_HMARK_FLAG(XT_HMARK_METHOD_L3)) {
+		printf("method L3 ");
+	} else {
+		if (info->flags & XT_HMARK_FLAG(XT_HMARK_METHOD_L3_4))
+			printf("method L3-4 ");
+		if (info->flags & XT_HMARK_FLAG(XT_HMARK_SPORT_AND))
+			printf("sport-mask 0x%x ",
+			       htons(info->port_mask.p16.src));
+		if (info->flags & XT_HMARK_FLAG(XT_HMARK_DPORT_AND))
+			printf("dport-mask 0x%x ",
+			       htons(info->port_mask.p16.dst));
+		if (info->flags & XT_HMARK_FLAG(XT_HMARK_SPI_AND))
+			printf("spi-mask 0x%x ", htonl(info->spi_mask));
+		if (info->flags & XT_HMARK_FLAG(XT_HMARK_SPORT_OR))
+			printf("sport-set 0x%x ",
+			       htons(info->port_set.p16.src));
+		if (info->flags & XT_HMARK_FLAG(XT_HMARK_DPORT_OR))
+			printf("dport-set 0x%x ",
+			       htons(info->port_set.p16.dst));
+		if (info->flags & XT_HMARK_FLAG(XT_HMARK_SPI_OR))
+			printf("spi-set 0x%x ", htonl(info->spi_set));
+		if (info->flags & XT_HMARK_FLAG(XT_HMARK_PROTO_AND))
+			printf("proto-mask 0x%x ", info->proto_mask);
+	}
+	if (info->flags & XT_HMARK_FLAG(XT_HMARK_RND))
+		printf("rnd 0x%x ", info->hashrnd);
+
+}
+
+static void HMARK_ip6_print(const void *ip,
+			    const struct xt_entry_target *target, int numeric)
+{
+	const struct xt_hmark_info *info =
+			(const struct xt_hmark_info *)target->data;
+
+	printf(" HMARK ");
+	if (info->flags & XT_HMARK_FLAG(XT_HMARK_MODULUS))
+		printf("%% 0x%x ", info->hmodulus);
+	if (info->flags & XT_HMARK_FLAG(XT_HMARK_OFFSET))
+		printf("+ 0x%x ", info->hoffset);
+	if (info->flags & XT_HMARK_FLAG(XT_HMARK_CT))
+		printf("ct, ");
+	if (info->flags & XT_HMARK_FLAG(XT_HMARK_SADR_AND))
+		printf("src-mask %s ",
+		       xtables_ip6mask_to_numeric(&info->src_mask.in6) + 1);
+	if (info->flags & XT_HMARK_FLAG(XT_HMARK_DADR_AND))
+		printf("dst-mask %s ",
+		       xtables_ip6mask_to_numeric(&info->dst_mask.in6) + 1);
+	HMARK_print(info);
+}
+static void HMARK_ip4_print(const void *ip,
+			    const struct xt_entry_target *target, int numeric)
+{
+	const struct xt_hmark_info *info =
+		(const struct xt_hmark_info *)target->data;
+
+	printf(" HMARK ");
+	if (info->flags & XT_HMARK_FLAG(XT_HMARK_MODULUS))
+		printf("%% 0x%x ", info->hmodulus);
+	if (info->flags & XT_HMARK_FLAG(XT_HMARK_OFFSET))
+		printf("+ 0x%x ", info->hoffset);
+	if (info->flags & XT_HMARK_FLAG(XT_HMARK_CT))
+		printf("ct, ");
+	if (info->flags & XT_HMARK_FLAG(XT_HMARK_SADR_AND))
+		printf("src-mask %s ",
+		       xtables_ipmask_to_numeric(&info->src_mask.in) + 1);
+	if (info->flags & XT_HMARK_FLAG(XT_HMARK_DADR_AND))
+		printf("dst-mask %s ",
+		       xtables_ipmask_to_numeric(&info->dst_mask.in) + 1);
+	HMARK_print(info);
+}
+static void HMARK_save(const struct xt_hmark_info *info)
+{
+	if (info->flags & XT_HMARK_FLAG(XT_HMARK_METHOD_L3)) {
+		printf(" --hmark-method L3");
+	} else {
+		if (info->flags & XT_HMARK_FLAG(XT_HMARK_METHOD_L3_4))
+			printf(" --hmark-method L3-4");
+		if (info->flags & XT_HMARK_FLAG(XT_HMARK_SPORT_AND))
+			printf(" --hmark-sport-mask 0x%x",
+			       htons(info->port_mask.p16.src));
+		if (info->flags & XT_HMARK_FLAG(XT_HMARK_DPORT_AND))
+			printf(" --hmark-dport-mask 0x%x",
+			       htons(info->port_mask.p16.dst));
+		if (info->flags & XT_HMARK_FLAG(XT_HMARK_SPI_AND))
+			printf(" --hmark-spi-mask 0x%x",
+			       htonl(info->spi_mask));
+		if (info->flags & XT_HMARK_FLAG(XT_HMARK_SPORT_OR))
+			printf(" --hmark-sport-set 0x%x",
+			       htons(info->port_set.p16.src));
+		if (info->flags & XT_HMARK_FLAG(XT_HMARK_DPORT_OR))
+			printf(" --hmark-dport-set 0x%x",
+			       htons(info->port_set.p16.dst));
+		if (info->flags & XT_HMARK_FLAG(XT_HMARK_SPI_OR))
+			printf(" --hmark-spi-set 0x%x", htonl(info->spi_set));
+		if (info->flags & XT_HMARK_FLAG(XT_HMARK_PROTO_AND))
+			printf(" --hmark-proto-mask 0x%x", info->proto_mask);
+	}
+	if (info->flags & XT_HMARK_FLAG(XT_HMARK_RND))
+		printf(" --hmark-rnd 0x%x", info->hashrnd);
+	if (info->flags & XT_HMARK_FLAG(XT_HMARK_MODULUS))
+		printf(" --hmark-mod 0x%x", info->hmodulus);
+	if (info->flags & XT_HMARK_FLAG(XT_HMARK_OFFSET))
+		printf(" --hmark-offset 0x%x", info->hoffset);
+	if (info->flags & XT_HMARK_FLAG(XT_HMARK_CT))
+		printf(" --hmark-ct");
+}
+
+static void HMARK_ip6_save(const void *ip, const struct xt_entry_target *target)
+{
+	const struct xt_hmark_info *info =
+		(const struct xt_hmark_info *)target->data;
+
+	if (info->flags & XT_HMARK_FLAG(XT_HMARK_SADR_AND))
+		printf(" --hmark-src-mask %s",
+		       xtables_ip6mask_to_numeric(&info->src_mask.in6) + 1);
+	if (info->flags & XT_HMARK_FLAG(XT_HMARK_DADR_AND))
+		printf(" --hmark-dst-mask %s",
+		       xtables_ip6mask_to_numeric(&info->dst_mask.in6) + 1);
+	HMARK_save(info);
+}
+
+static void HMARK_ip4_save(const void *ip, const struct xt_entry_target *target)
+{
+	const struct xt_hmark_info *info =
+		(const struct xt_hmark_info *)target->data;
+
+	if (info->flags & XT_HMARK_FLAG(XT_HMARK_SADR_AND))
+		printf(" --hmark-src-mask %s",
+		       xtables_ipmask_to_numeric(&info->src_mask.in) + 1);
+	if (info->flags & XT_HMARK_FLAG(XT_HMARK_DADR_AND))
+		printf(" --hmark-dst-mask %s",
+		       xtables_ipmask_to_numeric(&info->dst_mask.in) + 1);
+	HMARK_save(info);
+}
+
+static struct xtables_target mark_tg_reg[] = {
+	{
+		.family        = NFPROTO_IPV4,
+		.name          = "HMARK",
+		.version       = XTABLES_VERSION,
+		.revision      = 0,
+		.size          = XT_ALIGN(sizeof(struct xt_hmark_info)),
+		.userspacesize = XT_ALIGN(sizeof(struct xt_hmark_info)),
+		.help          = HMARK_help,
+		.print         = HMARK_ip4_print,
+		.save          = HMARK_ip4_save,
+		.x6_parse      = HMARK_ip4_parse,
+		.x6_fcheck     = HMARK_check,
+		.x6_options    = HMARK_opts,
+	},
+	{
+		.family        = NFPROTO_IPV6,
+		.name          = "HMARK",
+		.version       = XTABLES_VERSION,
+		.revision      = 0,
+		.size          = XT_ALIGN(sizeof(struct xt_hmark_info)),
+		.userspacesize = XT_ALIGN(sizeof(struct xt_hmark_info)),
+		.help          = HMARK_help,
+		.print         = HMARK_ip6_print,
+		.save          = HMARK_ip6_save,
+		.x6_parse      = HMARK_ip6_parse,
+		.x6_fcheck     = HMARK_check,
+		.x6_options    = HMARK_opts,
+	},
+};
+
+void _init(void)
+{
+	xtables_register_targets(mark_tg_reg, ARRAY_SIZE(mark_tg_reg));
+}
diff --git a/extensions/libxt_HMARK.man b/extensions/libxt_HMARK.man
new file mode 100644
index 0000000..c258e59
--- /dev/null
+++ b/extensions/libxt_HMARK.man
@@ -0,0 +1,84 @@ 
+This module does the same as MARK, i.e. set an fwmark, but the mark is based on a hash value.
+The hash is based on src-addr, dst-addr, sport, dport and proto. The same mark will be produced independent of direction if no masks is set or the same masks is used for src and dest.
+The hash mark could be adjusted by modulus and finally an offset could be added, i.e the final mark will be within a range.
+ICMP error will use the the original message for hash calculation not the icmp it self.
+
+Note: IPv4 packets with nf_defrag_ipv4 loaded will be defragmented before they reach hmark,
+      IPv6 nf_defrag is not implemented this way, hence fragmented ipv6 packets will reach hmark.
+      Default behavior is to completely ignore any fragment if it reach hmark.
+      --hmark-method L3 is fragment safe since neither ports or L4 protocol field is used.
+      None of the parameters effect the packet it self only the calculated hash value.
+
+.PP
+Parameters:
+Short hand methods
+.TP
+\fB\-\-hmark\-method\fP \fIL3\fP
+Do not use L4 protocol field, ports or spi, only Layer 3 addresses, mask length
+of L3 addresses can still be used. Fragment or not does not matter in
+this case since only L3 address can be used in calc. of hash value.
+.TP
+\fB\-\-hmark\-method\fP \fIL3-4\fP (Default)
+Include L4 in calculation. of hash value i.e. all masks below are valid.
+Fragments will be ignored. (i.e no hash value produced)
+.PP
+For all masks default is all "1:s", to disable a field use mask 0
+.TP
+\fB\-\-hmark\-src\-mask\fP \fIlength\fP
+The length of the mask to AND the source address with (saddr & value).
+.TP
+\fB\-\-hmark\-dst\-mask\fP \fIlength\fP
+The length of the mask to AND the dest. address with (daddr & value).
+.TP
+\fB\-\-hmark\-sport\-mask\fP \fIvalue\fP
+A 16 bit value to AND the src port with (sport & value).
+.TP
+\fB\-\-hmark\-dport\-mask\fP \fIvalue\fP
+A 16 bit value to AND the dest port with (dport & value).
+.TP
+\fB\-\-hmark\-sport\-set\fP \fIvalue\fP
+A 16 bit value to OR the src port with (sport | value).
+.TP
+\fB\-\-hmark\-dport\-set\fP \fIvalue\fP
+A 16 bit value to OR the dest port with (dport | value).
+.TP
+\fB\-\-hmark\-spi\-mask\fP \fIvalue\fP
+Value to AND the spi field with (spi & value) valid for proto esp or ah.
+.TP
+\fB\-\-hmark\-spi\-set\fP \fIvalue\fP
+Value to OR the spi field with (spi | value) valid for proto esp or ah.
+.TP
+\fB\-\-hmark\-proto\-mask\fP \fIvalue\fP
+An 8 bit value to AND the L4 proto field with (proto & value).
+.TP
+\fB\-\-hmark\-ct\fP
+When flag is set, conntrack data should be used. Useful when NAT internal addressed should be used in calculation.
+Be careful when using DNAT since mangle table is handled before nat table. I.e it will not work as expected to put HMARK in table mangle and PREROUTING chain. The initial packet will have it's hash based on the original address, while the rest of the flow will use the NAT:ed address.
+.TP
+\fB\-\-hmark\-rnd\fP \fIvalue\fP
+A 32 bit initial value for hash calc, default is 0xc175a3b8.
+.PP
+Final processing of the mark in order of execution.
+.TP
+\fB\-\-hmark\-mod\fP \fIvalue (must be > 0)\fP
+The easiest way to describe this is:  hash = hash mod <value>
+.TP
+\fB\-\-hmark\-offset\fP \fIvalue\fP
+The easiest way to describe this is:  hash = hash + <value>
+.PP
+\fIExamples:\fP
+.PP
+Default rule handles all TCP, UDP, SCTP, ESP & AH
+.IP
+iptables \-t mangle \-A PREROUTING \-m state \-\-state NEW,ESTABLISHED,RELATED
+ \-j HMARK \-\-hmark-offs 10000 \-\-hmark-mod 10
+.PP
+Handle SCTP and hash dest port only and produce a nfmark between 100-119.
+.IP
+iptables \-t mangle \-A PREROUTING -p SCTP \-j HMARK \-\-src\-mask 0 \-\-dst\-mask 0
+ \-\-sp\-mask 0 \-\-offset 100 \-\-mod 20
+.PP
+Fragment safe Layer 3 only that keep a class C network flow together
+.IP
+iptables \-t mangle \-A PREROUTING \-j HMARK \-\-method L3 \-\-src\-mask 24 \-\-mod 20 \-\-offset 100
+
diff --git a/include/linux/netfilter/xt_HMARK.h b/include/linux/netfilter/xt_HMARK.h
new file mode 100644
index 0000000..05e43ba
--- /dev/null
+++ b/include/linux/netfilter/xt_HMARK.h
@@ -0,0 +1,48 @@ 
+#ifndef XT_HMARK_H_
+#define XT_HMARK_H_
+
+#include <linux/types.h>
+
+enum {
+	XT_HMARK_NONE,
+	XT_HMARK_SADR_AND,
+	XT_HMARK_DADR_AND,
+	XT_HMARK_SPI_AND,
+	XT_HMARK_SPI_OR,
+	XT_HMARK_SPORT_AND,
+	XT_HMARK_DPORT_AND,
+	XT_HMARK_SPORT_OR,
+	XT_HMARK_DPORT_OR,
+	XT_HMARK_PROTO_AND,
+	XT_HMARK_RND,
+	XT_HMARK_MODULUS,
+	XT_HMARK_OFFSET,
+	XT_HMARK_CT,
+	XT_HMARK_METHOD_L3,
+	XT_HMARK_METHOD_L3_4,
+};
+#define XT_HMARK_FLAG(flag)	(1 << flag)
+
+union hmark_ports {
+	struct {
+		__u16	src;
+		__u16	dst;
+	} p16;
+	__u32	v32;
+};
+
+struct xt_hmark_info {
+	union nf_inet_addr	src_mask;	/* Source address mask */
+	union nf_inet_addr	dst_mask;	/* Dest address mask */
+	union hmark_ports	port_mask;
+	union hmark_ports	port_set;
+	__u32			spi_mask;
+	__u32			spi_set;
+	__u32			flags;		/* Print out only */
+	__u16			proto_mask;	/* L4 Proto mask */
+	__u32			hashrnd;
+	__u32			hmodulus;	/* Modulus */
+	__u32			hoffset;	/* Offset */
+};
+
+#endif /* XT_HMARK_H_ */
-- 
1.7.2.3