[GIT] Networking

Message ID	20151027.233235.1641084823622810663.davem@davemloft.net
State	Accepted, archived
Delegated to:	David Miller
Headers	show Return-Path: <netdev-owner@vger.kernel.org> sender: davem-davemloft) by shards.monkeyblade.net (Postfix) with ESMTPSA id 46F71592A19; Tue, 27 Oct 2015 23:15:56 -0700 (PDT) Date: Tue, 27 Oct 2015 23:32:35 -0700 (PDT) Message-Id: <20151027.233235.1641084823622810663.davem@davemloft.net> To: torvalds@linux-foundation.org CC: akpm@linux-foundation.org, netdev@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [GIT] Networking From: David Miller <davem@davemloft.net> Mime-Version: 1.0 Content-Type: Text/Plain; charset=iso-8859-1 Content-Transfer-Encoding: 8BIT Sender: netdev-owner@vger.kernel.org Precedence: bulk

David Miller Oct. 28, 2015, 6:32 a.m. UTC

This may look a bit scary this late in the release cycle, but as is typically
the case it's predominantly small driver fixes all over the place.

1) Fix two regressions in ipv6 route lookups, particularly wrt. output
   interface specifications in the lookup key.  From David Ahern.

2) Fix checks in ipv6 IPSEC tunnel pre-encap fragmentation, from
   Herbert Xu.

3) Fix mis-advertisement of 1000BASE-T on bcm63xx_enet, from Simon
   Arlott.

4) Some smsc phys misbehave with energy detect mode enabled, so add a
   DT property and disable it on such switches.  From Heiko Schocher.

5) Fix TSO corruption on TX in mv643xx_eth, from Philipp Kirchhofer.

6) Fix regression added by removal of openvswitch vport stats, from
   James Morse.

7) Vendor Kconfig options should be bool, not tristate, from Andreas
   Schwab.

8) Use non-_BH() net stats bump in tcp_xmit_probe_skb(), otherwise
   we barf during TCP REPAIR operations.

9) Fix various bugs in openvswitch conntrack support, from Joe
   Stringer.

10) Fix NETLINK_LIST_MEMBERSHIPS locking, from David Herrmann.

11) Don't have VSOCK do sock_put() in interrupt context, from Jorgen
    Hansen.

12) Fix skb_realloc_headroom() failures properly in ISDN, from Karsten
    Keil.

13) Add some device IDs to qmi_wwan, from Bjorn Mork.

14) Fix ovs egress tunnel information when using lwtunnel devices,
    from Pravin B Shelar.

15) Add missing NETIF_F_FRAGLIST to macvtab feature list, from Jason
    Wang.

16) Fix incorrect handling of throw routes when the result of the
    throw cannot find a match, from Xin Long.

17) Protect ipv6 MTU calculations from wrap-around, from Hannes
    Frederic Sowa.

18) Fix failed autonegotiation on KSZ9031 micrel PHYs, from Nathan
    Sullivan.

19) Add missing memory barries in descriptor accesses or xgbe driver,
    from Thomas Lendacky.

20) Fix release conditon test in pppoe_release(), from Guillaume Nault.

21) Fix gianfar bugs wrt. filter configuration, from Claudiu Manoil.

22) Fix violations of RX buffer alignment in sh_eth driver, from Sergei
    Shtylyov.

23) Fixing missing of_node_put() calls in various places around the
    networking, from Julia Lawall.

24) Fix incorrect leaf now walking in ipv4 routing tree, from Alexander
    Duyck.

25) RDS doesn't check pskb_pull()/pskb_trim() return values, from
    Sowmini Varadhan.

26) Fix VLAN configuration in mlx4 driver, from Jack Morgenstein.

Please pull, thanks a lot.

The following changes since commit 1099f86044111e9a7807f09523e42d4c9d0fb781:

  Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net (2015-10-19 09:55:40 -0700)

are available in the git repository at:


  git://git.kernel.org/pub/scm/linux/kernel/git/davem/net.git master

for you to fetch changes up to e18f6ac30d31433d8cd9ccf693d3cdd5d2e66ef9:

  Merge branch 'mlx4-fixes' (2015-10-27 20:27:45 -0700)

----------------------------------------------------------------

Alexander Duyck (1):
      fib_trie: leaf_walk_rcu should not compute key if key is less than pn->key

Andreas Schwab (1):
      net: cavium: change NET_VENDOR_CAVIUM to bool

Andrew F. Davis (1):
      net: phy: dp83848: Add TI DP83848 Ethernet PHY

Andrew Shewmaker (1):
      tcp: allow dctcp alpha to drop to zero

Bjørn Mork (1):
      qmi_wwan: add Sierra Wireless MC74xx/EM74xx

Carol L Soto (1):
      net/mlx4: Copy/set only sizeof struct mlx4_eqe bytes

Claudiu Manoil (4):
      gianfar: Remove duplicated argument to bitwise OR
      gianfar: Don't enable the Filer w/o the Parser
      gianfar: Fix Rx BSY error handling
      MAINTAINERS: Add entry for gianfar ethernet driver

Dan Carpenter (1):
      irda: precedence bug in irlmp_seq_hb_idx()

David Ahern (2):
      net: Really fix vti6 with oif in dst lookups
      net: ipv6: Dont add RT6_LOOKUP_F_IFACE flag if saddr set

David Daney (1):
      net: thunderx: Rewrite silicon revision tests.

David Herrmann (1):
      netlink: fix locking around NETLINK_LIST_MEMBERSHIPS

David S. Miller (12):
      Merge branch 'smsc-energy-detect'
      Merge branch 'mv643xx-fixes'
      Merge git://git.kernel.org/.../pablo/nf
      Merge branch 'isdn-null-deref'
      Merge branch 'master' of git://git.kernel.org/.../klassert/ipsec
      Merge branch 'master' of git://git.kernel.org/.../jkirsher/net-queue
      Merge branch 'ipv6-overflow-arith'
      Merge branch 'thunderx-fixes'
      Merge branch 'gianfar-fixes'
      Merge branch 'sh_eth-fixes'
      Merge branch 'net_of_node_put'
      Merge branch 'mlx4-fixes'

Eric Dumazet (1):
      ipv6: gre: support SIT encapsulation

Florian Westphal (1):
      netfilter: sync with packet rx also after removing queue entries

Gao feng (1):
      vsock: fix missing cleanup when misc_register failed

Guillaume Nault (1):
      ppp: fix pppoe_dev deletion condition in pppoe_release()

Hannes Frederic Sowa (2):
      overflow-arith: begin to add support for overflow builtin functions
      ipv6: protect mtu calculation of wrap-around and infinite loop by rounding issues

Hans de Goede (1):
      net: sun4i-emac: Properly free resources on probe failure and remove

Heiko Schocher (2):
      drivers: net: cpsw: add phy-handle parsing
      net: phy: smsc: disable energy detect mode

Herbert Xu (1):
      ipv6: Fix IPsec pre-encap fragmentation check

Jack Morgenstein (1):
      net/mlx4_en: Explicitly set no vlan tags in WQE ctrl segment when no vlan is present

James Morse (1):
      openvswitch: Allocate memory for ovs internal device stats.

Jason Wang (1):
      macvtap: unbreak receiving of gro skb with frag list

Jesse Brandeburg (2):
      i40e: fix stats offsets
      i40e: fix annoying message

Joe Jin (1):
      xen-netfront: update num_queues to real created

Joe Stringer (7):
      openvswitch: Reject ct_state masks for unknown bits
      openvswitch: Clarify conntrack COMMIT behaviour
      openvswitch: Mark connections new when not confirmed.
      openvswitch: Serialize nested ct actions if provided
      openvswitch: Fix double-free on ip_defrag() errors
      ipv6: Export nf_ct_frag6_consume_orig()
      openvswitch: Fix skb leak using IPv6 defrag

Jon Paul Maloy (3):
      tipc: extend broadcast link window size
      tipc: allow non-linear first fragment buffer
      tipc: conditionally expand buffer headroom over udp tunnel

Jorgen Hansen (2):
      VSOCK: sock_put wasn't safe to call in interrupt context
      VSOCK: Fix lockdep issue.

Julia Lawall (6):
      net: thunderx: add missing of_node_put
      net: netcp: add missing of_node_put
      netdev/phy: add missing of_node_put
      net: phy: mdio: add missing of_node_put
      ath6kl: add missing of_node_put
      net: mv643xx_eth: add missing of_node_put

Karsten Keil (2):
      ISDN: fix OOM condition for sending queued I-Frames
      mISDN: fix OOM condition for sending queued I-Frames

Lendacky, Thomas (2):
      amd-xgbe: Use wmb before updating current descriptor count
      amd-xgbe: Fix race between access of desc and desc index

Li RongQing (2):
      af_key: fix two typos
      net: sysctl: fix a kmemleak warning

Mathias Krause (1):
      xfrm6: Fix ICMPv6 and MH header checks in _decode_session6

Michael Rossberg (1):
      xfrm: Fix state threshold configuration from userspace

Michael S. Tsirkin (1):
      vhost: fix performance on LE hosts

Nathan Sullivan (1):
      net/phy: micrel: Add workaround for bad autoneg

Neil Horman (1):
      forcedeth: fix unilateral interrupt disabling in netpoll path

Nikolay Borisov (1):
      netfilter: ipset: Fix sleeping memory allocation in atomic context

Pablo Neira Ayuso (1):
      netfilter: fix Kconfig dependencies for nf_dup_ipv{4,6}

Philipp Kirchhofer (2):
      net: mv643xx_eth: Ensure proper data alignment in TSO TX path
      net: mv643xx_eth: Defer writing the first TX descriptor when using TSO

Pravin B Shelar (1):
      openvswitch: Fix egress tunnel info.

Renato Westphal (1):
      tcp: remove improper preemption check in tcp_xmit_probe_skb()

Sergei Shtylyov (2):
      sh_eth: fix RX buffer size alignment
      sh_eth: fix RX buffer size calculation

Simon Arlott (1):
      bcm63xx_enet: check 1000BASE-T advertisement configuration

Sowmini Varadhan (1):
      RDS-TCP: Recover correctly from pskb_pull()/pksb_trim() failure in rds_tcp_data_recv

Steffen Klassert (1):
      xfrm: Fix pmtu discovery for local generated packets.

Sunil Goutham (2):
      net: thunderx: Remove PF soft reset.
      net: thunderx: Fix incorrect subsystem devid of VF on pass2 silicon

Thanneeru Srinivasulu (1):
      net: thunderx: Incorporate pass2 silicon CPI index configuration changes

Yang Shi (1):
      bpf: sample: define aarch64 specific registers

lucien (2):
      netfilter: ipt_rpfilter: remove the nh_scope test in rpfilter_lookup_reverse
      ipv6: fix the incorrect return value of throw route

 Documentation/devicetree/bindings/net/cpsw.txt         |    1 +
 Documentation/devicetree/bindings/net/smsc-lan87xx.txt |   24 ++++++++++++
 MAINTAINERS                                            |    8 ++++
 drivers/isdn/hisax/isdnl2.c                            |   20 ++++------
 drivers/isdn/mISDN/layer2.c                            |   54 ++++++++++---------------
 drivers/net/ethernet/allwinner/sun4i-emac.c            |   20 ++++++++--
 drivers/net/ethernet/amd/xgbe/xgbe-dev.c               |    2 +-
 drivers/net/ethernet/amd/xgbe/xgbe-drv.c               |    8 +++-
 drivers/net/ethernet/broadcom/bcm63xx_enet.c           |   33 +++++++++-------
 drivers/net/ethernet/cavium/Kconfig                    |    2 +-
 drivers/net/ethernet/cavium/thunder/nic_main.c         |   42 +++++++++++++-------
 drivers/net/ethernet/cavium/thunder/nic_reg.h          |    4 ++
 drivers/net/ethernet/cavium/thunder/nicvf_main.c       |    2 +-
 drivers/net/ethernet/cavium/thunder/thunder_bgx.c      |    4 +-
 drivers/net/ethernet/freescale/gianfar.c               |    8 ++--
 drivers/net/ethernet/freescale/gianfar_ethtool.c       |    4 +-
 drivers/net/ethernet/intel/i40e/i40e_ethtool.c         |    6 +++
 drivers/net/ethernet/intel/i40e/i40e_main.c            |    1 +
 drivers/net/ethernet/marvell/mv643xx_eth.c             |   52 ++++++++++++++++++++-----
 drivers/net/ethernet/mellanox/mlx4/cmd.c               |    2 +-
 drivers/net/ethernet/mellanox/mlx4/en_tx.c             |    2 +
 drivers/net/ethernet/mellanox/mlx4/eq.c                |    2 +-
 drivers/net/ethernet/nvidia/forcedeth.c                |   24 ++++++------
 drivers/net/ethernet/renesas/sh_eth.c                  |   14 +++----
 drivers/net/ethernet/ti/cpsw.c                         |   15 +++++--
 drivers/net/ethernet/ti/netcp_ethss.c                  |    8 +++-
 drivers/net/geneve.c                                   |   40 +++++++++++++++----
 drivers/net/macvtap.c                                  |    2 +-
 drivers/net/phy/Kconfig                                |    5 +++
 drivers/net/phy/Makefile                               |    1 +
 drivers/net/phy/dp83848.c                              |   99 ++++++++++++++++++++++++++++++++++++++++++++++
 drivers/net/phy/mdio-mux-mmioreg.c                     |    2 +
 drivers/net/phy/mdio-mux.c                             |    1 +
 drivers/net/phy/micrel.c                               |   23 ++++++++++-
 drivers/net/phy/smsc.c                                 |   19 ++++++---
 drivers/net/ppp/pppoe.c                                |    2 +-
 drivers/net/usb/qmi_wwan.c                             |    4 ++
 drivers/net/vxlan.c                                    |   41 +++++++++++++++++++
 drivers/net/wireless/ath/ath6kl/init.c                 |    1 +
 drivers/net/xen-netfront.c                             |   14 +++----
 drivers/vhost/vhost.h                                  |    7 ++++
 include/linux/compiler-gcc.h                           |    4 ++
 include/linux/netdevice.h                              |    7 ++++
 include/linux/overflow-arith.h                         |   18 +++++++++
 include/net/dst_metadata.h                             |   32 +++++++++++++++
 include/uapi/linux/openvswitch.h                       |    3 +-
 net/core/dev.c                                         |   27 +++++++++++++
 net/ipv4/fib_trie.c                                    |    2 +-
 net/ipv4/gre_offload.c                                 |    3 +-
 net/ipv4/ip_gre.c                                      |   46 +++++++++++++++++-----
 net/ipv4/netfilter/Kconfig                             |    1 +
 net/ipv4/netfilter/ipt_rpfilter.c                      |    4 +-
 net/ipv4/tcp_dctcp.c                                   |    2 +-
 net/ipv4/tcp_output.c                                  |    2 +-
 net/ipv4/xfrm4_output.c                                |    2 +
 net/ipv6/fib6_rules.c                                  |   19 +++++++--
 net/ipv6/ip6_fib.c                                     |   12 +++++-
 net/ipv6/ip6_output.c                                  |    9 ++++-
 net/ipv6/netfilter/Kconfig                             |    1 +
 net/ipv6/netfilter/nf_conntrack_reasm.c                |    1 +
 net/ipv6/route.c                                       |    9 ++++-
 net/ipv6/xfrm6_output.c                                |   18 ++++++---
 net/ipv6/xfrm6_policy.c                                |    6 ++-
 net/irda/irlmp.c                                       |    2 +-
 net/key/af_key.c                                       |    2 +-
 net/netfilter/core.c                                   |    2 +
 net/netfilter/ipset/ip_set_list_set.c                  |    2 +-
 net/netlink/af_netlink.c                               |    4 +-
 net/openvswitch/actions.c                              |   13 +++----
 net/openvswitch/conntrack.c                            |   48 ++++++++++++++++-------
 net/openvswitch/conntrack.h                            |   17 +++-----
 net/openvswitch/datapath.c                             |    5 +--
 net/openvswitch/datapath.h                             |    1 -
 net/openvswitch/flow_netlink.c                         |   23 ++++++-----
 net/openvswitch/flow_netlink.h                         |    6 +--
 net/openvswitch/vport-geneve.c                         |   13 -------
 net/openvswitch/vport-gre.c                            |    8 ----
 net/openvswitch/vport-internal_dev.c                   |   46 ++++++++++++++++++++--
 net/openvswitch/vport-vxlan.c                          |   19 ---------
 net/openvswitch/vport.c                                |   58 ---------------------------
 net/openvswitch/vport.h                                |   35 -----------------
 net/rds/tcp_recv.c                                     |   11 +++++-
 net/sysctl_net.c                                       |    6 ++-
 net/tipc/bcast.c                                       |    8 ++--
 net/tipc/msg.c                                         |   12 ++++--
 net/tipc/udp_media.c                                   |    5 +++
 net/vmw_vsock/af_vsock.c                               |    7 ++--
 net/vmw_vsock/vmci_transport.c                         |  173 +++++++++++++++++++++++++++++++++++++++------------------------------------------
 net/vmw_vsock/vmci_transport.h                         |    4 +-
 net/xfrm/xfrm_user.c                                   |    4 +-
 samples/bpf/bpf_helpers.h                              |   12 ++++++
 91 files changed, 919 insertions(+), 478 deletions(-)
 create mode 100644 Documentation/devicetree/bindings/net/smsc-lan87xx.txt
 create mode 100644 drivers/net/phy/dp83848.c
 create mode 100644 include/linux/overflow-arith.h
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Linus Torvalds Oct. 28, 2015, 9:39 a.m. UTC | #1

On Wed, Oct 28, 2015 at 3:32 PM, David Miller <davem@davemloft.net> wrote:
>
> This may look a bit scary this late in the release cycle, but as is typically
> the case it's predominantly small driver fixes all over the place.

Christ people. This is just sh*t.

The conflict I get is due to stupid new gcc header file crap. But what
makes me upset is that the crap is for completely bogus reasons.

This is the old code in net/ipv6/ip6_output.c:

        mtu -= hlen + sizeof(struct frag_hdr);

and this is the new "improved" code that uses fancy stuff that wants
magical built-in compiler support and has silly wrapper functions for
when it doesn't exist:

        if (overflow_usub(mtu, hlen + sizeof(struct frag_hdr), &mtu) ||
            mtu <= 7)
                goto fail_toobig;

and anybody who thinks that the above is

 (a) legible
 (b) efficient (even with the magical compiler support)
 (c) particularly safe

is just incompetent and out to lunch.

The above code is sh*t, and it generates shit code. It looks bad, and
there's no reason for it.

The code could *easily* have been done with just a single and
understandable conditional, and the compiler would actually have
generated better code, and the code would look better and more
understandable. Why is this not

        if (mtu < hlen + sizeof(struct frag_hdr) + 8)
                goto fail_toobig;
        mtu -= hlen + sizeof(struct frag_hdr);

which is the same number of lines, doesn't use crazy helper functions
that nobody knows what they do, and is much more obvious what it
actually does.

I guarantee that the second more obvious version is easier to read and
understand. Does anybody really want to dispute this?

Really. Give me *one* reason why it was written in that idiotic way
with two different conditionals, and a shiny new nonstandard function
that wants particular compiler support to generate even half-way sane
code, and even then generates worse code? A shiny function that we
have never ever needed anywhere else, and that is just
compiler-masturbation.

And yes, you still could have overflow issues if the whole "hlen +
xyz" expression overflows, but quite frankly, the "overflow_usub()"
code had that too. So if you worry about that, then you damn well
didn't do the right thing to begin with.

So I really see no reason for this kind of complete idiotic crap.

Tell me why. Because I'm not pulling this kind of completely insane
stuff that generates conflicts at rc7 time, and that seems to have
absolutely no reason for being anm idiotic unreadable mess.

The code seems *designed* to use that new "overflow_usub()" code. It
seems to be an excuse to use that function.

And it's a f*cking bad excuse for that braindamage.

I'm sorry, but we don't add idiotic new interfaces like this for
idiotic new code like that.

Yes, yes, if this had stayed inside the network layer I would never
have noticed. But since I *did* notice, I really don't want to pull
this. In fact, I want to make it clear to *everybody* that code like
this is completely unacceptable. Anybody who thinks that code like
this is "safe" and "secure" because it uses fancy overflow detection
functions is so far out to lunch that it's not even funny. All this
kind of crap does is to make the code a unreadable mess with code that
no sane person will ever really understand what it actually does.

Get rid of it. And I don't *ever* want to see that shit again.

                Linus
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Hannes Frederic Sowa Oct. 28, 2015, 11:03 a.m. UTC | #2

Hi Linus,

On Wed, Oct 28, 2015, at 10:39, Linus Torvalds wrote:
> On Wed, Oct 28, 2015 at 3:32 PM, David Miller <davem@davemloft.net>
> wrote:
> >
> > This may look a bit scary this late in the release cycle, but as is typically
> > the case it's predominantly small driver fixes all over the place.
> 
> Christ people. This is just sh*t.
> 
> The conflict I get is due to stupid new gcc header file crap. But what
> makes me upset is that the crap is for completely bogus reasons.
> 
> This is the old code in net/ipv6/ip6_output.c:
> 
>         mtu -= hlen + sizeof(struct frag_hdr);
> 
> and this is the new "improved" code that uses fancy stuff that wants
> magical built-in compiler support and has silly wrapper functions for
> when it doesn't exist:
> 
>         if (overflow_usub(mtu, hlen + sizeof(struct frag_hdr), &mtu) ||
>             mtu <= 7)
>                 goto fail_toobig;
> 
> and anybody who thinks that the above is
> 
>  (a) legible
>  (b) efficient (even with the magical compiler support)
>  (c) particularly safe

I still want to present an argument in favor of those overflow
functions:

On a very quick look it is obvious that someone cared about wrap-around
or overflow on this code without diagnosing the checks above.

And that was the reason I tried to use it.

> is just incompetent and out to lunch.
> 
> The above code is sh*t, and it generates shit code. It looks bad, and
> there's no reason for it.

Yes, overflow_usub is a bad example where IMHO the compiler cannot
improve a lot.

I think it gets more interesting in case of signed integers where the
compiler can simply generate a seto instruction instead of manually
checking the input variables for ranges before doing the calculation.
E.g. especially for multiplication it is quite clear.

> The code could *easily* have been done with just a single and
> understandable conditional, and the compiler would actually have
> generated better code, and the code would look better and more
> understandable. Why is this not
> 
>         if (mtu < hlen + sizeof(struct frag_hdr) + 8)
>                 goto fail_toobig;
>         mtu -= hlen + sizeof(struct frag_hdr);
> 
> which is the same number of lines, doesn't use crazy helper functions
> that nobody knows what they do, and is much more obvious what it
> actually does.
> 
> I guarantee that the second more obvious version is easier to read and
> understand. Does anybody really want to dispute this?

When reading through the code I have to jump back from the third line
back to the first one just to check if the lengths adds up. I absolutely
see your point, but I don't find the overflow_* helpers horrid but still
useful. If one is used to how the arguments line up on the overflow
helpers I find them quite easy to read.

> Really. Give me *one* reason why it was written in that idiotic way
> with two different conditionals, and a shiny new nonstandard function
> that wants particular compiler support to generate even half-way sane
> code, and even then generates worse code? A shiny function that we
> have never ever needed anywhere else, and that is just
> compiler-masturbation.

I agree as a bugfix I could have find a simpler solution.

> And yes, you still could have overflow issues if the whole "hlen +
> xyz" expression overflows, but quite frankly, the "overflow_usub()"
> code had that too. So if you worry about that, then you damn well
> didn't do the right thing to begin with.

Sure, there was no need to test against that because it couldn't.

> So I really see no reason for this kind of complete idiotic crap.
> 
> Tell me why. Because I'm not pulling this kind of completely insane
> stuff that generates conflicts at rc7 time, and that seems to have
> absolutely no reason for being anm idiotic unreadable mess.

I can understand that and will fix it up asap for rc7.

> The code seems *designed* to use that new "overflow_usub()" code. It
> seems to be an excuse to use that function.

Somehow I feel a bit guilty here. :) Actually I do find them quite
handy.

> And it's a f*cking bad excuse for that braindamage.
> 
> I'm sorry, but we don't add idiotic new interfaces like this for
> idiotic new code like that.
> 
> Yes, yes, if this had stayed inside the network layer I would never
> have noticed. But since I *did* notice, I really don't want to pull
> this. In fact, I want to make it clear to *everybody* that code like
> this is completely unacceptable. Anybody who thinks that code like
> this is "safe" and "secure" because it uses fancy overflow detection
> functions is so far out to lunch that it's not even funny. All this
> kind of crap does is to make the code a unreadable mess with code that
> no sane person will ever really understand what it actually does.
> 
> Get rid of it. And I don't *ever* want to see that shit again.

I don't want to give up on that this easily:

In future I would like to see an interface like this. It is often hard
to do correct overflow/wrap-around tests and it would be great if there
are helper functions which could easily and without a lot of thinking be
used by people to remove those problems from the kernel. While the
interface is at first difficult to use it is still much easier and less
error-prone than trying to come up with integer overflow checks e.g. for
multiplication in the integer domain.

It is not that the Linux kernel already had security vulnerabilities
because of missing overflow checks and explicitly pointing out that for
this variable it is handled seems like a good thing to me.

I will revert those patches in net and send them over to DaveM but I
think such an interface would still be nice to have.

Are you absolutely against such an interface in future? Don't you like
the design on how arguments are handled? Could I improve on that?

Thanks,
Hannes
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

David Miller Oct. 28, 2015, 1:21 p.m. UTC | #3

From: Linus Torvalds <torvalds@linux-foundation.org>
Date: Wed, 28 Oct 2015 18:39:56 +0900

> Get rid of it. And I don't *ever* want to see that shit again.

No problem, I'll revert it all.

I asked Hannes to repost his patches to linux-kernel hoping someone
would review and say it stunk or not, give him some feedback, or
whatever, and nobody reviewed the changes at all...

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Rasmus Villemoes Oct. 28, 2015, 2:27 p.m. UTC | #4

On Wed, Oct 28 2015, Hannes Frederic Sowa <hannes@stressinduktion.org> wrote:

> Hi Linus,
>
> On Wed, Oct 28, 2015, at 10:39, Linus Torvalds wrote:
>> Get rid of it. And I don't *ever* want to see that shit again.
>
> I don't want to give up on that this easily:
>
> In future I would like to see an interface like this. It is often hard
> to do correct overflow/wrap-around tests and it would be great if there
> are helper functions which could easily and without a lot of thinking be
> used by people to remove those problems from the kernel.

I agree - proper overflow checking can be really hard. Quick, assuming a
and b have the same unsigned integer type, is 'a+b<a' sufficient to
check overflow? Of course not (hint: promotion rules). And as you say,
it gets even more complicated for signed types.

A few months ago I tried posting a complete set of fallbacks for older
compilers (https://lkml.org/lkml/2015/7/19/358), but nothing really
happened. Now I know where Linus stands, so I guess I can just delete
that branch.

Rasmus
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Andy Lutomirski Nov. 2, 2015, 8:34 p.m. UTC | #5

On 10/28/2015 02:39 AM, Linus Torvalds wrote:
>
> I'm sorry, but we don't add idiotic new interfaces like this for
> idiotic new code like that.

As one of the people who encouraged gcc to add this interface, I'll 
speak up in its favor:

Getting overflow checking right in more complicated cases is a PITA. 
I'll admit that the "subtract from an unsigned integer if it won't go 
negative" isn't particularly useful, but there are other cases in which 
it's much more useful.

The one I care about the most is for multiplication.  Witness the 
never-ending debates about the proper way to implement things like 
kmalloc_array.  We currently do:

static inline void *kmalloc_array(size_t n, size_t size, gfp_t flags)
{
         if (size != 0 && n > SIZE_MAX / size)
                 return NULL;
         return __kmalloc(n * size, flags);
}

This is correct, and it's even reasonably efficient if size is a 
compile-time constant.  (On x86, it still might not be quite optimal, 
since there'll be an extra cmp instruction.  Sure, the difference could 
easily be a cycle or even less.)

But if size is not a constant, then, unless the compiler is quite 
clever, this ends up generating a division, and that sucks.

If we were willing to do:

size_t total_bytes;
#if efficient_overflow_detection_works
if (__builtin_mul_overflow(n, size, &total_bytes))
         return NULL;
#else
/* existing check goes here */
total_bytes = n * size;
#endif
return __kmalloc(n * size, flags);

then we get optimal code generation on new compilers and the result 
isn't even that ugly to look at.

For compiler flag settings in which signed overflow can cause subtle 
disasters, the signed addition overflow helpers can be nice, too.

--Andy
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Linus Torvalds Nov. 2, 2015, 9:16 p.m. UTC | #6

On Mon, Nov 2, 2015 at 12:34 PM, Andy Lutomirski <luto@kernel.org> wrote:
> On 10/28/2015 02:39 AM, Linus Torvalds wrote:
>>
>> I'm sorry, but we don't add idiotic new interfaces like this for
>> idiotic new code like that.
>
>
> As one of the people who encouraged gcc to add this interface, I'll speak up
> in its favor:
>
> Getting overflow checking right in more complicated cases is a PITA.

No it is not. Not for unsigned values.

Stop this idiocy. Yes, overflow for signed integers can be complex.
But not for unsigned ones.

And that disgusting "overflow_usub()" in no way makes the code more
readable. EVER.

So stop just making things up. A helper function *could* have been
more legible and more efficient, if it had been about something
completely different.

But in this case it really wasn't. It wasn't more efficient, it wasn't
more legible, and it simply had no excuse for it. Stop making excuses
for shit.

                 Linus
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Linus Torvalds Nov. 2, 2015, 9:19 p.m. UTC | #7

On Mon, Nov 2, 2015 at 1:16 PM, Linus Torvalds
<torvalds@linux-foundation.org> wrote:
> On Mon, Nov 2, 2015 at 12:34 PM, Andy Lutomirski <luto@kernel.org> wrote:
>>
>> Getting overflow checking right in more complicated cases is a PITA.
>
> No it is not. Not for unsigned values.

Just to clarify. The "oevrflow" test for unsigned subtracts  of "a-b"
(it's really an underflow, but whatever) really is just

   (b > a)

Really. That's it. Claiming that that is "complicated" and needs a
helper function is not something sane people do. A fifth-grader that
isn't good at math can understand that.

In contrast, nobody sane understands "usub_overflow(a, b, &res)".

So really. Stop making inane arguments.

                  Linus
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Andy Lutomirski Nov. 2, 2015, 9:30 p.m. UTC | #8

On Mon, Nov 2, 2015 at 1:19 PM, Linus Torvalds
<torvalds@linux-foundation.org> wrote:
> On Mon, Nov 2, 2015 at 1:16 PM, Linus Torvalds
> <torvalds@linux-foundation.org> wrote:
>> On Mon, Nov 2, 2015 at 12:34 PM, Andy Lutomirski <luto@kernel.org> wrote:
>>>
>>> Getting overflow checking right in more complicated cases is a PITA.
>>
>> No it is not. Not for unsigned values.
>
> Just to clarify. The "oevrflow" test for unsigned subtracts  of "a-b"
> (it's really an underflow, but whatever) really is just
>
>    (b > a)
>
> Really. That's it. Claiming that that is "complicated" and needs a
> helper function is not something sane people do. A fifth-grader that
> isn't good at math can understand that.
>
> In contrast, nobody sane understands "usub_overflow(a, b, &res)".
>
> So really. Stop making inane arguments.

I'll stop making inane arguments if you stop bashing arguments I
didn't make. :)  I said the helpers were useful for multiplication (by
which I meant both signed and unsigned) and, to a lesser extent, for
signed addition and subtraction.

I don't believe I even tried to justify usub_overflow as anything
other than an extremely minor optimization that probably isn't
worthwhile.

--Andy, who still has inline asm that does 'cmovo' and such in his
code for work, sigh.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Hannes Frederic Sowa Nov. 2, 2015, 10:14 p.m. UTC | #9

Hello,

On Mon, Nov 2, 2015, at 22:30, Andy Lutomirski wrote:
> On Mon, Nov 2, 2015 at 1:19 PM, Linus Torvalds
> <torvalds@linux-foundation.org> wrote:
> > On Mon, Nov 2, 2015 at 1:16 PM, Linus Torvalds
> > <torvalds@linux-foundation.org> wrote:
> >> On Mon, Nov 2, 2015 at 12:34 PM, Andy Lutomirski <luto@kernel.org> wrote:
> >>>
> >>> Getting overflow checking right in more complicated cases is a PITA.
> >>
> >> No it is not. Not for unsigned values.
> >
> > Just to clarify. The "oevrflow" test for unsigned subtracts  of "a-b"
> > (it's really an underflow, but whatever) really is just
> >
> >    (b > a)
> >
> > Really. That's it. Claiming that that is "complicated" and needs a
> > helper function is not something sane people do. A fifth-grader that
> > isn't good at math can understand that.
> >
> > In contrast, nobody sane understands "usub_overflow(a, b, &res)".
> >
> > So really. Stop making inane arguments.
> 
> I'll stop making inane arguments if you stop bashing arguments I
> didn't make. :)  I said the helpers were useful for multiplication (by
> which I meant both signed and unsigned) and, to a lesser extent, for
> signed addition and subtraction.
> 
> I don't believe I even tried to justify usub_overflow as anything
> other than an extremely minor optimization that probably isn't
> worthwhile.

overflow_usub was part of a larger header I already prepared to offer
support for *all* overflow_* checking builtins. While fixing this IPv6
bug I thought I could hopefully introduce this interface slowly and
simply cut away the other versions.

The reasoning from my side, while I totally agree that unsigned add/sub
is very easy to test for, is that after using those builtins for a while
I absolutely don't consider them in any way unpleasant to read (but
mainly with signed operations, I have to admit).

They do the one operation and if something overflows or wraps around,
error out and enforce the failure case on the true branch. I simply like
the name 'overflow' (also it could be changed to wraparound in unsigned
operations) in the code. Also chaining multiple operations where each
one could overflow can simply be put into a single if with short-eval
or.

Simply, my point in submitting overflow_usub was to provide all helpers
sometime soon in this way but not submit any without users. I simply was
working down the list.

> --Andy, who still has inline asm that does 'cmovo' and such in his
> code for work, sigh.

In the past I also prepared some inline asm functions with
__builtin_constant_p and inline asm to do those checks but considered
them to complicated for submission. ;)

Linus, I am totally fine with leaving out the usub and uadd operations
but hope you are willing to accept the multiplication and signed add/sub
versions.

Bye,
Hannes
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Linus Torvalds Nov. 2, 2015, 11:21 p.m. UTC | #10

On Mon, Nov 2, 2015 at 2:14 PM, Hannes Frederic Sowa
<hannes@stressinduktion.org> wrote:
>
> overflow_usub was part of a larger header I already prepared to offer
> support for *all* overflow_* checking builtins. While fixing this IPv6
> bug I thought I could hopefully introduce this interface slowly and
> simply cut away the other versions.

Hell no.

Both you and Andy seem to argue that "since there are other totally
unrelated functions that look superficially similar and actually some
sense, we should add these stupid crap functions too".

In exactly *WHAT* crazy universe does that make sense as an argument?

It's like saying "I put literal shit on your plate, because there are
potentially nutritious sausages that look superficially a bit like the
dogshit I served you".

Seriously.

The fact that _valid_ overflow checking functions exist in _no_ way
support the crap that I got.

It's *exactly* the same argument as "dog poop superficially looks like
good sausages".

Is that really your argument?

There is never an excuse for "usub_overflow()". It's that simple.

No amount of _other_ overflow functions make that shit palatable.

                   Linus
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Benjamin Herrenschmidt Nov. 3, 2015, 12:56 a.m. UTC | #11

On Mon, 2015-11-02 at 13:30 -0800, Andy Lutomirski wrote:
> 
> I'll stop making inane arguments if you stop bashing arguments I
> didn't make. :)  I said the helpers were useful for multiplication (by
> which I meant both signed and unsigned) and, to a lesser extent, for
> signed addition and subtraction.
> 
> I don't believe I even tried to justify usub_overflow as anything
> other than an extremely minor optimization that probably isn't
> worthwhile.

Also how much of the problem is simply that the function signature
(naming and choice of arguments) just plain sucks ?

Cheers,
Ben.

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Linus Torvalds Nov. 3, 2015, 1:54 a.m. UTC | #12

On Mon, Nov 2, 2015 at 4:56 PM, Benjamin Herrenschmidt
<benh@kernel.crashing.org> wrote:
>
> Also how much of the problem is simply that the function signature
> (naming and choice of arguments) just plain sucks ?

Some of that is pretty much inevitable.

C really has no good way to return multiple values. The traditional
(pass pointer to fill in result) one simply doesn't result in
good-looking code.

We've occasionally done it by simply breaking C syntax: see "do_div()"
(which returns a value *and* just changes the first argument directly
as a macro). People have tended to absolutely hate it, and while it
can be very practical, it has certainly also resulted in people being
confused. It was originally done for printing numbers, where the whole
"return remainder and divide argument" model was really convenient.

Sometimes we've done it by knowing the value space: the whole "return
error value _or_ a resulting pointer value" by just knowing the
domains (ie "ERR_PTR()" end friends). That tends to work really badly
for arithmetic overflows, though.

And at other times, we've returned structures, which can efficiently
contain two words, and gcc generates reasonable code for.

The *natural* thing to do would actually be to trap and set a flag. We
do that with

    put_user_try {
        ...
    } put_user_catch(err);

which sets "err" if one of the "put_user_ex()" or calls in between traps.

The "try/catch" model would probably be the best one syntactically for
overflow handling.  It could even be done with macros (ie the "catch"
thing would contain a special overflow label, and the "overflow
functions" would then just jump to that labeln in the error case as a
way to avoid the "return two different values" thing).

Of course, try/catch only really makes sense if you have multiple
operations that can overflow in different parts. That's can be the
pattern in many other cases, but for the kernel it's quite unusual.
It's seldom more than one single operation we need to worry about in
any particular sequence (the "put_user_try/catch" use we have is
exactly because signal handling writes multiple different values to
the same stack when it generates the stack frame).

And with all that said, realistically, in the kernel we seldom have a
ton of complex overflow issues. Most of the values we have are
unsigned, and we have historically just done them manually with

    sum = a+b;
    if (sum < a)
         .. handle error ..

which really doesn't get much better from any syntactic stuff around
it (because any other syntax will involve the whole "how do I return
two values" problem and make it less legible).

Gcc is even smart enough to turn that into a "just check the carry
flag" if the patterns are simple enough, so the simple approach can
even generate optimal code.

The biggest problem - and where the compiler could actually help us -
tends to be multiplication overflows. We have several (not *many*, but
certainly more than just a couple) cases where we simply check by
dividing MAX_INT or something.

See for example kmalloc_array(), which does

        if (size != 0 && n > SIZE_MAX / size)
                return NULL;

exactly to avoid the overflow when it does the "n*size" allocation.

So for multiplication, we really *could* use overflow logic. It's not
horribly common, but it definitely happens.

Signed integer overflows are annoying even for simple addition and
subtraction, but I can't off-hand recall any real case where that was
even an issue in the kernel.

                  Linus
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Andy Lutomirski Nov. 3, 2015, 1:58 a.m. UTC | #13

On Mon, Nov 2, 2015 at 5:54 PM, Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>
> The biggest problem - and where the compiler could actually help us -
> tends to be multiplication overflows. We have several (not *many*, but
> certainly more than just a couple) cases where we simply check by
> dividing MAX_INT or something.
>
> See for example kmalloc_array(), which does
>
>         if (size != 0 && n > SIZE_MAX / size)
>                 return NULL;
>
> exactly to avoid the overflow when it does the "n*size" allocation.
>
> So for multiplication, we really *could* use overflow logic. It's not
> horribly common, but it definitely happens.
>

Based in part on an old patch by Sasha, what if we relied on CSE:

if (mul_would_overflow(size, n))
  return NULL;
do_something_with(size * n);

I haven't checked, but it would be sad if gcc couldn't optimize this
correctly if we use the builtins.

The downside is that I don't see off the top of my head how this could
be implemented using inline asm if we want a fast fallback when the
builtins aren't available.

--Andy
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Linus Torvalds Nov. 3, 2015, 2:38 a.m. UTC | #14

On Mon, Nov 2, 2015 at 5:58 PM, Andy Lutomirski <luto@amacapital.net> wrote:
>
> Based in part on an old patch by Sasha, what if we relied on CSE:
>
> if (mul_would_overflow(size, n))
>   return NULL;
> do_something_with(size * n);

I suspect we wouldn't even have to rely on CSE. Are these things in
performance-critical areas? I suspect our current "use divides" is
actually slower than just using two multiplies, even if one is only
used for overflow checking.

That said, I also think that for something like this, where we
actually have a *reason* for using a special overflow helper function,
we could just try to use the gcc syntax.

I don't think it's wonderful syntax, but at least there's an excuse
for odd/ugly code in those kinds of situations. The reason I hated the
unsigned subtraction case so much was that the simple obvious code
just avoids all those issues entirely, and there wasn't any real
*reason* for the nasty syntax. For multiplication overflow, we'd at
least have a reason.

Sadly, the *nice* syntax, where we'd do something like "goto label"
when the multiply overflows, does not mesh well with inline asm.  Yes,
gcc now has "asm goto", but you can't use it with an output value ;(

But from a syntactic standpoint, the best syntax might actually be
something like

        mul = multiply_with_overflow(size, n, error_label);
        do_something_with(mul);

    error_label:
        return NULL;

and it would *almost* be possible to do this with inline asm if it
wasn't for the annoying "no output values" case.  There are many other
cases where I'd have wanted to do this (ie the whole "fetch value from
user space, but if an exception happens, point the exception handler
at the label).

Back in May, we talked to the gcc people about allowing output values
that are unspecified for the "goto" case (so only a fallthrough would
have them set), but I think that that doesn't match how gcc internally
thinks about branching instructions..

But you could still hide it inside a macro and make it expand to something like

   #define multiply_with_overflow(size, n, error_label) ({
        unsigned long result, error; \
        .. do multiply in asm, set result and error... \
        if (error) goto error_label;
        result; })

which would allow the above kind of odd hand-coded try/catch model in
C. Which I think would be pretty understandable and not very prone to
getting it wrong. Hmm?

                     Linus
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Hannes Frederic Sowa Nov. 3, 2015, 12:53 p.m. UTC | #15

Hello,

On Tue, Nov 3, 2015, at 03:38, Linus Torvalds wrote:
> On Mon, Nov 2, 2015 at 5:58 PM, Andy Lutomirski <luto@amacapital.net>
> wrote:
> >
> > Based in part on an old patch by Sasha, what if we relied on CSE:
> >
> > if (mul_would_overflow(size, n))
> >   return NULL;
> > do_something_with(size * n);
> 
> I suspect we wouldn't even have to rely on CSE. Are these things in
> performance-critical areas? I suspect our current "use divides" is
> actually slower than just using two multiplies, even if one is only
> used for overflow checking.

And furthermore we don't actually have to rely on CSE if we want to, our
overflow checks could look much more simpler as in "ordinary" C code
because we tell the compiler that signed overflow is defined throughout
the kernel ( -fno-strict-overflow). Thus the checks can be done after
the calculations.

> That said, I also think that for something like this, where we
> actually have a *reason* for using a special overflow helper function,
> we could just try to use the gcc syntax.
>
> I don't think it's wonderful syntax, but at least there's an excuse
> for odd/ugly code in those kinds of situations. The reason I hated the
> unsigned subtraction case so much was that the simple obvious code
> just avoids all those issues entirely, and there wasn't any real
> *reason* for the nasty syntax. For multiplication overflow, we'd at
> least have a reason.
> 
> Sadly, the *nice* syntax, where we'd do something like "goto label"
> when the multiply overflows, does not mesh well with inline asm.  Yes,
> gcc now has "asm goto", but you can't use it with an output value ;(

I don't understand why you consider inline asm? Those builtins already
normally produce very reasonable code (and yes, I checked). We can wrap
the gcc builtins anyway and adapt the syntax as needed. inline asm does
prohibit constant folding etc, so a __builtin_constant_p check would be
necessary or helpful further adding complexity.

> But from a syntactic standpoint, the best syntax might actually be
> something like
> 
>         mul = multiply_with_overflow(size, n, error_label);
>         do_something_with(mul);
> 
>     error_label:
>         return NULL;
> 
> and it would *almost* be possible to do this with inline asm if it
> wasn't for the annoying "no output values" case.  There are many other
> cases where I'd have wanted to do this (ie the whole "fetch value from
> user space, but if an exception happens, point the exception handler
> at the label).

I don't see the problem with the

  if (multiply_with_overflow(...))
    overflowed_handle_error(...);

multiply_with_overflow can have a __must_check attribute, so we see
warnings if return value is not checked immediately.

It allows chaining easily

if (multiply_with_overflow(...) ||
    multiply_with_overflow(...))
      goto overflow;

without adding checks between the different stages or calculation. It
just composes nicely. The error handling is very explicit.

> Back in May, we talked to the gcc people about allowing output values
> that are unspecified for the "goto" case (so only a fallthrough would
> have them set), but I think that that doesn't match how gcc internally
> thinks about branching instructions..
> 
> But you could still hide it inside a macro and make it expand to
> something like
> 
>    #define multiply_with_overflow(size, n, error_label) ({
>         unsigned long result, error; \
>         .. do multiply in asm, set result and error... \
>         if (error) goto error_label;
>         result; })
> 
> which would allow the above kind of odd hand-coded try/catch model in
> C. Which I think would be pretty understandable and not very prone to
> getting it wrong. Hmm?

Hiding branches in macros seems not to be a good idea to me at all. I
actually think a lot of users in functions would simply check their
arguments and return -EINVAL in case they overflow. Forcing them to do a
jump seems inappropriate.

I also don't think that reordering the arguments makes a lot of sense:

bool overflow;
int a = multiply_with_overflow(b, c, &overflow);
if (overflow)
  error out;

This scheme might be composable if we ||= the overflow flag in the
helper functions/macros and force the user to initialize the overflow
boolean it to false in the beginning. Way too many things that can go
wrong and an auditor has to verify.

Bye,
Hannes
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Linus Torvalds Nov. 3, 2015, 8:05 p.m. UTC | #16

On Tue, Nov 3, 2015 at 4:53 AM, Hannes Frederic Sowa
<hannes@stressinduktion.org> wrote:
>
> And furthermore we don't actually have to rely on CSE if we want to, our
> overflow checks could look much more simpler as in "ordinary" C code
> because we tell the compiler that signed overflow is defined throughout
> the kernel ( -fno-strict-overflow). Thus the checks can be done after
> the calculations.

Yes. We could actually do overflow checks by just verifying the
result, even for signed stuff.

> I don't understand why you consider inline asm? Those builtins already
> normally produce very reasonable code

Those built-ins only exist with gcc-5+, afaik.

We'll have people who rely on old versions of gcc for *years* after
gcc5 is commonly available. They'll be running enterprise distros or
debian-stable or things like that.

So we do need to have reasonable backwards compatibility functions.
For things like multiplication overflow, inline asm may be the best
way to do that.

That said, we'll be able to work around it, I'm sure. But no, we're
not going to be in the situation where we just know we can use the
builtins.

> I don't see the problem with the
>
>   if (multiply_with_overflow(...))
>     overflowed_handle_error(...);

I do agree that it's likely not a big issue.

That said, I may be influenced by hardware design, but I think I'm
also influenced by traditional good C rules: I like functions that
return the *result*, so that the result can be used in a chain of
calculations. Like hardware, the "overflow" bit is separate and I
actually think the gcc overflow functions did the calling convention
wrong.

So even if we do the "pass one of the results by reference" thing, I'd
much rather that "pas by reference" be for the overflow condition.

And hardware that does it well tends to not just give an "overflow"
result, but a "summary overflow", so that you can do multiple
operations in series, and then just check the "summary overflow" at
the end.

So my gut feel is that overflow should either be an exception (ie the
whole "jump to another place" model), or it should be a flag value,
but it shouldn't be the "result" of the function.

For example, one of the overflow issues we've had occasionally has
been not about a single op, but a series of operations:
"multiply-and-add". Look at __timespec64_to_jiffies(), for example,
where the operation that can overflow is "seconds * SEC_CONVERSION +
nsec * NSEC_CONVERSION".

Now, in that case we currently handle the overflow by just knowing
that 'nsec' had better follow certain rules, so we can simply check
seconds against a known maximum, and we don't need to get the "exact"
overflow condition. And quite honestly, that may end up *always* being
the right thing to do - there just isn't any real reason to worry
about individual operations overflowing.

But imagine that we did. The "summary overflow" interface would allow
us to do something like

     bool overflow = 0;

     result = add_overflow(
        mul_overflow(sec, SEC_CONVERSION, &overflow),
        mul_overflow(nsec, NSEC_CONVERSION, &overflow),
        &overflow);

     return overflow ? MAX_JIFFIES : result;

which I'm not at all actually advocating (because (a) I think the
current code is simpler and (b) I don't like the silly
"add_overflow()" anyway), but that I'm giving as an example of why I
think the gcc builtin result passing choice looks a bit odd to me.

> multiply_with_overflow can have a __must_check attribute, so we see
> warnings if return value is not checked immediately.

Yes. There may be advantages to that too. That said, I'm not seeing
that as a big deal. If you use the overflow functions and don't check
the overflow condition, you kind of have bigger issues than "I'd like
to get a compiler warning". That's more of a "WTF is the person doing"
thing).

                  Linus
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Linus Torvalds Nov. 3, 2015, 8:44 p.m. UTC | #17

On Tue, Nov 3, 2015 at 12:05 PM, Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>      result = add_overflow(
>         mul_overflow(sec, SEC_CONVERSION, &overflow),
>         mul_overflow(nsec, NSEC_CONVERSION, &overflow),
>         &overflow);
>
>      return overflow ? MAX_JIFFIES : result;

Thinking more about this example, I think the gcc interface for
multiplication overflow is fine.

It would end up something like

    if (mul_overflow(sec, SEC_CONVERSION, &sec))
        return MAX_JIFFY_OFFSET;
    if (mul_overflow(nsec, NSEC_CONVERSION, &nsec))
        return MAX_JIFFY_OFFSET;
    sum = sec + nsec;
    if (sum < sec || sum > MAX_JIFFY_OFFSET)
        return MAX_JIFFY_OFFSET;
    return sum;

and that doesn't look horribly ugly to me.

That said, I do wonder how many of our interfaces really want
overflow, and how many just want saturation (or even clamping to a
maximum value).

For example, one of the *common* cases of multiplication overflow we
have had is for memory allocation where we do things like

    buffer = kmalloc(sizeof(something) * nr, GFP_KERNEL);

and we've fixed them by moving them to 'kcalloc()'. But as with the
jiffies conversion above, it would actually be sufficient to just
saturate to a maximum value instead, and depending on that causing the
allocation to fail.

So it may actually be that most users really don't even *want* "overflow".

Does anybody have any particular other "uhhuh, overflow in
multiplication" issues in mind? Because the interface for a saturating
multiplication (or addition, for that matter) would actually be much
easier. And would be trivial to have as an inline asm for
compatibility with older versions of gcc too.

Then you could just do that jiffies conversion - or allocation, for
that matter - without any special overflow handling at all. Doing

    buf = kmalloc(sat_mul(sizeof(x), nr), GFP_KERNEL);

would just magically work.

And the above jiffies conversion would still want to clamp things to
MAX_JIFFY_OFFSET (because we consider "jiffies" to be an offset from
now, and while it's "unsigned long", we clamp the offset to half the
range), but it would still be a rather natural model for it too.

                    Linus
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

David Laight Nov. 6, 2015, 3:27 p.m. UTC | #18

> From: Linus Torvalds

> Sent: 03 November 2015 20:45

> On Tue, Nov 3, 2015 at 12:05 PM, Linus Torvalds

> <torvalds@linux-foundation.org> wrote:

> >      result = add_overflow(

> >         mul_overflow(sec, SEC_CONVERSION, &overflow),

> >         mul_overflow(nsec, NSEC_CONVERSION, &overflow),

> >         &overflow);

> >

> >      return overflow ? MAX_JIFFIES : result;

> 

> Thinking more about this example, I think the gcc interface for

> multiplication overflow is fine.

> 

> It would end up something like

> 

>     if (mul_overflow(sec, SEC_CONVERSION, &sec))

>         return MAX_JIFFY_OFFSET;

>     if (mul_overflow(nsec, NSEC_CONVERSION, &nsec))

>         return MAX_JIFFY_OFFSET;

>     sum = sec + nsec;

>     if (sum < sec || sum > MAX_JIFFY_OFFSET)

>         return MAX_JIFFY_OFFSET;

>     return sum;

> 

> and that doesn't look horribly ugly to me.

If mul_overflow() is a real function you've just forced some of the
values out to memory, generating a 'clobber' for all memory
(unless 'strict-aliasing' is enabled) and making a mess of other
optimisations.
(If it is a static inline that might not happen.)

If you assume that no one is stupid enough to multiply very large
values by 1 and not get an error you could have mul_overflow()
return the largest prime if the multiply overflowed.

	David

Andy Lutomirski Nov. 7, 2015, 12:49 a.m. UTC | #19

On Fri, Nov 6, 2015 at 7:27 AM, David Laight <David.Laight@aculab.com> wrote:
>> From: Linus Torvalds
>> Sent: 03 November 2015 20:45
>> On Tue, Nov 3, 2015 at 12:05 PM, Linus Torvalds
>> <torvalds@linux-foundation.org> wrote:
>> >      result = add_overflow(
>> >         mul_overflow(sec, SEC_CONVERSION, &overflow),
>> >         mul_overflow(nsec, NSEC_CONVERSION, &overflow),
>> >         &overflow);
>> >
>> >      return overflow ? MAX_JIFFIES : result;
>>
>> Thinking more about this example, I think the gcc interface for
>> multiplication overflow is fine.
>>
>> It would end up something like
>>
>>     if (mul_overflow(sec, SEC_CONVERSION, &sec))
>>         return MAX_JIFFY_OFFSET;
>>     if (mul_overflow(nsec, NSEC_CONVERSION, &nsec))
>>         return MAX_JIFFY_OFFSET;
>>     sum = sec + nsec;
>>     if (sum < sec || sum > MAX_JIFFY_OFFSET)
>>         return MAX_JIFFY_OFFSET;
>>     return sum;
>>
>> and that doesn't look horribly ugly to me.
>
> If mul_overflow() is a real function you've just forced some of the
> values out to memory, generating a 'clobber' for all memory
> (unless 'strict-aliasing' is enabled) and making a mess of other
> optimisations.
> (If it is a static inline that might not happen.)

I doubt anyone would ever make it a real function.  On new gcc, it
would be an inline backed by an intrinsic.  On old gcc it would be a
normal inline or perhaps an inline with inline asm in it.

--Andy
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Ingo Molnar Nov. 9, 2015, 8:12 a.m. UTC | #20

* Linus Torvalds <torvalds@linux-foundation.org> wrote:

> Does anybody have any particular other "uhhuh, overflow in multiplication" 
> issues in mind? Because the interface for a saturating multiplication (or 
> addition, for that matter) would actually be much easier. And would be trivial 
> to have as an inline asm for compatibility with older versions of gcc too.
> 
> Then you could just do that jiffies conversion - or allocation, for that matter 
> - without any special overflow handling at all. Doing
> 
>     buf = kmalloc(sat_mul(sizeof(x), nr), GFP_KERNEL);
> 
> would just magically work.

Exactly: saturation is the default behavior for many GPU vector/pixel attributes 
as well, to simplify and speed up the code and the hardware. I always wanted our 
ABIs to saturate instead of duplicating complexity with overflow failure logic.

In the kernel the first point of failure is missing overflow checks. The second 
point of failure are buggy overflow checks. We can eliminate both if we just use 
safe operations that produce output that never exit the valid range. This also 
happens to result in the simplest code. We should start thinking of overflow 
checks as rootkit enablers.

And note how much this simplifies review and static analysis: if this is the 
dominant model used in new kernel code then the analysis (human or machine) would 
only have to ensure that no untrusted input values get multiplied (or added) in an 
unsafe way. It would not have to be able to understand and track any 'overflow 
logic' through a maze of return paths, and validate whether the 'overflow logic' 
is correct for all input parameter ranges...

The flip side is marginally less ABI robustness: random input parameters due to 
memory corruption will just saturate and produce nonsensical results. I don't 
think it's a big issue, and I also think the simplicity of input parameter 
validation is _way_ more important than our behavior to random input - but I've 
been overruled in the past when trying to introduce saturating ABIs, so saturation 
is something people sometimes find inelegant.

Thanks,

	Ingo
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Hannes Frederic Sowa Nov. 9, 2015, 10:38 a.m. UTC | #21

Hello,

Ingo Molnar <mingo@kernel.org> writes:

> * Linus Torvalds <torvalds@linux-foundation.org> wrote:
>
>> Does anybody have any particular other "uhhuh, overflow in multiplication" 
>> issues in mind? Because the interface for a saturating multiplication (or 
>> addition, for that matter) would actually be much easier. And would be trivial 
>> to have as an inline asm for compatibility with older versions of gcc too.
>> 
>> Then you could just do that jiffies conversion - or allocation, for that matter 
>> - without any special overflow handling at all. Doing
>> 
>>     buf = kmalloc(sat_mul(sizeof(x), nr), GFP_KERNEL);
>> 
>> would just magically work.
>
> Exactly: saturation is the default behavior for many GPU vector/pixel attributes 
> as well, to simplify and speed up the code and the hardware. I always wanted our 
> ABIs to saturate instead of duplicating complexity with overflow failure logic.

I don't think saturation arithmetic is useful at all in the kernel as a
replacement for overflow/wrap-around checks. Linus' example has a
discrepancy between what the caller expects and the actual number of
bytes allocated. Imagine sat_mul does the operation in signed char and
kmalloc takes only signed chars as an argument, it could actually be a
huge discrepancy that could lead to security vulnerabilities. The call
should definitely error out here and not try to allocate memory of some
different size and return it to the caller.

> In the kernel the first point of failure is missing overflow checks. The second 
> point of failure are buggy overflow checks. We can eliminate both if we just use 
> safe operations that produce output that never exit the valid range. This also 
> happens to result in the simplest code. We should start thinking of overflow 
> checks as rootkit enablers.

Sorry, I don't understand that at all. sat_mul is a rootkit enabler, I
fear. If you allocate a smalelr portion of memory as the caller actually
asked for because of saturation logic, this definitely could lead to
memory corruption and hard to diagnose bugs.

> And note how much this simplifies review and static analysis: if this is the 
> dominant model used in new kernel code then the analysis (human or machine) would 
> only have to ensure that no untrusted input values get multiplied (or added) in an 
> unsafe way. It would not have to be able to understand and track any 'overflow 
> logic' through a maze of return paths, and validate whether the 'overflow logic' 
> is correct for all input parameter ranges...

Sorry, I don't really understand that proposal. :/

> The flip side is marginally less ABI robustness: random input parameters due to 
> memory corruption will just saturate and produce nonsensical results. I don't 
> think it's a big issue, and I also think the simplicity of input parameter 
> validation is _way_ more important than our behavior to random input - but I've 
> been overruled in the past when trying to introduce saturating ABIs, so saturation 
> is something people sometimes find inelegant.

If those nonsensical results are memory corruptions I also don't
agree. I think we need to be very much accurate when dealing with
overflows.

Bye,
Hannes
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Hannes Frederic Sowa Nov. 9, 2015, 10:38 a.m. UTC | #22

Hello,

Ingo Molnar <mingo@kernel.org> writes:

> * Linus Torvalds <torvalds@linux-foundation.org> wrote:
>
>> Does anybody have any particular other "uhhuh, overflow in multiplication" 
>> issues in mind? Because the interface for a saturating multiplication (or 
>> addition, for that matter) would actually be much easier. And would be trivial 
>> to have as an inline asm for compatibility with older versions of gcc too.
>> 
>> Then you could just do that jiffies conversion - or allocation, for that matter 
>> - without any special overflow handling at all. Doing
>> 
>>     buf = kmalloc(sat_mul(sizeof(x), nr), GFP_KERNEL);
>> 
>> would just magically work.
>
> Exactly: saturation is the default behavior for many GPU vector/pixel attributes 
> as well, to simplify and speed up the code and the hardware. I always wanted our 
> ABIs to saturate instead of duplicating complexity with overflow failure logic.

I don't think saturation arithmetic is useful at all in the kernel as a
replacement for overflow/wrap-around checks. Linus' example has a
discrepancy between what the caller expects and the actual number of
bytes allocated. Imagine sat_mul does the operation in signed char and
kmalloc takes only signed chars as an argument, it could actually be a
huge discrepancy that could lead to security vulnerabilities. The call
should definitely error out here and not try to allocate memory of some
different size and return it to the caller.

> In the kernel the first point of failure is missing overflow checks. The second 
> point of failure are buggy overflow checks. We can eliminate both if we just use 
> safe operations that produce output that never exit the valid range. This also 
> happens to result in the simplest code. We should start thinking of overflow 
> checks as rootkit enablers.

Sorry, I don't understand that at all. sat_mul is a rootkit enabler, I
fear. If you allocate a smalelr portion of memory as the caller actually
asked for because of saturation logic, this definitely could lead to
memory corruption and hard to diagnose bugs.

> And note how much this simplifies review and static analysis: if this is the 
> dominant model used in new kernel code then the analysis (human or machine) would 
> only have to ensure that no untrusted input values get multiplied (or added) in an 
> unsafe way. It would not have to be able to understand and track any 'overflow 
> logic' through a maze of return paths, and validate whether the 'overflow logic' 
> is correct for all input parameter ranges...

Sorry, I don't really understand that proposal. :/

> The flip side is marginally less ABI robustness: random input parameters due to 
> memory corruption will just saturate and produce nonsensical results. I don't 
> think it's a big issue, and I also think the simplicity of input parameter 
> validation is _way_ more important than our behavior to random input - but I've 
> been overruled in the past when trying to introduce saturating ABIs, so saturation 
> is something people sometimes find inelegant.

If those nonsensical results are memory corruptions I also don't
agree. I think we need to be very much accurate when dealing with
overflows.

Bye,
Hannes
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Hannes Frederic Sowa Nov. 9, 2015, 12:09 p.m. UTC | #23

Hi,

On Wed, Oct 28, 2015, at 15:27, Rasmus Villemoes wrote:
> On Wed, Oct 28 2015, Hannes Frederic Sowa <hannes@stressinduktion.org>
> wrote:
> 
> > Hi Linus,
> >
> > On Wed, Oct 28, 2015, at 10:39, Linus Torvalds wrote:
> >> Get rid of it. And I don't *ever* want to see that shit again.
> >
> > I don't want to give up on that this easily:
> >
> > In future I would like to see an interface like this. It is often hard
> > to do correct overflow/wrap-around tests and it would be great if there
> > are helper functions which could easily and without a lot of thinking be
> > used by people to remove those problems from the kernel.
> 
> I agree - proper overflow checking can be really hard. Quick, assuming a
> and b have the same unsigned integer type, is 'a+b<a' sufficient to
> check overflow? Of course not (hint: promotion rules). And as you say,
> it gets even more complicated for signed types.
> 
> A few months ago I tried posting a complete set of fallbacks for older
> compilers (https://lkml.org/lkml/2015/7/19/358), but nothing really
> happened. Now I know where Linus stands, so I guess I can just delete
> that branch.

I actually like your approach of being type agnostic a bit more (in
comparison to static inline functions), mostly because of one specific
reason:

The type agnostic __builtin_*_overflow function even do the correct
things if you deal with types smaller than int. Imagine e.g. you want to
add to unsigned chars a and b,

unsigned char a, b;
if (a + b < a)
  goto overflow;
else
  a += b;

The overflow condition will never trigger, as the comparisons will
always be done in the integer domain and a + b < a is never true. I
actually think that this is easy to overlook and the functions should
handle that. The macro version does this quite nicely.

Bye,
Hannes
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Rasmus Villemoes Nov. 9, 2015, 2:16 p.m. UTC | #24

On Mon, Nov 09 2015, Hannes Frederic Sowa <hannes@stressinduktion.org> wrote:

> Hi,
>
> On Wed, Oct 28, 2015, at 15:27, Rasmus Villemoes wrote:
>> 
>> I agree - proper overflow checking can be really hard. Quick, assuming a
>> and b have the same unsigned integer type, is 'a+b<a' sufficient to
>> check overflow? Of course not (hint: promotion rules). And as you say,
>> it gets even more complicated for signed types.
>> 
>> A few months ago I tried posting a complete set of fallbacks for older
>> compilers (https://lkml.org/lkml/2015/7/19/358), but nothing really
>> happened. Now I know where Linus stands, so I guess I can just delete
>> that branch.
>
> I actually like your approach of being type agnostic a bit more (in
> comparison to static inline functions), mostly because of one specific
> reason:
>
> The type agnostic __builtin_*_overflow function even do the correct
> things if you deal with types smaller than int. Imagine e.g. you want to
> add to unsigned chars a and b,

If you read my mail again you'll see that I mentioned exactly this :-)
so obviously I agree that this is a nice part of it.

> unsigned char a, b;
> if (a + b < a)
>   goto overflow;
> else
>   a += b;
>
> The overflow condition will never trigger, as the comparisons will
> always be done in the integer domain and a + b < a is never true. I
> actually think that this is easy to overlook and the functions should
> handle that.

Yes. While people very rarely use local u8 or u16 variables for
computations, I think one could imagine a and b being struct members,
which for one reason or another happens to be of a type narrower than
int (which would also make the issue much harder to spot since the
struct definition is far away). Something like

combine_packets(struct foo *a, const struct foo *b)
{
  if (a->len + b->len < a->len)
    return -EOVERFLOW;
  /* ensure a->payload is big enough...*/
  memcpy(a->payload + a->len, b->payload, b->len);
  a->len += b->len;
  ...
}

which, depending on details, would either lead to memory corruption or
loss of parts of the packets.

I haven't actually found any instance of this in the kernel, but that
doesn't mean it couldn't get introduced (or that it doesn't exist).

Aside: It turns out clang is smart enough to optimize away the broken
overflow check, but gcc isn't. Neither issue a warning, despite the
intention being rather clear.

Rasmus
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Pull-request

Message

Comments