diff mbox

rds: Error on offset mismatch if not loopback

Message ID 20120921213239.GJ14393@linux-tkdk.sfcn.org
State Changes Requested, archived
Delegated to: David Miller
Headers show

Commit Message

John Jolly Sept. 21, 2012, 9:32 p.m. UTC
Attempting an rds connection from the IP address of an IPoIB interface
to itself causes a kernel panic due to a BUG_ON() being triggered.
Making the test less strict allows rds-ping to work without crashing
the machine.

A local unprivileged user could use this flaw to crash the system.

Signed-off-by: John Jolly <jjolly@suse.com>
---
 net/rds/ib_send.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

Comments

Venkat Venkatsubra Sept. 21, 2012, 9:38 p.m. UTC | #1
On 9/21/2012 4:32 PM, John Jolly wrote:
> Attempting an rds connection from the IP address of an IPoIB interface
> to itself causes a kernel panic due to a BUG_ON() being triggered.
> Making the test less strict allows rds-ping to work without crashing
> the machine.
>
> A local unprivileged user could use this flaw to crash the system.
>
> Signed-off-by: John Jolly<jjolly@suse.com>
> ---
>   net/rds/ib_send.c |    2 +-
>   1 files changed, 1 insertions(+), 1 deletions(-)
>
> diff --git a/net/rds/ib_send.c b/net/rds/ib_send.c
> index e590949..7920c85 100644
> --- a/net/rds/ib_send.c
> +++ b/net/rds/ib_send.c
> @@ -544,7 +544,7 @@ int rds_ib_xmit(struct rds_connection *conn, struct rds_message *rm,
>          int flow_controlled = 0;
>          int nr_sig = 0;
>
> -       BUG_ON(off % RDS_FRAG_SIZE);
> +       BUG_ON(!conn->c_loopback&&  off % RDS_FRAG_SIZE);
>          BUG_ON(hdr_off != 0&&  hdr_off != sizeof(struct rds_header));
>
>          /* Do not send cong updates to IB loopback */
Hi John,

How do you trigger this BUG_ON ?
With rds-ping I could not hit this condition of non-zero "off % 
RDS_FRAG_SIZE".
rds-ping uses zero byte messages to ping or pong back. How does the 
"off" become non-zero ?

Thanks.

Venkat
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
David Miller Sept. 22, 2012, 7:25 p.m. UTC | #2
From: John Jolly <jjolly@suse.com>
Date: Fri, 21 Sep 2012 15:32:40 -0600

> Attempting an rds connection from the IP address of an IPoIB interface
> to itself causes a kernel panic due to a BUG_ON() being triggered.
> Making the test less strict allows rds-ping to work without crashing
> the machine.
> 
> A local unprivileged user could use this flaw to crash the system.
> 
> Signed-off-by: John Jolly <jjolly@suse.com>

Besides the questions being asked of you by Venkat Venkatsubra, this
patch has another issue.

It has been completely corrupted by your email client, it has
turned all TAB characters into spaces, making the patch useless.

Please learn how to send a patch unmolested in the body of your
email.  Test it by emailing the patch to yourself, and verifying
that you can in fact apply the patch you receive in that email.
Then, and only then, should you consider making a new submission
of this patch.

Use Documentation/email-clients.txt for guidance.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Josh Hunt Nov. 13, 2013, 4:24 a.m. UTC | #3
On Tue, Nov 12, 2013 at 10:22 PM, Josh Hunt <joshhunt00@gmail.com> wrote:
> On Sat, Sep 22, 2012 at 2:25 PM, David Miller <davem@davemloft.net> wrote:
>>
>> From: John Jolly <jjolly@suse.com>
>> Date: Fri, 21 Sep 2012 15:32:40 -0600
>>
>> > Attempting an rds connection from the IP address of an IPoIB interface
>> > to itself causes a kernel panic due to a BUG_ON() being triggered.
>> > Making the test less strict allows rds-ping to work without crashing
>> > the machine.
>> >
>> > A local unprivileged user could use this flaw to crash the system.
>> >
>> > Signed-off-by: John Jolly <jjolly@suse.com>
>>
>> Besides the questions being asked of you by Venkat Venkatsubra, this
>> patch has another issue.
>>
>> It has been completely corrupted by your email client, it has
>> turned all TAB characters into spaces, making the patch useless.
>>
>> Please learn how to send a patch unmolested in the body of your
>> email.  Test it by emailing the patch to yourself, and verifying
>> that you can in fact apply the patch you receive in that email.
>> Then, and only then, should you consider making a new submission
>> of this patch.
>>
>> Use Documentation/email-clients.txt for guidance.
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>> Please read the FAQ at  http://www.tux.org/lkml/
>
>
> I think this issue was lost in the shuffle. It appears that redhat, ubuntu,
> and oracle are maintaining local patches to resolve this:
>
> https://oss.oracle.com/git/?p=redpatch.git;a=commit;h=c7b6a0a1d8d636852be130fa15fa8be10d4704e8
> https://bugzilla.redhat.com/show_bug.cgi?id=822754
> http://ubuntu.5.x6.nabble.com/CVE-2012-2372-RDS-local-ping-DOS-td4985388.html
>
> Given that Oracle has applied it I'll make the assumption that Venkat's
> question was answered at some point.
>
> David - I can resubmit the patch with the proper signed-off-by and
> formatting if you are willing to apply it unless John wants to try again. I
> think it's time this got upstream.
>
> --
> Josh

Ugh.. hopefully resending with all the html crap removed...
David Miller Nov. 13, 2013, 6:09 a.m. UTC | #4
From: Josh Hunt <joshhunt00@gmail.com>
Date: Tue, 12 Nov 2013 22:22:11 -0600

> David - I can resubmit the patch with the proper signed-off-by and
> formatting if you are willing to apply it unless John wants to try again. I
> think it's time this got upstream.

Nothing is going to happen until the patch is submitted properly, so
just do, don't ask.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Venkat Venkatsubra Nov. 13, 2013, 3:16 p.m. UTC | #5
-----Original Message-----
From: Josh Hunt [mailto:joshhunt00@gmail.com] 
Sent: Tuesday, November 12, 2013 10:25 PM
To: David Miller
Cc: jjolly@suse.com; LKML; Venkat Venkatsubra; netdev@vger.kernel.org
Subject: Re: [PATCH] rds: Error on offset mismatch if not loopback

On Tue, Nov 12, 2013 at 10:22 PM, Josh Hunt <joshhunt00@gmail.com> wrote:
> On Sat, Sep 22, 2012 at 2:25 PM, David Miller <davem@davemloft.net> wrote:
>>
>> From: John Jolly <jjolly@suse.com>
>> Date: Fri, 21 Sep 2012 15:32:40 -0600
>>
>> > Attempting an rds connection from the IP address of an IPoIB 
>> > interface to itself causes a kernel panic due to a BUG_ON() being triggered.
>> > Making the test less strict allows rds-ping to work without 
>> > crashing the machine.
>> >
>> > A local unprivileged user could use this flaw to crash the system.
>> >
>> > Signed-off-by: John Jolly <jjolly@suse.com>
>>
>> Besides the questions being asked of you by Venkat Venkatsubra, this 
>> patch has another issue.
>>
>> It has been completely corrupted by your email client, it has turned 
>> all TAB characters into spaces, making the patch useless.
>>
>> Please learn how to send a patch unmolested in the body of your 
>> email.  Test it by emailing the patch to yourself, and verifying that 
>> you can in fact apply the patch you receive in that email.
>> Then, and only then, should you consider making a new submission of 
>> this patch.
>>
>> Use Documentation/email-clients.txt for guidance.
>> --
>> To unsubscribe from this list: send the line "unsubscribe 
>> linux-kernel" in the body of a message to majordomo@vger.kernel.org 
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>> Please read the FAQ at  http://www.tux.org/lkml/
>
>
> I think this issue was lost in the shuffle. It appears that redhat, 
> ubuntu, and oracle are maintaining local patches to resolve this:
>
> https://oss.oracle.com/git/?p=redpatch.git;a=commit;h=c7b6a0a1d8d63685
> 2be130fa15fa8be10d4704e8
> https://bugzilla.redhat.com/show_bug.cgi?id=822754
> http://ubuntu.5.x6.nabble.com/CVE-2012-2372-RDS-local-ping-DOS-td49853
> 88.html
>
> Given that Oracle has applied it I'll make the assumption that 
> Venkat's question was answered at some point.
>
> David - I can resubmit the patch with the proper signed-off-by and 
> formatting if you are willing to apply it unless John wants to try 
> again. I think it's time this got upstream.
>
> --
> Josh

Ugh.. hopefully resending with all the html crap removed...

--
Josh

Hi Josh,

No, I still didn't get an answer for how "off" could be non-zero in case of rds-ping to hit BUG_ON(off % RDS_FRAG_SIZE).
Because, rds-ping uses zero byte messages to ping.
If you have a test case that reproduces the kernel panic I can try it out and see how that can happen.
The Oracle's internal code I checked doesn't have that patch applied.

Venkat
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Josh Hunt Nov. 13, 2013, 5:40 p.m. UTC | #6
On Wed, Nov 13, 2013 at 9:16 AM, Venkat Venkatsubra
<venkat.x.venkatsubra@oracle.com> wrote:
>
>
> -----Original Message-----
> From: Josh Hunt [mailto:joshhunt00@gmail.com]
> Sent: Tuesday, November 12, 2013 10:25 PM
> To: David Miller
> Cc: jjolly@suse.com; LKML; Venkat Venkatsubra; netdev@vger.kernel.org
> Subject: Re: [PATCH] rds: Error on offset mismatch if not loopback
>
> On Tue, Nov 12, 2013 at 10:22 PM, Josh Hunt <joshhunt00@gmail.com> wrote:
>> On Sat, Sep 22, 2012 at 2:25 PM, David Miller <davem@davemloft.net> wrote:
>>>
>>> From: John Jolly <jjolly@suse.com>
>>> Date: Fri, 21 Sep 2012 15:32:40 -0600
>>>
>>> > Attempting an rds connection from the IP address of an IPoIB
>>> > interface to itself causes a kernel panic due to a BUG_ON() being triggered.
>>> > Making the test less strict allows rds-ping to work without
>>> > crashing the machine.
>>> >
>>> > A local unprivileged user could use this flaw to crash the system.
>>> >
>>> > Signed-off-by: John Jolly <jjolly@suse.com>
>>>
>>> Besides the questions being asked of you by Venkat Venkatsubra, this
>>> patch has another issue.
>>>
>>> It has been completely corrupted by your email client, it has turned
>>> all TAB characters into spaces, making the patch useless.
>>>
>>> Please learn how to send a patch unmolested in the body of your
>>> email.  Test it by emailing the patch to yourself, and verifying that
>>> you can in fact apply the patch you receive in that email.
>>> Then, and only then, should you consider making a new submission of
>>> this patch.
>>>
>>> Use Documentation/email-clients.txt for guidance.
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe
>>> linux-kernel" in the body of a message to majordomo@vger.kernel.org
>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>> Please read the FAQ at  http://www.tux.org/lkml/
>>
>>
>> I think this issue was lost in the shuffle. It appears that redhat,
>> ubuntu, and oracle are maintaining local patches to resolve this:
>>
>> https://oss.oracle.com/git/?p=redpatch.git;a=commit;h=c7b6a0a1d8d63685
>> 2be130fa15fa8be10d4704e8
>> https://bugzilla.redhat.com/show_bug.cgi?id=822754
>> http://ubuntu.5.x6.nabble.com/CVE-2012-2372-RDS-local-ping-DOS-td49853
>> 88.html
>>
>> Given that Oracle has applied it I'll make the assumption that
>> Venkat's question was answered at some point.
>>
>> David - I can resubmit the patch with the proper signed-off-by and
>> formatting if you are willing to apply it unless John wants to try
>> again. I think it's time this got upstream.
>>
>> --
>> Josh
>
> Ugh.. hopefully resending with all the html crap removed...
>
> --
> Josh
>
> Hi Josh,
>
> No, I still didn't get an answer for how "off" could be non-zero in case of rds-ping to hit BUG_ON(off % RDS_FRAG_SIZE).
> Because, rds-ping uses zero byte messages to ping.
> If you have a test case that reproduces the kernel panic I can try it out and see how that can happen.
> The Oracle's internal code I checked doesn't have that patch applied.
>
> Venkat

No I don't have a test case. I came across this CVE while doing an
audit and noticed it was patched in Ubuntu's kernel and other distros,
but was not in the upstream kernel yet. Quick googling of lkml showed
that there were at least two attempts to get this patch upstream, but
both had issues due to not following the proper submission process:

https://lkml.org/lkml/2012/10/22/433
https://lkml.org/lkml/2012/9/21/505

From my searching it appears the initial bug was found by someone at redhat:
https://bugzilla.redhat.com/show_bug.cgi?id=822754

I've added Li Honggang the reporter of this issue from Redhat to the
mail. Hopefully he can share his testcase.

and possibly requires certain hardware as Jay writes in the first link above:
"...some Infiniband HCAs(QLogic, possibly others) the machine will panic..."

I was referring to this oracle commit:
https://oss.oracle.com/git/?p=redpatch.git;a=commit;h=c7b6a0a1d8d636852be130fa15fa8be10d4704e8

I have no experience with this code. There were a few comments around
the reset and xmit fns about making sure the caller did certain things
if not they were racy, but I have no idea if that's coming into play
here.
Honggang LI Nov. 14, 2013, 12:55 a.m. UTC | #7
On 11/14/2013 01:40 AM, Josh Hunt wrote:
> On Wed, Nov 13, 2013 at 9:16 AM, Venkat Venkatsubra
> <venkat.x.venkatsubra@oracle.com> wrote:
>>
>> -----Original Message-----
>> From: Josh Hunt [mailto:joshhunt00@gmail.com]
>> Sent: Tuesday, November 12, 2013 10:25 PM
>> To: David Miller
>> Cc: jjolly@suse.com; LKML; Venkat Venkatsubra; netdev@vger.kernel.org
>> Subject: Re: [PATCH] rds: Error on offset mismatch if not loopback
>>
>> On Tue, Nov 12, 2013 at 10:22 PM, Josh Hunt <joshhunt00@gmail.com> wrote:
>>> On Sat, Sep 22, 2012 at 2:25 PM, David Miller <davem@davemloft.net> wrote:
>>>> From: John Jolly <jjolly@suse.com>
>>>> Date: Fri, 21 Sep 2012 15:32:40 -0600
>>>>
>>>>> Attempting an rds connection from the IP address of an IPoIB
>>>>> interface to itself causes a kernel panic due to a BUG_ON() being triggered.
>>>>> Making the test less strict allows rds-ping to work without
>>>>> crashing the machine.
>>>>>
>>>>> A local unprivileged user could use this flaw to crash the system.
>>>>>
>>>>> Signed-off-by: John Jolly <jjolly@suse.com>
>>>> Besides the questions being asked of you by Venkat Venkatsubra, this
>>>> patch has another issue.
>>>>
>>>> It has been completely corrupted by your email client, it has turned
>>>> all TAB characters into spaces, making the patch useless.
>>>>
>>>> Please learn how to send a patch unmolested in the body of your
>>>> email.  Test it by emailing the patch to yourself, and verifying that
>>>> you can in fact apply the patch you receive in that email.
>>>> Then, and only then, should you consider making a new submission of
>>>> this patch.
>>>>
>>>> Use Documentation/email-clients.txt for guidance.
>>>> --
>>>> To unsubscribe from this list: send the line "unsubscribe
>>>> linux-kernel" in the body of a message to majordomo@vger.kernel.org
>>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>> Please read the FAQ at  http://www.tux.org/lkml/
>>>
>>> I think this issue was lost in the shuffle. It appears that redhat,
>>> ubuntu, and oracle are maintaining local patches to resolve this:
>>>
>>> https://oss.oracle.com/git/?p=redpatch.git;a=commit;h=c7b6a0a1d8d63685
>>> 2be130fa15fa8be10d4704e8
>>> https://bugzilla.redhat.com/show_bug.cgi?id=822754
>>> http://ubuntu.5.x6.nabble.com/CVE-2012-2372-RDS-local-ping-DOS-td49853
>>> 88.html
>>>
>>> Given that Oracle has applied it I'll make the assumption that
>>> Venkat's question was answered at some point.
>>>
>>> David - I can resubmit the patch with the proper signed-off-by and
>>> formatting if you are willing to apply it unless John wants to try
>>> again. I think it's time this got upstream.
>>>
>>> --
>>> Josh
>> Ugh.. hopefully resending with all the html crap removed...
>>
>> --
>> Josh
>>
>> Hi Josh,
>>
>> No, I still didn't get an answer for how "off" could be non-zero in case of rds-ping to hit BUG_ON(off % RDS_FRAG_SIZE).
>> Because, rds-ping uses zero byte messages to ping.
>> If you have a test case that reproduces the kernel panic I can try it out and see how that can happen.
>> The Oracle's internal code I checked doesn't have that patch applied.
>>
>> Venkat
> No I don't have a test case. I came across this CVE while doing an
> audit and noticed it was patched in Ubuntu's kernel and other distros,
> but was not in the upstream kernel yet. Quick googling of lkml showed
> that there were at least two attempts to get this patch upstream, but
> both had issues due to not following the proper submission process:
>
> https://lkml.org/lkml/2012/10/22/433
> https://lkml.org/lkml/2012/9/21/505
>
> From my searching it appears the initial bug was found by someone at redhat:
> https://bugzilla.redhat.com/show_bug.cgi?id=822754
>
> I've added Li Honggang the reporter of this issue from Redhat to the
> mail. Hopefully he can share his testcase.
The test case is very simple:
Steps to Reproduce:
1. yum install -y rds-tools

2. [root@rdma3 ~]# ifconfig ib0 | grep 'inet addr'
          inet addr:172.31.0.3  Bcast:172.31.0.255  Mask:255.255.255.0

3. [root@rdma3 ~]# /usr/bin/rds-ping 172.31.0.3  <<<< kernel panic (You
may need to wait for a few seconds before the kernel panic.)
>
> and possibly requires certain hardware as Jay writes in the first link above:
> "...some Infiniband HCAs(QLogic, possibly others) the machine will panic..."
This bug can be reproduced with Mellanox HCAs (mlx4_ib.ko and mthca.ko),
QLogic HCA (ib_qib.ko). I did not test the QLogic HCA running "ib_ipath.ko".

As I know the upstream code of RDS is broken. There are *many* RDS bugs.

Best regards.
Honggang
>
> I was referring to this oracle commit:
> https://oss.oracle.com/git/?p=redpatch.git;a=commit;h=c7b6a0a1d8d636852be130fa15fa8be10d4704e8
>
> I have no experience with this code. There were a few comments around
> the reset and xmit fns about making sure the caller did certain things
> if not they were racy, but I have no idea if that's coming into play
> here.
>

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Josh Hunt Nov. 14, 2013, 1:27 a.m. UTC | #8
On Wed, Nov 13, 2013 at 6:55 PM, Honggang LI <honli@redhat.com> wrote:
> On 11/14/2013 01:40 AM, Josh Hunt wrote:
>> On Wed, Nov 13, 2013 at 9:16 AM, Venkat Venkatsubra
>> <venkat.x.venkatsubra@oracle.com> wrote:
>>>
>>> -----Original Message-----
>>> From: Josh Hunt [mailto:joshhunt00@gmail.com]
>>> Sent: Tuesday, November 12, 2013 10:25 PM
>>> To: David Miller
>>> Cc: jjolly@suse.com; LKML; Venkat Venkatsubra; netdev@vger.kernel.org
>>> Subject: Re: [PATCH] rds: Error on offset mismatch if not loopback
>>>
>>> On Tue, Nov 12, 2013 at 10:22 PM, Josh Hunt <joshhunt00@gmail.com> wrote:
>>>> On Sat, Sep 22, 2012 at 2:25 PM, David Miller <davem@davemloft.net> wrote:
>>>>> From: John Jolly <jjolly@suse.com>
>>>>> Date: Fri, 21 Sep 2012 15:32:40 -0600
>>>>>
>>>>>> Attempting an rds connection from the IP address of an IPoIB
>>>>>> interface to itself causes a kernel panic due to a BUG_ON() being triggered.
>>>>>> Making the test less strict allows rds-ping to work without
>>>>>> crashing the machine.
>>>>>>
>>>>>> A local unprivileged user could use this flaw to crash the system.
>>>>>>
>>>>>> Signed-off-by: John Jolly <jjolly@suse.com>
>>>>> Besides the questions being asked of you by Venkat Venkatsubra, this
>>>>> patch has another issue.
>>>>>
>>>>> It has been completely corrupted by your email client, it has turned
>>>>> all TAB characters into spaces, making the patch useless.
>>>>>
>>>>> Please learn how to send a patch unmolested in the body of your
>>>>> email.  Test it by emailing the patch to yourself, and verifying that
>>>>> you can in fact apply the patch you receive in that email.
>>>>> Then, and only then, should you consider making a new submission of
>>>>> this patch.
>>>>>
>>>>> Use Documentation/email-clients.txt for guidance.
>>>>> --
>>>>> To unsubscribe from this list: send the line "unsubscribe
>>>>> linux-kernel" in the body of a message to majordomo@vger.kernel.org
>>>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>>> Please read the FAQ at  http://www.tux.org/lkml/
>>>>
>>>> I think this issue was lost in the shuffle. It appears that redhat,
>>>> ubuntu, and oracle are maintaining local patches to resolve this:
>>>>
>>>> https://oss.oracle.com/git/?p=redpatch.git;a=commit;h=c7b6a0a1d8d63685
>>>> 2be130fa15fa8be10d4704e8
>>>> https://bugzilla.redhat.com/show_bug.cgi?id=822754
>>>> http://ubuntu.5.x6.nabble.com/CVE-2012-2372-RDS-local-ping-DOS-td49853
>>>> 88.html
>>>>
>>>> Given that Oracle has applied it I'll make the assumption that
>>>> Venkat's question was answered at some point.
>>>>
>>>> David - I can resubmit the patch with the proper signed-off-by and
>>>> formatting if you are willing to apply it unless John wants to try
>>>> again. I think it's time this got upstream.
>>>>
>>>> --
>>>> Josh
>>> Ugh.. hopefully resending with all the html crap removed...
>>>
>>> --
>>> Josh
>>>
>>> Hi Josh,
>>>
>>> No, I still didn't get an answer for how "off" could be non-zero in case of rds-ping to hit BUG_ON(off % RDS_FRAG_SIZE).
>>> Because, rds-ping uses zero byte messages to ping.
>>> If you have a test case that reproduces the kernel panic I can try it out and see how that can happen.
>>> The Oracle's internal code I checked doesn't have that patch applied.
>>>
>>> Venkat
>> No I don't have a test case. I came across this CVE while doing an
>> audit and noticed it was patched in Ubuntu's kernel and other distros,
>> but was not in the upstream kernel yet. Quick googling of lkml showed
>> that there were at least two attempts to get this patch upstream, but
>> both had issues due to not following the proper submission process:
>>
>> https://lkml.org/lkml/2012/10/22/433
>> https://lkml.org/lkml/2012/9/21/505
>>
>> From my searching it appears the initial bug was found by someone at redhat:
>> https://bugzilla.redhat.com/show_bug.cgi?id=822754
>>
>> I've added Li Honggang the reporter of this issue from Redhat to the
>> mail. Hopefully he can share his testcase.
> The test case is very simple:
> Steps to Reproduce:
> 1. yum install -y rds-tools
>
> 2. [root@rdma3 ~]# ifconfig ib0 | grep 'inet addr'
>           inet addr:172.31.0.3  Bcast:172.31.0.255  Mask:255.255.255.0
>
> 3. [root@rdma3 ~]# /usr/bin/rds-ping 172.31.0.3  <<<< kernel panic (You
> may need to wait for a few seconds before the kernel panic.)
>>
>> and possibly requires certain hardware as Jay writes in the first link above:
>> "...some Infiniband HCAs(QLogic, possibly others) the machine will panic..."
> This bug can be reproduced with Mellanox HCAs (mlx4_ib.ko and mthca.ko),
> QLogic HCA (ib_qib.ko). I did not test the QLogic HCA running "ib_ipath.ko".
>
> As I know the upstream code of RDS is broken. There are *many* RDS bugs.
>
> Best regards.
> Honggang

Thanks Honggang. I have resubmitted the patch for approval.
Venkat Venkatsubra Nov. 14, 2013, 1:43 p.m. UTC | #9
-----Original Message-----
From: Honggang LI [mailto:honli@redhat.com] 
Sent: Wednesday, November 13, 2013 6:56 PM
To: Josh Hunt; Venkat Venkatsubra
Cc: David Miller; jjolly@suse.com; LKML; netdev@vger.kernel.org
Subject: Re: [PATCH] rds: Error on offset mismatch if not loopback

On 11/14/2013 01:40 AM, Josh Hunt wrote:
> On Wed, Nov 13, 2013 at 9:16 AM, Venkat Venkatsubra 
> <venkat.x.venkatsubra@oracle.com> wrote:
>>
>> -----Original Message-----
>> From: Josh Hunt [mailto:joshhunt00@gmail.com]
>> Sent: Tuesday, November 12, 2013 10:25 PM
>> To: David Miller
>> Cc: jjolly@suse.com; LKML; Venkat Venkatsubra; netdev@vger.kernel.org
>> Subject: Re: [PATCH] rds: Error on offset mismatch if not loopback
>>
>> On Tue, Nov 12, 2013 at 10:22 PM, Josh Hunt <joshhunt00@gmail.com> wrote:
>>> On Sat, Sep 22, 2012 at 2:25 PM, David Miller <davem@davemloft.net> wrote:
>>>> From: John Jolly <jjolly@suse.com>
>>>> Date: Fri, 21 Sep 2012 15:32:40 -0600
>>>>
>>>>> Attempting an rds connection from the IP address of an IPoIB 
>>>>> interface to itself causes a kernel panic due to a BUG_ON() being triggered.
>>>>> Making the test less strict allows rds-ping to work without 
>>>>> crashing the machine.
>>>>>
>>>>> A local unprivileged user could use this flaw to crash the system.
>>>>>
>>>>> Signed-off-by: John Jolly <jjolly@suse.com>
>>>> Besides the questions being asked of you by Venkat Venkatsubra, 
>>>> this patch has another issue.
>>>>
>>>> It has been completely corrupted by your email client, it has 
>>>> turned all TAB characters into spaces, making the patch useless.
>>>>
>>>> Please learn how to send a patch unmolested in the body of your 
>>>> email.  Test it by emailing the patch to yourself, and verifying 
>>>> that you can in fact apply the patch you receive in that email.
>>>> Then, and only then, should you consider making a new submission of 
>>>> this patch.
>>>>
>>>> Use Documentation/email-clients.txt for guidance.
>>>> --
>>>> To unsubscribe from this list: send the line "unsubscribe 
>>>> linux-kernel" in the body of a message to majordomo@vger.kernel.org 
>>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>> Please read the FAQ at  http://www.tux.org/lkml/
>>>
>>> I think this issue was lost in the shuffle. It appears that redhat, 
>>> ubuntu, and oracle are maintaining local patches to resolve this:
>>>
>>> https://oss.oracle.com/git/?p=redpatch.git;a=commit;h=c7b6a0a1d8d636
>>> 85
>>> 2be130fa15fa8be10d4704e8
>>> https://bugzilla.redhat.com/show_bug.cgi?id=822754
>>> http://ubuntu.5.x6.nabble.com/CVE-2012-2372-RDS-local-ping-DOS-td498
>>> 53
>>> 88.html
>>>
>>> Given that Oracle has applied it I'll make the assumption that 
>>> Venkat's question was answered at some point.
>>>
>>> David - I can resubmit the patch with the proper signed-off-by and 
>>> formatting if you are willing to apply it unless John wants to try 
>>> again. I think it's time this got upstream.
>>>
>>> --
>>> Josh
>> Ugh.. hopefully resending with all the html crap removed...
>>
>> --
>> Josh
>>
>> Hi Josh,
>>
>> No, I still didn't get an answer for how "off" could be non-zero in case of rds-ping to hit BUG_ON(off % RDS_FRAG_SIZE).
>> Because, rds-ping uses zero byte messages to ping.
>> If you have a test case that reproduces the kernel panic I can try it out and see how that can happen.
>> The Oracle's internal code I checked doesn't have that patch applied.
>>
>> Venkat
> No I don't have a test case. I came across this CVE while doing an 
> audit and noticed it was patched in Ubuntu's kernel and other distros, 
> but was not in the upstream kernel yet. Quick googling of lkml showed 
> that there were at least two attempts to get this patch upstream, but 
> both had issues due to not following the proper submission process:
>
> https://lkml.org/lkml/2012/10/22/433
> https://lkml.org/lkml/2012/9/21/505
>
> From my searching it appears the initial bug was found by someone at redhat:
> https://bugzilla.redhat.com/show_bug.cgi?id=822754
>
> I've added Li Honggang the reporter of this issue from Redhat to the 
> mail. Hopefully he can share his testcase.
The test case is very simple:
Steps to Reproduce:
1. yum install -y rds-tools

2. [root@rdma3 ~]# ifconfig ib0 | grep 'inet addr'
          inet addr:172.31.0.3  Bcast:172.31.0.255  Mask:255.255.255.0

3. [root@rdma3 ~]# /usr/bin/rds-ping 172.31.0.3  <<<< kernel panic (You may need to wait for a few seconds before the kernel panic.)
>
> and possibly requires certain hardware as Jay writes in the first link above:
> "...some Infiniband HCAs(QLogic, possibly others) the machine will panic..."
This bug can be reproduced with Mellanox HCAs (mlx4_ib.ko and mthca.ko), QLogic HCA (ib_qib.ko). I did not test the QLogic HCA running "ib_ipath.ko".

As I know the upstream code of RDS is broken. There are *many* RDS bugs.

Best regards.
Honggang
>
> I was referring to this oracle commit:
> https://oss.oracle.com/git/?p=redpatch.git;a=commit;h=c7b6a0a1d8d63685
> 2be130fa15fa8be10d4704e8
>
> I have no experience with this code. There were a few comments around 
> the reset and xmit fns about making sure the caller did certain things 
> if not they were racy, but I have no idea if that's coming into play 
> here.
>

Hi Honggang,

I ran rds-ping over local interface for 30 minutes. I stopped it after that.
It didn't hit any panic.

# ip addr show dev ib0
6: ib0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 2044 qdisc pfifo_fast qlen 1024
    link/infiniband 80:00:00:48:fe:80:00:00:00:00:00:00:00:21:28:00:01:cf:63:db brd 00:ff:ff:ff:ff:12:40:1b:ff:ff:00:00:00:00:00:00:ff:ff:ff:ff
    inet 10.196.4.125/30 brd 10.196.4.127 scope global ib0
    inet6 fe80::221:2800:1cf:63db/64 scope link
       valid_lft forever preferred_lft forever
#

# rds-ping  10.196.4.125
    1: 170 usec
    2: 171 usec 
   ....
   ....
   ....
 1860: 173 usec
 1861: 171 usec
 1862: 177 usec
 1863: 168 usec
 1864: 171 usec
 1865: 175 usec
^C#

I tested with Oracle UEK2 which is based on 2.6.39 kernel. Mellanox IB adaptor.
19:00.0 InfiniBand: Mellanox Technologies MT26428 [ConnectX VPI PCIe 2.0 5GT/s - IB QDR / 10GigE] (rev b0)

There is something about your setup that must be causing it for you.
Can I work with you offline if you are available ?

The panic you are hitting is not making sense to me.

Venkat
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Honggang LI Nov. 15, 2013, 2:32 a.m. UTC | #10
On 11/14/2013 09:43 PM, Venkat Venkatsubra wrote:
>
> -----Original Message-----
> From: Honggang LI [mailto:honli@redhat.com] 
> Sent: Wednesday, November 13, 2013 6:56 PM
> To: Josh Hunt; Venkat Venkatsubra
> Cc: David Miller; jjolly@suse.com; LKML; netdev@vger.kernel.org
> Subject: Re: [PATCH] rds: Error on offset mismatch if not loopback
>
> On 11/14/2013 01:40 AM, Josh Hunt wrote:
>> On Wed, Nov 13, 2013 at 9:16 AM, Venkat Venkatsubra 
>> <venkat.x.venkatsubra@oracle.com> wrote:
>>> -----Original Message-----
>>> From: Josh Hunt [mailto:joshhunt00@gmail.com]
>>> Sent: Tuesday, November 12, 2013 10:25 PM
>>> To: David Miller
>>> Cc: jjolly@suse.com; LKML; Venkat Venkatsubra; netdev@vger.kernel.org
>>> Subject: Re: [PATCH] rds: Error on offset mismatch if not loopback
>>>
>>> On Tue, Nov 12, 2013 at 10:22 PM, Josh Hunt <joshhunt00@gmail.com> wrote:
>>>> On Sat, Sep 22, 2012 at 2:25 PM, David Miller <davem@davemloft.net> wrote:
>>>>> From: John Jolly <jjolly@suse.com>
>>>>> Date: Fri, 21 Sep 2012 15:32:40 -0600
>>>>>
>>>>>> Attempting an rds connection from the IP address of an IPoIB 
>>>>>> interface to itself causes a kernel panic due to a BUG_ON() being triggered.
>>>>>> Making the test less strict allows rds-ping to work without 
>>>>>> crashing the machine.
>>>>>>
>>>>>> A local unprivileged user could use this flaw to crash the system.
>>>>>>
>>>>>> Signed-off-by: John Jolly <jjolly@suse.com>
>>>>> Besides the questions being asked of you by Venkat Venkatsubra, 
>>>>> this patch has another issue.
>>>>>
>>>>> It has been completely corrupted by your email client, it has 
>>>>> turned all TAB characters into spaces, making the patch useless.
>>>>>
>>>>> Please learn how to send a patch unmolested in the body of your 
>>>>> email.  Test it by emailing the patch to yourself, and verifying 
>>>>> that you can in fact apply the patch you receive in that email.
>>>>> Then, and only then, should you consider making a new submission of 
>>>>> this patch.
>>>>>
>>>>> Use Documentation/email-clients.txt for guidance.
>>>>> --
>>>>> To unsubscribe from this list: send the line "unsubscribe 
>>>>> linux-kernel" in the body of a message to majordomo@vger.kernel.org 
>>>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>>> Please read the FAQ at  http://www.tux.org/lkml/
>>>> I think this issue was lost in the shuffle. It appears that redhat, 
>>>> ubuntu, and oracle are maintaining local patches to resolve this:
>>>>
>>>> https://oss.oracle.com/git/?p=redpatch.git;a=commit;h=c7b6a0a1d8d636
>>>> 85
>>>> 2be130fa15fa8be10d4704e8
>>>> https://bugzilla.redhat.com/show_bug.cgi?id=822754
>>>> http://ubuntu.5.x6.nabble.com/CVE-2012-2372-RDS-local-ping-DOS-td498
>>>> 53
>>>> 88.html
>>>>
>>>> Given that Oracle has applied it I'll make the assumption that 
>>>> Venkat's question was answered at some point.
>>>>
>>>> David - I can resubmit the patch with the proper signed-off-by and 
>>>> formatting if you are willing to apply it unless John wants to try 
>>>> again. I think it's time this got upstream.
>>>>
>>>> --
>>>> Josh
>>> Ugh.. hopefully resending with all the html crap removed...
>>>
>>> --
>>> Josh
>>>
>>> Hi Josh,
>>>
>>> No, I still didn't get an answer for how "off" could be non-zero in case of rds-ping to hit BUG_ON(off % RDS_FRAG_SIZE).
>>> Because, rds-ping uses zero byte messages to ping.
>>> If you have a test case that reproduces the kernel panic I can try it out and see how that can happen.
>>> The Oracle's internal code I checked doesn't have that patch applied.
>>>
>>> Venkat
>> No I don't have a test case. I came across this CVE while doing an 
>> audit and noticed it was patched in Ubuntu's kernel and other distros, 
>> but was not in the upstream kernel yet. Quick googling of lkml showed 
>> that there were at least two attempts to get this patch upstream, but 
>> both had issues due to not following the proper submission process:
>>
>> https://lkml.org/lkml/2012/10/22/433
>> https://lkml.org/lkml/2012/9/21/505
>>
>> From my searching it appears the initial bug was found by someone at redhat:
>> https://bugzilla.redhat.com/show_bug.cgi?id=822754
>>
>> I've added Li Honggang the reporter of this issue from Redhat to the 
>> mail. Hopefully he can share his testcase.
> The test case is very simple:
> Steps to Reproduce:
> 1. yum install -y rds-tools
>
> 2. [root@rdma3 ~]# ifconfig ib0 | grep 'inet addr'
>           inet addr:172.31.0.3  Bcast:172.31.0.255  Mask:255.255.255.0
>
> 3. [root@rdma3 ~]# /usr/bin/rds-ping 172.31.0.3  <<<< kernel panic (You may need to wait for a few seconds before the kernel panic.)
>> and possibly requires certain hardware as Jay writes in the first link above:
>> "...some Infiniband HCAs(QLogic, possibly others) the machine will panic..."
> This bug can be reproduced with Mellanox HCAs (mlx4_ib.ko and mthca.ko), QLogic HCA (ib_qib.ko). I did not test the QLogic HCA running "ib_ipath.ko".
>
> As I know the upstream code of RDS is broken. There are *many* RDS bugs.
>
> Best regards.
> Honggang
>> I was referring to this oracle commit:
>> https://oss.oracle.com/git/?p=redpatch.git;a=commit;h=c7b6a0a1d8d63685
>> 2be130fa15fa8be10d4704e8
>>
>> I have no experience with this code. There were a few comments around 
>> the reset and xmit fns about making sure the caller did certain things 
>> if not they were racy, but I have no idea if that's coming into play 
>> here.
>>
> Hi Honggang,
>
> I ran rds-ping over local interface for 30 minutes. I stopped it after that.
> It didn't hit any panic.
>
> # ip addr show dev ib0
> 6: ib0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 2044 qdisc pfifo_fast qlen 1024
>     link/infiniband 80:00:00:48:fe:80:00:00:00:00:00:00:00:21:28:00:01:cf:63:db brd 00:ff:ff:ff:ff:12:40:1b:ff:ff:00:00:00:00:00:00:ff:ff:ff:ff
>     inet 10.196.4.125/30 brd 10.196.4.127 scope global ib0
>     inet6 fe80::221:2800:1cf:63db/64 scope link
>        valid_lft forever preferred_lft forever
> #
>
> # rds-ping  10.196.4.125
>     1: 170 usec
>     2: 171 usec 
>    ....
>    ....
>    ....
>  1860: 173 usec
>  1861: 171 usec
>  1862: 177 usec
>  1863: 168 usec
>  1864: 171 usec
>  1865: 175 usec
> ^C#
>
> I tested with Oracle UEK2 which is based on 2.6.39 kernel. Mellanox IB adaptor.
> 19:00.0 InfiniBand: Mellanox Technologies MT26428 [ConnectX VPI PCIe 2.0 5GT/s - IB QDR / 10GigE] (rev b0)
>
> There is something about your setup that must be causing it for you.
> Can I work with you offline if you are available ?
>
> The panic you are hitting is not making sense to me.
>
> Venkat
Hi, Venkat
 It seems we are in different time zone. Please contact me via email if
you need I do something for this bug. Could you please try upstream
kernel 2.6.39. I confirmed that the bug can be reproduced with Mellanox
and QLogic HCA when running  upstream kernel-2.6.39.

[root@rdma01 ~]# ifconfig mlx4_ib1
Ifconfig uses the ioctl access method to get the full address
information, which limits hardware addresses to 8 bytes.
Because Infiniband address has 20 bytes, only the first 8 bytes are
displayed correctly.
Ifconfig is obsolete! For replacement check ip.
mlx4_ib1  Link encap:InfiniBand  HWaddr
80:00:00:48:FE:80:00:00:00:00:00:00:00:00:00:00:00:00:00:00 
          inet addr:172.31.2.1  Bcast:172.31.2.255  Mask:255.255.255.0
          inet6 addr: fe80::7ae7:d1ff:ff6b:b01/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:65520  Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:0 errors:0 dropped:5 overruns:0 carrier:0
          collisions:0 txqueuelen:256
          RX bytes:0 (0.0 b)  TX bytes:0 (0.0 b)

[root@rdma01 ~]# rpm -qf /usr/bin/rds-ping
rds-tools-2.0.6-3.el6.x86_64
[root@rdma01 ~]# uname -a
Linux rdma01.rhts.eng.nay.redhat.com 2.6.39 #1 SMP Thu Nov 14 20:25:45
EST 2013 x86_64 x86_64 x86_64 GNU/Linux
[root@rdma01 ~]# ibstat
CA 'mlx4_0'
    CA type: MT26428
    Number of ports: 2
    Firmware version: 2.8.600
    Hardware version: b0
    Node GUID: 0x78e7d1ffff6b0b00
    System image GUID: 0x78e7d1ffff6b0b03
    Port 1:
        State: Active
        Physical state: LinkUp
        Rate: 40
        Base lid: 1
        LMC: 0
        SM lid: 4
        Capability mask: 0x02510868
        Port GUID: 0x78e7d1ffff6b0b01
        Link layer: InfiniBand
    Port 2:
        State: Down
        Physical state: Polling
        Rate: 70
        Base lid: 0
        LMC: 0
        SM lid: 0
        Capability mask: 0x02510868
        Port GUID: 0x78e7d1ffff6b0b02
        Link layer: InfiniBand
[root@rdma01 ~]# lspci | grep Mellanox
1f:00.0 InfiniBand: Mellanox Technologies MT26428 [ConnectX VPI PCIe 2.0
5GT/s - IB QDR / 10GigE] (rev b0)
[root@rdma01 ~]# ssh 172.31.2.2 hostname   (make sure the IPoIB
interface works)
rdma02.rhts.eng.nay.redhat.com
[root@rdma01 ~]# ssh 172.31.2.1 hostname
rdma01.rhts.eng.nay.redhat.com
[root@rdma01 ~]# /usr/bin/rds-ping 172.31.2.1 (kernel panic, please see
the attachment for console log)
Venkat Venkatsubra Nov. 19, 2013, 11:33 p.m. UTC | #11
We now have lot more information than we did before.
When sending a "congestion update" in rds_ib_xmit() we are now returning an incorrect number as bytes sent:

        BUG_ON(off % RDS_FRAG_SIZE);
        BUG_ON(hdr_off != 0 && hdr_off != sizeof(struct rds_header));

        /* Do not send cong updates to IB loopback */
        if (conn->c_loopback
            && rm->m_inc.i_hdr.h_flags & RDS_FLAG_CONG_BITMAP) {
                rds_cong_map_updated(conn->c_fcong, ~(u64) 0);
                scat = &rm->data.op_sg[sg];
                ret = sizeof(struct rds_header) + RDS_CONG_MAP_BYTES;
                ret = min_t(int, ret, scat->length - conn->c_xmit_data_off);
                return ret;
        }

It returns min(8240, 4096-0) i.e. 4096 bytes.
The caller rds_send_xmit() is made to think a partial message (4096 out of 8240) was sent.
It calls rds_ib_xmit() again with a data offset "off" of 4096-48 (rds header) (=4048 bytes). And we hit the BUG_ON.

The reason I didn't hit the panic on my test on Oracle UEK2 which is based on 2.6.39 kernel is it had it like this:
        BUG_ON(off % RDS_FRAG_SIZE);
        BUG_ON(hdr_off != 0 && hdr_off != sizeof(struct rds_header));

        /* Do not send cong updates to IB loopback */
        if (conn->c_loopback
            && rm->m_inc.i_hdr.h_flags & RDS_FLAG_CONG_BITMAP) {
                rds_cong_map_updated(conn->c_fcong, ~(u64) 0);
                return sizeof(struct rds_header) + RDS_CONG_MAP_BYTES;
        }
(So it wasn't 100% 2.6.39 ;-). )
It returned 8240 bytes. The caller rds_send_xmit decides the full message was sent (48 byte header + 4096 data + 4096 data).
And it worked.

Then I found this info on the change that was done upstream which now causes the panic:
http://marc.info/?l=linux-netdev&m=129908332903057
http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=6094628bfd94323fc1cea05ec2c6affd98c18f7f 

Will investigate more into which problem the above change addressed.

Venkat
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Venkat Venkatsubra Nov. 20, 2013, 6:09 p.m. UTC | #12
-----Original Message-----
From: Venkat Venkatsubra 
Sent: Tuesday, November 19, 2013 5:33 PM
To: Honggang LI; Josh Hunt
Cc: David Miller; jjolly@suse.com; LKML; netdev@vger.kernel.org
Subject: RE: [PATCH] rds: Error on offset mismatch if not loopback

We now have lot more information than we did before.
When sending a "congestion update" in rds_ib_xmit() we are now returning an incorrect number as bytes sent:

        BUG_ON(off % RDS_FRAG_SIZE);
        BUG_ON(hdr_off != 0 && hdr_off != sizeof(struct rds_header));

        /* Do not send cong updates to IB loopback */
        if (conn->c_loopback
            && rm->m_inc.i_hdr.h_flags & RDS_FLAG_CONG_BITMAP) {
                rds_cong_map_updated(conn->c_fcong, ~(u64) 0);
                scat = &rm->data.op_sg[sg];
                ret = sizeof(struct rds_header) + RDS_CONG_MAP_BYTES;
                ret = min_t(int, ret, scat->length - conn->c_xmit_data_off);
                return ret;
        }

It returns min(8240, 4096-0) i.e. 4096 bytes.
The caller rds_send_xmit() is made to think a partial message (4096 out of 8240) was sent.
It calls rds_ib_xmit() again with a data offset "off" of 4096-48 (rds header) (=4048 bytes). And we hit the BUG_ON.

The reason I didn't hit the panic on my test on Oracle UEK2 which is based on 2.6.39 kernel is it had it like this:
        BUG_ON(off % RDS_FRAG_SIZE);
        BUG_ON(hdr_off != 0 && hdr_off != sizeof(struct rds_header));

        /* Do not send cong updates to IB loopback */
        if (conn->c_loopback
            && rm->m_inc.i_hdr.h_flags & RDS_FLAG_CONG_BITMAP) {
                rds_cong_map_updated(conn->c_fcong, ~(u64) 0);
                return sizeof(struct rds_header) + RDS_CONG_MAP_BYTES;
        }
(So it wasn't 100% 2.6.39 ;-). )
It returned 8240 bytes. The caller rds_send_xmit decides the full message was sent (48 byte header + 4096 data + 4096 data).
And it worked.

Then I found this info on the change that was done upstream which now causes the panic:
http://marc.info/?l=linux-netdev&m=129908332903057
http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=6094628bfd94323fc1cea05ec2c6affd98c18f7f 

Will investigate more into which problem the above change addressed.

Venkat
--

Looks like the fix pointed to by the above link is for a panic on a PPC system with a PAGE_SIZE of 64Kbytes.
I think the sequence it was going through before that fix was:
/* Do not send cong updates to IB loopback */
        if (conn->c_loopback
            && rm->m_inc.i_hdr.h_flags & RDS_FLAG_CONG_BITMAP) {
                rds_cong_map_updated(conn->c_fcong, ~(u64) 0);
                return sizeof(struct rds_header) + RDS_CONG_MAP_BYTES;
        }
rds_ib_xmit returns 8240
rds_send_xmit : c_xmit_data_off = 0 + 8240 - 48 (rds header the first time) = 8196
                c_xmit_data_off < 65536 (sg->length)
                calls rds_ib_xmit again
rds_ib_xmit returns 8240
rds_send_xmit: c_xmit_data_off = 8192+8240 = 16432 and calls rds_ib_xmit
rds_ib_xmit : returns 8240
rds_send_xmit: c_xmit_data_off 24672 and calls rds_ib_xmit
...
...
and so on till
rds_send_xmit: c_xmit_data_off 57632 and calls rds_ib_xmit
rds_ib_xmit: returns 8240

On the last iteration it hits the below BUG_ON in rds_send_xmit.
while (ret) {
    tmp = min_t(int, ret, sg->length -
                         conn->c_xmit_data_off);
[tmp = 7904]
    conn->c_xmit_data_off += tmp;
[c_xmit_data_off = 65536]
    ret -= tmp;
[ret = 8240-7904 = 336]
    if (conn->c_xmit_data_off == sg->length) {
         conn->c_xmit_data_off = 0;
         sg++;
         conn->c_xmit_sg++;
         BUG_ON(ret != 0 &&
             conn->c_xmit_sg == rm->data.op_nents);
    }
}

Since the congestion update over loopback is not actually transmitted as a message,
the multiple iterations we see in the case of ppc is unnecessary.
All that rds_ib_xmit needs to do is return a number of bytes that will tell the caller that
we are done with this message.
  
This might fix the original problem without introducing the current panic:
/* Do not send cong updates to IB loopback */
        if (conn->c_loopback
            && rm->m_inc.i_hdr.h_flags & RDS_FLAG_CONG_BITMAP) {
                rds_cong_map_updated(conn->c_fcong, ~(u64) 0);
                scat = &rm->data.op_sg[sg];
                ret = max_t(int, RDS_CONG_MAP_BYTES, scat->length);
                return ret + sizeof(struct rds_header);
        }
It will return 8240 when PAGE_SIZE is 4k and 64k+48 in case of ppc when scat->length is 64k and
be done with one iteration of rds_send_xmit/rds_ib_xmit loop.

Venkat
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
David Miller Nov. 20, 2013, 6:54 p.m. UTC | #13
Why are you posting this message a second time?
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Venkat Venkatsubra Nov. 20, 2013, 9:28 p.m. UTC | #14
> Why are you posting this message a second time?

Reposting just the contents of the second message in case it got missed the previous time.

Looks like the fix pointed to by the previous link is for a panic on a PPC system with a PAGE_SIZE of 64Kbytes.
I think the sequence it was going through before that fix was:
/* Do not send cong updates to IB loopback */
        if (conn->c_loopback
            && rm->m_inc.i_hdr.h_flags & RDS_FLAG_CONG_BITMAP) {
                rds_cong_map_updated(conn->c_fcong, ~(u64) 0);
                return sizeof(struct rds_header) + RDS_CONG_MAP_BYTES;
        }
rds_ib_xmit returns 8240
rds_send_xmit : c_xmit_data_off = 0 + 8240 - 48 (rds header the first time) = 8196
                c_xmit_data_off < 65536 (sg->length)
                calls rds_ib_xmit again
rds_ib_xmit returns 8240
rds_send_xmit: c_xmit_data_off = 8192+8240 = 16432 and calls rds_ib_xmit
rds_ib_xmit : returns 8240
rds_send_xmit: c_xmit_data_off 24672 and calls rds_ib_xmit ...
...
and so on till
rds_send_xmit: c_xmit_data_off 57632 and calls rds_ib_xmit
rds_ib_xmit: returns 8240

On the last iteration it hits the below BUG_ON in rds_send_xmit.
while (ret) {
    tmp = min_t(int, ret, sg->length -
                         conn->c_xmit_data_off);
 [tmp = 7904]
    conn->c_xmit_data_off += tmp;
[c_xmit_data_off = 65536]
    ret -= tmp;
[ret = 8240-7904 = 336]
    if (conn->c_xmit_data_off == sg->length) {
         conn->c_xmit_data_off = 0;
         sg++;
         conn->c_xmit_sg++;
         BUG_ON(ret != 0 &&
             conn->c_xmit_sg == rm->data.op_nents);
    }
}

Since the congestion update over loopback is not actually transmitted as a message,
the multiple iterations we see in the case of ppc is unnecessary.
All that rds_ib_xmit needs to do is return a number of bytes that will tell the caller
that we are done with this message.
  
This might fix the original problem without introducing the current panic:
/* Do not send cong updates to IB loopback */
        if (conn->c_loopback
            && rm->m_inc.i_hdr.h_flags & RDS_FLAG_CONG_BITMAP) {
                rds_cong_map_updated(conn->c_fcong, ~(u64) 0);
                scat = &rm->data.op_sg[sg];
                ret = max_t(int, RDS_CONG_MAP_BYTES, scat->length);
                return ret + sizeof(struct rds_header);
        }
It will return 8240 when PAGE_SIZE is 4k and 64k+48 in case of ppc when scat->length is 64k and 
be done with one iteration of rds_send_xmit/rds_ib_xmit loop.

Venkat
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/net/rds/ib_send.c b/net/rds/ib_send.c
index e590949..7920c85 100644
--- a/net/rds/ib_send.c
+++ b/net/rds/ib_send.c
@@ -544,7 +544,7 @@  int rds_ib_xmit(struct rds_connection *conn, struct rds_message *rm,
        int flow_controlled = 0;
        int nr_sig = 0;
 
-       BUG_ON(off % RDS_FRAG_SIZE);
+       BUG_ON(!conn->c_loopback && off % RDS_FRAG_SIZE);
        BUG_ON(hdr_off != 0 && hdr_off != sizeof(struct rds_header));
 
        /* Do not send cong updates to IB loopback */