Message ID | 20120921213239.GJ14393@linux-tkdk.sfcn.org |
---|---|
State | Changes Requested, archived |
Delegated to: | David Miller |
Headers | show |
On 9/21/2012 4:32 PM, John Jolly wrote: > Attempting an rds connection from the IP address of an IPoIB interface > to itself causes a kernel panic due to a BUG_ON() being triggered. > Making the test less strict allows rds-ping to work without crashing > the machine. > > A local unprivileged user could use this flaw to crash the system. > > Signed-off-by: John Jolly<jjolly@suse.com> > --- > net/rds/ib_send.c | 2 +- > 1 files changed, 1 insertions(+), 1 deletions(-) > > diff --git a/net/rds/ib_send.c b/net/rds/ib_send.c > index e590949..7920c85 100644 > --- a/net/rds/ib_send.c > +++ b/net/rds/ib_send.c > @@ -544,7 +544,7 @@ int rds_ib_xmit(struct rds_connection *conn, struct rds_message *rm, > int flow_controlled = 0; > int nr_sig = 0; > > - BUG_ON(off % RDS_FRAG_SIZE); > + BUG_ON(!conn->c_loopback&& off % RDS_FRAG_SIZE); > BUG_ON(hdr_off != 0&& hdr_off != sizeof(struct rds_header)); > > /* Do not send cong updates to IB loopback */ Hi John, How do you trigger this BUG_ON ? With rds-ping I could not hit this condition of non-zero "off % RDS_FRAG_SIZE". rds-ping uses zero byte messages to ping or pong back. How does the "off" become non-zero ? Thanks. Venkat -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
From: John Jolly <jjolly@suse.com> Date: Fri, 21 Sep 2012 15:32:40 -0600 > Attempting an rds connection from the IP address of an IPoIB interface > to itself causes a kernel panic due to a BUG_ON() being triggered. > Making the test less strict allows rds-ping to work without crashing > the machine. > > A local unprivileged user could use this flaw to crash the system. > > Signed-off-by: John Jolly <jjolly@suse.com> Besides the questions being asked of you by Venkat Venkatsubra, this patch has another issue. It has been completely corrupted by your email client, it has turned all TAB characters into spaces, making the patch useless. Please learn how to send a patch unmolested in the body of your email. Test it by emailing the patch to yourself, and verifying that you can in fact apply the patch you receive in that email. Then, and only then, should you consider making a new submission of this patch. Use Documentation/email-clients.txt for guidance. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, Nov 12, 2013 at 10:22 PM, Josh Hunt <joshhunt00@gmail.com> wrote: > On Sat, Sep 22, 2012 at 2:25 PM, David Miller <davem@davemloft.net> wrote: >> >> From: John Jolly <jjolly@suse.com> >> Date: Fri, 21 Sep 2012 15:32:40 -0600 >> >> > Attempting an rds connection from the IP address of an IPoIB interface >> > to itself causes a kernel panic due to a BUG_ON() being triggered. >> > Making the test less strict allows rds-ping to work without crashing >> > the machine. >> > >> > A local unprivileged user could use this flaw to crash the system. >> > >> > Signed-off-by: John Jolly <jjolly@suse.com> >> >> Besides the questions being asked of you by Venkat Venkatsubra, this >> patch has another issue. >> >> It has been completely corrupted by your email client, it has >> turned all TAB characters into spaces, making the patch useless. >> >> Please learn how to send a patch unmolested in the body of your >> email. Test it by emailing the patch to yourself, and verifying >> that you can in fact apply the patch you receive in that email. >> Then, and only then, should you consider making a new submission >> of this patch. >> >> Use Documentation/email-clients.txt for guidance. >> -- >> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in >> the body of a message to majordomo@vger.kernel.org >> More majordomo info at http://vger.kernel.org/majordomo-info.html >> Please read the FAQ at http://www.tux.org/lkml/ > > > I think this issue was lost in the shuffle. It appears that redhat, ubuntu, > and oracle are maintaining local patches to resolve this: > > https://oss.oracle.com/git/?p=redpatch.git;a=commit;h=c7b6a0a1d8d636852be130fa15fa8be10d4704e8 > https://bugzilla.redhat.com/show_bug.cgi?id=822754 > http://ubuntu.5.x6.nabble.com/CVE-2012-2372-RDS-local-ping-DOS-td4985388.html > > Given that Oracle has applied it I'll make the assumption that Venkat's > question was answered at some point. > > David - I can resubmit the patch with the proper signed-off-by and > formatting if you are willing to apply it unless John wants to try again. I > think it's time this got upstream. > > -- > Josh Ugh.. hopefully resending with all the html crap removed...
From: Josh Hunt <joshhunt00@gmail.com> Date: Tue, 12 Nov 2013 22:22:11 -0600 > David - I can resubmit the patch with the proper signed-off-by and > formatting if you are willing to apply it unless John wants to try again. I > think it's time this got upstream. Nothing is going to happen until the patch is submitted properly, so just do, don't ask. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
-----Original Message----- From: Josh Hunt [mailto:joshhunt00@gmail.com] Sent: Tuesday, November 12, 2013 10:25 PM To: David Miller Cc: jjolly@suse.com; LKML; Venkat Venkatsubra; netdev@vger.kernel.org Subject: Re: [PATCH] rds: Error on offset mismatch if not loopback On Tue, Nov 12, 2013 at 10:22 PM, Josh Hunt <joshhunt00@gmail.com> wrote: > On Sat, Sep 22, 2012 at 2:25 PM, David Miller <davem@davemloft.net> wrote: >> >> From: John Jolly <jjolly@suse.com> >> Date: Fri, 21 Sep 2012 15:32:40 -0600 >> >> > Attempting an rds connection from the IP address of an IPoIB >> > interface to itself causes a kernel panic due to a BUG_ON() being triggered. >> > Making the test less strict allows rds-ping to work without >> > crashing the machine. >> > >> > A local unprivileged user could use this flaw to crash the system. >> > >> > Signed-off-by: John Jolly <jjolly@suse.com> >> >> Besides the questions being asked of you by Venkat Venkatsubra, this >> patch has another issue. >> >> It has been completely corrupted by your email client, it has turned >> all TAB characters into spaces, making the patch useless. >> >> Please learn how to send a patch unmolested in the body of your >> email. Test it by emailing the patch to yourself, and verifying that >> you can in fact apply the patch you receive in that email. >> Then, and only then, should you consider making a new submission of >> this patch. >> >> Use Documentation/email-clients.txt for guidance. >> -- >> To unsubscribe from this list: send the line "unsubscribe >> linux-kernel" in the body of a message to majordomo@vger.kernel.org >> More majordomo info at http://vger.kernel.org/majordomo-info.html >> Please read the FAQ at http://www.tux.org/lkml/ > > > I think this issue was lost in the shuffle. It appears that redhat, > ubuntu, and oracle are maintaining local patches to resolve this: > > https://oss.oracle.com/git/?p=redpatch.git;a=commit;h=c7b6a0a1d8d63685 > 2be130fa15fa8be10d4704e8 > https://bugzilla.redhat.com/show_bug.cgi?id=822754 > http://ubuntu.5.x6.nabble.com/CVE-2012-2372-RDS-local-ping-DOS-td49853 > 88.html > > Given that Oracle has applied it I'll make the assumption that > Venkat's question was answered at some point. > > David - I can resubmit the patch with the proper signed-off-by and > formatting if you are willing to apply it unless John wants to try > again. I think it's time this got upstream. > > -- > Josh Ugh.. hopefully resending with all the html crap removed... -- Josh Hi Josh, No, I still didn't get an answer for how "off" could be non-zero in case of rds-ping to hit BUG_ON(off % RDS_FRAG_SIZE). Because, rds-ping uses zero byte messages to ping. If you have a test case that reproduces the kernel panic I can try it out and see how that can happen. The Oracle's internal code I checked doesn't have that patch applied. Venkat -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, Nov 13, 2013 at 9:16 AM, Venkat Venkatsubra <venkat.x.venkatsubra@oracle.com> wrote: > > > -----Original Message----- > From: Josh Hunt [mailto:joshhunt00@gmail.com] > Sent: Tuesday, November 12, 2013 10:25 PM > To: David Miller > Cc: jjolly@suse.com; LKML; Venkat Venkatsubra; netdev@vger.kernel.org > Subject: Re: [PATCH] rds: Error on offset mismatch if not loopback > > On Tue, Nov 12, 2013 at 10:22 PM, Josh Hunt <joshhunt00@gmail.com> wrote: >> On Sat, Sep 22, 2012 at 2:25 PM, David Miller <davem@davemloft.net> wrote: >>> >>> From: John Jolly <jjolly@suse.com> >>> Date: Fri, 21 Sep 2012 15:32:40 -0600 >>> >>> > Attempting an rds connection from the IP address of an IPoIB >>> > interface to itself causes a kernel panic due to a BUG_ON() being triggered. >>> > Making the test less strict allows rds-ping to work without >>> > crashing the machine. >>> > >>> > A local unprivileged user could use this flaw to crash the system. >>> > >>> > Signed-off-by: John Jolly <jjolly@suse.com> >>> >>> Besides the questions being asked of you by Venkat Venkatsubra, this >>> patch has another issue. >>> >>> It has been completely corrupted by your email client, it has turned >>> all TAB characters into spaces, making the patch useless. >>> >>> Please learn how to send a patch unmolested in the body of your >>> email. Test it by emailing the patch to yourself, and verifying that >>> you can in fact apply the patch you receive in that email. >>> Then, and only then, should you consider making a new submission of >>> this patch. >>> >>> Use Documentation/email-clients.txt for guidance. >>> -- >>> To unsubscribe from this list: send the line "unsubscribe >>> linux-kernel" in the body of a message to majordomo@vger.kernel.org >>> More majordomo info at http://vger.kernel.org/majordomo-info.html >>> Please read the FAQ at http://www.tux.org/lkml/ >> >> >> I think this issue was lost in the shuffle. It appears that redhat, >> ubuntu, and oracle are maintaining local patches to resolve this: >> >> https://oss.oracle.com/git/?p=redpatch.git;a=commit;h=c7b6a0a1d8d63685 >> 2be130fa15fa8be10d4704e8 >> https://bugzilla.redhat.com/show_bug.cgi?id=822754 >> http://ubuntu.5.x6.nabble.com/CVE-2012-2372-RDS-local-ping-DOS-td49853 >> 88.html >> >> Given that Oracle has applied it I'll make the assumption that >> Venkat's question was answered at some point. >> >> David - I can resubmit the patch with the proper signed-off-by and >> formatting if you are willing to apply it unless John wants to try >> again. I think it's time this got upstream. >> >> -- >> Josh > > Ugh.. hopefully resending with all the html crap removed... > > -- > Josh > > Hi Josh, > > No, I still didn't get an answer for how "off" could be non-zero in case of rds-ping to hit BUG_ON(off % RDS_FRAG_SIZE). > Because, rds-ping uses zero byte messages to ping. > If you have a test case that reproduces the kernel panic I can try it out and see how that can happen. > The Oracle's internal code I checked doesn't have that patch applied. > > Venkat No I don't have a test case. I came across this CVE while doing an audit and noticed it was patched in Ubuntu's kernel and other distros, but was not in the upstream kernel yet. Quick googling of lkml showed that there were at least two attempts to get this patch upstream, but both had issues due to not following the proper submission process: https://lkml.org/lkml/2012/10/22/433 https://lkml.org/lkml/2012/9/21/505 From my searching it appears the initial bug was found by someone at redhat: https://bugzilla.redhat.com/show_bug.cgi?id=822754 I've added Li Honggang the reporter of this issue from Redhat to the mail. Hopefully he can share his testcase. and possibly requires certain hardware as Jay writes in the first link above: "...some Infiniband HCAs(QLogic, possibly others) the machine will panic..." I was referring to this oracle commit: https://oss.oracle.com/git/?p=redpatch.git;a=commit;h=c7b6a0a1d8d636852be130fa15fa8be10d4704e8 I have no experience with this code. There were a few comments around the reset and xmit fns about making sure the caller did certain things if not they were racy, but I have no idea if that's coming into play here.
On 11/14/2013 01:40 AM, Josh Hunt wrote: > On Wed, Nov 13, 2013 at 9:16 AM, Venkat Venkatsubra > <venkat.x.venkatsubra@oracle.com> wrote: >> >> -----Original Message----- >> From: Josh Hunt [mailto:joshhunt00@gmail.com] >> Sent: Tuesday, November 12, 2013 10:25 PM >> To: David Miller >> Cc: jjolly@suse.com; LKML; Venkat Venkatsubra; netdev@vger.kernel.org >> Subject: Re: [PATCH] rds: Error on offset mismatch if not loopback >> >> On Tue, Nov 12, 2013 at 10:22 PM, Josh Hunt <joshhunt00@gmail.com> wrote: >>> On Sat, Sep 22, 2012 at 2:25 PM, David Miller <davem@davemloft.net> wrote: >>>> From: John Jolly <jjolly@suse.com> >>>> Date: Fri, 21 Sep 2012 15:32:40 -0600 >>>> >>>>> Attempting an rds connection from the IP address of an IPoIB >>>>> interface to itself causes a kernel panic due to a BUG_ON() being triggered. >>>>> Making the test less strict allows rds-ping to work without >>>>> crashing the machine. >>>>> >>>>> A local unprivileged user could use this flaw to crash the system. >>>>> >>>>> Signed-off-by: John Jolly <jjolly@suse.com> >>>> Besides the questions being asked of you by Venkat Venkatsubra, this >>>> patch has another issue. >>>> >>>> It has been completely corrupted by your email client, it has turned >>>> all TAB characters into spaces, making the patch useless. >>>> >>>> Please learn how to send a patch unmolested in the body of your >>>> email. Test it by emailing the patch to yourself, and verifying that >>>> you can in fact apply the patch you receive in that email. >>>> Then, and only then, should you consider making a new submission of >>>> this patch. >>>> >>>> Use Documentation/email-clients.txt for guidance. >>>> -- >>>> To unsubscribe from this list: send the line "unsubscribe >>>> linux-kernel" in the body of a message to majordomo@vger.kernel.org >>>> More majordomo info at http://vger.kernel.org/majordomo-info.html >>>> Please read the FAQ at http://www.tux.org/lkml/ >>> >>> I think this issue was lost in the shuffle. It appears that redhat, >>> ubuntu, and oracle are maintaining local patches to resolve this: >>> >>> https://oss.oracle.com/git/?p=redpatch.git;a=commit;h=c7b6a0a1d8d63685 >>> 2be130fa15fa8be10d4704e8 >>> https://bugzilla.redhat.com/show_bug.cgi?id=822754 >>> http://ubuntu.5.x6.nabble.com/CVE-2012-2372-RDS-local-ping-DOS-td49853 >>> 88.html >>> >>> Given that Oracle has applied it I'll make the assumption that >>> Venkat's question was answered at some point. >>> >>> David - I can resubmit the patch with the proper signed-off-by and >>> formatting if you are willing to apply it unless John wants to try >>> again. I think it's time this got upstream. >>> >>> -- >>> Josh >> Ugh.. hopefully resending with all the html crap removed... >> >> -- >> Josh >> >> Hi Josh, >> >> No, I still didn't get an answer for how "off" could be non-zero in case of rds-ping to hit BUG_ON(off % RDS_FRAG_SIZE). >> Because, rds-ping uses zero byte messages to ping. >> If you have a test case that reproduces the kernel panic I can try it out and see how that can happen. >> The Oracle's internal code I checked doesn't have that patch applied. >> >> Venkat > No I don't have a test case. I came across this CVE while doing an > audit and noticed it was patched in Ubuntu's kernel and other distros, > but was not in the upstream kernel yet. Quick googling of lkml showed > that there were at least two attempts to get this patch upstream, but > both had issues due to not following the proper submission process: > > https://lkml.org/lkml/2012/10/22/433 > https://lkml.org/lkml/2012/9/21/505 > > From my searching it appears the initial bug was found by someone at redhat: > https://bugzilla.redhat.com/show_bug.cgi?id=822754 > > I've added Li Honggang the reporter of this issue from Redhat to the > mail. Hopefully he can share his testcase. The test case is very simple: Steps to Reproduce: 1. yum install -y rds-tools 2. [root@rdma3 ~]# ifconfig ib0 | grep 'inet addr' inet addr:172.31.0.3 Bcast:172.31.0.255 Mask:255.255.255.0 3. [root@rdma3 ~]# /usr/bin/rds-ping 172.31.0.3 <<<< kernel panic (You may need to wait for a few seconds before the kernel panic.) > > and possibly requires certain hardware as Jay writes in the first link above: > "...some Infiniband HCAs(QLogic, possibly others) the machine will panic..." This bug can be reproduced with Mellanox HCAs (mlx4_ib.ko and mthca.ko), QLogic HCA (ib_qib.ko). I did not test the QLogic HCA running "ib_ipath.ko". As I know the upstream code of RDS is broken. There are *many* RDS bugs. Best regards. Honggang > > I was referring to this oracle commit: > https://oss.oracle.com/git/?p=redpatch.git;a=commit;h=c7b6a0a1d8d636852be130fa15fa8be10d4704e8 > > I have no experience with this code. There were a few comments around > the reset and xmit fns about making sure the caller did certain things > if not they were racy, but I have no idea if that's coming into play > here. > -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, Nov 13, 2013 at 6:55 PM, Honggang LI <honli@redhat.com> wrote: > On 11/14/2013 01:40 AM, Josh Hunt wrote: >> On Wed, Nov 13, 2013 at 9:16 AM, Venkat Venkatsubra >> <venkat.x.venkatsubra@oracle.com> wrote: >>> >>> -----Original Message----- >>> From: Josh Hunt [mailto:joshhunt00@gmail.com] >>> Sent: Tuesday, November 12, 2013 10:25 PM >>> To: David Miller >>> Cc: jjolly@suse.com; LKML; Venkat Venkatsubra; netdev@vger.kernel.org >>> Subject: Re: [PATCH] rds: Error on offset mismatch if not loopback >>> >>> On Tue, Nov 12, 2013 at 10:22 PM, Josh Hunt <joshhunt00@gmail.com> wrote: >>>> On Sat, Sep 22, 2012 at 2:25 PM, David Miller <davem@davemloft.net> wrote: >>>>> From: John Jolly <jjolly@suse.com> >>>>> Date: Fri, 21 Sep 2012 15:32:40 -0600 >>>>> >>>>>> Attempting an rds connection from the IP address of an IPoIB >>>>>> interface to itself causes a kernel panic due to a BUG_ON() being triggered. >>>>>> Making the test less strict allows rds-ping to work without >>>>>> crashing the machine. >>>>>> >>>>>> A local unprivileged user could use this flaw to crash the system. >>>>>> >>>>>> Signed-off-by: John Jolly <jjolly@suse.com> >>>>> Besides the questions being asked of you by Venkat Venkatsubra, this >>>>> patch has another issue. >>>>> >>>>> It has been completely corrupted by your email client, it has turned >>>>> all TAB characters into spaces, making the patch useless. >>>>> >>>>> Please learn how to send a patch unmolested in the body of your >>>>> email. Test it by emailing the patch to yourself, and verifying that >>>>> you can in fact apply the patch you receive in that email. >>>>> Then, and only then, should you consider making a new submission of >>>>> this patch. >>>>> >>>>> Use Documentation/email-clients.txt for guidance. >>>>> -- >>>>> To unsubscribe from this list: send the line "unsubscribe >>>>> linux-kernel" in the body of a message to majordomo@vger.kernel.org >>>>> More majordomo info at http://vger.kernel.org/majordomo-info.html >>>>> Please read the FAQ at http://www.tux.org/lkml/ >>>> >>>> I think this issue was lost in the shuffle. It appears that redhat, >>>> ubuntu, and oracle are maintaining local patches to resolve this: >>>> >>>> https://oss.oracle.com/git/?p=redpatch.git;a=commit;h=c7b6a0a1d8d63685 >>>> 2be130fa15fa8be10d4704e8 >>>> https://bugzilla.redhat.com/show_bug.cgi?id=822754 >>>> http://ubuntu.5.x6.nabble.com/CVE-2012-2372-RDS-local-ping-DOS-td49853 >>>> 88.html >>>> >>>> Given that Oracle has applied it I'll make the assumption that >>>> Venkat's question was answered at some point. >>>> >>>> David - I can resubmit the patch with the proper signed-off-by and >>>> formatting if you are willing to apply it unless John wants to try >>>> again. I think it's time this got upstream. >>>> >>>> -- >>>> Josh >>> Ugh.. hopefully resending with all the html crap removed... >>> >>> -- >>> Josh >>> >>> Hi Josh, >>> >>> No, I still didn't get an answer for how "off" could be non-zero in case of rds-ping to hit BUG_ON(off % RDS_FRAG_SIZE). >>> Because, rds-ping uses zero byte messages to ping. >>> If you have a test case that reproduces the kernel panic I can try it out and see how that can happen. >>> The Oracle's internal code I checked doesn't have that patch applied. >>> >>> Venkat >> No I don't have a test case. I came across this CVE while doing an >> audit and noticed it was patched in Ubuntu's kernel and other distros, >> but was not in the upstream kernel yet. Quick googling of lkml showed >> that there were at least two attempts to get this patch upstream, but >> both had issues due to not following the proper submission process: >> >> https://lkml.org/lkml/2012/10/22/433 >> https://lkml.org/lkml/2012/9/21/505 >> >> From my searching it appears the initial bug was found by someone at redhat: >> https://bugzilla.redhat.com/show_bug.cgi?id=822754 >> >> I've added Li Honggang the reporter of this issue from Redhat to the >> mail. Hopefully he can share his testcase. > The test case is very simple: > Steps to Reproduce: > 1. yum install -y rds-tools > > 2. [root@rdma3 ~]# ifconfig ib0 | grep 'inet addr' > inet addr:172.31.0.3 Bcast:172.31.0.255 Mask:255.255.255.0 > > 3. [root@rdma3 ~]# /usr/bin/rds-ping 172.31.0.3 <<<< kernel panic (You > may need to wait for a few seconds before the kernel panic.) >> >> and possibly requires certain hardware as Jay writes in the first link above: >> "...some Infiniband HCAs(QLogic, possibly others) the machine will panic..." > This bug can be reproduced with Mellanox HCAs (mlx4_ib.ko and mthca.ko), > QLogic HCA (ib_qib.ko). I did not test the QLogic HCA running "ib_ipath.ko". > > As I know the upstream code of RDS is broken. There are *many* RDS bugs. > > Best regards. > Honggang Thanks Honggang. I have resubmitted the patch for approval.
-----Original Message----- From: Honggang LI [mailto:honli@redhat.com] Sent: Wednesday, November 13, 2013 6:56 PM To: Josh Hunt; Venkat Venkatsubra Cc: David Miller; jjolly@suse.com; LKML; netdev@vger.kernel.org Subject: Re: [PATCH] rds: Error on offset mismatch if not loopback On 11/14/2013 01:40 AM, Josh Hunt wrote: > On Wed, Nov 13, 2013 at 9:16 AM, Venkat Venkatsubra > <venkat.x.venkatsubra@oracle.com> wrote: >> >> -----Original Message----- >> From: Josh Hunt [mailto:joshhunt00@gmail.com] >> Sent: Tuesday, November 12, 2013 10:25 PM >> To: David Miller >> Cc: jjolly@suse.com; LKML; Venkat Venkatsubra; netdev@vger.kernel.org >> Subject: Re: [PATCH] rds: Error on offset mismatch if not loopback >> >> On Tue, Nov 12, 2013 at 10:22 PM, Josh Hunt <joshhunt00@gmail.com> wrote: >>> On Sat, Sep 22, 2012 at 2:25 PM, David Miller <davem@davemloft.net> wrote: >>>> From: John Jolly <jjolly@suse.com> >>>> Date: Fri, 21 Sep 2012 15:32:40 -0600 >>>> >>>>> Attempting an rds connection from the IP address of an IPoIB >>>>> interface to itself causes a kernel panic due to a BUG_ON() being triggered. >>>>> Making the test less strict allows rds-ping to work without >>>>> crashing the machine. >>>>> >>>>> A local unprivileged user could use this flaw to crash the system. >>>>> >>>>> Signed-off-by: John Jolly <jjolly@suse.com> >>>> Besides the questions being asked of you by Venkat Venkatsubra, >>>> this patch has another issue. >>>> >>>> It has been completely corrupted by your email client, it has >>>> turned all TAB characters into spaces, making the patch useless. >>>> >>>> Please learn how to send a patch unmolested in the body of your >>>> email. Test it by emailing the patch to yourself, and verifying >>>> that you can in fact apply the patch you receive in that email. >>>> Then, and only then, should you consider making a new submission of >>>> this patch. >>>> >>>> Use Documentation/email-clients.txt for guidance. >>>> -- >>>> To unsubscribe from this list: send the line "unsubscribe >>>> linux-kernel" in the body of a message to majordomo@vger.kernel.org >>>> More majordomo info at http://vger.kernel.org/majordomo-info.html >>>> Please read the FAQ at http://www.tux.org/lkml/ >>> >>> I think this issue was lost in the shuffle. It appears that redhat, >>> ubuntu, and oracle are maintaining local patches to resolve this: >>> >>> https://oss.oracle.com/git/?p=redpatch.git;a=commit;h=c7b6a0a1d8d636 >>> 85 >>> 2be130fa15fa8be10d4704e8 >>> https://bugzilla.redhat.com/show_bug.cgi?id=822754 >>> http://ubuntu.5.x6.nabble.com/CVE-2012-2372-RDS-local-ping-DOS-td498 >>> 53 >>> 88.html >>> >>> Given that Oracle has applied it I'll make the assumption that >>> Venkat's question was answered at some point. >>> >>> David - I can resubmit the patch with the proper signed-off-by and >>> formatting if you are willing to apply it unless John wants to try >>> again. I think it's time this got upstream. >>> >>> -- >>> Josh >> Ugh.. hopefully resending with all the html crap removed... >> >> -- >> Josh >> >> Hi Josh, >> >> No, I still didn't get an answer for how "off" could be non-zero in case of rds-ping to hit BUG_ON(off % RDS_FRAG_SIZE). >> Because, rds-ping uses zero byte messages to ping. >> If you have a test case that reproduces the kernel panic I can try it out and see how that can happen. >> The Oracle's internal code I checked doesn't have that patch applied. >> >> Venkat > No I don't have a test case. I came across this CVE while doing an > audit and noticed it was patched in Ubuntu's kernel and other distros, > but was not in the upstream kernel yet. Quick googling of lkml showed > that there were at least two attempts to get this patch upstream, but > both had issues due to not following the proper submission process: > > https://lkml.org/lkml/2012/10/22/433 > https://lkml.org/lkml/2012/9/21/505 > > From my searching it appears the initial bug was found by someone at redhat: > https://bugzilla.redhat.com/show_bug.cgi?id=822754 > > I've added Li Honggang the reporter of this issue from Redhat to the > mail. Hopefully he can share his testcase. The test case is very simple: Steps to Reproduce: 1. yum install -y rds-tools 2. [root@rdma3 ~]# ifconfig ib0 | grep 'inet addr' inet addr:172.31.0.3 Bcast:172.31.0.255 Mask:255.255.255.0 3. [root@rdma3 ~]# /usr/bin/rds-ping 172.31.0.3 <<<< kernel panic (You may need to wait for a few seconds before the kernel panic.) > > and possibly requires certain hardware as Jay writes in the first link above: > "...some Infiniband HCAs(QLogic, possibly others) the machine will panic..." This bug can be reproduced with Mellanox HCAs (mlx4_ib.ko and mthca.ko), QLogic HCA (ib_qib.ko). I did not test the QLogic HCA running "ib_ipath.ko". As I know the upstream code of RDS is broken. There are *many* RDS bugs. Best regards. Honggang > > I was referring to this oracle commit: > https://oss.oracle.com/git/?p=redpatch.git;a=commit;h=c7b6a0a1d8d63685 > 2be130fa15fa8be10d4704e8 > > I have no experience with this code. There were a few comments around > the reset and xmit fns about making sure the caller did certain things > if not they were racy, but I have no idea if that's coming into play > here. > Hi Honggang, I ran rds-ping over local interface for 30 minutes. I stopped it after that. It didn't hit any panic. # ip addr show dev ib0 6: ib0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 2044 qdisc pfifo_fast qlen 1024 link/infiniband 80:00:00:48:fe:80:00:00:00:00:00:00:00:21:28:00:01:cf:63:db brd 00:ff:ff:ff:ff:12:40:1b:ff:ff:00:00:00:00:00:00:ff:ff:ff:ff inet 10.196.4.125/30 brd 10.196.4.127 scope global ib0 inet6 fe80::221:2800:1cf:63db/64 scope link valid_lft forever preferred_lft forever # # rds-ping 10.196.4.125 1: 170 usec 2: 171 usec .... .... .... 1860: 173 usec 1861: 171 usec 1862: 177 usec 1863: 168 usec 1864: 171 usec 1865: 175 usec ^C# I tested with Oracle UEK2 which is based on 2.6.39 kernel. Mellanox IB adaptor. 19:00.0 InfiniBand: Mellanox Technologies MT26428 [ConnectX VPI PCIe 2.0 5GT/s - IB QDR / 10GigE] (rev b0) There is something about your setup that must be causing it for you. Can I work with you offline if you are available ? The panic you are hitting is not making sense to me. Venkat -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 11/14/2013 09:43 PM, Venkat Venkatsubra wrote: > > -----Original Message----- > From: Honggang LI [mailto:honli@redhat.com] > Sent: Wednesday, November 13, 2013 6:56 PM > To: Josh Hunt; Venkat Venkatsubra > Cc: David Miller; jjolly@suse.com; LKML; netdev@vger.kernel.org > Subject: Re: [PATCH] rds: Error on offset mismatch if not loopback > > On 11/14/2013 01:40 AM, Josh Hunt wrote: >> On Wed, Nov 13, 2013 at 9:16 AM, Venkat Venkatsubra >> <venkat.x.venkatsubra@oracle.com> wrote: >>> -----Original Message----- >>> From: Josh Hunt [mailto:joshhunt00@gmail.com] >>> Sent: Tuesday, November 12, 2013 10:25 PM >>> To: David Miller >>> Cc: jjolly@suse.com; LKML; Venkat Venkatsubra; netdev@vger.kernel.org >>> Subject: Re: [PATCH] rds: Error on offset mismatch if not loopback >>> >>> On Tue, Nov 12, 2013 at 10:22 PM, Josh Hunt <joshhunt00@gmail.com> wrote: >>>> On Sat, Sep 22, 2012 at 2:25 PM, David Miller <davem@davemloft.net> wrote: >>>>> From: John Jolly <jjolly@suse.com> >>>>> Date: Fri, 21 Sep 2012 15:32:40 -0600 >>>>> >>>>>> Attempting an rds connection from the IP address of an IPoIB >>>>>> interface to itself causes a kernel panic due to a BUG_ON() being triggered. >>>>>> Making the test less strict allows rds-ping to work without >>>>>> crashing the machine. >>>>>> >>>>>> A local unprivileged user could use this flaw to crash the system. >>>>>> >>>>>> Signed-off-by: John Jolly <jjolly@suse.com> >>>>> Besides the questions being asked of you by Venkat Venkatsubra, >>>>> this patch has another issue. >>>>> >>>>> It has been completely corrupted by your email client, it has >>>>> turned all TAB characters into spaces, making the patch useless. >>>>> >>>>> Please learn how to send a patch unmolested in the body of your >>>>> email. Test it by emailing the patch to yourself, and verifying >>>>> that you can in fact apply the patch you receive in that email. >>>>> Then, and only then, should you consider making a new submission of >>>>> this patch. >>>>> >>>>> Use Documentation/email-clients.txt for guidance. >>>>> -- >>>>> To unsubscribe from this list: send the line "unsubscribe >>>>> linux-kernel" in the body of a message to majordomo@vger.kernel.org >>>>> More majordomo info at http://vger.kernel.org/majordomo-info.html >>>>> Please read the FAQ at http://www.tux.org/lkml/ >>>> I think this issue was lost in the shuffle. It appears that redhat, >>>> ubuntu, and oracle are maintaining local patches to resolve this: >>>> >>>> https://oss.oracle.com/git/?p=redpatch.git;a=commit;h=c7b6a0a1d8d636 >>>> 85 >>>> 2be130fa15fa8be10d4704e8 >>>> https://bugzilla.redhat.com/show_bug.cgi?id=822754 >>>> http://ubuntu.5.x6.nabble.com/CVE-2012-2372-RDS-local-ping-DOS-td498 >>>> 53 >>>> 88.html >>>> >>>> Given that Oracle has applied it I'll make the assumption that >>>> Venkat's question was answered at some point. >>>> >>>> David - I can resubmit the patch with the proper signed-off-by and >>>> formatting if you are willing to apply it unless John wants to try >>>> again. I think it's time this got upstream. >>>> >>>> -- >>>> Josh >>> Ugh.. hopefully resending with all the html crap removed... >>> >>> -- >>> Josh >>> >>> Hi Josh, >>> >>> No, I still didn't get an answer for how "off" could be non-zero in case of rds-ping to hit BUG_ON(off % RDS_FRAG_SIZE). >>> Because, rds-ping uses zero byte messages to ping. >>> If you have a test case that reproduces the kernel panic I can try it out and see how that can happen. >>> The Oracle's internal code I checked doesn't have that patch applied. >>> >>> Venkat >> No I don't have a test case. I came across this CVE while doing an >> audit and noticed it was patched in Ubuntu's kernel and other distros, >> but was not in the upstream kernel yet. Quick googling of lkml showed >> that there were at least two attempts to get this patch upstream, but >> both had issues due to not following the proper submission process: >> >> https://lkml.org/lkml/2012/10/22/433 >> https://lkml.org/lkml/2012/9/21/505 >> >> From my searching it appears the initial bug was found by someone at redhat: >> https://bugzilla.redhat.com/show_bug.cgi?id=822754 >> >> I've added Li Honggang the reporter of this issue from Redhat to the >> mail. Hopefully he can share his testcase. > The test case is very simple: > Steps to Reproduce: > 1. yum install -y rds-tools > > 2. [root@rdma3 ~]# ifconfig ib0 | grep 'inet addr' > inet addr:172.31.0.3 Bcast:172.31.0.255 Mask:255.255.255.0 > > 3. [root@rdma3 ~]# /usr/bin/rds-ping 172.31.0.3 <<<< kernel panic (You may need to wait for a few seconds before the kernel panic.) >> and possibly requires certain hardware as Jay writes in the first link above: >> "...some Infiniband HCAs(QLogic, possibly others) the machine will panic..." > This bug can be reproduced with Mellanox HCAs (mlx4_ib.ko and mthca.ko), QLogic HCA (ib_qib.ko). I did not test the QLogic HCA running "ib_ipath.ko". > > As I know the upstream code of RDS is broken. There are *many* RDS bugs. > > Best regards. > Honggang >> I was referring to this oracle commit: >> https://oss.oracle.com/git/?p=redpatch.git;a=commit;h=c7b6a0a1d8d63685 >> 2be130fa15fa8be10d4704e8 >> >> I have no experience with this code. There were a few comments around >> the reset and xmit fns about making sure the caller did certain things >> if not they were racy, but I have no idea if that's coming into play >> here. >> > Hi Honggang, > > I ran rds-ping over local interface for 30 minutes. I stopped it after that. > It didn't hit any panic. > > # ip addr show dev ib0 > 6: ib0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 2044 qdisc pfifo_fast qlen 1024 > link/infiniband 80:00:00:48:fe:80:00:00:00:00:00:00:00:21:28:00:01:cf:63:db brd 00:ff:ff:ff:ff:12:40:1b:ff:ff:00:00:00:00:00:00:ff:ff:ff:ff > inet 10.196.4.125/30 brd 10.196.4.127 scope global ib0 > inet6 fe80::221:2800:1cf:63db/64 scope link > valid_lft forever preferred_lft forever > # > > # rds-ping 10.196.4.125 > 1: 170 usec > 2: 171 usec > .... > .... > .... > 1860: 173 usec > 1861: 171 usec > 1862: 177 usec > 1863: 168 usec > 1864: 171 usec > 1865: 175 usec > ^C# > > I tested with Oracle UEK2 which is based on 2.6.39 kernel. Mellanox IB adaptor. > 19:00.0 InfiniBand: Mellanox Technologies MT26428 [ConnectX VPI PCIe 2.0 5GT/s - IB QDR / 10GigE] (rev b0) > > There is something about your setup that must be causing it for you. > Can I work with you offline if you are available ? > > The panic you are hitting is not making sense to me. > > Venkat Hi, Venkat It seems we are in different time zone. Please contact me via email if you need I do something for this bug. Could you please try upstream kernel 2.6.39. I confirmed that the bug can be reproduced with Mellanox and QLogic HCA when running upstream kernel-2.6.39. [root@rdma01 ~]# ifconfig mlx4_ib1 Ifconfig uses the ioctl access method to get the full address information, which limits hardware addresses to 8 bytes. Because Infiniband address has 20 bytes, only the first 8 bytes are displayed correctly. Ifconfig is obsolete! For replacement check ip. mlx4_ib1 Link encap:InfiniBand HWaddr 80:00:00:48:FE:80:00:00:00:00:00:00:00:00:00:00:00:00:00:00 inet addr:172.31.2.1 Bcast:172.31.2.255 Mask:255.255.255.0 inet6 addr: fe80::7ae7:d1ff:ff6b:b01/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:65520 Metric:1 RX packets:0 errors:0 dropped:0 overruns:0 frame:0 TX packets:0 errors:0 dropped:5 overruns:0 carrier:0 collisions:0 txqueuelen:256 RX bytes:0 (0.0 b) TX bytes:0 (0.0 b) [root@rdma01 ~]# rpm -qf /usr/bin/rds-ping rds-tools-2.0.6-3.el6.x86_64 [root@rdma01 ~]# uname -a Linux rdma01.rhts.eng.nay.redhat.com 2.6.39 #1 SMP Thu Nov 14 20:25:45 EST 2013 x86_64 x86_64 x86_64 GNU/Linux [root@rdma01 ~]# ibstat CA 'mlx4_0' CA type: MT26428 Number of ports: 2 Firmware version: 2.8.600 Hardware version: b0 Node GUID: 0x78e7d1ffff6b0b00 System image GUID: 0x78e7d1ffff6b0b03 Port 1: State: Active Physical state: LinkUp Rate: 40 Base lid: 1 LMC: 0 SM lid: 4 Capability mask: 0x02510868 Port GUID: 0x78e7d1ffff6b0b01 Link layer: InfiniBand Port 2: State: Down Physical state: Polling Rate: 70 Base lid: 0 LMC: 0 SM lid: 0 Capability mask: 0x02510868 Port GUID: 0x78e7d1ffff6b0b02 Link layer: InfiniBand [root@rdma01 ~]# lspci | grep Mellanox 1f:00.0 InfiniBand: Mellanox Technologies MT26428 [ConnectX VPI PCIe 2.0 5GT/s - IB QDR / 10GigE] (rev b0) [root@rdma01 ~]# ssh 172.31.2.2 hostname (make sure the IPoIB interface works) rdma02.rhts.eng.nay.redhat.com [root@rdma01 ~]# ssh 172.31.2.1 hostname rdma01.rhts.eng.nay.redhat.com [root@rdma01 ~]# /usr/bin/rds-ping 172.31.2.1 (kernel panic, please see the attachment for console log)
We now have lot more information than we did before. When sending a "congestion update" in rds_ib_xmit() we are now returning an incorrect number as bytes sent: BUG_ON(off % RDS_FRAG_SIZE); BUG_ON(hdr_off != 0 && hdr_off != sizeof(struct rds_header)); /* Do not send cong updates to IB loopback */ if (conn->c_loopback && rm->m_inc.i_hdr.h_flags & RDS_FLAG_CONG_BITMAP) { rds_cong_map_updated(conn->c_fcong, ~(u64) 0); scat = &rm->data.op_sg[sg]; ret = sizeof(struct rds_header) + RDS_CONG_MAP_BYTES; ret = min_t(int, ret, scat->length - conn->c_xmit_data_off); return ret; } It returns min(8240, 4096-0) i.e. 4096 bytes. The caller rds_send_xmit() is made to think a partial message (4096 out of 8240) was sent. It calls rds_ib_xmit() again with a data offset "off" of 4096-48 (rds header) (=4048 bytes). And we hit the BUG_ON. The reason I didn't hit the panic on my test on Oracle UEK2 which is based on 2.6.39 kernel is it had it like this: BUG_ON(off % RDS_FRAG_SIZE); BUG_ON(hdr_off != 0 && hdr_off != sizeof(struct rds_header)); /* Do not send cong updates to IB loopback */ if (conn->c_loopback && rm->m_inc.i_hdr.h_flags & RDS_FLAG_CONG_BITMAP) { rds_cong_map_updated(conn->c_fcong, ~(u64) 0); return sizeof(struct rds_header) + RDS_CONG_MAP_BYTES; } (So it wasn't 100% 2.6.39 ;-). ) It returned 8240 bytes. The caller rds_send_xmit decides the full message was sent (48 byte header + 4096 data + 4096 data). And it worked. Then I found this info on the change that was done upstream which now causes the panic: http://marc.info/?l=linux-netdev&m=129908332903057 http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=6094628bfd94323fc1cea05ec2c6affd98c18f7f Will investigate more into which problem the above change addressed. Venkat -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
-----Original Message-----
From: Venkat Venkatsubra
Sent: Tuesday, November 19, 2013 5:33 PM
To: Honggang LI; Josh Hunt
Cc: David Miller; jjolly@suse.com; LKML; netdev@vger.kernel.org
Subject: RE: [PATCH] rds: Error on offset mismatch if not loopback
We now have lot more information than we did before.
When sending a "congestion update" in rds_ib_xmit() we are now returning an incorrect number as bytes sent:
BUG_ON(off % RDS_FRAG_SIZE);
BUG_ON(hdr_off != 0 && hdr_off != sizeof(struct rds_header));
/* Do not send cong updates to IB loopback */
if (conn->c_loopback
&& rm->m_inc.i_hdr.h_flags & RDS_FLAG_CONG_BITMAP) {
rds_cong_map_updated(conn->c_fcong, ~(u64) 0);
scat = &rm->data.op_sg[sg];
ret = sizeof(struct rds_header) + RDS_CONG_MAP_BYTES;
ret = min_t(int, ret, scat->length - conn->c_xmit_data_off);
return ret;
}
It returns min(8240, 4096-0) i.e. 4096 bytes.
The caller rds_send_xmit() is made to think a partial message (4096 out of 8240) was sent.
It calls rds_ib_xmit() again with a data offset "off" of 4096-48 (rds header) (=4048 bytes). And we hit the BUG_ON.
The reason I didn't hit the panic on my test on Oracle UEK2 which is based on 2.6.39 kernel is it had it like this:
BUG_ON(off % RDS_FRAG_SIZE);
BUG_ON(hdr_off != 0 && hdr_off != sizeof(struct rds_header));
/* Do not send cong updates to IB loopback */
if (conn->c_loopback
&& rm->m_inc.i_hdr.h_flags & RDS_FLAG_CONG_BITMAP) {
rds_cong_map_updated(conn->c_fcong, ~(u64) 0);
return sizeof(struct rds_header) + RDS_CONG_MAP_BYTES;
}
(So it wasn't 100% 2.6.39 ;-). )
It returned 8240 bytes. The caller rds_send_xmit decides the full message was sent (48 byte header + 4096 data + 4096 data).
And it worked.
Then I found this info on the change that was done upstream which now causes the panic:
http://marc.info/?l=linux-netdev&m=129908332903057
http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=6094628bfd94323fc1cea05ec2c6affd98c18f7f
Will investigate more into which problem the above change addressed.
Venkat
--
Looks like the fix pointed to by the above link is for a panic on a PPC system with a PAGE_SIZE of 64Kbytes.
I think the sequence it was going through before that fix was:
/* Do not send cong updates to IB loopback */
if (conn->c_loopback
&& rm->m_inc.i_hdr.h_flags & RDS_FLAG_CONG_BITMAP) {
rds_cong_map_updated(conn->c_fcong, ~(u64) 0);
return sizeof(struct rds_header) + RDS_CONG_MAP_BYTES;
}
rds_ib_xmit returns 8240
rds_send_xmit : c_xmit_data_off = 0 + 8240 - 48 (rds header the first time) = 8196
c_xmit_data_off < 65536 (sg->length)
calls rds_ib_xmit again
rds_ib_xmit returns 8240
rds_send_xmit: c_xmit_data_off = 8192+8240 = 16432 and calls rds_ib_xmit
rds_ib_xmit : returns 8240
rds_send_xmit: c_xmit_data_off 24672 and calls rds_ib_xmit
...
...
and so on till
rds_send_xmit: c_xmit_data_off 57632 and calls rds_ib_xmit
rds_ib_xmit: returns 8240
On the last iteration it hits the below BUG_ON in rds_send_xmit.
while (ret) {
tmp = min_t(int, ret, sg->length -
conn->c_xmit_data_off);
[tmp = 7904]
conn->c_xmit_data_off += tmp;
[c_xmit_data_off = 65536]
ret -= tmp;
[ret = 8240-7904 = 336]
if (conn->c_xmit_data_off == sg->length) {
conn->c_xmit_data_off = 0;
sg++;
conn->c_xmit_sg++;
BUG_ON(ret != 0 &&
conn->c_xmit_sg == rm->data.op_nents);
}
}
Since the congestion update over loopback is not actually transmitted as a message,
the multiple iterations we see in the case of ppc is unnecessary.
All that rds_ib_xmit needs to do is return a number of bytes that will tell the caller that
we are done with this message.
This might fix the original problem without introducing the current panic:
/* Do not send cong updates to IB loopback */
if (conn->c_loopback
&& rm->m_inc.i_hdr.h_flags & RDS_FLAG_CONG_BITMAP) {
rds_cong_map_updated(conn->c_fcong, ~(u64) 0);
scat = &rm->data.op_sg[sg];
ret = max_t(int, RDS_CONG_MAP_BYTES, scat->length);
return ret + sizeof(struct rds_header);
}
It will return 8240 when PAGE_SIZE is 4k and 64k+48 in case of ppc when scat->length is 64k and
be done with one iteration of rds_send_xmit/rds_ib_xmit loop.
Venkat
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Why are you posting this message a second time? -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
> Why are you posting this message a second time?
Reposting just the contents of the second message in case it got missed the previous time.
Looks like the fix pointed to by the previous link is for a panic on a PPC system with a PAGE_SIZE of 64Kbytes.
I think the sequence it was going through before that fix was:
/* Do not send cong updates to IB loopback */
if (conn->c_loopback
&& rm->m_inc.i_hdr.h_flags & RDS_FLAG_CONG_BITMAP) {
rds_cong_map_updated(conn->c_fcong, ~(u64) 0);
return sizeof(struct rds_header) + RDS_CONG_MAP_BYTES;
}
rds_ib_xmit returns 8240
rds_send_xmit : c_xmit_data_off = 0 + 8240 - 48 (rds header the first time) = 8196
c_xmit_data_off < 65536 (sg->length)
calls rds_ib_xmit again
rds_ib_xmit returns 8240
rds_send_xmit: c_xmit_data_off = 8192+8240 = 16432 and calls rds_ib_xmit
rds_ib_xmit : returns 8240
rds_send_xmit: c_xmit_data_off 24672 and calls rds_ib_xmit ...
...
and so on till
rds_send_xmit: c_xmit_data_off 57632 and calls rds_ib_xmit
rds_ib_xmit: returns 8240
On the last iteration it hits the below BUG_ON in rds_send_xmit.
while (ret) {
tmp = min_t(int, ret, sg->length -
conn->c_xmit_data_off);
[tmp = 7904]
conn->c_xmit_data_off += tmp;
[c_xmit_data_off = 65536]
ret -= tmp;
[ret = 8240-7904 = 336]
if (conn->c_xmit_data_off == sg->length) {
conn->c_xmit_data_off = 0;
sg++;
conn->c_xmit_sg++;
BUG_ON(ret != 0 &&
conn->c_xmit_sg == rm->data.op_nents);
}
}
Since the congestion update over loopback is not actually transmitted as a message,
the multiple iterations we see in the case of ppc is unnecessary.
All that rds_ib_xmit needs to do is return a number of bytes that will tell the caller
that we are done with this message.
This might fix the original problem without introducing the current panic:
/* Do not send cong updates to IB loopback */
if (conn->c_loopback
&& rm->m_inc.i_hdr.h_flags & RDS_FLAG_CONG_BITMAP) {
rds_cong_map_updated(conn->c_fcong, ~(u64) 0);
scat = &rm->data.op_sg[sg];
ret = max_t(int, RDS_CONG_MAP_BYTES, scat->length);
return ret + sizeof(struct rds_header);
}
It will return 8240 when PAGE_SIZE is 4k and 64k+48 in case of ppc when scat->length is 64k and
be done with one iteration of rds_send_xmit/rds_ib_xmit loop.
Venkat
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/net/rds/ib_send.c b/net/rds/ib_send.c index e590949..7920c85 100644 --- a/net/rds/ib_send.c +++ b/net/rds/ib_send.c @@ -544,7 +544,7 @@ int rds_ib_xmit(struct rds_connection *conn, struct rds_message *rm, int flow_controlled = 0; int nr_sig = 0; - BUG_ON(off % RDS_FRAG_SIZE); + BUG_ON(!conn->c_loopback && off % RDS_FRAG_SIZE); BUG_ON(hdr_off != 0 && hdr_off != sizeof(struct rds_header)); /* Do not send cong updates to IB loopback */
Attempting an rds connection from the IP address of an IPoIB interface to itself causes a kernel panic due to a BUG_ON() being triggered. Making the test less strict allows rds-ping to work without crashing the machine. A local unprivileged user could use this flaw to crash the system. Signed-off-by: John Jolly <jjolly@suse.com> --- net/rds/ib_send.c | 2 +- 1 files changed, 1 insertions(+), 1 deletions(-)