Message ID | 1476282127.5271.30.camel@sipsolutions.net |
---|---|
State | Awaiting Upstream, archived |
Delegated to: | David Miller |
Headers | show |
On Wed, Oct 12, 2016 at 7:22 AM, Johannes Berg <johannes@sipsolutions.net> wrote: > >> > Can you elaborate on how exactly it kills your system? >> >> the last time I saw it it was a NULL deref at >> ieee80211_aes_ccm_decrypt. > > Hm. I was expecting something within the crypto code would cause the > crash, this seems strange. > > Anyway, I'm surely out of my depth wrt. the actual cause. Something > like the patch below probably works around it, but it's horribly > inefficient due to the locking and doesn't cover CMAC/GMAC either. In a pinch, I have these patches sitting around: https://git.kernel.org/cgit/linux/kernel/git/luto/linux.git/commit/?h=x86/vmap_stack&id=0a39cfa6fbb5d5635c85253cc7d6b44b54822afd https://git.kernel.org/cgit/linux/kernel/git/luto/linux.git/commit/?h=x86/vmap_stack&id=bf8cfa200b5a01383ea39fc8ce2f32909767baa8 I don't like them, though. I think it's rather silly that we can't just pass virtual addresses to the crypto code.
On Wed, 2016-10-12 at 22:39 -0700, Andy Lutomirski wrote: > In a pinch, I have these patches sitting around: > > https://git.kernel.org/cgit/linux/kernel/git/luto/linux.git/commit/?h=x86/vmap_stack&id=0a39cfa6fbb5d5635c85253cc7d6b44b54822afd > https://git.kernel.org/cgit/linux/kernel/git/luto/linux.git/commit/?h=x86/vmap_stack&id=bf8cfa200b5a01383ea39fc8ce2f32909767baa8 That truly sounds like something we'd rather avoid in the TX/RX paths though, which should perform well. > I don't like them, though. I think it's rather silly that we can't > just pass virtual addresses to the crypto code. I don't really understand it either, hence my question about the actual crash. I'll try to reproduce it in a VM. johannes
On (10/13/16 08:02), Johannes Berg wrote: > On Wed, 2016-10-12 at 22:39 -0700, Andy Lutomirski wrote: > > > In a pinch, I have these patches sitting around: > > > > https://git.kernel.org/cgit/linux/kernel/git/luto/linux.git/commit/?h=x86/vmap_stack&id=0a39cfa6fbb5d5635c85253cc7d6b44b54822afd > > https://git.kernel.org/cgit/linux/kernel/git/luto/linux.git/commit/?h=x86/vmap_stack&id=bf8cfa200b5a01383ea39fc8ce2f32909767baa8 > > That truly sounds like something we'd rather avoid in the TX/RX paths > though, which should perform well. didn't fix. so I finally had some time to do a better bug-reporter job. I added a bunch of printk-s and several virt_addr_valid()-s to ieee80211_aes_ccm_encrypt(). and right befoe the Oops I see the following report from virt_addr_valid() FAIL: 00004100002cba02 > ffffc900802cba02 || 1 -> (00004100002cba02 >> 39) == 130 which is basically failed '!phys_addr_valid(x)' in __virt_addr_valid() /* carry flag will be set if starting x was >= PAGE_OFFSET */ if ((x > y) || !phys_addr_valid(x)) return false; backtrace ------------[ cut here ]------------ WARNING: CPU: 7 PID: 246 at arch/x86/mm/physaddr.c:68 __virt_addr_valid+0xab/0xed ffffc900002cb6f0 ffffffff8122168c 0000000000000000 0000000000000000 ffffc900002cb730 ffffffff810428d8 0000004400000198 ffff88041bd21022 ffffc900002cba02 1ffff920000596ed ffff88041932d1e0 ffffc900002cba00 Call Trace: [<ffffffff8122168c>] dump_stack+0x4f/0x65 [<ffffffff810428d8>] __warn+0xc2/0xdd [<ffffffff81042963>] warn_slowpath_null+0x1d/0x1f [<ffffffff8103c226>] __virt_addr_valid+0xab/0xed [<ffffffff8146d31a>] ieee80211_aes_ccm_decrypt+0x8f/0x2da [<ffffffff812372de>] ? debug_smp_processor_id+0x17/0x19 [<ffffffff810fb7e1>] ? __put_page+0x3c/0x3f [<ffffffff8145b879>] ? ccmp_special_blocks.isra.1+0x51/0x12d [<ffffffff8145c445>] ieee80211_crypto_ccmp_decrypt+0x204/0x298 [<ffffffff81476dd1>] ieee80211_rx_handlers+0x7df/0x1c1d [<ffffffff814791c1>] ieee80211_prepare_and_rx_handle+0xdc2/0xe79 [<ffffffff814793cc>] ? ieee80211_rx_napi+0x154/0x7a5 [<ffffffff814796ec>] ieee80211_rx_napi+0x474/0x7a5 [<ffffffffa01fce3b>] iwl_mvm_rx_rx_mpdu+0x6e6/0x751 [iwlmvm] [<ffffffffa01f6c49>] iwl_mvm_rx+0x7e/0x98 [iwlmvm] [<ffffffffa01c0bca>] iwl_pcie_rx_handle+0x523/0x698 [iwlwifi] [<ffffffffa01c2015>] iwl_pcie_irq_handler+0x45d/0x64d [iwlwifi] [<ffffffff81089411>] ? irq_finalize_oneshot+0xd4/0xd4 [<ffffffff8108942e>] irq_thread_fn+0x1d/0x34 [<ffffffff810896a2>] irq_thread+0xe6/0x1bb [<ffffffff81089527>] ? wake_threads_waitq+0x2c/0x2c [<ffffffff810895bc>] ? irq_thread_dtor+0x95/0x95 [<ffffffff8105d7a3>] kthread+0xfc/0x104 [<ffffffff8107d3ad>] ? put_lock_stats.isra.9+0xe/0x20 [<ffffffff8105d6a7>] ? kthread_create_on_node+0x3f/0x3f [<ffffffff8105d6a7>] ? kthread_create_on_node+0x3f/0x3f [<ffffffff8105d6a7>] ? kthread_create_on_node+0x3f/0x3f [<ffffffff814b2952>] ret_from_fork+0x22/0x30 -ss
On (10/13/16 22:42), Sergey Senozhatsky wrote: > > On (10/13/16 08:02), Johannes Berg wrote: > > On Wed, 2016-10-12 at 22:39 -0700, Andy Lutomirski wrote: > > > > > In a pinch, I have these patches sitting around: > > > > > > https://git.kernel.org/cgit/linux/kernel/git/luto/linux.git/commit/?h=x86/vmap_stack&id=0a39cfa6fbb5d5635c85253cc7d6b44b54822afd > > > https://git.kernel.org/cgit/linux/kernel/git/luto/linux.git/commit/?h=x86/vmap_stack&id=bf8cfa200b5a01383ea39fc8ce2f32909767baa8 > > > > That truly sounds like something we'd rather avoid in the TX/RX paths > > though, which should perform well. > > didn't fix. > > so I finally had some time to do a better bug-reporter job. > > I added a bunch of printk-s and several virt_addr_valid()-s > to ieee80211_aes_ccm_encrypt(). > > and right befoe the Oops I see the following report from > virt_addr_valid() > > > FAIL: 00004100002cba02 > ffffc900802cba02 || 1 -> (00004100002cba02 >> 39) == 130 that `(00004100002cba02 >> 39) == 130' part is phys_addr_valid() { (addr >> boot_cpu_data.x86_phys_bits) } -ss
On Thu, 2016-10-13 at 22:42 +0900, Sergey Senozhatsky wrote: > > > > https://git.kernel.org/cgit/linux/kernel/git/luto/linux.git/commi > > > t/?h=x86/vmap_stack&id=0a39cfa6fbb5d5635c85253cc7d6b44b54822afd > > > https://git.kernel.org/cgit/linux/kernel/git/luto/linux.git/commi > > > t/?h=x86/vmap_stack&id=bf8cfa200b5a01383ea39fc8ce2f32909767baa8 > > > > That truly sounds like something we'd rather avoid in the TX/RX > > paths though, which should perform well. > > didn't fix. It couldn't, since the new helpers weren't used in mac80211 in those patches yet. > so I finally had some time to do a better bug-reporter job. > > I added a bunch of printk-s and several virt_addr_valid()-s > to ieee80211_aes_ccm_encrypt(). > > and right befoe the Oops I see the following report from > virt_addr_valid() > > > FAIL: 00004100002cba02 > ffffc900802cba02 || 1 -> (00004100002cba02 > >> 39) == 130 Yeah, we already know that in this function the aad variable is on the stack, it explicitly is. The question, though, is why precisely that fails in the crypto code. Can you send the Oops report itself? johannes
On (10/13/16 15:45), Johannes Berg wrote: > On Thu, 2016-10-13 at 22:42 +0900, Sergey Senozhatsky wrote: > > > > > > https://git.kernel.org/cgit/linux/kernel/git/luto/linux.git/commi > > > > t/?h=x86/vmap_stack&id=0a39cfa6fbb5d5635c85253cc7d6b44b54822afd > > > > https://git.kernel.org/cgit/linux/kernel/git/luto/linux.git/commi > > > > t/?h=x86/vmap_stack&id=bf8cfa200b5a01383ea39fc8ce2f32909767baa8 > > > > > > That truly sounds like something we'd rather avoid in the TX/RX > > > paths though, which should perform well. > > > > didn't fix. > > It couldn't, since the new helpers weren't used in mac80211 in those > patches yet. indeed. I thought they were. > > FAIL: 00004100002cba02 > ffffc900802cba02 || 1 -> (00004100002cba02 > > >> 39) == 130 > > The question, though, is why precisely that fails in the crypto code. > Can you send the Oops report itself? kernel: BUG: unable to handle kernel NULL pointer dereference at (null) kernel: IP: [<ffffffff8146d2f4>] ieee80211_aes_ccm_decrypt+0x107/0x27f kernel: PGD 0 kernel: kernel: Oops: 0000 [#1] PREEMPT SMP kernel: Modules linked in: nls_iso8859_1 nls_cp437 vfat fat mousedev psmouse serio_raw atkbd libps2 i915 coretemp i2c_algo_bit hwmon crc32c_intel mxm_wmi drm_kms_helper cfbfillrect syscopyarea cfbimgblt sysfillrect iwlmvm sysimgblt fb_sys_fops i2c_i801 cfbcopyarea ie31200_edac drm iwlwifi i2c kernel: CPU: 3 PID: 245 Comm: irq/28-iwlwifi Not tainted 4.8.0-next-20161013-dbg-00002-ge789862-dirty #112 kernel: task: ffff88041bf01800 task.stack: ffffc900002d0000 kernel: RIP: 0010:[<ffffffff8146d2f4>] [<ffffffff8146d2f4>] ieee80211_aes_ccm_decrypt+0x107/0x27f kernel: RSP: 0018:ffffc900002d3770 EFLAGS: 00010246 kernel: RAX: ffffc900002d3930 RBX: ffff8804133cf606 RCX: 0000000000082000 kernel: RDX: 0000000000000000 RSI: 0000000000000018 RDI: 0000000000000a02 kernel: RBP: ffffc900002d39b8 R08: 00000000000005e4 R09: 00000004100002d3 kernel: R10: 000000000000001c R11: ffff8803e66d2d20 R12: ffff8804191c2780 kernel: R13: ffffc900002d39f0 R14: ffff8804133cf022 R15: 1ffff9200005a6ee kernel: FS: 0000000000000000(0000) GS:ffff88041ea00000(0000) knlGS:0000000000000000 kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 kernel: CR2: 0000000000000000 CR3: 0000000001805000 CR4: 00000000001406e0 kernel: Stack: kernel: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 kernel: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 kernel: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 kernel: Call Trace: kernel: [<ffffffff8145c405>] ieee80211_crypto_ccmp_decrypt+0x204/0x298 kernel: [<ffffffff81476cd8>] ieee80211_rx_handlers+0x7df/0x1c1d kernel: [<ffffffff814790c8>] ieee80211_prepare_and_rx_handle+0xdc2/0xe79 kernel: [<ffffffff814792e7>] ? ieee80211_rx_napi+0x168/0x7b6 kernel: [<ffffffff8147960a>] ieee80211_rx_napi+0x48b/0x7b6 kernel: [<ffffffff8123729e>] ? debug_smp_processor_id+0x17/0x19 kernel: [<ffffffffa01cfe3b>] iwl_mvm_rx_rx_mpdu+0x6e6/0x751 [iwlmvm] kernel: [<ffffffffa01c9c49>] iwl_mvm_rx+0x7e/0x98 [iwlmvm] kernel: [<ffffffffa0131bca>] iwl_pcie_rx_handle+0x523/0x698 [iwlwifi] kernel: [<ffffffffa0133027>] iwl_pcie_irq_handler+0x46f/0x65f [iwlwifi] kernel: [<ffffffff810893d0>] ? irq_finalize_oneshot+0xd4/0xd4 kernel: [<ffffffff810893ed>] irq_thread_fn+0x1d/0x34 kernel: [<ffffffff81089661>] irq_thread+0xe6/0x1bb kernel: [<ffffffff810894e6>] ? wake_threads_waitq+0x2c/0x2c kernel: [<ffffffff8108957b>] ? irq_thread_dtor+0x95/0x95 kernel: [<ffffffff8105d762>] kthread+0xfc/0x104 kernel: [<ffffffff8107d36c>] ? put_lock_stats.isra.9+0xe/0x20 kernel: [<ffffffff8105d666>] ? kthread_create_on_node+0x3f/0x3f kernel: [<ffffffff814b2852>] ret_from_fork+0x22/0x30 kernel: Code: 01 ca 49 89 d1 48 89 d1 48 c1 ea 23 48 8b 14 d5 80 23 63 82 49 c1 e9 0c 48 c1 e9 1b 48 85 d2 74 0a 0f b6 c9 48 c1 e1 04 48 01 ca <48> 8b 12 49 c1 e1 06 b9 00 00 00 80 89 7d 80 89 75 84 48 8b 3d kernel: RIP [<ffffffff8146d2f4>] ieee80211_aes_ccm_decrypt+0x107/0x27f kernel: RSP <ffffc900002d3770> kernel: CR2: 0000000000000000 kernel: ---[ end trace 3cd1fcd496516f72 ]--- -ss
On (10/14/16 00:00), Sergey Senozhatsky wrote: > kernel: [<ffffffff8145c405>] ieee80211_crypto_ccmp_decrypt+0x204/0x298 > kernel: [<ffffffff81476cd8>] ieee80211_rx_handlers+0x7df/0x1c1d > kernel: [<ffffffff814790c8>] ieee80211_prepare_and_rx_handle+0xdc2/0xe79 > kernel: [<ffffffff814792e7>] ? ieee80211_rx_napi+0x168/0x7b6 > kernel: [<ffffffff8147960a>] ieee80211_rx_napi+0x48b/0x7b6 > kernel: [<ffffffff8123729e>] ? debug_smp_processor_id+0x17/0x19 > kernel: [<ffffffffa01cfe3b>] iwl_mvm_rx_rx_mpdu+0x6e6/0x751 [iwlmvm] > kernel: [<ffffffffa01c9c49>] iwl_mvm_rx+0x7e/0x98 [iwlmvm] > kernel: [<ffffffffa0131bca>] iwl_pcie_rx_handle+0x523/0x698 [iwlwifi] > kernel: [<ffffffffa0133027>] iwl_pcie_irq_handler+0x46f/0x65f [iwlwifi] > kernel: [<ffffffff810893d0>] ? irq_finalize_oneshot+0xd4/0xd4 > kernel: [<ffffffff810893ed>] irq_thread_fn+0x1d/0x34 > kernel: [<ffffffff81089661>] irq_thread+0xe6/0x1bb > kernel: [<ffffffff810894e6>] ? wake_threads_waitq+0x2c/0x2c > kernel: [<ffffffff8108957b>] ? irq_thread_dtor+0x95/0x95 > kernel: [<ffffffff8105d762>] kthread+0xfc/0x104 > kernel: [<ffffffff8107d36c>] ? put_lock_stats.isra.9+0xe/0x20 > kernel: [<ffffffff8105d666>] ? kthread_create_on_node+0x3f/0x3f > kernel: [<ffffffff814b2852>] ret_from_fork+0x22/0x30 > kernel: Code: 01 ca 49 89 d1 48 89 d1 48 c1 ea 23 48 8b 14 d5 80 23 63 82 49 c1 e9 0c 48 c1 e9 1b 48 85 d2 74 0a 0f b6 c9 48 c1 e1 04 48 01 ca <48> 8b 12 49 c1 e1 06 b9 00 00 00 80 89 7d 80 89 75 84 48 8b 3d > kernel: RIP [<ffffffff8146d2f4>] ieee80211_aes_ccm_decrypt+0x107/0x27f ffffffff8146d1ed <ieee80211_aes_ccm_decrypt>: ffffffff8146d1ed: e8 9e 67 04 00 callq ffffffff814b3990 <__fentry__> ffffffff8146d1f2: 55 push %rbp ffffffff8146d1f3: 48 89 e5 mov %rsp,%rbp ffffffff8146d1f6: 41 57 push %r15 ffffffff8146d1f8: 41 56 push %r14 ffffffff8146d1fa: 49 89 ce mov %rcx,%r14 ffffffff8146d1fd: 41 55 push %r13 ffffffff8146d1ff: 41 54 push %r12 ffffffff8146d201: 53 push %rbx ffffffff8146d202: 48 83 c4 80 add $0xffffffffffffff80,%rsp ffffffff8146d206: 8b 47 04 mov 0x4(%rdi),%eax ffffffff8146d209: 48 8d 48 50 lea 0x50(%rax),%rcx ffffffff8146d20d: 48 83 c0 5e add $0x5e,%rax ffffffff8146d211: 48 c1 e8 03 shr $0x3,%rax ffffffff8146d215: 48 c1 e0 03 shl $0x3,%rax ffffffff8146d219: 48 29 c4 sub %rax,%rsp ffffffff8146d21c: 4c 8d 7c 24 07 lea 0x7(%rsp),%r15 ffffffff8146d221: 49 c1 ef 03 shr $0x3,%r15 ffffffff8146d225: 4d 85 c0 test %r8,%r8 ffffffff8146d228: 4a 8d 04 fd 00 00 00 lea 0x0(,%r15,8),%rax ffffffff8146d22f: 00 ffffffff8146d230: 48 89 85 70 ff ff ff mov %rax,-0x90(%rbp) ffffffff8146d237: 75 0a jne ffffffff8146d243 <ieee80211_aes_ccm_decrypt+0x56> ffffffff8146d239: b8 ea ff ff ff mov $0xffffffea,%eax ffffffff8146d23e: e9 1a 02 00 00 jmpq ffffffff8146d45d <ieee80211_aes_ccm_decrypt+0x270> ffffffff8146d243: 31 c0 xor %eax,%eax ffffffff8146d245: 49 89 fc mov %rdi,%r12 ffffffff8146d248: 49 89 f5 mov %rsi,%r13 ffffffff8146d24b: 4c 89 85 58 ff ff ff mov %r8,-0xa8(%rbp) ffffffff8146d252: 4a 8d 3c fd 00 00 00 lea 0x0(,%r15,8),%rdi ffffffff8146d259: 00 ffffffff8146d25a: be 03 00 00 00 mov $0x3,%esi ffffffff8146d25f: 4c 89 cb mov %r9,%rbx ffffffff8146d262: 48 89 95 60 ff ff ff mov %rdx,-0xa0(%rbp) ffffffff8146d269: f3 aa rep stos %al,%es:(%rdi) ffffffff8146d26b: 48 8d 85 78 ff ff ff lea -0x88(%rbp),%rax ffffffff8146d272: 48 89 c7 mov %rax,%rdi ffffffff8146d275: 48 89 85 68 ff ff ff mov %rax,-0x98(%rbp) ffffffff8146d27c: e8 46 06 dc ff callq ffffffff8122d8c7 <sg_init_table> ffffffff8146d281: 48 8b 95 60 ff ff ff mov -0xa0(%rbp),%rdx ffffffff8146d288: 41 b9 00 00 00 80 mov $0x80000000,%r9d ffffffff8146d28e: 48 8b 0d 7b cd 39 00 mov 0x39cd7b(%rip),%rcx # ffffffff8180a010 <phys_base> ffffffff8146d295: 48 8b 85 68 ff ff ff mov -0x98(%rbp),%rax ffffffff8146d29c: 4c 8b 85 58 ff ff ff mov -0xa8(%rbp),%r8 ffffffff8146d2a3: 0f b7 32 movzwl (%rdx),%esi ffffffff8146d2a6: 48 83 c2 02 add $0x2,%rdx ffffffff8146d2aa: 89 d7 mov %edx,%edi ffffffff8146d2ac: 81 e7 ff 0f 00 00 and $0xfff,%edi ffffffff8146d2b2: 66 c1 c6 08 rol $0x8,%si ffffffff8146d2b6: 4c 01 ca add %r9,%rdx ffffffff8146d2b9: 0f b7 f6 movzwl %si,%esi ffffffff8146d2bc: 72 0a jb ffffffff8146d2c8 <ieee80211_aes_ccm_decrypt+0xdb> ffffffff8146d2be: 48 b9 00 00 00 80 ff movabs $0x77ff80000000,%rcx ffffffff8146d2c5: 77 00 00 ffffffff8146d2c8: 48 01 ca add %rcx,%rdx ffffffff8146d2cb: 49 89 d1 mov %rdx,%r9 ffffffff8146d2ce: 48 89 d1 mov %rdx,%rcx ffffffff8146d2d1: 48 c1 ea 23 shr $0x23,%rdx ffffffff8146d2d5: 48 8b 14 d5 80 23 63 mov -0x7d9cdc80(,%rdx,8),%rdx ffffffff8146d2dc: 82 ffffffff8146d2dd: 49 c1 e9 0c shr $0xc,%r9 ffffffff8146d2e1: 48 c1 e9 1b shr $0x1b,%rcx ffffffff8146d2e5: 48 85 d2 test %rdx,%rdx ffffffff8146d2e8: 74 0a je ffffffff8146d2f4 <ieee80211_aes_ccm_decrypt+0x107> ffffffff8146d2ea: 0f b6 c9 movzbl %cl,%ecx ffffffff8146d2ed: 48 c1 e1 04 shl $0x4,%rcx ffffffff8146d2f1: 48 01 ca add %rcx,%rdx ffffffff8146d2f4: 48 8b 12 mov (%rdx),%rdx ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ffffffff8146d2f7: 49 c1 e1 06 shl $0x6,%r9 ffffffff8146d2fb: b9 00 00 00 80 mov $0x80000000,%ecx ffffffff8146d300: 89 7d 80 mov %edi,-0x80(%rbp) ffffffff8146d303: 89 75 84 mov %esi,-0x7c(%rbp) ffffffff8146d306: 48 8b 3d 03 cd 39 00 mov 0x39cd03(%rip),%rdi # ffffffff8180a010 <phys_base> ffffffff8146d30d: 48 83 e2 fc and $0xfffffffffffffffc,%rdx ffffffff8146d311: 49 01 d1 add %rdx,%r9 ffffffff8146d314: 48 8b 95 78 ff ff ff mov -0x88(%rbp),%rdx -ss
On Oct 13, 2016 6:46 AM, "Johannes Berg" <johannes@sipsolutions.net> wrote: > > On Thu, 2016-10-13 at 22:42 +0900, Sergey Senozhatsky wrote: > > > > > > https://git.kernel.org/cgit/linux/kernel/git/luto/linux.git/commi > > > > t/?h=x86/vmap_stack&id=0a39cfa6fbb5d5635c85253cc7d6b44b54822afd > > > > https://git.kernel.org/cgit/linux/kernel/git/luto/linux.git/commi > > > > t/?h=x86/vmap_stack&id=bf8cfa200b5a01383ea39fc8ce2f32909767baa8 > > > > > > That truly sounds like something we'd rather avoid in the TX/RX > > > paths though, which should perform well. > > > > didn't fix. > > It couldn't, since the new helpers weren't used in mac80211 in those > patches yet. > > > so I finally had some time to do a better bug-reporter job. > > > > I added a bunch of printk-s and several virt_addr_valid()-s > > to ieee80211_aes_ccm_encrypt(). > > > > and right befoe the Oops I see the following report from > > virt_addr_valid() > > > > > > FAIL: 00004100002cba02 > ffffc900802cba02 || 1 -> (00004100002cba02 > > >> 39) == 130 > > Yeah, we already know that in this function the aad variable is on the > stack, it explicitly is. > > The question, though, is why precisely that fails in the crypto code. > Can you send the Oops report itself? > It's failing before that. With CONFIG_VMAP_STACK=y, the stack may not be physically contiguous and can't be used for DMA, so putting it in a scatterlist is bogus in general, and the crypto code mostly wants a scatterlist. There are a couple (faster!) APIs for crypto that don't use scatterlists, but I don't think AEAD works with them. --Andy
On Thu, 2016-10-13 at 14:49 -0700, Andy Lutomirski wrote: > > It's failing before that. With CONFIG_VMAP_STACK=y, the stack may > not be physically contiguous and can't be used for DMA, so putting it > in a scatterlist is bogus in general, and the crypto code mostly > wants a scatterlist. I see, so all this stuff is getting inlined, and we crash in sg_set_buf() because it does sg_set_page() and that obviously needs to do virt_to_page(), which is invalid on this address now. With CONFIG_DEBUG_SG we'd have hit the BUG_ON there instead. It does indeed look like AEAD doesn't have any non-SG API. So ultimately, the bug already goes back to Ard's commit 7ec7c4a9a686 ("mac80211: port CCMP to cryptoapi's CCM driver") since that already potentially used stack space for DMA. Since we don't have any space in the SKB or anywhere else at this point (other than the stack that we can't use), I see two ways out of this: 1. revert that patch (doing so would need some major adjustments now, since it's pretty old and a number of new things were added in the meantime) 2. allocate a per-CPU buffer for all the things that we put on the stack and use in SG lists, those are: * CCM/GCM: AAD (32B), B_0/J_0 (16B) * GMAC: AAD (20B), zero (16B) * (not sure why CMAC isn't using this API, but it would be like GMAC) Thoughts? johannes
> 1. revert that patch (doing so would need some major adjustments now, > since it's pretty old and a number of new things were added in the > meantime) This it will have to be, I guess. > 2. allocate a per-CPU buffer for all the things that we put on the > stack and use in SG lists, those are: > * CCM/GCM: AAD (32B), B_0/J_0 (16B) > * GMAC: AAD (20B), zero (16B) > * (not sure why CMAC isn't using this API, but it would be like GMAC) This doesn't work - I tried to move the mac80211 buffers, but because we also put the struct aead_request on the stack, and crypto_ccm has the "odata" in there, and we can't separate the odata from that struct, we'd have to also put that into a per-CPU buffer, but it's very big - 456 bytes for CCM, didn't measure the others but I'd expect them to be larger, if different. I don't think we can allocate half a kb for each CPU just to be able to possibly use the acceleration here. We can't even make that conditional on not having hardware crypto in the wifi NIC because drivers are always allowed to pass undecrypted frames, regardless of whether or not HW crypto was attempted, so we don't know upfront if we'll have to decrypt anything in software... Given that, I think we have had a bug in here basically since Ard's patch, we never should've put these structs on the stack. Herbert, you also touched this later and converted the API usage, did you see the way the stack is used here and think it should be OK, or did you simply not realize that? Ard, are you able to help out working on a revert of your patch? That would require also reverting a number of other patches (various fixes, API adjustments, etc. to the AEAD usage), but the more complicated part is that in the meantime Jouni introduced GCMP and CCMP-256, both of which we of course need to retain. johannes
On 14 October 2016 at 09:28, Johannes Berg <johannes@sipsolutions.net> wrote: > >> 1. revert that patch (doing so would need some major adjustments now, >> since it's pretty old and a number of new things were added in the >> meantime) > > This it will have to be, I guess. > >> 2. allocate a per-CPU buffer for all the things that we put on the >> stack and use in SG lists, those are: >> * CCM/GCM: AAD (32B), B_0/J_0 (16B) >> * GMAC: AAD (20B), zero (16B) >> * (not sure why CMAC isn't using this API, but it would be like GMAC) > > This doesn't work - I tried to move the mac80211 buffers, but because > we also put the struct aead_request on the stack, and crypto_ccm has > the "odata" in there, and we can't separate the odata from that struct, > we'd have to also put that into a per-CPU buffer, but it's very big - > 456 bytes for CCM, didn't measure the others but I'd expect them to be > larger, if different. > > I don't think we can allocate half a kb for each CPU just to be able to > possibly use the acceleration here. We can't even make that conditional > on not having hardware crypto in the wifi NIC because drivers are > always allowed to pass undecrypted frames, regardless of whether or not > HW crypto was attempted, so we don't know upfront if we'll have to > decrypt anything in software... > > Given that, I think we have had a bug in here basically since Ard's > patch, we never should've put these structs on the stack. Herbert, you > also touched this later and converted the API usage, did you see the > way the stack is used here and think it should be OK, or did you simply > not realize that? > > Ard, are you able to help out working on a revert of your patch? That > would require also reverting a number of other patches (various fixes, > API adjustments, etc. to the AEAD usage), but the more complicated part > is that in the meantime Jouni introduced GCMP and CCMP-256, both of > which we of course need to retain. > I am missing some context here, but could you explain what exactly is the problem here? Look at this code """ struct scatterlist sg[3]; char aead_req_data[sizeof(struct aead_request) + crypto_aead_reqsize(tfm)] __aligned(__alignof__(struct aead_request)); struct aead_request *aead_req = (void *) aead_req_data; memset(aead_req, 0, sizeof(aead_req_data)); sg_init_table(sg, 3); sg_set_buf(&sg[0], &aad[2], be16_to_cpup((__be16 *)aad)); sg_set_buf(&sg[1], data, data_len); sg_set_buf(&sg[2], mic, mic_len); aead_request_set_tfm(aead_req, tfm); aead_request_set_crypt(aead_req, sg, sg, data_len, b_0); aead_request_set_ad(aead_req, sg[0].length); """ I assume the stack buffer itself is not the problem here, but aad, which is allocated on the stack one frame up. Do we really need to revert the whole patch to fix that?
On (10/13/16 14:49), Andy Lutomirski wrote: [..] > > > FAIL: 00004100002cba02 > ffffc900802cba02 || 1 -> (00004100002cba02 > > > >> 39) == 130 > > > > Yeah, we already know that in this function the aad variable is on the > > stack, it explicitly is. > > > > The question, though, is why precisely that fails in the crypto code. > > Can you send the Oops report itself? > > > > It's failing before that. With CONFIG_VMAP_STACK=y, the stack may not > be physically contiguous and can't be used for DMA, so putting it in a > scatterlist is bogus in general, and the crypto code mostly wants a > scatterlist. > > There are a couple (faster!) APIs for crypto that don't use > scatterlists, but I don't think AEAD works with them. given that we have a known issue shouldn't VMAP_STACK be disabled for now, or would you rather prefer to mark MAC80211 as incompatible: "depends on CFG80211 && !VMAP_STACK"? -ss
On 14 October 2016 at 09:39, Ard Biesheuvel <ard.biesheuvel@linaro.org> wrote: > On 14 October 2016 at 09:28, Johannes Berg <johannes@sipsolutions.net> wrote: >> >>> 1. revert that patch (doing so would need some major adjustments now, >>> since it's pretty old and a number of new things were added in the >>> meantime) >> >> This it will have to be, I guess. >> >>> 2. allocate a per-CPU buffer for all the things that we put on the >>> stack and use in SG lists, those are: >>> * CCM/GCM: AAD (32B), B_0/J_0 (16B) >>> * GMAC: AAD (20B), zero (16B) >>> * (not sure why CMAC isn't using this API, but it would be like GMAC) >> >> This doesn't work - I tried to move the mac80211 buffers, but because >> we also put the struct aead_request on the stack, and crypto_ccm has >> the "odata" in there, and we can't separate the odata from that struct, >> we'd have to also put that into a per-CPU buffer, but it's very big - >> 456 bytes for CCM, didn't measure the others but I'd expect them to be >> larger, if different. >> >> I don't think we can allocate half a kb for each CPU just to be able to >> possibly use the acceleration here. We can't even make that conditional >> on not having hardware crypto in the wifi NIC because drivers are >> always allowed to pass undecrypted frames, regardless of whether or not >> HW crypto was attempted, so we don't know upfront if we'll have to >> decrypt anything in software... >> >> Given that, I think we have had a bug in here basically since Ard's >> patch, we never should've put these structs on the stack. Herbert, you >> also touched this later and converted the API usage, did you see the >> way the stack is used here and think it should be OK, or did you simply >> not realize that? >> >> Ard, are you able to help out working on a revert of your patch? That >> would require also reverting a number of other patches (various fixes, >> API adjustments, etc. to the AEAD usage), but the more complicated part >> is that in the meantime Jouni introduced GCMP and CCMP-256, both of >> which we of course need to retain. >> > > I am missing some context here, but could you explain what exactly is > the problem here? > > Look at this code > > """ > struct scatterlist sg[3]; > > char aead_req_data[sizeof(struct aead_request) + > crypto_aead_reqsize(tfm)] > __aligned(__alignof__(struct aead_request)); > struct aead_request *aead_req = (void *) aead_req_data; > > memset(aead_req, 0, sizeof(aead_req_data)); > > sg_init_table(sg, 3); > sg_set_buf(&sg[0], &aad[2], be16_to_cpup((__be16 *)aad)); > sg_set_buf(&sg[1], data, data_len); > sg_set_buf(&sg[2], mic, mic_len); > > aead_request_set_tfm(aead_req, tfm); > aead_request_set_crypt(aead_req, sg, sg, data_len, b_0); > aead_request_set_ad(aead_req, sg[0].length); > """ > > I assume the stack buffer itself is not the problem here, but aad, > which is allocated on the stack one frame up. > Do we really need to revert the whole patch to fix that? Ah never mind, this is about 'odata'. Apologies, should have read first
On Fri, 2016-10-14 at 09:41 +0100, Ard Biesheuvel wrote: > > I assume the stack buffer itself is not the problem here, but aad, > > which is allocated on the stack one frame up. > > Do we really need to revert the whole patch to fix that? > > Ah never mind, this is about 'odata'. Apologies, should have read > first Right, odata also goes into an sg list and further on. I think we should wait for Herbert to chime in before we do any further work though, perhaps he has any better ideas. johannes
On Fri, 2016-10-14 at 17:39 +0900, Sergey Senozhatsky wrote: > > given that we have a known issue shouldn't VMAP_STACK be > disabled for now, or would you rather prefer to mark MAC80211 > as incompatible: "depends on CFG80211 && !VMAP_STACK"? Yeah. It's a bit complicated by the fact that most people will probably have hardware crypto in their wifi NICs, so that they won't actually hit the software crypto path. As I said in my other email though, we can't guarantee - even if the driver says it can do hardware crypto - that it really will do it for all frames (some might not be able to do for management frames for example), so we also can't really catch this at runtime ... Making mac80211 depend on !VMAP_STACK is probably technically best, but I fear it'll break a lot of people's configurations who don't have a problem right now (e.g. Linus's, who probably enabled this, but I know where he uses wifi he uses an Intel NIC that will always do HW crypto). Andy, what do you think? johannes
On 14 October 2016 at 09:42, Johannes Berg <johannes@sipsolutions.net> wrote: > On Fri, 2016-10-14 at 09:41 +0100, Ard Biesheuvel wrote: > >> > I assume the stack buffer itself is not the problem here, but aad, >> > which is allocated on the stack one frame up. >> > Do we really need to revert the whole patch to fix that? >> >> Ah never mind, this is about 'odata'. Apologies, should have read >> first > > Right, odata also goes into an sg list and further on. > > I think we should wait for Herbert to chime in before we do any further > work though, perhaps he has any better ideas. > Do you have a reference for the sg_set_buf() call on odata? crypto/ccm.c does not seem to have it (afaict), and the same problem does not exist in the accelerated arm64 implementation. In the mean time, I will try and see if we can move aad[] off the stack in the WPA code.
On Fri, 2016-10-14 at 09:47 +0100, Ard Biesheuvel wrote: > > Do you have a reference for the sg_set_buf() call on odata? > crypto/ccm.c does not seem to have it (afaict), It's indirect - crypto_ccm_encrypt() calls crypto_ccm_init_crypt() which does it. > and the same problem > does not exist in the accelerated arm64 implementation. In the mean > time, I will try and see if we can move aad[] off the stack in the > WPA code. I had that with per-CPU buffers, just sent the patch upthread. johannes
On 14 October 2016 at 09:55, Johannes Berg <johannes@sipsolutions.net> wrote: > On Fri, 2016-10-14 at 09:47 +0100, Ard Biesheuvel wrote: >> >> Do you have a reference for the sg_set_buf() call on odata? >> crypto/ccm.c does not seem to have it (afaict), > > It's indirect - crypto_ccm_encrypt() calls crypto_ccm_init_crypt() > which does it. > Indeed. And the decrypt path does the same for auth_tag[]. But that still means there are two separate problems here, one which affects the WPA code, and one that only affects the generic CCM chaining mode (but not the accelerated arm64 implementation) Unsurprisingly, I would strongly prefer those to be fixed properly rather than backing out my patch, but I'm happy to help out whichever solution we reach consensus on. >> and the same problem >> does not exist in the accelerated arm64 implementation. In the mean >> time, I will try and see if we can move aad[] off the stack in the >> WPA code. > > I had that with per-CPU buffers, just sent the patch upthread. > I will check whether this removes the issue when not using crypto/ccm.ko
On Fri, 2016-10-14 at 10:05 +0100, Ard Biesheuvel wrote: > > Indeed. And the decrypt path does the same for auth_tag[]. Hadn't gotten that far, due to the BUG_ON() in CONFIG_DEBUG_SG in the encrypt path :) > But that still means there are two separate problems here, one which > affects the WPA code, and one that only affects the generic CCM > chaining mode (but not the accelerated arm64 implementation) Yes. The generic CCM chaining still doesn't typically have a request on the stack though. In fact, ESP (net/ipv4/esp4.c) for example will do temporary allocations with kmalloc for every frame, it seems. > Unsurprisingly, I would strongly prefer those to be fixed properly > rather than backing out my patch, but I'm happy to help out whichever > solution we reach consensus on. Yeah, obviously, it would be good to use the accelerated versions after all. > I will check whether this removes the issue when not using > crypto/ccm.ko Ok. I think we can probably live with having those 48 bytes in per-CPU buffers, but I suppose we don't really want to have ~500. johannes
On 14 October 2016 at 10:10, Johannes Berg <johannes@sipsolutions.net> wrote: > On Fri, 2016-10-14 at 10:05 +0100, Ard Biesheuvel wrote: >> >> Indeed. And the decrypt path does the same for auth_tag[]. > > Hadn't gotten that far, due to the BUG_ON() in CONFIG_DEBUG_SG in the > encrypt path :) > >> But that still means there are two separate problems here, one which >> affects the WPA code, and one that only affects the generic CCM >> chaining mode (but not the accelerated arm64 implementation) > > Yes. The generic CCM chaining still doesn't typically have a request on > the stack though. In fact, ESP (net/ipv4/esp4.c) for example will do > temporary allocations with kmalloc for every frame, it seems. > It is annotated with a TODO, though :-) 38320c70d282b (Herbert Xu 2008-01-28 19:35:05 -0800 41) * TODO: Use spare space in skb for this where possible. >> Unsurprisingly, I would strongly prefer those to be fixed properly >> rather than backing out my patch, but I'm happy to help out whichever >> solution we reach consensus on. > > Yeah, obviously, it would be good to use the accelerated versions after > all. > >> I will check whether this removes the issue when not using >> crypto/ccm.ko > > Ok. I think we can probably live with having those 48 bytes in per-CPU > buffers, but I suppose we don't really want to have ~500. > Agreed.
On Fri, 2016-10-14 at 10:21 +0100, Ard Biesheuvel wrote: > It is annotated with a TODO, though :-) > > 38320c70d282b (Herbert Xu 2008-01-28 19:35:05 > -0800 41) > * TODO: Use spare space in skb for this where possible. I saw that, but I don't think generally there will be spare space for it - the stuff there is likely far too big. Anyway ... same problem that we have. I'm not inclined to allocate ~500 bytes temporarily for every frame either though. Maybe we could try to manage it in mac80211, we'd "only" need 5 AEAD structs (which are today on the stack) in parallel for each key (4 TX, 1 RX), but in a typical case of having 3 keys that's already 7.5K worth of memory that we almost never use. Again, with more complexity, we could know that the TX will not be used if the driver does the TX, but the single RX one we'd need unconditionally... decisions decisions... johannes
On 14 October 2016 at 10:25, Johannes Berg <johannes@sipsolutions.net> wrote: > On Fri, 2016-10-14 at 10:21 +0100, Ard Biesheuvel wrote: > >> It is annotated with a TODO, though :-) >> >> 38320c70d282b (Herbert Xu 2008-01-28 19:35:05 >> -0800 41) >> * TODO: Use spare space in skb for this where possible. > > I saw that, but I don't think generally there will be spare space for > it - the stuff there is likely far too big. Anyway ... same problem > that we have. > > I'm not inclined to allocate ~500 bytes temporarily for every frame > either though. > > Maybe we could try to manage it in mac80211, we'd "only" need 5 AEAD > structs (which are today on the stack) in parallel for each key (4 TX, > 1 RX), but in a typical case of having 3 keys that's already 7.5K worth > of memory that we almost never use. Again, with more complexity, we > could know that the TX will not be used if the driver does the TX, but > the single RX one we'd need unconditionally... decisions decisions... > So why is the performance hit acceptable for ESP but not for WPA? We could easily implement the same thing, i.e., kmalloc(GFP_ATOMIC)/kfree the aead_req struct rather than allocate it on the stack
> So why is the performance hit acceptable for ESP but not for WPA? We > could easily implement the same thing, i.e., > kmalloc(GFP_ATOMIC)/kfree the aead_req struct rather than allocate it > on the stack Yeah, maybe we should. It's likely a much bigger allocation, but I don't actually know if that affects speed. In most cases where you want high performance we never hit this anyway since we'll have hardware crypto. I know for our (Intel's) devices we normally never hit these code paths. But on the other hand, you also did your changes for a reason, and the only reason I can see of that is performance. So you'd be the one with most "skin in the game", I guess? johannes
On 14 October 2016 at 11:00, Johannes Berg <johannes@sipsolutions.net> wrote: > >> So why is the performance hit acceptable for ESP but not for WPA? We >> could easily implement the same thing, i.e., >> kmalloc(GFP_ATOMIC)/kfree the aead_req struct rather than allocate it >> on the stack > > Yeah, maybe we should. It's likely a much bigger allocation, but I > don't actually know if that affects speed. > > In most cases where you want high performance we never hit this anyway > since we'll have hardware crypto. I know for our (Intel's) devices we > normally never hit these code paths. > > But on the other hand, you also did your changes for a reason, and the > only reason I can see of that is performance. So you'd be the one with > most "skin in the game", I guess? > Well, what sucks here is that the accelerated driver I implemented for arm64 does not actually need this, as long as we take aad[] off the stack. And note that the API was changed since my patch, to add aad[] to the scatterlist: prior to this change, it used aead_request_set_assoc() to set the associated data separately.
diff --git a/net/mac80211/ieee80211_i.h b/net/mac80211/ieee80211_i.h index 103187ca9474..e820f437f02e 100644 --- a/net/mac80211/ieee80211_i.h +++ b/net/mac80211/ieee80211_i.h @@ -27,6 +27,7 @@ #include <linux/leds.h> #include <linux/idr.h> #include <linux/rhashtable.h> +#include <crypto/aes.h> #include <net/ieee80211_radiotap.h> #include <net/cfg80211.h> #include <net/mac80211.h> @@ -1224,6 +1225,10 @@ struct ieee80211_local { spinlock_t rx_path_lock; + /* temporary buffers for software crypto */ + u8 aad[2 * AES_BLOCK_SIZE]; + u8 b_0[AES_BLOCK_SIZE]; + /* Station data */ /* * The mutex only protects the list, hash table and diff --git a/net/mac80211/wpa.c b/net/mac80211/wpa.c index b48c1e13e281..a3f17a710b85 100644 --- a/net/mac80211/wpa.c +++ b/net/mac80211/wpa.c @@ -405,8 +405,8 @@ static int ccmp_encrypt_skb(struct ieee80211_tx_data *tx, struct sk_buff *skb, u8 *pos; u8 pn[6]; u64 pn64; - u8 aad[2 * AES_BLOCK_SIZE]; - u8 b_0[AES_BLOCK_SIZE]; + u8 *aad = tx->local->aad; + u8 *b_0 = tx->local->b_0; if (info->control.hw_key && !(info->control.hw_key->flags & IEEE80211_KEY_FLAG_GENERATE_IV) && @@ -460,9 +460,11 @@ static int ccmp_encrypt_skb(struct ieee80211_tx_data *tx, struct sk_buff *skb, return 0; pos += IEEE80211_CCMP_HDR_LEN; + spin_lock_bh(&tx->local->rx_path_lock); ccmp_special_blocks(skb, pn, b_0, aad); ieee80211_aes_ccm_encrypt(key->u.ccmp.tfm, b_0, aad, pos, len, skb_put(skb, mic_len), mic_len); + spin_unlock_bh(&tx->local->rx_path_lock); return 0; } @@ -534,8 +536,9 @@ ieee80211_crypto_ccmp_decrypt(struct ieee80211_rx_data *rx, } if (!(status->flag & RX_FLAG_DECRYPTED)) { - u8 aad[2 * AES_BLOCK_SIZE]; - u8 b_0[AES_BLOCK_SIZE]; + u8 *aad = rx->local->aad; + u8 *b_0 = rx->local->b_0; + /* hardware didn't decrypt/verify MIC */ ccmp_special_blocks(skb, pn, b_0, aad); @@ -639,8 +642,8 @@ static int gcmp_encrypt_skb(struct ieee80211_tx_data *tx, struct sk_buff *skb) u8 *pos; u8 pn[6]; u64 pn64; - u8 aad[2 * AES_BLOCK_SIZE]; - u8 j_0[AES_BLOCK_SIZE]; + u8 *aad = tx->local->aad; + u8 *j_0 = tx->local->b_0; if (info->control.hw_key && !(info->control.hw_key->flags & IEEE80211_KEY_FLAG_GENERATE_IV) && @@ -695,9 +698,11 @@ static int gcmp_encrypt_skb(struct ieee80211_tx_data *tx, struct sk_buff *skb) return 0; pos += IEEE80211_GCMP_HDR_LEN; + spin_lock_bh(&tx->local->rx_path_lock); gcmp_special_blocks(skb, pn, j_0, aad); ieee80211_aes_gcm_encrypt(key->u.gcmp.tfm, j_0, aad, pos, len, skb_put(skb, IEEE80211_GCMP_MIC_LEN)); + spin_unlock_bh(&tx->local->rx_path_lock); return 0; } @@ -764,8 +769,9 @@ ieee80211_crypto_gcmp_decrypt(struct ieee80211_rx_data *rx) } if (!(status->flag & RX_FLAG_DECRYPTED)) { - u8 aad[2 * AES_BLOCK_SIZE]; - u8 j_0[AES_BLOCK_SIZE]; + u8 *aad = rx->local->aad; + u8 *j_0 = rx->local->b_0; + /* hardware didn't decrypt/verify MIC */ gcmp_special_blocks(skb, pn, j_0, aad);