| Message ID | aWmqFAzdtsR0t5aJ@autotest-wegao.qe.prg2.suse.org |
|---|---|
| State | Superseded |
| Headers | show |
| Series | clone10.c failed on 32bit compilation | expand |
FYI the patch is invalid. It should have been: diff --git include/lapi/tls.h include/lapi/tls.h index a067872e0f..eee77899e8 100644 --- include/lapi/tls.h +++ include/lapi/tls.h @@ -64,7 +64,7 @@ static inline void init_tls(void) tls_ptr = allocate_tls_area(); tls_desc = SAFE_MALLOC(sizeof(*tls_desc)); memset(tls_desc, 0, sizeof(*tls_desc)); - tls_desc->entry_number = -1; + tls_desc->entry_number = 13; tls_desc->base_addr = (unsigned long)tls_ptr; tls_desc->limit = TLS_SIZE; tls_desc->seg_32bit = 1; @@ -72,7 +72,7 @@ static inline void init_tls(void) tls_desc->read_exec_only = 0; tls_desc->limit_in_pages = 0; tls_desc->seg_not_present = 0; - tls_desc->useable = 1; + tls_ptr = tls_desc; #else tst_brk(TCONF, "Unsupported architecture for TLS");
Wei Gao <wegao@suse.com> wrote: > Try following patch firslty still got EINVAL since tls_desc->entry_number will be -1 still go same error. > > diff --git a/include/lapi/tls.h b/include/lapi/tls.h > index a067872e0..aedc907d9 100644 > --- a/include/lapi/tls.h > +++ b/include/lapi/tls.h > @@ -73,6 +73,7 @@ static inline void init_tls(void) > tls_desc->limit_in_pages = 0; > tls_desc->seg_not_present = 0; > tls_desc->useable = 1; > + tls_ptr = tls_desc; > > #else > tst_brk(TCONF, "Unsupported architecture for TLS"); > > so i try to change entry_number to correct one base kernel src logic. > diff --git a/include/lapi/tls.h b/include/lapi/tls.h > index a067872e0..9e69244da 100644 > --- a/include/lapi/tls.h > +++ b/include/lapi/tls.h > @@ -64,7 +64,7 @@ static inline void init_tls(void) > tls_ptr = allocate_tls_area(); > tls_desc = SAFE_MALLOC(sizeof(*tls_desc)); > memset(tls_desc, 0, sizeof(*tls_desc)); > - tls_desc->entry_number = -1; > + tls_desc->entry_number = 13; > tls_desc->base_addr = (unsigned long)tls_ptr; > tls_desc->limit = TLS_SIZE; > tls_desc->seg_32bit = 1; > > Now i get following output, no EINVAL now but it seems child and parent point to same tls area. > tst_tmpdir.c:316: TINFO: Using /tmp/LTP_cloa20awq as tmpdir (tmpfs filesystem) > tst_test.c:2025: TINFO: LTP version: 20250130-546-g13dbd838c > tst_test.c:2028: TINFO: Tested kernel: 6.19.0-rc5-gb71e635feefc #11 SMP PREEMPT_DYNAMIC Tue Jan 13 16:16:15 CST 2026 x86_64 > tst_kconfig.c:71: TINFO: Couldn't locate kernel config! > tst_test.c:1846: TINFO: Overall timeout per run is 0h 00m 30s > clone10.c:48: TINFO: Child (PID: 5262, TID: 5263): TLS value set to: 101 > clone10.c:72: TFAIL: Parent (PID: 5262, TID: 5262): TLS value mismatch: got 101, expected 100 Well, this indicates that the Child does not switch to itself TLS correctly, still use the parent, and so the '__thread tls_var' value gets modified. With two days of research, I roughly drew the picture of the possible reason as below: Using a naked clone() to verify that CLONE_SETTLS is unreliable when running 32-bit on x86_64, since i386 TLS requires two steps: writing the descriptor and switching the selector, but CLONE_SETTLS only overrides the former: This is the simplified call chain: kernel_clone() copy_process() copy_thread() set_new_tls() do_set_thread_area() In copy_thread(), the child’s register frame is copied from the parent *childregs = *current_pt_regs(); and on the 32-bit side it also does savesegment(gs, p->thread.gs); saving the current %gs into thread_struct. Together, this means that unless something explicitly overwrites it later, the child’s initial %gs selector is inherited from the parent. https://elixir.bootlin.com/linux/v6.18/source/arch/x86/kernel/process.c#L243 Then, in do_set_thread_area(), the kernel updates the TLS descriptor set_tls_desc(p, idx, &info, 1); However, when (p != current), the x86_32 path does not update or refresh any segment selector. So it updates the descriptor but does not switch the child’s %gs selector to the new modified_sel. https://elixir.bootlin.com/linux/v6.18/source/arch/x86/kernel/tls.c#L150 Therefore, relying on CLONE_SETTLS alone can leave the child executing with the parent’s %gs selector, so TLS accesses still resolve to the old TLS base. In summary, if this analysis is make sense, we need to configure the TCONF test branch skip on i386.
Petr Vorel <pvorel@suse.cz> wrote: > FYI the patch is invalid. It should have been: > > diff --git include/lapi/tls.h include/lapi/tls.h > index a067872e0f..eee77899e8 100644 > --- include/lapi/tls.h > +++ include/lapi/tls.h > @@ -64,7 +64,7 @@ static inline void init_tls(void) > tls_ptr = allocate_tls_area(); > tls_desc = SAFE_MALLOC(sizeof(*tls_desc)); > memset(tls_desc, 0, sizeof(*tls_desc)); > - tls_desc->entry_number = -1; > + tls_desc->entry_number = 13; > tls_desc->base_addr = (unsigned long)tls_ptr; > tls_desc->limit = TLS_SIZE; > tls_desc->seg_32bit = 1; > @@ -72,7 +72,7 @@ static inline void init_tls(void) > tls_desc->read_exec_only = 0; > tls_desc->limit_in_pages = 0; > tls_desc->seg_not_present = 0; > - tls_desc->useable = 1; > + tls_ptr = tls_desc; @Wei, @Petr, did you get it to work after trying the above diff? Which kernel did you use? Unfortunately, neither of these methods (including Wei's method) works properly on my kernel-6.19.0-rc2 platform. And no matter what method I try, the child process still cannot switch to the new TLS. More details see I posted in the pre-thread.
On Tue, Jan 20, 2026 at 08:58:05PM +0800, Li Wang wrote: > Wei Gao <wegao@suse.com> wrote: > > > Try following patch firslty still got EINVAL since tls_desc->entry_number will be -1 still go same error. > > > > diff --git a/include/lapi/tls.h b/include/lapi/tls.h > > index a067872e0..aedc907d9 100644 > > --- a/include/lapi/tls.h > > +++ b/include/lapi/tls.h > > @@ -73,6 +73,7 @@ static inline void init_tls(void) > > tls_desc->limit_in_pages = 0; > > tls_desc->seg_not_present = 0; > > tls_desc->useable = 1; > > + tls_ptr = tls_desc; > > > > #else > > tst_brk(TCONF, "Unsupported architecture for TLS"); > > > > so i try to change entry_number to correct one base kernel src logic. > > diff --git a/include/lapi/tls.h b/include/lapi/tls.h > > index a067872e0..9e69244da 100644 > > --- a/include/lapi/tls.h > > +++ b/include/lapi/tls.h > > @@ -64,7 +64,7 @@ static inline void init_tls(void) > > tls_ptr = allocate_tls_area(); > > tls_desc = SAFE_MALLOC(sizeof(*tls_desc)); > > memset(tls_desc, 0, sizeof(*tls_desc)); > > - tls_desc->entry_number = -1; > > + tls_desc->entry_number = 13; > > tls_desc->base_addr = (unsigned long)tls_ptr; > > tls_desc->limit = TLS_SIZE; > > tls_desc->seg_32bit = 1; > > > > Now i get following output, no EINVAL now but it seems child and parent point to same tls area. > > tst_tmpdir.c:316: TINFO: Using /tmp/LTP_cloa20awq as tmpdir (tmpfs filesystem) > > tst_test.c:2025: TINFO: LTP version: 20250130-546-g13dbd838c > > tst_test.c:2028: TINFO: Tested kernel: 6.19.0-rc5-gb71e635feefc #11 SMP PREEMPT_DYNAMIC Tue Jan 13 16:16:15 CST 2026 x86_64 > > tst_kconfig.c:71: TINFO: Couldn't locate kernel config! > > tst_test.c:1846: TINFO: Overall timeout per run is 0h 00m 30s > > clone10.c:48: TINFO: Child (PID: 5262, TID: 5263): TLS value set to: 101 > > clone10.c:72: TFAIL: Parent (PID: 5262, TID: 5262): TLS value mismatch: got 101, expected 100 > > Well, this indicates that the Child does not switch to itself TLS > correctly, still use the parent, and so the '__thread tls_var' value > gets modified. > > With two days of research, I roughly drew the picture of the possible > reason as below: > > Using a naked clone() to verify that CLONE_SETTLS is unreliable > when running 32-bit on x86_64, since i386 TLS requires two steps: > writing the descriptor and switching the selector, but CLONE_SETTLS > only overrides the former: > > This is the simplified call chain: > > kernel_clone() > copy_process() > copy_thread() > set_new_tls() > do_set_thread_area() > > In copy_thread(), the child’s register frame is copied from the parent > *childregs = *current_pt_regs(); and on the 32-bit side it also does > savesegment(gs, p->thread.gs); saving the current %gs into thread_struct. > > Together, this means that unless something explicitly overwrites it later, > the child’s initial %gs selector is inherited from the parent. > > https://elixir.bootlin.com/linux/v6.18/source/arch/x86/kernel/process.c#L243 > > Then, in do_set_thread_area(), the kernel updates the TLS descriptor > set_tls_desc(p, idx, &info, 1); However, when (p != current), the x86_32 path > does not update or refresh any segment selector. So it updates the descriptor > but does not switch the child’s %gs selector to the new modified_sel. > > https://elixir.bootlin.com/linux/v6.18/source/arch/x86/kernel/tls.c#L150 > > Therefore, relying on CLONE_SETTLS alone can leave the child executing > with the parent’s %gs selector, so TLS accesses still resolve to the > old TLS base. > > In summary, if this analysis is make sense, we need to configure the TCONF > test branch skip on i386. Your analysis is correct when we use 13 as tls_desc->entry_number. If we not change kernel logic (force switch child's %gs to poing gdt 13 entry then we can not test this feature). So correct solution is not create new GDT TLS entry but reuse exist entry, by default parent will use GDT_ENTRY_TLS_MIN which number is 12. So we clone with tls_desc->entry_number to 12 and ONLY change tls_desc->base_addr, when switch to child %gs still same but GDT number 12 entry already updated by new base_addr. But now i encounter strange SIGSEGV error when switch to child, no idea why. BTW: if we use pure 32bit we should resuse entry_number to 6, so basicly in code we should first use __NR_get_thread_area get currently using entry_number. I agree we can skip test on i386 firstly since we are currently still not support in this test case. > > -- > Regards, > Li Wang >
On Tue, Jan 20, 2026 at 09:13:25PM +0800, Li Wang wrote: > Petr Vorel <pvorel@suse.cz> wrote: > > > FYI the patch is invalid. It should have been: > > > > diff --git include/lapi/tls.h include/lapi/tls.h > > index a067872e0f..eee77899e8 100644 > > --- include/lapi/tls.h > > +++ include/lapi/tls.h > > @@ -64,7 +64,7 @@ static inline void init_tls(void) > > tls_ptr = allocate_tls_area(); > > tls_desc = SAFE_MALLOC(sizeof(*tls_desc)); > > memset(tls_desc, 0, sizeof(*tls_desc)); > > - tls_desc->entry_number = -1; > > + tls_desc->entry_number = 13; > > tls_desc->base_addr = (unsigned long)tls_ptr; > > tls_desc->limit = TLS_SIZE; > > tls_desc->seg_32bit = 1; > > @@ -72,7 +72,7 @@ static inline void init_tls(void) > > tls_desc->read_exec_only = 0; > > tls_desc->limit_in_pages = 0; > > tls_desc->seg_not_present = 0; > > - tls_desc->useable = 1; > > + tls_ptr = tls_desc; > > @Wei, @Petr, did you get it to work after trying the above diff? > Which kernel did you use? 6.19.0-rc5-gb71e635feefc , above diff can not work, just some try. > > Unfortunately, neither of these methods (including Wei's method) works > properly on my kernel-6.19.0-rc2 platform. > > And no matter what method I try, the child process still cannot switch > to the new TLS. More details see I posted in the pre-thread. Yes, i guess we are still blocking on i386 scenario. But we can rewrite parent TLS's base_addr instead of switch new TLS, this way is correct base current kernel's logic but still need further implementation. > > > -- > Regards, > Li Wang >
Wei Gao <wegao@suse.com> wrote: > > Well, this indicates that the Child does not switch to itself TLS > > correctly, still use the parent, and so the '__thread tls_var' value > > gets modified. > > > > With two days of research, I roughly drew the picture of the possible > > reason as below: > > > > Using a naked clone() to verify that CLONE_SETTLS is unreliable > > when running 32-bit on x86_64, since i386 TLS requires two steps: > > writing the descriptor and switching the selector, but CLONE_SETTLS > > only overrides the former: > > > > This is the simplified call chain: > > > > kernel_clone() > > copy_process() > > copy_thread() > > set_new_tls() > > do_set_thread_area() > > > > In copy_thread(), the child’s register frame is copied from the parent > > *childregs = *current_pt_regs(); and on the 32-bit side it also does > > savesegment(gs, p->thread.gs); saving the current %gs into thread_struct. > > > > Together, this means that unless something explicitly overwrites it later, > > the child’s initial %gs selector is inherited from the parent. > > > > https://elixir.bootlin.com/linux/v6.18/source/arch/x86/kernel/process.c#L243 > > > > Then, in do_set_thread_area(), the kernel updates the TLS descriptor > > set_tls_desc(p, idx, &info, 1); However, when (p != current), the x86_32 path > > does not update or refresh any segment selector. So it updates the descriptor > > but does not switch the child’s %gs selector to the new modified_sel. > > > > https://elixir.bootlin.com/linux/v6.18/source/arch/x86/kernel/tls.c#L150 > > > > Therefore, relying on CLONE_SETTLS alone can leave the child executing > > with the parent’s %gs selector, so TLS accesses still resolve to the > > old TLS base. > > > > In summary, if this analysis is make sense, we need to configure the TCONF > > test branch skip on i386. > > Your analysis is correct when we use 13 as tls_desc->entry_number. If we > not change kernel logic (force switch child's %gs to poing gdt 13 entry then > we can not test this feature). When we allocate a new TLS entry (e.g., 13) and don’t actually switch %gs to it, that's not breaking anything, user space continues to run with the original %gs, so it survives. > So correct solution is not create new GDT TLS entry but reuse exist No, we shouldn't do this for clone10. The purpose of clone10 is to test that the cloned child process will create a new TLS are and verify that it is different from the parent's. > entry, by default parent will use GDT_ENTRY_TLS_MIN which number is 12. > So we clone with tls_desc->entry_number to 12 and ONLY change > tls_desc->base_addr, when switch to child %gs still same but GDT > number 12 entry already updated by new base_addr. But now i encounter strange > SIGSEGV error when switch to child, no idea why. If you reuse the currently active entry (often 12) and change only the descriptor base, you are effectively changing GS.base while keeping %gs the same. That means all existing TLS references now land at a different base, almost certainly not a valid TCB, so you crash quickly. > BTW: if we use pure 32bit we should resuse entry_number to 6, so basicly > in code we should first use __NR_get_thread_area get currently using > entry_number. > > I agree we can skip test on i386 firstly since we are currently still not support > in this test case.
Wei Gao <wegao@suse.com> wrote: > > > diff --git include/lapi/tls.h include/lapi/tls.h > > > index a067872e0f..eee77899e8 100644 > > > --- include/lapi/tls.h > > > +++ include/lapi/tls.h > > > @@ -64,7 +64,7 @@ static inline void init_tls(void) > > > tls_ptr = allocate_tls_area(); > > > tls_desc = SAFE_MALLOC(sizeof(*tls_desc)); > > > memset(tls_desc, 0, sizeof(*tls_desc)); > > > - tls_desc->entry_number = -1; > > > + tls_desc->entry_number = 13; > > > tls_desc->base_addr = (unsigned long)tls_ptr; > > > tls_desc->limit = TLS_SIZE; > > > tls_desc->seg_32bit = 1; > > > @@ -72,7 +72,7 @@ static inline void init_tls(void) > > > tls_desc->read_exec_only = 0; > > > tls_desc->limit_in_pages = 0; > > > tls_desc->seg_not_present = 0; > > > - tls_desc->useable = 1; > > > + tls_ptr = tls_desc; > > @Wei, @Petr, did you get it to work after trying the above diff? > > Which kernel did you use? > 6.19.0-rc5-gb71e635feefc , above diff can not work, just some try. > > Unfortunately, neither of these methods (including Wei's method) works > > properly on my kernel-6.19.0-rc2 platform. > > > > And no matter what method I try, the child process still cannot switch > > to the new TLS. More details see I posted in the pre-thread. > Yes, i guess we are still blocking on i386 scenario. But we can rewrite > parent TLS's base_addr instead of switch new TLS, this way is correct > base current kernel's logic but still need further implementation. Hmm, but it deviates from the topic of our test. To rewrite the tls_desc->base_addr and manually switch the %gs selector to validate TLS state, that is not necessary IMHO. -- Regards, Li Wang
> > Yes, i guess we are still blocking on i386 scenario. But we can rewrite > > parent TLS's base_addr instead of switch new TLS, this way is correct > > base current kernel's logic but still need further implementation. Oh, sorry. You mean only rewrite parent TLS base_addr. I guess it can work only if the new base_addr points to a valid TLS block that matches the runtime's expected layout. And, we need to construct a real TCB+TLS image for the new location. However, it is also over the goal of the CLONE_SETTLS test for clone10.
On Wed, Jan 21, 2026 at 04:11:05PM +0800, Li Wang wrote: > > > Yes, i guess we are still blocking on i386 scenario. But we can rewrite > > > parent TLS's base_addr instead of switch new TLS, this way is correct > > > base current kernel's logic but still need further implementation. > > Oh, sorry. You mean only rewrite parent TLS base_addr. > > I guess it can work only if the new base_addr points to a valid TLS > block that matches the runtime's expected layout. And, we need to > construct a real TCB+TLS image for the new location. > > However, it is also over the goal of the CLONE_SETTLS test for clone10. Indeed this way we fake too much things :) > > > -- > Regards, > Li Wang >
diff --git a/include/lapi/tls.h b/include/lapi/tls.h index a067872e0..aedc907d9 100644 --- a/include/lapi/tls.h +++ b/include/lapi/tls.h @@ -73,6 +73,7 @@ static inline void init_tls(void) tls_desc->limit_in_pages = 0; tls_desc->seg_not_present = 0; tls_desc->useable = 1; + tls_ptr = tls_desc; #else tst_brk(TCONF, "Unsupported architecture for TLS"); so i try to change entry_number to correct one base kernel src logic. diff --git a/include/lapi/tls.h b/include/lapi/tls.h index a067872e0..9e69244da 100644 --- a/include/lapi/tls.h +++ b/include/lapi/tls.h @@ -64,7 +64,7 @@ static inline void init_tls(void) tls_ptr = allocate_tls_area(); tls_desc = SAFE_MALLOC(sizeof(*tls_desc)); memset(tls_desc, 0, sizeof(*tls_desc)); - tls_desc->entry_number = -1; + tls_desc->entry_number = 13; tls_desc->base_addr = (unsigned long)tls_ptr; tls_desc->limit = TLS_SIZE; tls_desc->seg_32bit = 1;
Hi Chunfu Wen Li Wang We're encountering an issue where the new test case clone10.c is consistently failing in 32-bit mode. Have you experienced similar problems? I've attached some preliminary investigation results for reference. Any insights or suggestions would be greatly appreciated. uname -r 6.19.0-rc5-gb71e635feefc export CFLAGS="-m32 -fstack-protector-strong" export LDFLAGS="-m32" make clone10 tst_tmpdir.c:316: TINFO: Using /tmp/LTP_cloV8soQm as tmpdir (tmpfs filesystem) tst_test.c:2025: TINFO: LTP version: 20250130-546-g13dbd838c tst_test.c:2028: TINFO: Tested kernel: 6.19.0-rc5-gb71e635feefc #11 SMP PREEMPT_DYNAMIC Tue Jan 13 16:16:15 CST 2026 x86_64 tst_kconfig.c:71: TINFO: Couldn't locate kernel config! tst_test.c:1846: TINFO: Overall timeout per run is 0h 00m 30s clone10.c:63: TBROK: clone() failed: EINVAL (22) Trace kernel source and found error will triggered by following function: arch/x86/kernel/tls.c static int set_new_tls(struct task_struct *p, unsigned long tls) { struct user_desc __user *utls = (struct user_desc __user *)tls; if (in_ia32_syscall()) return do_set_thread_area(p, -1, utls, 0); <<<<<<<<<<<< else return do_set_thread_area_64(p, ARCH_SET_FS, tls); } int do_set_thread_area(struct task_struct *p, int idx, struct user_desc __user *u_info, int can_allocate) { ... if (idx == -1) idx = info.entry_number; <<<< entry_number is 0 which set by clone10 /* * index -1 means the kernel should try to find and * allocate an empty descriptor: */ if (idx == -1 && can_allocate) { <<< can_allocate is 0 so SKIP, means 32bit mode can not support auto allocate idx = get_free_idx(); if (idx < 0) return idx; if (put_user(idx, &u_info->entry_number)) return -EFAULT; } if (idx < GDT_ENTRY_TLS_MIN || idx > GDT_ENTRY_TLS_MAX) return -EINVAL; <<<< idx now is 0 so finally hit this return to clone10 Try following patch firslty still got EINVAL since tls_desc->entry_number will be -1 still go same error. Now i get following output, no EINVAL now but it seems child and parent point to same tls area. tst_tmpdir.c:316: TINFO: Using /tmp/LTP_cloa20awq as tmpdir (tmpfs filesystem) tst_test.c:2025: TINFO: LTP version: 20250130-546-g13dbd838c tst_test.c:2028: TINFO: Tested kernel: 6.19.0-rc5-gb71e635feefc #11 SMP PREEMPT_DYNAMIC Tue Jan 13 16:16:15 CST 2026 x86_64 tst_kconfig.c:71: TINFO: Couldn't locate kernel config! tst_test.c:1846: TINFO: Overall timeout per run is 0h 00m 30s clone10.c:48: TINFO: Child (PID: 5262, TID: 5263): TLS value set to: 101 clone10.c:72: TFAIL: Parent (PID: 5262, TID: 5262): TLS value mismatch: got 101, expected 100 Thanks. Regards Wei Gao