diff mbox series

[v10,3/5] powerpc/crash: add crash CPU hotplug support

Message ID 20230423105213.70795-4-sourabhjain@linux.ibm.com (mailing list archive)
State Changes Requested
Headers show
Series PowerPC: In-kernel handling of CPU/Memory hotplug/online/offline events for kdump kernel | expand

Commit Message

Sourabh Jain April 23, 2023, 10:52 a.m. UTC
Introduce powerpc crash hotplug handler to update the necessary kexec
segments in the kernel on CPU/Memory hotplug events. Currently, these
updates are done by monitoring CPU/Memory hotplug events in userspace.

A common crash hotplug handler is triggered from generic infrastructure
for both CPU/Memory hotplug events. But in this patch, crash updates are
handled only for CPU hotplug events. Support for the crash update on
memory hotplug events is added in upcoming patches.

The elfcorehdr segment is used to exchange the CPU and other
dump-related information between the kernels. Ideally, the elfcorehdr
segment needs to be recreated on CPU hotplug events to reflect the
changes. But on powerpc, the elfcorehdr is built with possible CPUs
hence there is no need to update/recreate the elfcorehdr on CPU hotplug
events.

In addition to elfcorehdr, there is another kexec segment that holds CPU
data on powerpc is FDT (Flattened Device Tree). During the kdump kernel
boot, it is expected that the crashing CPU must be present in FDT, else
kdump kernel boot fails.

Now the only action needed on powerpc to handle the crash CPU hotplug
event is to add hot added CPUs in the kdump FDT segment to avoid kdump
kernel boot failure. So for the CPU hot add event, the FDT segment is
updated with hot added CPU and Since there is no need to remove the hot
unplugged CPUs from the FDT segment hence no action was taken for CPU
hot remove event.

To accommodate a growing number of CPUs, FDT is built with additional
buffer space to ensure that it can hold possible CPU nodes.

The changes done here will also work for the kexec_load system call
given that the kexec tool builds the FDT segment with additional space
to accommodate possible CPU nodes.

Since memory crash hotplug support is not there yet the crash hotplug
the handler simply warns the user and returns.

Signed-off-by: Sourabh Jain <sourabhjain@linux.ibm.com>
Reviewed-by: Laurent Dufour <laurent.dufour@fr.ibm.com>
---
 arch/powerpc/include/asm/kexec.h  |  4 ++
 arch/powerpc/kexec/core_64.c      | 61 +++++++++++++++++++++++++++++++
 arch/powerpc/kexec/elf_64.c       | 12 +++++-
 arch/powerpc/kexec/file_load_64.c | 14 +++++++
 4 files changed, 90 insertions(+), 1 deletion(-)

Comments

Laurent Dufour April 24, 2023, 9:59 a.m. UTC | #1
On 23/04/2023 12:52:11, Sourabh Jain wrote:
> Introduce powerpc crash hotplug handler to update the necessary kexec
> segments in the kernel on CPU/Memory hotplug events. Currently, these
> updates are done by monitoring CPU/Memory hotplug events in userspace.
> 
> A common crash hotplug handler is triggered from generic infrastructure
> for both CPU/Memory hotplug events. But in this patch, crash updates are
> handled only for CPU hotplug events. Support for the crash update on
> memory hotplug events is added in upcoming patches.
> 
> The elfcorehdr segment is used to exchange the CPU and other
> dump-related information between the kernels. Ideally, the elfcorehdr
> segment needs to be recreated on CPU hotplug events to reflect the
> changes. But on powerpc, the elfcorehdr is built with possible CPUs
> hence there is no need to update/recreate the elfcorehdr on CPU hotplug
> events.
> 
> In addition to elfcorehdr, there is another kexec segment that holds CPU
> data on powerpc is FDT (Flattened Device Tree). During the kdump kernel
> boot, it is expected that the crashing CPU must be present in FDT, else
> kdump kernel boot fails.
> 
> Now the only action needed on powerpc to handle the crash CPU hotplug
> event is to add hot added CPUs in the kdump FDT segment to avoid kdump
> kernel boot failure. So for the CPU hot add event, the FDT segment is
> updated with hot added CPU and Since there is no need to remove the hot
> unplugged CPUs from the FDT segment hence no action was taken for CPU
> hot remove event.
> 
> To accommodate a growing number of CPUs, FDT is built with additional
> buffer space to ensure that it can hold possible CPU nodes.
> 
> The changes done here will also work for the kexec_load system call
> given that the kexec tool builds the FDT segment with additional space
> to accommodate possible CPU nodes.
> 
> Since memory crash hotplug support is not there yet the crash hotplug
> the handler simply warns the user and returns.
> 
> Signed-off-by: Sourabh Jain <sourabhjain@linux.ibm.com>
> Reviewed-by: Laurent Dufour <laurent.dufour@fr.ibm.com
I don't remember sending a review-by on this patch earlier, do you?

> ---
>  arch/powerpc/include/asm/kexec.h  |  4 ++
>  arch/powerpc/kexec/core_64.c      | 61 +++++++++++++++++++++++++++++++
>  arch/powerpc/kexec/elf_64.c       | 12 +++++-
>  arch/powerpc/kexec/file_load_64.c | 14 +++++++
>  4 files changed, 90 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/powerpc/include/asm/kexec.h b/arch/powerpc/include/asm/kexec.h
> index 8090ad7d97d9d..f01ba767af56e 100644
> --- a/arch/powerpc/include/asm/kexec.h
> +++ b/arch/powerpc/include/asm/kexec.h
> @@ -103,6 +103,10 @@ void kexec_copy_flush(struct kimage *image);
>  struct crash_mem;
>  int update_cpus_node(void *fdt);
>  int get_crash_memory_ranges(struct crash_mem **mem_ranges);
> +#if defined(CONFIG_CRASH_HOTPLUG)
> +void arch_crash_handle_hotplug_event(struct kimage *image);
> +#define arch_crash_handle_hotplug_event arch_crash_handle_hotplug_event
> +#endif
>  #endif
>  
>  #if defined(CONFIG_CRASH_DUMP) && defined(CONFIG_PPC_RTAS)
> diff --git a/arch/powerpc/kexec/core_64.c b/arch/powerpc/kexec/core_64.c
> index 0b292f93a74cc..611b89bcea2be 100644
> --- a/arch/powerpc/kexec/core_64.c
> +++ b/arch/powerpc/kexec/core_64.c
> @@ -543,6 +543,67 @@ int update_cpus_node(void *fdt)
>  	return ret;
>  }
>  
> +#if defined(CONFIG_CRASH_HOTPLUG)
> +#undef pr_fmt
> +#define pr_fmt(fmt) "crash hp: " fmt
> +
> +/**
> + * arch_crash_hotplug_handler() - Handle crash CPU/Memory hotplug events to update the
> + *                                necessary kexec segments based on the hotplug event.
> + * @image: the active struct kimage
> + *
> + * Update FDT segment to include newly added CPU. No action for CPU remove case.
> + */
> +void arch_crash_handle_hotplug_event(struct kimage *image)
> +{
> +	void *fdt, *ptr;
> +	unsigned long mem;
> +	int i, fdt_index = -1;
> +	unsigned int hp_action = image->hp_action;
> +
> +	/*
> +	 * Since the hot-unplugged CPU is already part of crash FDT,
> +	 * no action is needed for CPU remove case.
> +	 */
> +	if (hp_action == KEXEC_CRASH_HP_REMOVE_CPU)
> +		return;
> +
> +	/* crash update on memory hotplug events is not supported yet */
> +	if (hp_action == KEXEC_CRASH_HP_REMOVE_MEMORY || hp_action == KEXEC_CRASH_HP_ADD_MEMORY) {
> +		pr_info_once("Crash update is not supported for memory hotplug\n");
> +		return;
> +	}
> +
> +	/* Find the FDT segment index in kexec segment array. */
> +	for (i = 0; i < image->nr_segments; i++) {
> +		mem = image->segment[i].mem;
> +		ptr = __va(mem);
> +
> +		if (ptr && fdt_magic(ptr) == FDT_MAGIC) {
> +			fdt_index = i;
> +			break;
> +		}
> +	}
> +
> +	if (fdt_index < 0) {
> +		pr_err("Unable to locate FDT segment.\n");
> +		return;
> +	}
> +
> +	fdt = __va((void *)image->segment[fdt_index].mem);
> +
> +	/* Temporarily invalidate the crash image while it is replaced */
> +	xchg(&kexec_crash_image, NULL);
> +
> +	/* update FDT to refelect changes in CPU resrouces */
> +	if (update_cpus_node(fdt))
> +		pr_err("Failed to update crash FDT");
> +
> +	/* The crash image is now valid once again */
> +	xchg(&kexec_crash_image, image);
> +}
> +#endif
> +
>  #ifdef CONFIG_PPC_64S_HASH_MMU
>  /* Values we need to export to the second kernel via the device tree. */
>  static unsigned long htab_base;
> diff --git a/arch/powerpc/kexec/elf_64.c b/arch/powerpc/kexec/elf_64.c
> index eeb258002d1e0..d8f6d967231e8 100644
> --- a/arch/powerpc/kexec/elf_64.c
> +++ b/arch/powerpc/kexec/elf_64.c
> @@ -30,6 +30,7 @@ static void *elf64_load(struct kimage *image, char *kernel_buf,
>  			unsigned long cmdline_len)
>  {
>  	int ret;
> +	bool do_pack_fdt = true;
>  	unsigned long kernel_load_addr;
>  	unsigned long initrd_load_addr = 0, fdt_load_addr;
>  	void *fdt;
> @@ -116,7 +117,16 @@ static void *elf64_load(struct kimage *image, char *kernel_buf,
>  	if (ret)
>  		goto out_free_fdt;
>  
> -	fdt_pack(fdt);
> +#if defined(CONFIG_CRASH_HOTPLUG)
> +	/*
> +	 * Do not pack FDT, additional space is reserved to accommodate
> +	 * possible CPU nodes which are not yet present in the system.
> +	 */
> +	if (image->type == KEXEC_TYPE_CRASH)
> +		do_pack_fdt = false;
> +#endif
> +	if (do_pack_fdt)
> +		fdt_pack(fdt);
>  
>  	kbuf.buffer = fdt;
>  	kbuf.bufsz = kbuf.memsz = fdt_totalsize(fdt);
> diff --git a/arch/powerpc/kexec/file_load_64.c b/arch/powerpc/kexec/file_load_64.c
> index 5b0b3f61e0e72..3554168687869 100644
> --- a/arch/powerpc/kexec/file_load_64.c
> +++ b/arch/powerpc/kexec/file_load_64.c
> @@ -908,6 +908,9 @@ unsigned int kexec_extra_fdt_size_ppc64(struct kimage *image)
>  	unsigned int cpu_nodes, extra_size = 0;
>  	struct device_node *dn;
>  	u64 usm_entries;
> +#if defined(CONFIG_CRASH_HOTPLUG)
> +	unsigned int possible_cpu_nodes;
> +#endif
>  
>  	// Budget some space for the password blob. There's already extra space
>  	// for the key name
> @@ -940,6 +943,17 @@ unsigned int kexec_extra_fdt_size_ppc64(struct kimage *image)
>  	if (cpu_nodes > boot_cpu_node_count)
>  		extra_size += (cpu_nodes - boot_cpu_node_count) * cpu_node_size();
>  
> +#if defined(CONFIG_CRASH_HOTPLUG)
> +	/*
> +	 * Make sure enough space is reserved to accommodate possible CPU nodes
> +	 * in the crash FDT. This allows packing possible CPU nodes which are
> +	 * not yet present in the system without regenerating the entire FDT.
> +	 */
> +	possible_cpu_nodes = num_possible_cpus() / threads_per_core;
> +	if (image->type == KEXEC_TYPE_CRASH && possible_cpu_nodes > cpu_nodes)
> +		extra_size += (possible_cpu_nodes - cpu_nodes) * cpu_node_size();
> +#endif
> +
>  	return extra_size;
>  }
>
diff mbox series

Patch

diff --git a/arch/powerpc/include/asm/kexec.h b/arch/powerpc/include/asm/kexec.h
index 8090ad7d97d9d..f01ba767af56e 100644
--- a/arch/powerpc/include/asm/kexec.h
+++ b/arch/powerpc/include/asm/kexec.h
@@ -103,6 +103,10 @@  void kexec_copy_flush(struct kimage *image);
 struct crash_mem;
 int update_cpus_node(void *fdt);
 int get_crash_memory_ranges(struct crash_mem **mem_ranges);
+#if defined(CONFIG_CRASH_HOTPLUG)
+void arch_crash_handle_hotplug_event(struct kimage *image);
+#define arch_crash_handle_hotplug_event arch_crash_handle_hotplug_event
+#endif
 #endif
 
 #if defined(CONFIG_CRASH_DUMP) && defined(CONFIG_PPC_RTAS)
diff --git a/arch/powerpc/kexec/core_64.c b/arch/powerpc/kexec/core_64.c
index 0b292f93a74cc..611b89bcea2be 100644
--- a/arch/powerpc/kexec/core_64.c
+++ b/arch/powerpc/kexec/core_64.c
@@ -543,6 +543,67 @@  int update_cpus_node(void *fdt)
 	return ret;
 }
 
+#if defined(CONFIG_CRASH_HOTPLUG)
+#undef pr_fmt
+#define pr_fmt(fmt) "crash hp: " fmt
+
+/**
+ * arch_crash_hotplug_handler() - Handle crash CPU/Memory hotplug events to update the
+ *                                necessary kexec segments based on the hotplug event.
+ * @image: the active struct kimage
+ *
+ * Update FDT segment to include newly added CPU. No action for CPU remove case.
+ */
+void arch_crash_handle_hotplug_event(struct kimage *image)
+{
+	void *fdt, *ptr;
+	unsigned long mem;
+	int i, fdt_index = -1;
+	unsigned int hp_action = image->hp_action;
+
+	/*
+	 * Since the hot-unplugged CPU is already part of crash FDT,
+	 * no action is needed for CPU remove case.
+	 */
+	if (hp_action == KEXEC_CRASH_HP_REMOVE_CPU)
+		return;
+
+	/* crash update on memory hotplug events is not supported yet */
+	if (hp_action == KEXEC_CRASH_HP_REMOVE_MEMORY || hp_action == KEXEC_CRASH_HP_ADD_MEMORY) {
+		pr_info_once("Crash update is not supported for memory hotplug\n");
+		return;
+	}
+
+	/* Find the FDT segment index in kexec segment array. */
+	for (i = 0; i < image->nr_segments; i++) {
+		mem = image->segment[i].mem;
+		ptr = __va(mem);
+
+		if (ptr && fdt_magic(ptr) == FDT_MAGIC) {
+			fdt_index = i;
+			break;
+		}
+	}
+
+	if (fdt_index < 0) {
+		pr_err("Unable to locate FDT segment.\n");
+		return;
+	}
+
+	fdt = __va((void *)image->segment[fdt_index].mem);
+
+	/* Temporarily invalidate the crash image while it is replaced */
+	xchg(&kexec_crash_image, NULL);
+
+	/* update FDT to refelect changes in CPU resrouces */
+	if (update_cpus_node(fdt))
+		pr_err("Failed to update crash FDT");
+
+	/* The crash image is now valid once again */
+	xchg(&kexec_crash_image, image);
+}
+#endif
+
 #ifdef CONFIG_PPC_64S_HASH_MMU
 /* Values we need to export to the second kernel via the device tree. */
 static unsigned long htab_base;
diff --git a/arch/powerpc/kexec/elf_64.c b/arch/powerpc/kexec/elf_64.c
index eeb258002d1e0..d8f6d967231e8 100644
--- a/arch/powerpc/kexec/elf_64.c
+++ b/arch/powerpc/kexec/elf_64.c
@@ -30,6 +30,7 @@  static void *elf64_load(struct kimage *image, char *kernel_buf,
 			unsigned long cmdline_len)
 {
 	int ret;
+	bool do_pack_fdt = true;
 	unsigned long kernel_load_addr;
 	unsigned long initrd_load_addr = 0, fdt_load_addr;
 	void *fdt;
@@ -116,7 +117,16 @@  static void *elf64_load(struct kimage *image, char *kernel_buf,
 	if (ret)
 		goto out_free_fdt;
 
-	fdt_pack(fdt);
+#if defined(CONFIG_CRASH_HOTPLUG)
+	/*
+	 * Do not pack FDT, additional space is reserved to accommodate
+	 * possible CPU nodes which are not yet present in the system.
+	 */
+	if (image->type == KEXEC_TYPE_CRASH)
+		do_pack_fdt = false;
+#endif
+	if (do_pack_fdt)
+		fdt_pack(fdt);
 
 	kbuf.buffer = fdt;
 	kbuf.bufsz = kbuf.memsz = fdt_totalsize(fdt);
diff --git a/arch/powerpc/kexec/file_load_64.c b/arch/powerpc/kexec/file_load_64.c
index 5b0b3f61e0e72..3554168687869 100644
--- a/arch/powerpc/kexec/file_load_64.c
+++ b/arch/powerpc/kexec/file_load_64.c
@@ -908,6 +908,9 @@  unsigned int kexec_extra_fdt_size_ppc64(struct kimage *image)
 	unsigned int cpu_nodes, extra_size = 0;
 	struct device_node *dn;
 	u64 usm_entries;
+#if defined(CONFIG_CRASH_HOTPLUG)
+	unsigned int possible_cpu_nodes;
+#endif
 
 	// Budget some space for the password blob. There's already extra space
 	// for the key name
@@ -940,6 +943,17 @@  unsigned int kexec_extra_fdt_size_ppc64(struct kimage *image)
 	if (cpu_nodes > boot_cpu_node_count)
 		extra_size += (cpu_nodes - boot_cpu_node_count) * cpu_node_size();
 
+#if defined(CONFIG_CRASH_HOTPLUG)
+	/*
+	 * Make sure enough space is reserved to accommodate possible CPU nodes
+	 * in the crash FDT. This allows packing possible CPU nodes which are
+	 * not yet present in the system without regenerating the entire FDT.
+	 */
+	possible_cpu_nodes = num_possible_cpus() / threads_per_core;
+	if (image->type == KEXEC_TYPE_CRASH && possible_cpu_nodes > cpu_nodes)
+		extra_size += (possible_cpu_nodes - cpu_nodes) * cpu_node_size();
+#endif
+
 	return extra_size;
 }