Message ID | 20241124052040.239813-1-haren@linux.ibm.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | powerpc/pseries: Add papr-platform-dump character driver for dump retrieval | expand |
Context | Check | Description |
---|---|---|
snowpatch_ozlabs/github-powerpc_ppctests | success | Successfully ran 8 jobs. |
snowpatch_ozlabs/github-powerpc_selftests | success | Successfully ran 8 jobs. |
snowpatch_ozlabs/github-powerpc_sparse | success | Successfully ran 4 jobs. |
snowpatch_ozlabs/github-powerpc_clang | success | Successfully ran 5 jobs. |
snowpatch_ozlabs/github-powerpc_kernel_qemu | success | Successfully ran 21 jobs. |
Hi Haren, kernel test robot noticed the following build warnings: [auto build test WARNING on powerpc/next] [also build test WARNING on powerpc/fixes linus/master v6.12 next-20241125] [If your patch is applied to the wrong git tree, kindly drop us a note. And when submitting patch, we suggest to use '--base' as documented in https://git-scm.com/docs/git-format-patch#_base_tree_information] url: https://github.com/intel-lab-lkp/linux/commits/Haren-Myneni/powerpc-pseries-Add-papr-platform-dump-character-driver-for-dump-retrieval/20241125-112816 base: https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git next patch link: https://lore.kernel.org/r/20241124052040.239813-1-haren%40linux.ibm.com patch subject: [PATCH] powerpc/pseries: Add papr-platform-dump character driver for dump retrieval config: powerpc64-randconfig-001-20241125 (https://download.01.org/0day-ci/archive/20241126/202411260057.wrWAIl67-lkp@intel.com/config) compiler: clang version 20.0.0git (https://github.com/llvm/llvm-project 592c0fe55f6d9a811028b5f3507be91458ab2713) reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20241126/202411260057.wrWAIl67-lkp@intel.com/reproduce) If you fix the issue in a separate patch/commit (i.e. not just a new version of the same patch/commit), kindly add following tags | Reported-by: kernel test robot <lkp@intel.com> | Closes: https://lore.kernel.org/oe-kbuild-all/202411260057.wrWAIl67-lkp@intel.com/ All warnings (new ones prefixed by >>): >> arch/powerpc/platforms/pseries/papr-platform-dump.c:51: warning: Function parameter or struct member 'dump_tag_hi' not described in 'ibm_platform_dump_params' >> arch/powerpc/platforms/pseries/papr-platform-dump.c:51: warning: Function parameter or struct member 'dump_tag_lo' not described in 'ibm_platform_dump_params' >> arch/powerpc/platforms/pseries/papr-platform-dump.c:51: warning: Function parameter or struct member 'sequence_hi' not described in 'ibm_platform_dump_params' >> arch/powerpc/platforms/pseries/papr-platform-dump.c:51: warning: Function parameter or struct member 'sequence_lo' not described in 'ibm_platform_dump_params' >> arch/powerpc/platforms/pseries/papr-platform-dump.c:51: warning: Function parameter or struct member 'bytes_ret_hi' not described in 'ibm_platform_dump_params' >> arch/powerpc/platforms/pseries/papr-platform-dump.c:51: warning: Function parameter or struct member 'bytes_ret_lo' not described in 'ibm_platform_dump_params' >> arch/powerpc/platforms/pseries/papr-platform-dump.c:51: warning: Excess struct member 'dump_tag' description in 'ibm_platform_dump_params' >> arch/powerpc/platforms/pseries/papr-platform-dump.c:51: warning: Excess struct member 'sequence' description in 'ibm_platform_dump_params' >> arch/powerpc/platforms/pseries/papr-platform-dump.c:51: warning: Excess struct member 'written' description in 'ibm_platform_dump_params' vim +51 arch/powerpc/platforms/pseries/papr-platform-dump.c 25 26 /** 27 * struct ibm_platform_dump_params - Parameters (in and out) for 28 * ibm,platform-dump 29 * @work_area: In: work area buffer for results. 30 * @buf_length: In: work area buffer length in bytes 31 * @dump_tag: In: Dump_Tag representing an id of the dump being processed 32 * @sequence: In: Sequence number. Out: Next sequence number. 33 * @written: Out: Bytes written by ibm,platform-dump to @work_area. 34 * @status: Out: RTAS call status. 35 * @list: Maintain the list of dumps are in progress. Can retrieve 36 * multiple dumps with different dump IDs at the same time, 37 * but not with the same dump ID. This list is used to 38 * determine whether the dump for the same ID is in progress. 39 */ 40 struct ibm_platform_dump_params { 41 struct rtas_work_area *work_area; 42 u32 buf_length; 43 u32 dump_tag_hi; 44 u32 dump_tag_lo; 45 u32 sequence_hi; 46 u32 sequence_lo; 47 u32 bytes_ret_hi; 48 u32 bytes_ret_lo; 49 s32 status; 50 struct list_head list; > 51 }; 52
On 2024-11-23 21:20:39 Sat, Haren Myneni wrote: [...] > +static ssize_t papr_platform_dump_handle_read(struct file *file, > + char __user *buf, size_t size, loff_t *off) > +{ > + struct ibm_platform_dump_params *params = file->private_data; > + u64 total_bytes; > + s32 fwrc; > + > + /* > + * Dump already completed with the previous read calls. > + * In case if the user space issues further reads, returns > + * -EINVAL. > + */ > + if (!params->buf_length) { > + pr_warn_once("Platform dump completed for dump ID %llu\n", > + (u64) (((u64)params->dump_tag_hi << 32) | > + params->dump_tag_lo)); > + return -EINVAL; > + } > + > + if (size < SZ_1K) { > + pr_err_once("Buffer length should be minimum 1024 bytes\n"); > + return -EINVAL; > + } > + > + /* > + * The hypervisor returns status 0 if no more data available to > + * download. Then expects the last RTAS call with NULL buffer > + * to invalidate the dump which means dump will be freed in the > + * hypervisor. > + */ > + if (params->status == RTAS_IBM_PLATFORM_DUMP_COMPLETE) { > + params->buf_length = 0; > + fwrc = rtas_ibm_platform_dump(params, 0, 0); > + /* > + * Returns 0 (success) to the user space so that user > + * space read stops. Does this implicitly invalidates the dump irrespective of whether userspace has requested or not ? Copy-pasting bellow code snippet from librtas side patch posted by you to librtas-devel mailing list: + /* + * rtas_platform_dump() is called with buf = NULL and length = 0 + * for "dump complete" RTAS call to invalidate dump. + * For kernel interface, read() will be continued until the + * return value = 0. Means kernel API will return this value only + * after issued "dump complete" call. So nothing to do further + * for the last RTAS call. + */ + if (buffer == NULL) + return 0; If I understand this correctly, it looks like calling rtas_platform_dump() with buf = NULL and length = 0, now does nothing. Dump is already invalidated even before userspace makes this call. Wouldn't this break ABI ? Correct me if I am wrong. Consider a scenario where userspace tool using librtas interface rtas_platform_dump() reads the last set of data but unable to write it to the disk due to insufficient disk space. In that case, tool may error out without invalidating the dump and expect user to cleanup the disk space, re-run the tool to save platform dump to disk. If last read implicitly invalidates the dump, then in this scenario user will loose the platform dump. Shouldn't we wait for explicit request from user to invalidate the dump to avoid this ? Thanks, -Mahesh.
On Wed, 2024-11-27 at 00:42 +0530, Mahesh J Salgaonkar wrote: > On 2024-11-23 21:20:39 Sat, Haren Myneni wrote: > [...] > > +static ssize_t papr_platform_dump_handle_read(struct file *file, > > + char __user *buf, size_t size, loff_t *off) > > +{ > > + struct ibm_platform_dump_params *params = file->private_data; > > + u64 total_bytes; > > + s32 fwrc; > > + > > + /* > > + * Dump already completed with the previous read calls. > > + * In case if the user space issues further reads, returns > > + * -EINVAL. > > + */ > > + if (!params->buf_length) { > > + pr_warn_once("Platform dump completed for dump ID > > %llu\n", > > + (u64) (((u64)params->dump_tag_hi << 32) | > > + params->dump_tag_lo)); > > + return -EINVAL; > > + } > > + > > + if (size < SZ_1K) { > > + pr_err_once("Buffer length should be minimum 1024 > > bytes\n"); > > + return -EINVAL; > > + } > > + > > + /* > > + * The hypervisor returns status 0 if no more data available to > > + * download. Then expects the last RTAS call with NULL buffer > > + * to invalidate the dump which means dump will be freed in the > > + * hypervisor. > > + */ > > + if (params->status == RTAS_IBM_PLATFORM_DUMP_COMPLETE) { > > + params->buf_length = 0; > > + fwrc = rtas_ibm_platform_dump(params, 0, 0); > > + /* > > + * Returns 0 (success) to the user space so that user > > + * space read stops. > > Does this implicitly invalidates the dump irrespective of whether > userspace has requested or not ? No, the RTAS call from the last read() will invalidate the dump. Expect the user space make the read() until returns 0 which means the last read() will return 0 after invalidate the dump. size_t read() { if (params->status == RTAS_IBM_PLATFORM_DUMP_COMPLETE) { RTAS call to invalidate dump return 0; } get the data from RTAS call If the RTAS call return status is DUMP_COMPLETE params->status = RTAS_IBM_PLATFORM_DUMP_COMPLETE return bytes_written } If the RTAS call returns DUMP_COMPLETE, the hypervisor expects one more RTAS call to invalidate the dump which is done as part of the last read() > > Copy-pasting bellow code snippet from librtas side patch posted by > you to > librtas-devel mailing list: > + /* > + * rtas_platform_dump() is called with buf = NULL and length = 0 > + * for "dump complete" RTAS call to invalidate dump. > + * For kernel interface, read() will be continued until the > + * return value = 0. Means kernel API will return this value only > + * after issued "dump complete" call. So nothing to do further > + * for the last RTAS call. > + */ > + if (buffer == NULL) > + return 0; > > If I understand this correctly, it looks like calling > rtas_platform_dump() with buf = NULL and length = 0, now does > nothing. Following the same read() ABI - expects to make calls until returns size 0. The current usage of rtas_platform_dump() in ppc64- diag/rtas_errd/extract_platdump.c dump_complete = rtas_platform_dump(dump_tag, 0, dump_buf, DUMP_BUF_SZ, &seq_next, &bytes); while (dump_complete) { dump_complete = rtas_platform_dump(dump_tag, seq, dump_buf, DUMP_BUF_SZ, &seq_next, &bytes); } ret = rtas_platform_dump(dump_tag, seq, NULL, 0, &seq_next, &bytes); we need to support both new and old interfaces and not changing the above code which uses librtas API. So the new rtas_platform_dump() interface { if the buffer == NULL return - nothing to do here. Dump is invalidated with the previous rtas_platform_dump() size = read() if size == 0 dump complete and invalidate the dump return 0 return 1; } > Dump is already invalidated even before userspace makes this call. > Wouldn't this break ABI ? Correct me if I am wrong. > > Consider a scenario where userspace tool using librtas interface > rtas_platform_dump() reads the last set of data but unable to write > it > to the disk due to insufficient disk space. In that case, tool may > error > out without invalidating the dump and expect user to cleanup the disk > space, re-run the tool to save platform dump to disk. If last read > implicitly invalidates the dump, then in this scenario user will > loose > the platform dump. Shouldn't we wait for explicit request from user > to > invalidate the dump to avoid this ? In case if the user space could not proceed with more read() calls due to disk space issue or etc, the dump is still valid and the user space can restart with sequenece 0. it means the dump has to be written completly to the disk before issuing the last read() call which returns size = 0 after invalidating the dump. Thanks Haren > > Thanks, > -Mahesh.
On Tue, Nov 26, 2024 at 12:40:20PM -0800, Haren Myneni wrote: > On Wed, 2024-11-27 at 00:42 +0530, Mahesh J Salgaonkar wrote: > > On 2024-11-23 21:20:39 Sat, Haren Myneni wrote: > > [...] > > > +static ssize_t papr_platform_dump_handle_read(struct file *file, > > > + char __user *buf, size_t size, loff_t *off) > > > +{ > > > + struct ibm_platform_dump_params *params = file->private_data; > > > + u64 total_bytes; > > > + s32 fwrc; > > > + > > > + /* > > > + * Dump already completed with the previous read calls. > > > + * In case if the user space issues further reads, returns > > > + * -EINVAL. > > > + */ > > > + if (!params->buf_length) { > > > + pr_warn_once("Platform dump completed for dump ID > > > %llu\n", > > > + (u64) (((u64)params->dump_tag_hi << 32) | > > > + params->dump_tag_lo)); > > > + return -EINVAL; > > > + } > > > + > > > + if (size < SZ_1K) { > > > + pr_err_once("Buffer length should be minimum 1024 > > > bytes\n"); > > > + return -EINVAL; > > > + } > > > + > > > + /* > > > + * The hypervisor returns status 0 if no more data available to > > > + * download. Then expects the last RTAS call with NULL buffer > > > + * to invalidate the dump which means dump will be freed in the > > > + * hypervisor. > > > + */ > > > + if (params->status == RTAS_IBM_PLATFORM_DUMP_COMPLETE) { > > > + params->buf_length = 0; > > > + fwrc = rtas_ibm_platform_dump(params, 0, 0); > > > + /* > > > + * Returns 0 (success) to the user space so that user > > > + * space read stops. > > > > Does this implicitly invalidates the dump irrespective of whether > > userspace has requested or not ? > > No, the RTAS call from the last read() will invalidate the dump. Expect > the user space make the read() until returns 0 which means the last > read() will return 0 after invalidate the dump. > > size_t read() > { > if (params->status == RTAS_IBM_PLATFORM_DUMP_COMPLETE) { > RTAS call to invalidate dump > return 0; > } > get the data from RTAS call > If the RTAS call return status is DUMP_COMPLETE > params->status = RTAS_IBM_PLATFORM_DUMP_COMPLETE > return bytes_written > } > > If the RTAS call returns DUMP_COMPLETE, the hypervisor expects one more > RTAS call to invalidate the dump which is done as part of the last > read() > > > > > Copy-pasting bellow code snippet from librtas side patch posted by > > you to > > librtas-devel mailing list: > > + /* > > + * rtas_platform_dump() is called with buf = NULL and length = 0 > > + * for "dump complete" RTAS call to invalidate dump. > > + * For kernel interface, read() will be continued until the > > + * return value = 0. Means kernel API will return this value only > > + * after issued "dump complete" call. So nothing to do further > > + * for the last RTAS call. > > + */ > > + if (buffer == NULL) > > + return 0; > > > > If I understand this correctly, it looks like calling > > rtas_platform_dump() with buf = NULL and length = 0, now does > > nothing. > Following the same read() ABI - expects to make calls until returns > size 0. > > The current usage of rtas_platform_dump() in ppc64- > diag/rtas_errd/extract_platdump.c > > dump_complete = rtas_platform_dump(dump_tag, 0, dump_buf, DUMP_BUF_SZ, > &seq_next, &bytes); > > while (dump_complete) { > > dump_complete = rtas_platform_dump(dump_tag, seq, dump_buf, > DUMP_BUF_SZ, &seq_next, &bytes); > > } > > ret = rtas_platform_dump(dump_tag, seq, NULL, 0, > &seq_next, &bytes); > > we need to support both new and old interfaces and not changing the > above code which uses librtas API. > > So the new rtas_platform_dump() interface > { > if the buffer == NULL > return - nothing to do here. Dump is invalidated with the previous > rtas_platform_dump() > > size = read() > if size == 0 > dump complete and invalidate the dump > return 0 > > return 1; > } No EOF? So no standard file handling code can use this FD? But also the size 0 read both indicates the EOF and invalidates the dump, these should be separate. Which differs from the hypervisor API that makes it impossible to save the dump without deleting it, and introduces a regression. If you are doing IOCTLs anyway the invalidation could be an IOCTL. Or you could really follow the RTAS ABI and only incalidate if the user passes NULL and 0. But more generally the previously added RTAS wrappers do not closely follow the RTAS ABI, and do accumulation of read data in the kernel, passing the whole buffer to userspace afterwards. Why cannot it be done in this case? Even more generally if the dump IDs are stable these could be listed in some sysfs or debugfs directory, and provide standard file operations, including unlink() to remove the dump. With the bonus that at least one of these filesystems has some provisions for confidentiality lockdown. The implementation could use that to make the dumps unavailable in confidentiality lockdown level if the dumps are considered confidential without reimplementing the check. Thanks Michal
On Wed, 2024-11-27 at 10:11 +0100, Michal Suchánek wrote: > On Tue, Nov 26, 2024 at 12:40:20PM -0800, Haren Myneni wrote: > > On Wed, 2024-11-27 at 00:42 +0530, Mahesh J Salgaonkar wrote: > > > On 2024-11-23 21:20:39 Sat, Haren Myneni wrote: > > > [...] > > > > +static ssize_t papr_platform_dump_handle_read(struct file > > > > *file, > > > > + char __user *buf, size_t size, loff_t *off) > > > > +{ > > > > + struct ibm_platform_dump_params *params = file- > > > > >private_data; > > > > + u64 total_bytes; > > > > + s32 fwrc; > > > > + > > > > + /* > > > > + * Dump already completed with the previous read calls. > > > > + * In case if the user space issues further reads, > > > > returns > > > > + * -EINVAL. > > > > + */ > > > > + if (!params->buf_length) { > > > > + pr_warn_once("Platform dump completed for dump > > > > ID > > > > %llu\n", > > > > + (u64) (((u64)params->dump_tag_hi << 32) > > > > | > > > > + params->dump_tag_lo)); > > > > + return -EINVAL; > > > > + } > > > > + > > > > + if (size < SZ_1K) { > > > > + pr_err_once("Buffer length should be minimum > > > > 1024 > > > > bytes\n"); > > > > + return -EINVAL; > > > > + } > > > > + > > > > + /* > > > > + * The hypervisor returns status 0 if no more data > > > > available to > > > > + * download. Then expects the last RTAS call with NULL > > > > buffer > > > > + * to invalidate the dump which means dump will be > > > > freed in the > > > > + * hypervisor. > > > > + */ > > > > + if (params->status == RTAS_IBM_PLATFORM_DUMP_COMPLETE) > > > > { > > > > + params->buf_length = 0; > > > > + fwrc = rtas_ibm_platform_dump(params, 0, 0); > > > > + /* > > > > + * Returns 0 (success) to the user space so > > > > that user > > > > + * space read stops. > > > > > > Does this implicitly invalidates the dump irrespective of whether > > > userspace has requested or not ? > > > > No, the RTAS call from the last read() will invalidate the dump. > > Expect > > the user space make the read() until returns 0 which means the last > > read() will return 0 after invalidate the dump. > > > > size_t read() > > { > > if (params->status == RTAS_IBM_PLATFORM_DUMP_COMPLETE) { > > RTAS call to invalidate dump > > return 0; > > } > > get the data from RTAS call > > If the RTAS call return status is DUMP_COMPLETE > > params->status = RTAS_IBM_PLATFORM_DUMP_COMPLETE > > return bytes_written > > } > > > > If the RTAS call returns DUMP_COMPLETE, the hypervisor expects one > > more > > RTAS call to invalidate the dump which is done as part of the last > > read() > > > > > Copy-pasting bellow code snippet from librtas side patch posted > > > by > > > you to > > > librtas-devel mailing list: > > > + /* > > > + * rtas_platform_dump() is called with buf = NULL and length = 0 > > > + * for "dump complete" RTAS call to invalidate dump. > > > + * For kernel interface, read() will be continued until the > > > + * return value = 0. Means kernel API will return this value > > > only > > > + * after issued "dump complete" call. So nothing to do further > > > + * for the last RTAS call. > > > + */ > > > + if (buffer == NULL) > > > + return 0; > > > > > > If I understand this correctly, it looks like calling > > > rtas_platform_dump() with buf = NULL and length = 0, now does > > > nothing. > > Following the same read() ABI - expects to make calls until returns > > size 0. > > > > The current usage of rtas_platform_dump() in ppc64- > > diag/rtas_errd/extract_platdump.c > > > > dump_complete = rtas_platform_dump(dump_tag, 0, dump_buf, > > DUMP_BUF_SZ, > > &seq_next, &bytes); > > > > while (dump_complete) { > > > > dump_complete = rtas_platform_dump(dump_tag, seq, dump_buf, > > DUMP_BUF_SZ, &seq_next, &bytes); > > > > } > > > > ret = rtas_platform_dump(dump_tag, seq, NULL, 0, > > &seq_next, &bytes); > > > > we need to support both new and old interfaces and not changing the > > above code which uses librtas API. > > > > So the new rtas_platform_dump() interface > > { > > if the buffer == NULL > > return - nothing to do here. Dump is invalidated with the > > previous > > rtas_platform_dump() > > > > size = read() > > if size == 0 > > dump complete and invalidate the dump > > return 0 > > > > return 1; > > } > > No EOF? read() returns size, 0 or < 0. Returns 0 is like EOF. > > So no standard file handling code can use this FD? Yes, providing support for FD from ioctl for the following reasons: - get-vpd char driver is providing only for FD from ioctl. So thought of using same interface for platform-dump so that having consitent interface for all RTAS function char drivers. - file->private_data is assigned to miscdevice in misc_register and also assigned to some other miscdevice struct in driver specific code. So was thinking of not following semantics in the existing code if I private_data to save internal param struct. Please let me know if you prefer to use FD from open() for platform- dump read(). > > But also the size 0 read both indicates the EOF and invalidates the > dump, these should be separate. > > Which differs from the hypervisor API that makes it impossible to > save > the dump without deleting it, and introduces a regression. The hypervisor API says to invalidate the dump after saving it. In the current interface it does - The user space makes the last read() (for return 0) after saving the complete dump. All read() calls return size (> 0) == RTAS calls for dump read() expects return 0 == same RTAS invalidate dump So the only difference is if the user does not call to invalidate dump explicitly even though saved the dump, but we do not have the current usage, Only the extract-dump command is the only usage now and please note that this command is used for non-HMC manages system. It is not used on HMC managed system which has its own command gets the dump directly from the hypervisor. > > If you are doing IOCTLs anyway the invalidation could be an IOCTL. Or > you could really follow the RTAS ABI and only incalidate if the user > passes NULL and 0. I could use this ioctl interface to invalidate the dump. devfd = ioctl(fd ..) for dump ID read(devfd) ret = ioctl(devfd ...) to invalidate dump I will make changes if you are OK with this interface > > But more generally the previously added RTAS wrappers do not closely > follow the RTAS ABI, and do accumulation of read data in the kernel, > passing the whole buffer to userspace afterwards. > > Why cannot it be done in this case? platform-dump RTAS returns more data unlike in get-vpd. In one case noticed 3G data which is not the ideal case to save in the kernel buffer within ioctl. Also platform-dump RTAS function supports interleave calls for multiple dump IDs at the same time unlike in get-vpd case. > > Even more generally if the dump IDs are stable these could be listed > in > some sysfs or debugfs directory, and provide standard file > operations, > including unlink() to remove the dump. dump IDs are not stable. The hypervisor can have several dumps with different IDs at the same time. so unlink() can not be used. > > With the bonus that at least one of these filesystems has some > provisions for confidentiality lockdown. The implementation could use > that to make the dumps unavailable in confidentiality lockdown level > if > the dumps are considered confidential without reimplementing the > check. We are adding new driver (interfaces) to support lockdown. Otherwise the current usage is sufficient. But we could modify to restrict the interface for confidentiality lockdown in future if we have that restriction. Thanks Haren > > Thanks > > Michal
On Mon, Dec 02, 2024 at 08:40:05PM -0800, Haren Myneni wrote: > On Wed, 2024-11-27 at 10:11 +0100, Michal Suchánek wrote: > > On Tue, Nov 26, 2024 at 12:40:20PM -0800, Haren Myneni wrote: > > > On Wed, 2024-11-27 at 00:42 +0530, Mahesh J Salgaonkar wrote: > > > > On 2024-11-23 21:20:39 Sat, Haren Myneni wrote: > > > > [...] > > > > > +static ssize_t papr_platform_dump_handle_read(struct file > > > > > *file, > > > > > + char __user *buf, size_t size, loff_t *off) > > > > > +{ > > > > > + struct ibm_platform_dump_params *params = file- > > > > > >private_data; > > > > > + u64 total_bytes; > > > > > + s32 fwrc; > > > > > + > > > > > + /* > > > > > + * Dump already completed with the previous read calls. > > > > > + * In case if the user space issues further reads, > > > > > returns > > > > > + * -EINVAL. > > > > > + */ > > > > > + if (!params->buf_length) { > > > > > + pr_warn_once("Platform dump completed for dump > > > > > ID > > > > > %llu\n", > > > > > + (u64) (((u64)params->dump_tag_hi << 32) > > > > > | > > > > > + params->dump_tag_lo)); > > > > > + return -EINVAL; > > > > > + } > > > > > + > > > > > + if (size < SZ_1K) { > > > > > + pr_err_once("Buffer length should be minimum > > > > > 1024 > > > > > bytes\n"); > > > > > + return -EINVAL; > > > > > + } > > > > > + > > > > > + /* > > > > > + * The hypervisor returns status 0 if no more data > > > > > available to > > > > > + * download. Then expects the last RTAS call with NULL > > > > > buffer > > > > > + * to invalidate the dump which means dump will be > > > > > freed in the > > > > > + * hypervisor. > > > > > + */ > > > > > + if (params->status == RTAS_IBM_PLATFORM_DUMP_COMPLETE) > > > > > { > > > > > + params->buf_length = 0; > > > > > + fwrc = rtas_ibm_platform_dump(params, 0, 0); > > > > > + /* > > > > > + * Returns 0 (success) to the user space so > > > > > that user > > > > > + * space read stops. > > > > > > > > Does this implicitly invalidates the dump irrespective of whether > > > > userspace has requested or not ? > > > > > > No, the RTAS call from the last read() will invalidate the dump. > > > Expect > > > the user space make the read() until returns 0 which means the last > > > read() will return 0 after invalidate the dump. > > > > > > size_t read() > > > { > > > if (params->status == RTAS_IBM_PLATFORM_DUMP_COMPLETE) { > > > RTAS call to invalidate dump > > > return 0; > > > } > > > get the data from RTAS call > > > If the RTAS call return status is DUMP_COMPLETE > > > params->status = RTAS_IBM_PLATFORM_DUMP_COMPLETE > > > return bytes_written > > > } > > > > > > If the RTAS call returns DUMP_COMPLETE, the hypervisor expects one > > > more > > > RTAS call to invalidate the dump which is done as part of the last > > > read() > > > > > > > Copy-pasting bellow code snippet from librtas side patch posted > > > > by > > > > you to > > > > librtas-devel mailing list: > > > > + /* > > > > + * rtas_platform_dump() is called with buf = NULL and length = 0 > > > > + * for "dump complete" RTAS call to invalidate dump. > > > > + * For kernel interface, read() will be continued until the > > > > + * return value = 0. Means kernel API will return this value > > > > only > > > > + * after issued "dump complete" call. So nothing to do further > > > > + * for the last RTAS call. > > > > + */ > > > > + if (buffer == NULL) > > > > + return 0; > > > > > > > > If I understand this correctly, it looks like calling > > > > rtas_platform_dump() with buf = NULL and length = 0, now does > > > > nothing. > > > Following the same read() ABI - expects to make calls until returns > > > size 0. > > > > > > The current usage of rtas_platform_dump() in ppc64- > > > diag/rtas_errd/extract_platdump.c > > > > > > dump_complete = rtas_platform_dump(dump_tag, 0, dump_buf, > > > DUMP_BUF_SZ, > > > &seq_next, &bytes); > > > > > > while (dump_complete) { > > > > > > dump_complete = rtas_platform_dump(dump_tag, seq, dump_buf, > > > DUMP_BUF_SZ, &seq_next, &bytes); > > > > > > } > > > > > > ret = rtas_platform_dump(dump_tag, seq, NULL, 0, > > > &seq_next, &bytes); > > > > > > we need to support both new and old interfaces and not changing the > > > above code which uses librtas API. > > > > > > So the new rtas_platform_dump() interface > > > { > > > if the buffer == NULL > > > return - nothing to do here. Dump is invalidated with the > > > previous > > > rtas_platform_dump() > > > > > > size = read() > > > if size == 0 > > > dump complete and invalidate the dump > > > return 0 > > > > > > return 1; > > > } > > > > No EOF? > > read() returns size, 0 or < 0. Returns 0 is like EOF. Indeed, looks like the special EOF value is only added by libc, sorry about the noise. > > So no standard file handling code can use this FD? > > Yes, providing support for FD from ioctl for the following reasons: > > - get-vpd char driver is providing only for FD from ioctl. So thought > of using same interface for platform-dump so that having consitent > interface for all RTAS function char drivers. > > - file->private_data is assigned to miscdevice in misc_register and > also assigned to some other miscdevice struct in driver specific code. > So was thinking of not following semantics in the existing code if I > private_data to save internal param struct. > > Please let me know if you prefer to use FD from open() for platform- > dump read(). There is no way to specify the dump id with open() of the character device. So for open() other mechanism would have to be used, such as sysfs/debugfs which can list the IDs as filenames, or something similar. > > But also the size 0 read both indicates the EOF and invalidates the > > dump, these should be separate. > > > > Which differs from the hypervisor API that makes it impossible to > > save > > the dump without deleting it, and introduces a regression. > > The hypervisor API says to invalidate the dump after saving it. In the > current interface it does - The user space makes the last read() (for > return 0) after saving the complete dump. > > All read() calls return size (> 0) == RTAS calls for dump > read() expects return 0 == same RTAS invalidate dump > > So the only difference is if the user does not call to invalidate dump > explicitly even though saved the dump, but we do not have the current > usage, Only the extract-dump command is the only usage now and please > note that this command is used for non-HMC manages system. It is not > used on HMC managed system which has its own command gets the dump > directly from the hypervisor. However, the hypervisor does provide the option to read the dump without invalidating it. At the very least this option is useful for testing. Not making it a separate call makes it needlessly difficult for people testing the feature. If as you say this is used exclusively by extract-dump, and it already calls an IOCTL to obtain a FD it can call another IOCTL on it to do the invalidation. There is hte corner case that the kernel might need to finish the reading of the dump on its own if the userspace did not to perform the IOCTL, or return an error if the dump is not fully read yet. > > If you are doing IOCTLs anyway the invalidation could be an IOCTL. Or > > you could really follow the RTAS ABI and only incalidate if the user > > passes NULL and 0. > > I could use this ioctl interface to invalidate the dump. > > devfd = ioctl(fd ..) for dump ID > read(devfd) > ret = ioctl(devfd ...) to invalidate dump > > I will make changes if you are OK with this interface Yes, for the ioctl interface this looks like a better option. > > But more generally the previously added RTAS wrappers do not closely > > follow the RTAS ABI, and do accumulation of read data in the kernel, > > passing the whole buffer to userspace afterwards. > > > > Why cannot it be done in this case? > > platform-dump RTAS returns more data unlike in get-vpd. In one case > noticed 3G data which is not the ideal case to save in the kernel > buffer within ioctl. > > Also platform-dump RTAS function supports interleave calls for multiple > dump IDs at the same time unlike in get-vpd case. It would be nice to document why this call is incompatible with the existing implementations somewhere in the commit message or a comment. > > Even more generally if the dump IDs are stable these could be listed > > in > > some sysfs or debugfs directory, and provide standard file > > operations, > > including unlink() to remove the dump. > > dump IDs are not stable. The hypervisor can have several dumps with > different IDs at the same time. so unlink() can not be used. I mean stable in the sense that same dump corresponds always to the same ID. Not in the sense that new dumps cannot be addded, and old removed. It is indeed unclear what is the lifecycle of these dump IDs, and how they are created, and how the userspace is obtaining them. Thanks Michal > > With the bonus that at least one of these filesystems has some > > provisions for confidentiality lockdown. The implementation could use > > that to make the dumps unavailable in confidentiality lockdown level > > if > > the dumps are considered confidential without reimplementing the > > check. > > We are adding new driver (interfaces) to support lockdown. Otherwise > the current usage is sufficient. But we could modify to restrict the > interface for confidentiality lockdown in future if we have that > restriction. > > Thanks > Haren > > > > > Thanks > > > > Michal >
On Wed, 2024-12-04 at 17:57 +0100, Michal Suchánek wrote: > On Mon, Dec 02, 2024 at 08:40:05PM -0800, Haren Myneni wrote: > > On Wed, 2024-11-27 at 10:11 +0100, Michal Suchánek wrote: > > > On Tue, Nov 26, 2024 at 12:40:20PM -0800, Haren Myneni wrote: > > > > On Wed, 2024-11-27 at 00:42 +0530, Mahesh J Salgaonkar wrote: > > > > > On 2024-11-23 21:20:39 Sat, Haren Myneni wrote: > > > > > [...] > > > > > > +static ssize_t papr_platform_dump_handle_read(struct file > > > > > > *file, > > > > > > + char __user *buf, size_t size, loff_t *off) > > > > > > +{ > > > > > > + struct ibm_platform_dump_params *params = file- > > > > > > > private_data; > > > > > > + u64 total_bytes; > > > > > > + s32 fwrc; > > > > > > + > > > > > > + /* > > > > > > + * Dump already completed with the previous read calls. > > > > > > + * In case if the user space issues further reads, > > > > > > returns > > > > > > + * -EINVAL. > > > > > > + */ > > > > > > + if (!params->buf_length) { > > > > > > + pr_warn_once("Platform dump completed for dump > > > > > > ID > > > > > > %llu\n", > > > > > > + (u64) (((u64)params->dump_tag_hi << 32) > > > > > > + params->dump_tag_lo)); > > > > > > + return -EINVAL; > > > > > > + } > > > > > > + > > > > > > + if (size < SZ_1K) { > > > > > > + pr_err_once("Buffer length should be minimum > > > > > > 1024 > > > > > > bytes\n"); > > > > > > + return -EINVAL; > > > > > > + } > > > > > > + > > > > > > + /* > > > > > > + * The hypervisor returns status 0 if no more data > > > > > > available to > > > > > > + * download. Then expects the last RTAS call with NULL > > > > > > buffer > > > > > > + * to invalidate the dump which means dump will be > > > > > > freed in the > > > > > > + * hypervisor. > > > > > > + */ > > > > > > + if (params->status == RTAS_IBM_PLATFORM_DUMP_COMPLETE) > > > > > > { > > > > > > + params->buf_length = 0; > > > > > > + fwrc = rtas_ibm_platform_dump(params, 0, 0); > > > > > > + /* > > > > > > + * Returns 0 (success) to the user space so > > > > > > that user > > > > > > + * space read stops. > > > > > > > > > > Does this implicitly invalidates the dump irrespective of > > > > > whether > > > > > userspace has requested or not ? > > > > > > > > No, the RTAS call from the last read() will invalidate the > > > > dump. > > > > Expect > > > > the user space make the read() until returns 0 which means the > > > > last > > > > read() will return 0 after invalidate the dump. > > > > > > > > size_t read() > > > > { > > > > if (params->status == RTAS_IBM_PLATFORM_DUMP_COMPLETE) { > > > > RTAS call to invalidate dump > > > > return 0; > > > > } > > > > get the data from RTAS call > > > > If the RTAS call return status is DUMP_COMPLETE > > > > params->status = RTAS_IBM_PLATFORM_DUMP_COMPLETE > > > > return bytes_written > > > > } > > > > > > > > If the RTAS call returns DUMP_COMPLETE, the hypervisor expects > > > > one > > > > more > > > > RTAS call to invalidate the dump which is done as part of the > > > > last > > > > read() > > > > > > > > > Copy-pasting bellow code snippet from librtas side patch > > > > > posted > > > > > by > > > > > you to > > > > > librtas-devel mailing list: > > > > > + /* > > > > > + * rtas_platform_dump() is called with buf = NULL and length > > > > > = 0 > > > > > + * for "dump complete" RTAS call to invalidate dump. > > > > > + * For kernel interface, read() will be continued until the > > > > > + * return value = 0. Means kernel API will return this value > > > > > only > > > > > + * after issued "dump complete" call. So nothing to do > > > > > further > > > > > + * for the last RTAS call. > > > > > + */ > > > > > + if (buffer == NULL) > > > > > + return 0; > > > > > > > > > > If I understand this correctly, it looks like calling > > > > > rtas_platform_dump() with buf = NULL and length = 0, now does > > > > > nothing. > > > > Following the same read() ABI - expects to make calls until > > > > returns > > > > size 0. > > > > > > > > The current usage of rtas_platform_dump() in ppc64- > > > > diag/rtas_errd/extract_platdump.c > > > > > > > > dump_complete = rtas_platform_dump(dump_tag, 0, dump_buf, > > > > DUMP_BUF_SZ, > > > > &seq_next, &bytes); > > > > > > > > while (dump_complete) { > > > > > > > > dump_complete = rtas_platform_dump(dump_tag, seq, dump_buf, > > > > DUMP_BUF_SZ, &seq_next, > > > > &bytes); > > > > > > > > } > > > > > > > > ret = rtas_platform_dump(dump_tag, seq, NULL, 0, > > > > &seq_next, &bytes); > > > > > > > > we need to support both new and old interfaces and not changing > > > > the > > > > above code which uses librtas API. > > > > > > > > So the new rtas_platform_dump() interface > > > > { > > > > if the buffer == NULL > > > > return - nothing to do here. Dump is invalidated with the > > > > previous > > > > rtas_platform_dump() > > > > > > > > size = read() > > > > if size == 0 > > > > dump complete and invalidate the dump > > > > return 0 > > > > > > > > return 1; > > > > } > > > > > > No EOF? > > > > read() returns size, 0 or < 0. Returns 0 is like EOF. > > Indeed, looks like the special EOF value is only added by libc, sorry > about the noise. > > > > So no standard file handling code can use this FD? > > > > Yes, providing support for FD from ioctl for the following reasons: > > > > - get-vpd char driver is providing only for FD from ioctl. So > > thought > > of using same interface for platform-dump so that having consitent > > interface for all RTAS function char drivers. > > > > - file->private_data is assigned to miscdevice in misc_register and > > also assigned to some other miscdevice struct in driver specific > > code. > > So was thinking of not following semantics in the existing code if > > I > > private_data to save internal param struct. > > > > Please let me know if you prefer to use FD from open() for > > platform- > > dump read(). > > There is no way to specify the dump id with open() of the character > device. So for open() other mechanism would have to be used, such as > sysfs/debugfs which can list the IDs as filenames, or something > similar. you mentioned about standard file handling code using FD from open. So I was referring to passing dump ID with ioctl as in the current patch, but using FD returned from open() for read() calls like: fd = open() ret = ioctl(fd,.. dumpid) read(fd, buf, size) As in get-vpd, we will have one other RTAS stream function get-indices. I will add this support later. So I am proposing same interfaces for RTAS functions as provided by papr-vpd so that same type of interfaces will be used across all RTAS functions fd = open() devfd = ioctl(fd,.. dumpid) read(devfd, buf, size) --> will just returns 0 if the previous RTAS call returns dump complete status. But will not invalidate the dump. ret = ioctl(devfd, .. dumpid) - can be used to invalidate dump as you suggested. > > > > But also the size 0 read both indicates the EOF and invalidates > > > the > > > dump, these should be separate. > > > > > > Which differs from the hypervisor API that makes it impossible to > > > save > > > the dump without deleting it, and introduces a regression. > > > > The hypervisor API says to invalidate the dump after saving it. In > > the > > current interface it does - The user space makes the last read() > > (for > > return 0) after saving the complete dump. > > > > All read() calls return size (> 0) == RTAS calls for dump > > read() expects return 0 == same RTAS invalidate dump > > > > So the only difference is if the user does not call to invalidate > > dump > > explicitly even though saved the dump, but we do not have the > > current > > usage, Only the extract-dump command is the only usage now and > > please > > note that this command is used for non-HMC manages system. It is > > not > > used on HMC managed system which has its own command gets the dump > > directly from the hypervisor. > > However, the hypervisor does provide the option to read the dump > without > invalidating it. > > At the very least this option is useful for testing. Not making it a > separate call makes it needlessly difficult for people testing the > feature. > > If as you say this is used exclusively by extract-dump, and it > already > calls an IOCTL to obtain a FD it can call another IOCTL on it to do > the > invalidation. There is hte corner case that the kernel might need to > finish the reading of the dump on its own if the userspace did not to > perform the IOCTL, or return an error if the dump is not fully read > yet. kernel should not read the remaining dump if the user space did not issue or returns an error if the complete dump is not read. should leave it to the user space as it is doing now. As Mahesh pointed before, what if the user space gets exited for some reason in the middle of dump collection. should be same before but one more ioctl to invaldate the dump read() { if the status is 'dump complete' return 0; if the status is 'dump invalidated' return -EINVAL -- means user space issued ioctl to invalidate before. should not expect this but in case RTAS call to get the dump return size; } ioctl(devfd, ...) { if the status is 'dump complete' RTAS call to invalidate dump return 0; return -EINVAL --- expects the user space to issue this ioctl after read() returns 0 } > > > > If you are doing IOCTLs anyway the invalidation could be an > > > IOCTL. Or > > > you could really follow the RTAS ABI and only incalidate if the > > > user > > > passes NULL and 0. > > > > I could use this ioctl interface to invalidate the dump. > > > > devfd = ioctl(fd ..) for dump ID > > read(devfd) > > ret = ioctl(devfd ...) to invalidate dump > > > > I will make changes if you are OK with this interface > > Yes, for the ioctl interface this looks like a better option. Thanks for your suggestion. As I mentioned above, will add this change. > > > > But more generally the previously added RTAS wrappers do not > > > closely > > > follow the RTAS ABI, and do accumulation of read data in the > > > kernel, > > > passing the whole buffer to userspace afterwards. > > > > > > Why cannot it be done in this case? > > > > platform-dump RTAS returns more data unlike in get-vpd. In one case > > noticed 3G data which is not the ideal case to save in the kernel > > buffer within ioctl. > > > > Also platform-dump RTAS function supports interleave calls for > > multiple > > dump IDs at the same time unlike in get-vpd case. > > It would be nice to document why this call is incompatible with the > existing implementations somewhere in the commit message or a > comment. I think mentioned about interleave calls. Sure will update in comments. > > > > Even more generally if the dump IDs are stable these could be > > > listed > > > in > > > some sysfs or debugfs directory, and provide standard file > > > operations, > > > including unlink() to remove the dump. > > > > dump IDs are not stable. The hypervisor can have several dumps with > > different IDs at the same time. so unlink() can not be used. > > I mean stable in the sense that same dump corresponds always to the > same > ID. Not in the sense that new dumps cannot be addded, and old > removed. > > It is indeed unclear what is the lifecycle of these dump IDs, and how > they are created, and how the userspace is obtaining them. Dump IDs can be 64bit value and so I think they may not be reused. Whenever the hypervisor has the dump ot if the user initiates the dump with BMC interface, notifies to the partition with the dump ID and automatically executes extract_platdump command. We should be OK Since using ioctl to invalidate fd = open() devfd = ioctl(fd,.. dumpid) read(devfd, buf, size) ret = ioctl(devfd, .. dumpid) Will implement the above change. Please let me know if you have any concerns. Thanks Haren > > Thanks > > Michal > > > > With the bonus that at least one of these filesystems has some > > > provisions for confidentiality lockdown. The implementation could > > > use > > > that to make the dumps unavailable in confidentiality lockdown > > > level > > > if > > > the dumps are considered confidential without reimplementing the > > > check. > > > > We are adding new driver (interfaces) to support lockdown. > > Otherwise > > the current usage is sufficient. But we could modify to restrict > > the > > interface for confidentiality lockdown in future if we have that > > restriction. > > > > Thanks > > Haren > > > > > Thanks > > > > > > Michal
On Wed, Dec 04, 2024 at 06:14:06PM -0800, Haren Myneni wrote: > On Wed, 2024-12-04 at 17:57 +0100, Michal Suchánek wrote: > > On Mon, Dec 02, 2024 at 08:40:05PM -0800, Haren Myneni wrote: > > > On Wed, 2024-11-27 at 10:11 +0100, Michal Suchánek wrote: > > > > On Tue, Nov 26, 2024 at 12:40:20PM -0800, Haren Myneni wrote: > > > > > On Wed, 2024-11-27 at 00:42 +0530, Mahesh J Salgaonkar wrote: > > > > > > On 2024-11-23 21:20:39 Sat, Haren Myneni wrote: > > > > > > [...] > > > > > > > +static ssize_t papr_platform_dump_handle_read(struct file > > > > > > > *file, > > > > > > > + char __user *buf, size_t size, loff_t *off) > > > > > > > +{ > > > > > > > + struct ibm_platform_dump_params *params = file- > > > > > > > > private_data; > > > > > > > + u64 total_bytes; > > > > > > > + s32 fwrc; > > > > > > > + > > > > > > > + /* > > > > > > > + * Dump already completed with the previous read calls. > > > > > > > + * In case if the user space issues further reads, > > > > > > > returns > > > > > > > + * -EINVAL. > > > > > > > + */ > > > > > > > + if (!params->buf_length) { > > > > > > > + pr_warn_once("Platform dump completed for dump > > > > > > > ID > > > > > > > %llu\n", > > > > > > > + (u64) (((u64)params->dump_tag_hi << 32) > > > > > > > + params->dump_tag_lo)); > > > > > > > + return -EINVAL; > > > > > > > + } > > > > > > > + > > > > > > > + if (size < SZ_1K) { > > > > > > > + pr_err_once("Buffer length should be minimum > > > > > > > 1024 > > > > > > > bytes\n"); > > > > > > > + return -EINVAL; > > > > > > > + } > > > > > > > + > > > > > > > + /* > > > > > > > + * The hypervisor returns status 0 if no more data > > > > > > > available to > > > > > > > + * download. Then expects the last RTAS call with NULL > > > > > > > buffer > > > > > > > + * to invalidate the dump which means dump will be > > > > > > > freed in the > > > > > > > + * hypervisor. > > > > > > > + */ > > > > > > > + if (params->status == RTAS_IBM_PLATFORM_DUMP_COMPLETE) > > > > > > > { > > > > > > > + params->buf_length = 0; > > > > > > > + fwrc = rtas_ibm_platform_dump(params, 0, 0); > > > > > > > + /* > > > > > > > + * Returns 0 (success) to the user space so > > > > > > > that user > > > > > > > + * space read stops. > > > > > > > > > > > > Does this implicitly invalidates the dump irrespective of > > > > > > whether > > > > > > userspace has requested or not ? > > > > > > > > > > No, the RTAS call from the last read() will invalidate the > > > > > dump. > > > > > Expect > > > > > the user space make the read() until returns 0 which means the > > > > > last > > > > > read() will return 0 after invalidate the dump. > > > > > > > > > > size_t read() > > > > > { > > > > > if (params->status == RTAS_IBM_PLATFORM_DUMP_COMPLETE) { > > > > > RTAS call to invalidate dump > > > > > return 0; > > > > > } > > > > > get the data from RTAS call > > > > > If the RTAS call return status is DUMP_COMPLETE > > > > > params->status = RTAS_IBM_PLATFORM_DUMP_COMPLETE > > > > > return bytes_written > > > > > } > > > > > > > > > > If the RTAS call returns DUMP_COMPLETE, the hypervisor expects > > > > > one > > > > > more > > > > > RTAS call to invalidate the dump which is done as part of the > > > > > last > > > > > read() > > > Yes, providing support for FD from ioctl for the following reasons: > > > > > > - get-vpd char driver is providing only for FD from ioctl. So > > > thought > > > of using same interface for platform-dump so that having consitent > > > interface for all RTAS function char drivers. > > > > > > - file->private_data is assigned to miscdevice in misc_register and > > > also assigned to some other miscdevice struct in driver specific > > > code. > > > So was thinking of not following semantics in the existing code if > > > I > > > private_data to save internal param struct. > > > > > > Please let me know if you prefer to use FD from open() for > > > platform- > > > dump read(). > > > > There is no way to specify the dump id with open() of the character > > device. So for open() other mechanism would have to be used, such as > > sysfs/debugfs which can list the IDs as filenames, or something > > similar. > > you mentioned about standard file handling code using FD from open. So > I was referring to passing dump ID with ioctl as in the current patch, > but using FD returned from open() for read() calls like: > fd = open() > ret = ioctl(fd,.. dumpid) > read(fd, buf, size) > > As in get-vpd, we will have one other RTAS stream function get-indices. It would be much clearer if these two related functions were implemented together. > I will add this support later. So I am proposing same interfaces for > RTAS functions as provided by papr-vpd so that same type of interfaces > will be used across all RTAS functions The proposed interface is similar but subtly different. While papr_vpd accumulates the data in kernel and returns them in single call, this API requires the user to call read() repeatedly to obtain the data, unlike papr_vpd. With the requirement to introduce a new API it can as well be designed to fit the call in question. > fd = open() > devfd = ioctl(fd,.. dumpid) > read(devfd, buf, size) --> will just returns 0 if the previous RTAS > call returns dump complete status. But will not invalidate the dump. > ret = ioctl(devfd, .. dumpid) - can be used to invalidate dump as you > suggested. > > > > > > > But also the size 0 read both indicates the EOF and invalidates > > > > the > > > > dump, these should be separate. > > > > > > > > Which differs from the hypervisor API that makes it impossible to > > > > save > > > > the dump without deleting it, and introduces a regression. > > > > > > The hypervisor API says to invalidate the dump after saving it. In > > > the > > > current interface it does - The user space makes the last read() > > > (for > > > return 0) after saving the complete dump. > > > > > > All read() calls return size (> 0) == RTAS calls for dump > > > read() expects return 0 == same RTAS invalidate dump > > > > > > So the only difference is if the user does not call to invalidate > > > dump > > > explicitly even though saved the dump, but we do not have the > > > current > > > usage, Only the extract-dump command is the only usage now and > > > please > > > note that this command is used for non-HMC manages system. It is > > > not > > > used on HMC managed system which has its own command gets the dump > > > directly from the hypervisor. > > > > However, the hypervisor does provide the option to read the dump > > without > > invalidating it. > > > > At the very least this option is useful for testing. Not making it a > > separate call makes it needlessly difficult for people testing the > > feature. > > > > If as you say this is used exclusively by extract-dump, and it > > already > > calls an IOCTL to obtain a FD it can call another IOCTL on it to do > > the > > invalidation. There is hte corner case that the kernel might need to > > finish the reading of the dump on its own if the userspace did not to > > perform the IOCTL, or return an error if the dump is not fully read > > yet. > > kernel should not read the remaining dump if the user space did not > issue or returns an error if the complete dump is not read. should > leave it to the user space as it is doing now. As Mahesh pointed > before, what if the user space gets exited for some reason in the > middle of dump collection. > > should be same before but one more ioctl to invaldate the dump > > read() > { > if the status is 'dump complete' > return 0; > > if the status is 'dump invalidated' > return -EINVAL -- means user space issued ioctl to invalidate > before. should not expect this but in case > > RTAS call to get the dump > return size; > } > > ioctl(devfd, ...) > { > if the status is 'dump complete' > RTAS call to invalidate dump > return 0; > > return -EINVAL --- expects the user space to issue this ioctl after > read() returns 0 > } That makes not a very nice API, though. Also EINVAL is not exactly helpful: the call arguments are valid, the kernel merely refused to do the remaining reads until it can perform the invalidation according to the hypervisor API. > > > > If you are doing IOCTLs anyway the invalidation could be an > > > > IOCTL. Or > > > > you could really follow the RTAS ABI and only incalidate if the > > > > user > > > > passes NULL and 0. > > > > > > I could use this ioctl interface to invalidate the dump. > > > > > > devfd = ioctl(fd ..) for dump ID > > > read(devfd) > > > ret = ioctl(devfd ...) to invalidate dump > > > > > > I will make changes if you are OK with this interface > > > > Yes, for the ioctl interface this looks like a better option. > > Thanks for your suggestion. As I mentioned above, will add this change. > > > > > > But more generally the previously added RTAS wrappers do not > > > > closely > > > > follow the RTAS ABI, and do accumulation of read data in the > > > > kernel, > > > > passing the whole buffer to userspace afterwards. > > > > > > > > Why cannot it be done in this case? > > > > > > platform-dump RTAS returns more data unlike in get-vpd. In one case > > > noticed 3G data which is not the ideal case to save in the kernel > > > buffer within ioctl. > > > > > > Also platform-dump RTAS function supports interleave calls for > > > multiple > > > dump IDs at the same time unlike in get-vpd case. > > > > It would be nice to document why this call is incompatible with the > > existing implementations somewhere in the commit message or a > > comment. > > I think mentioned about interleave calls. Sure will update in comments. > > > > > > > Even more generally if the dump IDs are stable these could be > > > > listed > > > > in > > > > some sysfs or debugfs directory, and provide standard file > > > > operations, > > > > including unlink() to remove the dump. > > > > > > dump IDs are not stable. The hypervisor can have several dumps with > > > different IDs at the same time. so unlink() can not be used. > > > > I mean stable in the sense that same dump corresponds always to the > > same > > ID. Not in the sense that new dumps cannot be addded, and old > > removed. > > > > It is indeed unclear what is the lifecycle of these dump IDs, and how > > they are created, and how the userspace is obtaining them. > > Dump IDs can be 64bit value and so I think they may not be reused. > Whenever the hypervisor has the dump ot if the user initiates the dump > with BMC interface, notifies to the partition with the dump ID and > automatically executes extract_platdump command. So the reason you can get away with not implementing get-indices is that the dump ID is passed in an event that invokes saving the dump. That should perhaps be documented as well. Also it does sound like get-indices together with these dump reading functions would work with a filesystem API with readdir returning the get-indices, and then the filename in open() or unlink() giving the dump ID. It's not really a problem if dump IDs are reused over time, it's also possible to delete a file, and create one with the same name later. What would be a problem for this is if get-indices gave different results each time even without the available dumps changing. However, it would make unlink() quite inefficient requiring to read the dump a second time. Also if the only known tool used for reading dumps always reads the dump once and then invalidates it repeated dump reading may trigger not yet encountered hyperviisor bugs. Thanks Michal > > We should be OK Since using ioctl to invalidate > > fd = open() > devfd = ioctl(fd,.. dumpid) > read(devfd, buf, size) > ret = ioctl(devfd, .. dumpid) > > Will implement the above change. Please let me know if you have any > concerns. > > Thanks > Haren > > > > > Thanks > > > > Michal > > > > > > With the bonus that at least one of these filesystems has some > > > > provisions for confidentiality lockdown. The implementation could > > > > use > > > > that to make the dumps unavailable in confidentiality lockdown > > > > level > > > > if > > > > the dumps are considered confidential without reimplementing the > > > > check. > > > > > > We are adding new driver (interfaces) to support lockdown. > > > Otherwise > > > the current usage is sufficient. But we could modify to restrict > > > the > > > interface for confidentiality lockdown in future if we have that > > > restriction. > > > > > > Thanks > > > Haren > > > > > > > Thanks > > > > > > > > Michal >
On Thu, 2024-12-05 at 11:42 +0100, Michal Suchánek wrote: > On Wed, Dec 04, 2024 at 06:14:06PM -0800, Haren Myneni wrote: > > On Wed, 2024-12-04 at 17:57 +0100, Michal Suchánek wrote: > > > On Mon, Dec 02, 2024 at 08:40:05PM -0800, Haren Myneni wrote: > > > > On Wed, 2024-11-27 at 10:11 +0100, Michal Suchánek wrote: > > > > > On Tue, Nov 26, 2024 at 12:40:20PM -0800, Haren Myneni wrote: > > > > > > On Wed, 2024-11-27 at 00:42 +0530, Mahesh J Salgaonkar > > > > > > wrote: > > > > > > > On 2024-11-23 21:20:39 Sat, Haren Myneni wrote: > > > > > > > [...] > > > > > > > > +static ssize_t papr_platform_dump_handle_read(struct > > > > > > > > file > > > > > > > > *file, > > > > > > > > + char __user *buf, size_t size, loff_t > > > > > > > > *off) > > > > > > > > +{ > > > > > > > > + struct ibm_platform_dump_params *params = file- > > > > > > > > > private_data; > > > > > > > > + u64 total_bytes; > > > > > > > > + s32 fwrc; > > > > > > > > + > > > > > > > > + /* > > > > > > > > + * Dump already completed with the previous > > > > > > > > read calls. > > > > > > > > + * In case if the user space issues further > > > > > > > > reads, > > > > > > > > returns > > > > > > > > + * -EINVAL. > > > > > > > > + */ > > > > > > > > + if (!params->buf_length) { > > > > > > > > + pr_warn_once("Platform dump completed > > > > > > > > for dump > > > > > > > > ID > > > > > > > > %llu\n", > > > > > > > > + (u64) (((u64)params- > > > > > > > > >dump_tag_hi << 32) > > > > > > > > + params->dump_tag_lo)); > > > > > > > > + return -EINVAL; > > > > > > > > + } > > > > > > > > + > > > > > > > > + if (size < SZ_1K) { > > > > > > > > + pr_err_once("Buffer length should be > > > > > > > > minimum > > > > > > > > 1024 > > > > > > > > bytes\n"); > > > > > > > > + return -EINVAL; > > > > > > > > + } > > > > > > > > + > > > > > > > > + /* > > > > > > > > + * The hypervisor returns status 0 if no more > > > > > > > > data > > > > > > > > available to > > > > > > > > + * download. Then expects the last RTAS call > > > > > > > > with NULL > > > > > > > > buffer > > > > > > > > + * to invalidate the dump which means dump will > > > > > > > > be > > > > > > > > freed in the > > > > > > > > + * hypervisor. > > > > > > > > + */ > > > > > > > > + if (params->status == > > > > > > > > RTAS_IBM_PLATFORM_DUMP_COMPLETE) > > > > > > > > { > > > > > > > > + params->buf_length = 0; > > > > > > > > + fwrc = rtas_ibm_platform_dump(params, > > > > > > > > 0, 0); > > > > > > > > + /* > > > > > > > > + * Returns 0 (success) to the user > > > > > > > > space so > > > > > > > > that user > > > > > > > > + * space read stops. > > > > > > > > > > > > > > Does this implicitly invalidates the dump irrespective of > > > > > > > whether > > > > > > > userspace has requested or not ? > > > > > > > > > > > > No, the RTAS call from the last read() will invalidate the > > > > > > dump. > > > > > > Expect > > > > > > the user space make the read() until returns 0 which means > > > > > > the > > > > > > last > > > > > > read() will return 0 after invalidate the dump. > > > > > > > > > > > > size_t read() > > > > > > { > > > > > > if (params->status == RTAS_IBM_PLATFORM_DUMP_COMPLETE) { > > > > > > RTAS call to invalidate dump > > > > > > return 0; > > > > > > } > > > > > > get the data from RTAS call > > > > > > If the RTAS call return status is DUMP_COMPLETE > > > > > > params->status = > > > > > > RTAS_IBM_PLATFORM_DUMP_COMPLETE > > > > > > return bytes_written > > > > > > } > > > > > > > > > > > > If the RTAS call returns DUMP_COMPLETE, the hypervisor > > > > > > expects > > > > > > one > > > > > > more > > > > > > RTAS call to invalidate the dump which is done as part of > > > > > > the > > > > > > last > > > > > > read() > > > > Yes, providing support for FD from ioctl for the following > > > > reasons: > > > > > > > > - get-vpd char driver is providing only for FD from ioctl. So > > > > thought > > > > of using same interface for platform-dump so that having > > > > consitent > > > > interface for all RTAS function char drivers. > > > > > > > > - file->private_data is assigned to miscdevice in misc_register > > > > and > > > > also assigned to some other miscdevice struct in driver > > > > specific > > > > code. > > > > So was thinking of not following semantics in the existing code > > > > if > > > > I > > > > private_data to save internal param struct. > > > > > > > > Please let me know if you prefer to use FD from open() for > > > > platform- > > > > dump read(). > > > > > > There is no way to specify the dump id with open() of the > > > character > > > device. So for open() other mechanism would have to be used, such > > > as > > > sysfs/debugfs which can list the IDs as filenames, or something > > > similar. > > > > you mentioned about standard file handling code using FD from open. > > So > > I was referring to passing dump ID with ioctl as in the current > > patch, > > but using FD returned from open() for read() calls like: > > fd = open() > > ret = ioctl(fd,.. dumpid) > > read(fd, buf, size) > > > > As in get-vpd, we will have one other RTAS stream function get- > > indices. > > It would be much clearer if these two related functions were > implemented together. I need more time to implement ibm,get-indices support. Do you prefer to wait platform-dump RTAS call support along with ibm,get-indices? > > > I will add this support later. So I am proposing same interfaces > > for > > RTAS functions as provided by papr-vpd so that same type of > > interfaces > > will be used across all RTAS functions > > The proposed interface is similar but subtly different. While > papr_vpd accumulates the data in kernel and returns them in single > call, this API requires the user to call read() repeatedly to obtain > the data, unlike papr_vpd. > > With the requirement to introduce a new API it can as well be > designed > to fit the call in question. For papr-vpd, the user space can issue read call to get all data. But can also use multiple read calls until EOF. See rtas_get_vpd() implementation in librtas sources get_vpd_chardev() - https://github.com/ibm-power-utilities/librtas/blob/next/librtas_src/vpd.c Adding the following stream RTAS calls support in the kernel for lockdown: ibm,get-vpd: included in papr-vpd.c, Address of location code as input and starts with sequence 0. Interleaved RTAS calls are not supported. So get the complete buffer within ioctl and the buffer is available to the user space with read() calls. current kernel implementation: ioctl: - first RTAS call with sequence 0 and continue for the complete buffer - If need to start, start with sequence 0 again; Interface usage in the user space: see librtas code. fd = open() devfd = ioctl(fd ..) while (read()); ibm,platform-dump: The user space passes dumpID as input with ioctl(). Interleaved RTAS calls are supported - with multiple dump IDs. Also the dump may need large buffer (noticed 3G dump in my testing) if get the complete dump within ioctl. So proposing RTAS call for each read. Then dump has to be invalidated at the end of dump to remove the dump from the hypervisor. fd= open() devfd = ioctl(fd, ) - sets sequence =1; read(devfd, ) { RTAS call with sequence and dump ID which returns next sequence# until end of dump } ioctl(devfd, ..) - invalidate dump -- If this ioctl is not called by the user space, the dump is still valid and the user space can restart the sequence with the original ioctl(fd, ..) which sets the sequence = 1; The current platform-dump implementation: ppc64-diag/rtas_errd/extract_platdump.c { status = First RTAS call with sequence = 1 while (status == dump_complete) status = RTAS call with next sequence status = RTAS call to invalidate dump } get-indices RTAS call: Takes indicator or sensor and type as inputs from the user space with ioctl(). Multiple interleave calls are not supported with different indices. So will be proposing similar interface as in get-vpd - Read complete buffer in ioctl() and the buffer will be available with read(). In the case of get-vpd, each RTAS call returns #bytes written and the user space can read continuous buffer. Where as get-indices, RTAS call does not tell number of bytes written and the output has to be certain format for each RTAS call given below: - Number Returned: 32-bit integer representing the number of indicator indices returned on this call - Sets of (32-bit integer index, 32-bit integer length of location code including NULLs, location code string (NULL terminated and padded to nearest 4 byte boundary)), one set per indicator or sensor, with the number of sets indicated by the first integer in this work buffer The hypervisor determine how many indices can fit in to the requested buffer and fills the buffer. For example if 1K buffer is requested, we may not get the data in all 1K buffer. So will be gettng 1K buffer (fixed) for each RTAS call and appends to the one buffer as in get-vpd during ioctl. Each read returns 1K to the user space (as in the current implementation in librtas) ioctl() { start sequence 1, get 1K buffer from RTAS call for each sequence copy 1K to large buffer if need to start sequence again, start with sequence 1. } read() { return 1K buffer until EOF } The current implementaion in the user space: https://github.com/ibm-power-utilities/librtas/tree/next/librtas_src > > > fd = open() > > devfd = ioctl(fd,.. dumpid) > > read(devfd, buf, size) --> will just returns 0 if the previous > > RTAS > > call returns dump complete status. But will not invalidate the > > dump. > > ret = ioctl(devfd, .. dumpid) - can be used to invalidate dump as > > you > > suggested. > > > > > > > But also the size 0 read both indicates the EOF and > > > > > invalidates > > > > > the > > > > > dump, these should be separate. > > > > > > > > > > Which differs from the hypervisor API that makes it > > > > > impossible to > > > > > save > > > > > the dump without deleting it, and introduces a regression. > > > > > > > > The hypervisor API says to invalidate the dump after saving it. > > > > In > > > > the > > > > current interface it does - The user space makes the last > > > > read() > > > > (for > > > > return 0) after saving the complete dump. > > > > > > > > All read() calls return size (> 0) == RTAS calls for dump > > > > read() expects return 0 == same RTAS invalidate dump > > > > > > > > So the only difference is if the user does not call to > > > > invalidate > > > > dump > > > > explicitly even though saved the dump, but we do not have the > > > > current > > > > usage, Only the extract-dump command is the only usage now and > > > > please > > > > note that this command is used for non-HMC manages system. It > > > > is > > > > not > > > > used on HMC managed system which has its own command gets the > > > > dump > > > > directly from the hypervisor. > > > > > > However, the hypervisor does provide the option to read the dump > > > without > > > invalidating it. > > > > > > At the very least this option is useful for testing. Not making > > > it a > > > separate call makes it needlessly difficult for people testing > > > the > > > feature. > > > > > > If as you say this is used exclusively by extract-dump, and it > > > already > > > calls an IOCTL to obtain a FD it can call another IOCTL on it to > > > do > > > the > > > invalidation. There is hte corner case that the kernel might need > > > to > > > finish the reading of the dump on its own if the userspace did > > > not to > > > perform the IOCTL, or return an error if the dump is not fully > > > read > > > yet. > > > > kernel should not read the remaining dump if the user space did not > > issue or returns an error if the complete dump is not read. should > > leave it to the user space as it is doing now. As Mahesh pointed > > before, what if the user space gets exited for some reason in the > > middle of dump collection. > > > > should be same before but one more ioctl to invaldate the dump > > > > read() > > { > > if the status is 'dump complete' > > return 0; > > > > if the status is 'dump invalidated' > > return -EINVAL -- means user space issued ioctl to > > invalidate > > before. should not expect this but in case > > > > RTAS call to get the dump > > return size; > > } > > > > ioctl(devfd, ...) > > { > > if the status is 'dump complete' > > RTAS call to invalidate dump > > return 0; > > > > return -EINVAL --- expects the user space to issue this ioctl > > after > > read() returns 0 > > } > > That makes not a very nice API, though. > > Also EINVAL is not exactly helpful: the call arguments are valid, the > kernel merely refused to do the remaining reads until it can perform > the > invalidation according to the hypervisor API. We can change to some other errno ex: -EPERM - not allowed since the dump is not completed. "dump-complete' status is returned from RTAS call only after all data is retrieved. So you mean the user space will be reading again starting from sequence 1. Not sure about this special case. But even if the tool starts again. it can restart the dump sequence with new ioctl(fd,..) before dump invalidate. ex: close(devfd) devfd = ioctl(fd, ..) - sets sequence 0 and next read() calls will get the complete dump > > > > > > If you are doing IOCTLs anyway the invalidation could be an > > > > > IOCTL. Or > > > > > you could really follow the RTAS ABI and only incalidate if > > > > > the > > > > > user > > > > > passes NULL and 0. > > > > > > > > I could use this ioctl interface to invalidate the dump. > > > > > > > > devfd = ioctl(fd ..) for dump ID > > > > read(devfd) > > > > ret = ioctl(devfd ...) to invalidate dump > > > > > > > > I will make changes if you are OK with this interface > > > > > > Yes, for the ioctl interface this looks like a better option. > > > > Thanks for your suggestion. As I mentioned above, will add this > > change. > > > > > But more generally the previously added RTAS wrappers do not > > > > > closely > > > > > follow the RTAS ABI, and do accumulation of read data in the > > > > > kernel, > > > > > passing the whole buffer to userspace afterwards. > > > > > > > > > > Why cannot it be done in this case? > > > > > > > > platform-dump RTAS returns more data unlike in get-vpd. In one > > > > case > > > > noticed 3G data which is not the ideal case to save in the > > > > kernel > > > > buffer within ioctl. > > > > > > > > Also platform-dump RTAS function supports interleave calls for > > > > multiple > > > > dump IDs at the same time unlike in get-vpd case. > > > > > > It would be nice to document why this call is incompatible with > > > the > > > existing implementations somewhere in the commit message or a > > > comment. > > > > I think mentioned about interleave calls. Sure will update in > > comments. > > > > > > > Even more generally if the dump IDs are stable these could be > > > > > listed > > > > > in > > > > > some sysfs or debugfs directory, and provide standard file > > > > > operations, > > > > > including unlink() to remove the dump. > > > > > > > > dump IDs are not stable. The hypervisor can have several dumps > > > > with > > > > different IDs at the same time. so unlink() can not be used. > > > > > > I mean stable in the sense that same dump corresponds always to > > > the > > > same > > > ID. Not in the sense that new dumps cannot be addded, and old > > > removed. > > > > > > It is indeed unclear what is the lifecycle of these dump IDs, and > > > how > > > they are created, and how the userspace is obtaining them. > > > > Dump IDs can be 64bit value and so I think they may not be reused. > > Whenever the hypervisor has the dump ot if the user initiates the > > dump > > with BMC interface, notifies to the partition with the dump ID and > > automatically executes extract_platdump command. > > So the reason you can get away with not implementing get-indices is > that > the dump ID is passed in an event that invokes saving the dump. Sorry if I created confusion. platform-dump and get-indices are separate RTAS calls for different usage. > > That should perhaps be documented as well. > > Also it does sound like get-indices together with these dump reading > functions would work with a filesystem API with readdir returning the > get-indices, and then the filename in open() or unlink() giving the > dump > ID. > > It's not really a problem if dump IDs are reused over time, it's also > possible to delete a file, and create one with the same name later. > What > would be a problem for this is if get-indices gave different results > each time even without the available dumps changing. > > However, it would make unlink() quite inefficient requiring to read > the > dump a second time. Are you proposing the following interfaces with unlink() usage: fd= open() ret = ioctl(dumpid) - which should create <dumpID> file name under some sysfs interface devfd = open(sysfs-dumpid-file) read(devfd,..) -- Note: First read should pass sequence=1 to RTAS call unlink(sysfs-dumpid-file) close(devfd) Suppose if the tool likes to read the dump again, read() call has to be with sequence 1. I am not sure how do we tell read to start the dump. I think it becomes more complex interface. So thinking of having similar interface for all RTAS stream calls to make it simple. > > Also if the only known tool used for reading dumps always reads the > dump > once and then invalidates it repeated dump reading may trigger not > yet > encountered hyperviisor bugs. Not sure what kind of bugs are exposed here. Expects the tool follow same semantics as in the current usage. As I mentioned about the behavior of 3 different RTAS calls, please let me know if you have any comments. Thanks Haren > > Thanks > > Michal > > > We should be OK Since using ioctl to invalidate > > > > fd = open() > > devfd = ioctl(fd,.. dumpid) > > read(devfd, buf, size) > > ret = ioctl(devfd, .. dumpid) > > > > Will implement the above change. Please let me know if you have any > > concerns. > > > > Thanks > > Haren > > > > > Thanks > > > > > > Michal > > > > > > > > With the bonus that at least one of these filesystems has > > > > > some > > > > > provisions for confidentiality lockdown. The implementation > > > > > could > > > > > use > > > > > that to make the dumps unavailable in confidentiality > > > > > lockdown > > > > > level > > > > > if > > > > > the dumps are considered confidential without reimplementing > > > > > the > > > > > check. > > > > > > > > We are adding new driver (interfaces) to support lockdown. > > > > Otherwise > > > > the current usage is sufficient. But we could modify to > > > > restrict > > > > the > > > > interface for confidentiality lockdown in future if we have > > > > that > > > > restriction. > > > > > > > > Thanks > > > > Haren > > > > > > > > > Thanks > > > > > > > > > > Michal
diff --git a/arch/powerpc/include/uapi/asm/papr-platform-dump.h b/arch/powerpc/include/uapi/asm/papr-platform-dump.h new file mode 100644 index 000000000000..67f45399f641 --- /dev/null +++ b/arch/powerpc/include/uapi/asm/papr-platform-dump.h @@ -0,0 +1,14 @@ +/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */ +#ifndef _UAPI_PAPR_PLATFORM_DUMP_H_ +#define _UAPI_PAPR_PLATFORM_DUMP_H_ + +#include <asm/ioctl.h> +#include <asm/papr-miscdev.h> + +/* + * ioctl for /dev/papr-platform-dump. Returns a VPD handle fd corresponding to + * the location code. + */ +#define PAPR_PLATFORM_DUMP_IOC_CREATE_HANDLE _IOW(PAPR_MISCDEV_IOC_ID, 3, __u64) + +#endif /* _UAPI_PAPR_PLATFORM_DUMP_H_ */ diff --git a/arch/powerpc/platforms/pseries/Makefile b/arch/powerpc/platforms/pseries/Makefile index 7bf506f6b8c8..5de27b5c8c12 100644 --- a/arch/powerpc/platforms/pseries/Makefile +++ b/arch/powerpc/platforms/pseries/Makefile @@ -3,7 +3,7 @@ ccflags-$(CONFIG_PPC_PSERIES_DEBUG) += -DDEBUG obj-y := lpar.o hvCall.o nvram.o reconfig.o \ of_helpers.o rtas-work-area.o papr-sysparm.o \ - papr-vpd.o \ + papr-vpd.o papr-platform-dump.o \ setup.o iommu.o event_sources.o ras.o \ firmware.o power.o dlpar.o mobility.o rng.o \ pci.o pci_dlpar.o eeh_pseries.o msi.o \ diff --git a/arch/powerpc/platforms/pseries/papr-platform-dump.c b/arch/powerpc/platforms/pseries/papr-platform-dump.c new file mode 100644 index 000000000000..671a58630c4a --- /dev/null +++ b/arch/powerpc/platforms/pseries/papr-platform-dump.c @@ -0,0 +1,356 @@ +// SPDX-License-Identifier: GPL-2.0-only + +#define pr_fmt(fmt) "papr-platform-dump: " fmt + +#include <linux/anon_inodes.h> +#include <linux/file.h> +#include <linux/fs.h> +#include <linux/init.h> +#include <linux/kernel.h> +#include <linux/miscdevice.h> +#include <asm/machdep.h> +#include <asm/rtas-work-area.h> +#include <asm/rtas.h> +#include <uapi/asm/papr-platform-dump.h> + +/* + * Function-specific return values for ibm,platform-dump, derived from + * PAPR+ v2.13 7.3.3.4.1 "ibm,platform-dump RTAS Call". + */ +#define RTAS_IBM_PLATFORM_DUMP_COMPLETE 0 /* Complete dump retrieved. */ +#define RTAS_IBM_PLATFORM_DUMP_CONTINUE 1 /* Continue dump */ +#define RTAS_NOT_AUTHORIZED -9002 /* Not Authorized */ + +#define RTAS_IBM_PLATFORM_DUMP_START 2 /* Linux status to start dump */ + +/** + * struct ibm_platform_dump_params - Parameters (in and out) for + * ibm,platform-dump + * @work_area: In: work area buffer for results. + * @buf_length: In: work area buffer length in bytes + * @dump_tag: In: Dump_Tag representing an id of the dump being processed + * @sequence: In: Sequence number. Out: Next sequence number. + * @written: Out: Bytes written by ibm,platform-dump to @work_area. + * @status: Out: RTAS call status. + * @list: Maintain the list of dumps are in progress. Can retrieve + * multiple dumps with different dump IDs at the same time, + * but not with the same dump ID. This list is used to + * determine whether the dump for the same ID is in progress. + */ +struct ibm_platform_dump_params { + struct rtas_work_area *work_area; + u32 buf_length; + u32 dump_tag_hi; + u32 dump_tag_lo; + u32 sequence_hi; + u32 sequence_lo; + u32 bytes_ret_hi; + u32 bytes_ret_lo; + s32 status; + struct list_head list; +}; + +/* + * Multiple dumps with different dump IDs can be retrieved at the same + * time, but not with dame dump ID. platform_dump_list_mutex and + * platform_dump_list are used to prevent this behavior. + */ +static DEFINE_MUTEX(platform_dump_list_mutex); +static LIST_HEAD(platform_dump_list); + +/** + * rtas_ibm_platform_dump() - Call ibm,platform-dump to fill a work area + * buffer. + * @params: See &struct ibm_platform_dump_params. + * @buf_addr: Address of dump buffer (work_area) + * @buf_length: Length of the buffer in bytes (min. 1024) + * + * Calls ibm,platform-dump until it errors or successfully deposits data + * into the supplied work area. Handles RTAS retry statuses. Maps RTAS + * error statuses to reasonable errno values. + * + * Can request multiple dumps with different dump IDs at the same time, + * but not with the same dump ID which is prevented with the check in + * the ioctl code (papr_platform_dump_create_handle()). + * + * The caller should inspect @params.status to determine whether more + * calls are needed to complete the sequence. + * + * Context: May sleep. + * Return: -ve on error, 0 for dump complete and 1 for continue dump + */ +static int rtas_ibm_platform_dump(struct ibm_platform_dump_params *params, + phys_addr_t buf_addr, u32 buf_length) +{ + u32 rets[4]; + s32 fwrc; + int ret = 0; + + do { + fwrc = rtas_call(rtas_function_token(RTAS_FN_IBM_PLATFORM_DUMP), + 6, 5, + rets, + params->dump_tag_hi, + params->dump_tag_lo, + params->sequence_hi, + params->sequence_lo, + buf_addr, + buf_length); + } while (rtas_busy_delay(fwrc)); + + switch (fwrc) { + case RTAS_HARDWARE_ERROR: + ret = -EIO; + break; + case RTAS_NOT_AUTHORIZED: + ret = -EPERM; + break; + case RTAS_IBM_PLATFORM_DUMP_CONTINUE: + case RTAS_IBM_PLATFORM_DUMP_COMPLETE: + params->sequence_hi = rets[0]; + params->sequence_lo = rets[1]; + params->bytes_ret_hi = rets[2]; + params->bytes_ret_lo = rets[3]; + break; + default: + ret = -EIO; + pr_err_ratelimited("unexpected ibm,platform-dump status %d\n", + fwrc); + break; + } + + params->status = fwrc; + return ret; +} + +/* + * Platform dump is used with multiple RTAS calls to retrieve the + * complete dump for the provided dump ID. Once the complete dump is + * retrieved, the hypervisor returns dump complete status (0) for the + * last RTAS call and expects the caller issues one more HCALL with + * NULL buffer to invalidate the dump so that the hypervisor can remove + * the dump. + * + * After the specific dump is invalidated in the hypervisor, expect the + * dump complete status for the new sequence - the user space initiates + * new request for the same dump ID. + */ +static ssize_t papr_platform_dump_handle_read(struct file *file, + char __user *buf, size_t size, loff_t *off) +{ + struct ibm_platform_dump_params *params = file->private_data; + u64 total_bytes; + s32 fwrc; + + /* + * Dump already completed with the previous read calls. + * In case if the user space issues further reads, returns + * -EINVAL. + */ + if (!params->buf_length) { + pr_warn_once("Platform dump completed for dump ID %llu\n", + (u64) (((u64)params->dump_tag_hi << 32) | + params->dump_tag_lo)); + return -EINVAL; + } + + if (size < SZ_1K) { + pr_err_once("Buffer length should be minimum 1024 bytes\n"); + return -EINVAL; + } + + /* + * The hypervisor returns status 0 if no more data available to + * download. Then expects the last RTAS call with NULL buffer + * to invalidate the dump which means dump will be freed in the + * hypervisor. + */ + if (params->status == RTAS_IBM_PLATFORM_DUMP_COMPLETE) { + params->buf_length = 0; + fwrc = rtas_ibm_platform_dump(params, 0, 0); + /* + * Returns 0 (success) to the user space so that user + * space read stops. + */ + return fwrc; + } else if (size > params->buf_length) { + /* + * Allocate 4K work area. So if the user requests > 4K, + * resize the buffer length. + */ + size = params->buf_length; + } + + fwrc = rtas_ibm_platform_dump(params, + rtas_work_area_phys(params->work_area), + size); + if (fwrc < 0) + return fwrc; + + total_bytes = (u64) (((u64)params->bytes_ret_hi << 32) | + params->bytes_ret_lo); + + /* + * Kernel or firmware bug, do not continue. + */ + if (WARN(total_bytes > size, "possible write beyond end of work area")) + return -EFAULT; + + if (copy_to_user(buf, rtas_work_area_raw_buf(params->work_area), + total_bytes)) + return -EFAULT; + + return total_bytes; +} + +static int papr_platform_dump_handle_release(struct inode *inode, + struct file *file) +{ + struct ibm_platform_dump_params *params = file->private_data; + s32 fwrc = 0; + + if (params->work_area) + rtas_work_area_free(params->work_area); + + mutex_lock(&platform_dump_list_mutex); + list_del(¶ms->list); + mutex_unlock(&platform_dump_list_mutex); + + kfree(params); + + return fwrc; +} + +static const struct file_operations papr_platform_dump_handle_ops = { + .read = papr_platform_dump_handle_read, + .release = papr_platform_dump_handle_release, +}; + +/** + * papr_platform_dump_create_handle() - Create a fd-based handle for + * reading platform dump + * + * Handler for PAPR_PLATFORM_DUMP_IOC_CREATE_HANDLE ioctl command + * Allocates RTAS parameter struct and work area and attached to the + * file descriptor for reading by user space with the multiple RTAS + * calls until the dump is completed. This memory allocation is freed + * when the file is released. + * + * Multiple dump requests with different IDs are allowed at the same + * time, but not with the same dump ID. So if the user space is + * already opened file descriptor for the specific dump ID, return + * -EALREADY for the next request. + * + * @dump_tag: Dump ID for the dump requested to retrieve from the + * hypervisor + * + * Return: The installed fd number if successful, -ve errno otherwise. + */ +static long papr_platform_dump_create_handle(u64 dump_tag) +{ + struct ibm_platform_dump_params *params; + u64 param_dump_tag; + struct file *file; + long err; + int fd; + + /* + * Return failure if the user space is already opened FD for + * the specific dump ID. This check will prevent multiple dump + * requests for the same dump ID at the same time. Generally + * should not expect this, but in case. + */ + list_for_each_entry(params, &platform_dump_list, list) { + param_dump_tag = (u64) (((u64)params->dump_tag_hi << 32) | + params->bytes_ret_lo); + if (dump_tag == param_dump_tag) { + pr_err("Platform dump for ID(%llu) is already in progress\n", + dump_tag); + return -EALREADY; + } + } + + params = kzalloc(sizeof(struct ibm_platform_dump_params), + GFP_KERNEL_ACCOUNT); + if (!params) + return -ENOMEM; + + params->work_area = rtas_work_area_alloc(SZ_4K); + params->buf_length = SZ_4K; + params->dump_tag_hi = (u32)(dump_tag >> 32); + params->dump_tag_lo = (u32)(dump_tag & 0x00000000ffffffffULL); + params->status = RTAS_IBM_PLATFORM_DUMP_START; + + fd = get_unused_fd_flags(O_RDONLY | O_CLOEXEC); + if (fd < 0) { + err = fd; + goto put_fd; + } + + file = anon_inode_getfile("[papr-platform-dump]", + &papr_platform_dump_handle_ops, + (void *)params, O_RDONLY); + if (IS_ERR(file)) { + err = PTR_ERR(file); + goto put_fd; + } + + file->f_mode |= FMODE_LSEEK | FMODE_PREAD; + fd_install(fd, file); + + list_add(¶ms->list, &platform_dump_list); + + pr_info("%s (%d) initiated platform dump for dump tag %llu\n", + current->comm, current->pid, dump_tag); + return fd; +put_fd: + rtas_work_area_free(params->work_area); + kfree(params); + put_unused_fd(fd); + return err; +} + +/* + * Top-level ioctl handler for /dev/papr-platform-dump. + */ +static long papr_platform_dump_dev_ioctl(struct file *filp, unsigned int ioctl, + unsigned long arg) +{ + u64 __user *argp = (void __user *)arg; + u64 dump_tag; + long ret; + + if (get_user(dump_tag, argp)) + return -EFAULT; + + switch (ioctl) { + case PAPR_PLATFORM_DUMP_IOC_CREATE_HANDLE: + mutex_lock(&platform_dump_list_mutex); + ret = papr_platform_dump_create_handle(dump_tag); + mutex_unlock(&platform_dump_list_mutex); + break; + default: + ret = -ENOIOCTLCMD; + break; + } + return ret; +} + +static const struct file_operations papr_platform_dump_ops = { + .unlocked_ioctl = papr_platform_dump_dev_ioctl, +}; + +static struct miscdevice papr_platform_dump_dev = { + .minor = MISC_DYNAMIC_MINOR, + .name = "papr-platform-dump", + .fops = &papr_platform_dump_ops, +}; + +static __init int papr_platform_dump_init(void) +{ + if (!rtas_function_implemented(RTAS_FN_IBM_PLATFORM_DUMP)) + return -ENODEV; + + return misc_register(&papr_platform_dump_dev); +} +machine_device_initcall(pseries, papr_platform_dump_init);
ibm,platform-dump RTAS call in combination with writable mapping /dev/mem is issued to collect platform dump from the hypervisor and may need multiple calls to get the complete dump. The current implementation uses rtas_platform_dump() API provided by librtas library to issue these HCALLs. But /dev/mem access by the user space is prohibited under system lockdown. The solution should be to restrict access to RTAS function in user space and provide kernel interfaces to collect dump. This patch adds papr-platform-dump character driver and expose standard interfaces such as open / ioctl/ read to user space in ways that are compatible with lockdown. PAPR (7.3.3.4.1 ibm,platform-dump) provides a method to obtain the complete dump: - Each dump will be identified by ID called dump tag. - A sequence of RTAS calls have to be issued until retrieve the complete dump. The hypervisor expects the first RTAS call with the sequence 0 and the subsequent calls with the sequence number returned from the previous calls. - The hypervisor returns dump complete status once the complete dump is retrieved. But expects one more RTAS call from the partition with the NULL buffer to invalidate dump which means the dump will be removed in the hypervisor. - Sequence of calls are allowed with different dump IDs at the same time but not with the same dump ID. Expose these interfaces to user space with a /dev/papr-platform-dump character device using the following programming model: int devfd = open("/dev/papr-platform-dump", O_RDONLY); int fd = ioctl(devfd,PAPR_PLATFORM_DUMP_IOC_CREATE_HANDLE, &dump_id) - Restrict user space to access with the same dump ID. Typically we do not expect user space requests the dump again for the same dump ID. char *buf = malloc(size); length = read(fd, buf, size); - size should be minimum 1K based on PAPR and <= 4K based on RTAS work area size. It will be restrict to RTAS work area size. Using 4K work area based on the current implementation in librtas library - Each read call issue RTAS call to get the data based on the size requirement and returns bytes returned from the hypervisor - If the previous HCALL returns dump complete status, the next read will be the last one which issue dump invalidate call and returns 0. The read API should use the file descriptor obtained from ioctl based on dump ID so that gets dump contents for the corresponding dump ID. Implemented support in librtas (rtas_platform_dump()) for this new ABI to support system lockdown. Signed-off-by: Haren Myneni <haren@linux.ibm.com> --- .../include/uapi/asm/papr-platform-dump.h | 14 + arch/powerpc/platforms/pseries/Makefile | 2 +- .../platforms/pseries/papr-platform-dump.c | 356 ++++++++++++++++++ 3 files changed, 371 insertions(+), 1 deletion(-) create mode 100644 arch/powerpc/include/uapi/asm/papr-platform-dump.h create mode 100644 arch/powerpc/platforms/pseries/papr-platform-dump.c