From patchwork Fri Feb 1 16:27:34 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Wilco Dijkstra X-Patchwork-Id: 1034903 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=sourceware.org (client-ip=209.132.180.131; helo=sourceware.org; envelope-from=libc-alpha-return-99682-incoming=patchwork.ozlabs.org@sourceware.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=arm.com Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; secure) header.d=sourceware.org header.i=@sourceware.org header.b="wk3/8Svv"; dkim=pass (1024-bit key; unprotected) header.d=armh.onmicrosoft.com header.i=@armh.onmicrosoft.com header.b="H2gHd2QN"; dkim-atps=neutral Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 43rjFG6fjFz9sDX for ; Sat, 2 Feb 2019 03:27:46 +1100 (AEDT) DomainKey-Signature: a=rsa-sha1; c=nofws; d=sourceware.org; h=list-id :list-unsubscribe:list-subscribe:list-archive:list-post :list-help:sender:from:to:cc:subject:date:message-id :content-type:content-transfer-encoding:mime-version; q=dns; s= default; b=qpPQziiZSQXqAX3OByhfGMkv4L/UJSDy4k9sbSXXJpOgM8Cw85wCG sAvlWIsRAMflTiDJQgLNVF97Zq1J4+N6ZuoM0zmWhHcWalppmgoD3Petfz8iDagT dh0jnHef/rdQ5+MjTG2tcSwJ5sgHsMNzrbVroUrwxDL/thyKhSSl/M= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=sourceware.org; h=list-id :list-unsubscribe:list-subscribe:list-archive:list-post :list-help:sender:from:to:cc:subject:date:message-id :content-type:content-transfer-encoding:mime-version; s=default; bh=/cFVZLOkDaeAPYdhTHDUoC+fYZ8=; b=wk3/8SvvVS3F3Q9oNXFyF5je43Y5 HmYZrSiAVTu5aZgsaXKEq8Nm9kH7mnwNBjLLcvtduBe7MyQo430aGesg2t0vE9dA j4egFiPQt0b/VSKjVb/Udi/PJz83qlb9ZMYfGYyRtvnJHFG0ANkehsAD24Q0BBgg 9YazeEQFeWirIw8= Received: (qmail 18962 invoked by alias); 1 Feb 2019 16:27:41 -0000 Mailing-List: contact libc-alpha-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: libc-alpha-owner@sourceware.org Delivered-To: mailing list libc-alpha@sourceware.org Received: (qmail 18953 invoked by uid 89); 1 Feb 2019 16:27:41 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-26.9 required=5.0 tests=BAYES_00, GIT_PATCH_0, GIT_PATCH_1, GIT_PATCH_2, GIT_PATCH_3, KAM_SHORT, RCVD_IN_DNSWL_NONE, SPF_HELO_PASS, SPF_PASS autolearn=ham version=3.3.2 spammy=thr X-HELO: EUR01-HE1-obe.outbound.protection.outlook.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=armh.onmicrosoft.com; s=selector1-arm-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=t56lRtqbsFIWIVMrCDmIwlUMgpZwfv0WVZ6iecOmBrU=; b=H2gHd2QNrzhB+NMsyXAb9zsOKwxIBi10Gc/+u5eF1w+nClV5kvU0nWHJOu2o6u1LqM+aAavwZ2y55m1TDLT4pFnxXIUXzv6VC9/ul3umii1rm3QnPWNUju0n2FAvz7luR88GhNevwVqYjJg70MWWILvlkPYeq241ypfiGvSnRJM= From: Wilco Dijkstra To: 'GNU C Library' CC: nd Subject: [PATCH] Add malloc micro benchmark Date: Fri, 1 Feb 2019 16:27:34 +0000 Message-ID: authentication-results: spf=none (sender IP is ) smtp.mailfrom=Wilco.Dijkstra@arm.com; received-spf: None (protection.outlook.com: arm.com does not designate permitted sender hosts) x-ms-exchange-senderadcheck: 1 MIME-Version: 1.0 X-MS-Exchange-CrossTenant-mailboxtype: HOSTED Add a malloc micro benchmark to enable accurate testing of the various paths in malloc and free. The benchmark does a varying number of allocations of a given block size, then frees them again. It tests 3 different scenarios: single-threaded using main arena, multi-threaded using thread-arena, main arena with SINGLE_THREAD_P false. OK for commit? ChangeLog: 2019-02-01 Wilco Dijkstra * benchtests/Makefile: Add malloc-simple benchmark. * benchtests/bench-malloc-simple.c: New benchmark. Reviewed-by: Carlos O'Donell diff --git a/benchtests/Makefile b/benchtests/Makefile index 12036b1935dc7ea84b421f024d6fe3190ae35a6e..09f7cb8e475a312268eebb4d346edde70d22bb3d 100644 --- a/benchtests/Makefile +++ b/benchtests/Makefile @@ -90,7 +90,7 @@ CFLAGS-bench-trunc.c += -fno-builtin CFLAGS-bench-truncf.c += -fno-builtin ifeq (${BENCHSET},) -bench-malloc := malloc-thread +bench-malloc := malloc-thread malloc-simple else bench-malloc := $(filter malloc-%,${BENCHSET}) endif @@ -98,7 +98,7 @@ endif $(addprefix $(objpfx)bench-,$(bench-math)): $(libm) $(addprefix $(objpfx)bench-,$(math-benchset)): $(libm) $(addprefix $(objpfx)bench-,$(bench-pthread)): $(shared-thread-library) -$(objpfx)bench-malloc-thread: $(shared-thread-library) +$(addprefix $(objpfx)bench-,$(bench-malloc)): $(shared-thread-library) @@ -165,7 +165,7 @@ bench-clean: ifneq ($(strip ${BENCHSET}),) VALIDBENCHSETNAMES := bench-pthread bench-math bench-string string-benchset \ wcsmbs-benchset stdlib-benchset stdio-common-benchset math-benchset \ - malloc-thread + malloc-thread malloc-simple INVALIDBENCHSETNAMES := $(filter-out ${VALIDBENCHSETNAMES},${BENCHSET}) ifneq (${INVALIDBENCHSETNAMES},) $(info The following values in BENCHSET are invalid: ${INVALIDBENCHSETNAMES}) @@ -194,10 +194,18 @@ bench-set: $(binaries-benchset) bench-malloc: $(binaries-bench-malloc) for run in $^; do \ + echo "$${run}"; \ + if [ `basename $${run}` = "bench-malloc-thread" ]; then \ for thr in 1 8 16 32; do \ echo "Running $${run} $${thr}"; \ - $(run-bench) $${thr} > $${run}-$${thr}.out; \ - done;\ + $(run-bench) $${thr} > $${run}-$${thr}.out; \ + done;\ + else \ + for thr in 8 16 32 64 128 256 512 1024 2048 4096; do \ + echo "Running $${run} $${thr}"; \ + $(run-bench) $${thr} > $${run}-$${thr}.out; \ + done;\ + fi;\ done # Build and execute the benchmark functions. This target generates JSON diff --git a/benchtests/bench-malloc-simple.c b/benchtests/bench-malloc-simple.c new file mode 100644 index 0000000000000000000000000000000000000000..995d78965fd65fdf1c84cf85bf38990cd49402b3 --- /dev/null +++ b/benchtests/bench-malloc-simple.c @@ -0,0 +1,182 @@ +/* Benchmark malloc and free functions. + Copyright (C) 2018 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#include +#include +#include +#include +#include +#include "bench-timing.h" +#include "json-lib.h" + +#define NUM_ITERS 1000000 +#define NUM_ALLOCS 4 +#define MAX_ALLOCS 1600 + +typedef struct +{ + size_t iters; + size_t size; + int n; + timing_t elapsed; +} malloc_args; + +static void +do_benchmark (malloc_args *args, int **arr) +{ + timing_t start, stop; + size_t iters = args->iters; + size_t size = args->size; + int n = args->n; + + TIMING_NOW (start); + + for (int j = 0; j < iters; j++) + { + for (int i = 0; i < n; i++) + arr[i] = malloc (size); + + for (int i = 0; i < n; i++) + free (arr[i]); + } + + TIMING_NOW (stop); + + TIMING_DIFF (args->elapsed, start, stop); +} + +static malloc_args tests[3][NUM_ALLOCS]; +static int allocs[NUM_ALLOCS] = { 25, 100, 400, MAX_ALLOCS }; + +static void * +thread_test (void *p) +{ + int **arr = (int**)p; + + /* Run benchmark multi-threaded. */ + for (int i = 0; i < NUM_ALLOCS; i++) + do_benchmark (&tests[2][i], arr); + + return p; +} + +void +bench (unsigned long size) +{ + size_t iters = NUM_ITERS; + int **arr = (int**) malloc (MAX_ALLOCS * sizeof (void*)); + unsigned long res; + + TIMING_INIT (res); + + for (int t = 0; t <= 3; t++) + for (int i = 0; i < NUM_ALLOCS; i++) + { + tests[t][i].n = allocs[i]; + tests[t][i].size = size; + tests[t][i].iters = iters / allocs[i]; + + /* Do a quick warmup run. */ + if (t == 0) + do_benchmark (&tests[0][i], arr); + } + + /* Run benchmark single threaded in main_arena. */ + for (int i = 0; i < NUM_ALLOCS; i++) + do_benchmark (&tests[0][i], arr); + + /* Run benchmark in a thread_arena. */ + pthread_t t; + pthread_create (&t, NULL, thread_test, (void*)arr); + pthread_join (t, NULL); + + /* Repeat benchmark in main_arena with SINGLE_THREAD_P == false. */ + for (int i = 0; i < NUM_ALLOCS; i++) + do_benchmark (&tests[1][i], arr); + + free (arr); + + json_ctx_t json_ctx; + + json_init (&json_ctx, 0, stdout); + + json_document_begin (&json_ctx); + + json_attr_string (&json_ctx, "timing_type", TIMING_TYPE); + + json_attr_object_begin (&json_ctx, "functions"); + + json_attr_object_begin (&json_ctx, "malloc"); + + char s[100]; + double iters2 = iters; + + json_attr_object_begin (&json_ctx, ""); + json_attr_double (&json_ctx, "malloc_block_size", size); + + struct rusage usage; + getrusage (RUSAGE_SELF, &usage); + json_attr_double (&json_ctx, "max_rss", usage.ru_maxrss); + + for (int i = 0; i < NUM_ALLOCS; i++) + { + sprintf (s, "main_arena_st_allocs_%04d_time", allocs[i]); + json_attr_double (&json_ctx, s, tests[0][i].elapsed / iters2); + } + + for (int i = 0; i < NUM_ALLOCS; i++) + { + sprintf (s, "main_arena_mt_allocs_%04d_time", allocs[i]); + json_attr_double (&json_ctx, s, tests[1][i].elapsed / iters2); + } + + for (int i = 0; i < NUM_ALLOCS; i++) + { + sprintf (s, "thread_arena__allocs_%04d_time", allocs[i]); + json_attr_double (&json_ctx, s, tests[2][i].elapsed / iters2); + } + + json_attr_object_end (&json_ctx); + + json_attr_object_end (&json_ctx); + + json_attr_object_end (&json_ctx); + + json_document_end (&json_ctx); +} + +static void usage (const char *name) +{ + fprintf (stderr, "%s: \n", name); + exit (1); +} + +int +main (int argc, char **argv) +{ + long val = 16; + if (argc == 2) + val = strtol (argv[1], NULL, 0); + + if (argc > 2 || val <= 0) + usage (argv[0]); + + bench (val); + + return 0; +}