From patchwork Sat Nov 13 14:32:02 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Malcolm X-Patchwork-Id: 1554715 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: bilbo.ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.a=rsa-sha256 header.s=default header.b=pu4XjEV+; dkim-atps=neutral Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=8.43.85.97; helo=sourceware.org; envelope-from=gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=) Received: from sourceware.org (ip-8-43-85-97.sourceware.org [8.43.85.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by bilbo.ozlabs.org (Postfix) with ESMTPS id 4Hryc30SVJz9sCD for ; Sun, 14 Nov 2021 01:33:05 +1100 (AEDT) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 025643858435 for ; Sat, 13 Nov 2021 14:33:03 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 025643858435 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1636813983; bh=UI5V2RnZQoEomtcltAFgoOMhWlJjHQnaI7s3HZ7RTqQ=; h=To:Subject:Date:List-Id:List-Unsubscribe:List-Archive:List-Post: List-Help:List-Subscribe:From:Reply-To:From; b=pu4XjEV++KC0BPEJCsWdlIrIxPQ8JNOZnBUbby257HsP0ha9gygExFxHCim4YDcte cUDiXIBs/w0sBj5DmyihcONUpokMy87/VBI265CQEKJ19g6xVfT+IymwvIR54FRaQC donp1VTmrtH8rYNW6dWILr1lOQoJFaQRGPD5Ep/g= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.151.124]) by sourceware.org (Postfix) with ESMTPS id 1416A3858403 for ; Sat, 13 Nov 2021 14:32:11 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 1416A3858403 Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-2-t6ZkLg7HO2qQaAJA3M-5fQ-1; Sat, 13 Nov 2021 09:32:07 -0500 X-MC-Unique: t6ZkLg7HO2qQaAJA3M-5fQ-1 Received: from smtp.corp.redhat.com (int-mx05.intmail.prod.int.phx2.redhat.com [10.5.11.15]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 4A32B1023F4D for ; Sat, 13 Nov 2021 14:32:06 +0000 (UTC) Received: from t14s.localdomain.com (ovpn-113-54.phx2.redhat.com [10.3.113.54]) by smtp.corp.redhat.com (Postfix) with ESMTP id 2939D5D6B1; Sat, 13 Nov 2021 14:32:04 +0000 (UTC) To: gcc-patches@gcc.gnu.org Subject: [committed] analyzer: add four new taint-based warnings Date: Sat, 13 Nov 2021 09:32:02 -0500 Message-Id: <20211113143202.2073099-1-dmalcolm@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.15 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com X-Spam-Status: No, score=-13.0 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_NONE, TXREP autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: David Malcolm via Gcc-patches From: David Malcolm Reply-To: David Malcolm Errors-To: gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org Sender: "Gcc-patches" The initial commit of the analyzer in GCC 10 had a single warning, -Wanalyzer-tainted-array-index and required manually enabling the taint checker with -fanalyzer-checker=taint (due to scaling issues). This patch extends the taint detection to add four new taint-based warnings: -Wanalyzer-tainted-allocation-size for e.g. attacker-controlled malloc/alloca -Wanalyzer-tainted-divisor for detecting where an attacker can inject a divide-by-zero -Wanalyzer-tainted-offset for attacker-controlled pointer offsets -Wanalyzer-tainted-size for e.g. attacker-controlled memset and rewords all the warnings to talk about "attacker-controlled" values rather than "tainted" values. Unfortunately I haven't yet addressed the scaling issues, so all of these still require -fanalyzer-checker=taint (in addition to -fanalyzer). Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu. Pushed to trunk as b9365b93212041f14a7f71ba8da5af4d82240dc6. gcc/analyzer/ChangeLog: * analyzer.opt (Wanalyzer-tainted-allocation-size): New. (Wanalyzer-tainted-divisor): New. (Wanalyzer-tainted-offset): New. (Wanalyzer-tainted-size): New. * engine.cc (impl_region_model_context::get_taint_map): New. * exploded-graph.h (impl_region_model_context::get_taint_map): New decl. * program-state.cc (sm_state_map::get_state): Call alt_get_inherited_state. (sm_state_map::impl_set_state): Modify states within compound svalues. (program_state::impl_call_analyzer_dump_state): Undo casts. (selftest::test_program_state_1): Update for new context param of create_region_for_heap_alloc. (selftest::test_program_state_merging): Likewise. * region-model-impl-calls.cc (region_model::impl_call_alloca): Likewise. (region_model::impl_call_calloc): Likewise. (region_model::impl_call_malloc): Likewise. (region_model::impl_call_operator_new): Likewise. (region_model::impl_call_realloc): Likewise. * region-model.cc (region_model::check_region_access): Call check_region_for_taint. (region_model::get_representative_path_var_1): Handle binops. (region_model::create_region_for_heap_alloc): Add "ctxt" param and pass it to set_dynamic_extents. (region_model::create_region_for_alloca): Likewise. (region_model::set_dynamic_extents): Add "ctxt" param and use it to call check_dynamic_size_for_taint. (selftest::test_state_merging): Update for new context param of create_region_for_heap_alloc. (selftest::test_malloc_constraints): Likewise. (selftest::test_malloc): Likewise. (selftest::test_alloca): Likewise for create_region_for_alloca. * region-model.h (region_model::create_region_for_heap_alloc): Add "ctxt" param. (region_model::create_region_for_alloca): Likewise. (region_model::set_dynamic_extents): Likewise. (region_model::check_dynamic_size_for_taint): New decl. (region_model::check_region_for_taint): New decl. (region_model_context::get_taint_map): New vfunc. (noop_region_model_context::get_taint_map): New. * sm-taint.cc: Remove include of "diagnostic-event-id.h"; add includes of "gimple-iterator.h", "tristate.h", "selftest.h", "ordered-hash-map.h", "cgraph.h", "cfg.h", "digraph.h", "analyzer/supergraph.h", "analyzer/call-string.h", "analyzer/program-point.h", "analyzer/store.h", "analyzer/region-model.h", and "analyzer/program-state.h". (enum bounds): Move to top of file. (class taint_diagnostic): New. (class tainted_array_index): Convert to subclass of taint_diagnostic. (tainted_array_index::emit): Add CWE-129. Reword warning to use "attacker-controlled" rather than "tainted". (tainted_array_index::describe_state_change): Move to taint_diagnostic::describe_state_change. (tainted_array_index::describe_final_event): Reword to use "attacker-controlled" rather than "tainted". (class tainted_offset): New. (class tainted_size): New. (class tainted_divisor): New. (class tainted_allocation_size): New. (taint_state_machine::alt_get_inherited_state): New. (taint_state_machine::on_stmt): In assignment handling, remove ARRAY_REF handling in favor of check_region_for_taint. Add detection of tainted divisors. (taint_state_machine::get_taint): New. (taint_state_machine::combine_states): New. (region_model::check_region_for_taint): New. (region_model::check_dynamic_size_for_taint): New. * sm.h (state_machine::alt_get_inherited_state): New. gcc/ChangeLog: * doc/invoke.texi (Static Analyzer Options): Add -Wno-analyzer-tainted-allocation-size, -Wno-analyzer-tainted-divisor, -Wno-analyzer-tainted-offset, and -Wno-analyzer-tainted-size to list. Add -Wanalyzer-tainted-allocation-size, -Wanalyzer-tainted-divisor, -Wanalyzer-tainted-offset, and -Wanalyzer-tainted-size to list of options effectively enabled by -fanalyzer. (-Wanalyzer-tainted-allocation-size): New. (-Wanalyzer-tainted-array-index): Tweak wording; add link to CWE. (-Wanalyzer-tainted-divisor): New. (-Wanalyzer-tainted-offset): New. (-Wanalyzer-tainted-size): New. gcc/testsuite/ChangeLog: * gcc.dg/analyzer/pr93382.c: Tweak expected wording. * gcc.dg/analyzer/taint-alloc-1.c: New test. * gcc.dg/analyzer/taint-alloc-2.c: New test. * gcc.dg/analyzer/taint-divisor-1.c: New test. * gcc.dg/analyzer/taint-1.c: Rename to... * gcc.dg/analyzer/taint-read-index-1.c: ...this. Tweak expected wording. Mark some events as xfail. * gcc.dg/analyzer/taint-read-offset-1.c: New test. * gcc.dg/analyzer/taint-size-1.c: New test. * gcc.dg/analyzer/taint-write-index-1.c: New test. * gcc.dg/analyzer/taint-write-offset-1.c: New test. Signed-off-by: David Malcolm --- gcc/analyzer/analyzer.opt | 16 + gcc/analyzer/engine.cc | 18 + gcc/analyzer/exploded-graph.h | 3 + gcc/analyzer/program-state.cc | 26 +- gcc/analyzer/region-model-impl-calls.cc | 15 +- gcc/analyzer/region-model.cc | 47 +- gcc/analyzer/region-model.h | 27 +- gcc/analyzer/sm-taint.cc | 826 ++++++++++++++++-- gcc/analyzer/sm.h | 9 + gcc/doc/invoke.texi | 66 +- gcc/testsuite/gcc.dg/analyzer/pr93382.c | 2 +- gcc/testsuite/gcc.dg/analyzer/taint-alloc-1.c | 64 ++ gcc/testsuite/gcc.dg/analyzer/taint-alloc-2.c | 27 + .../gcc.dg/analyzer/taint-divisor-1.c | 26 + .../{taint-1.c => taint-read-index-1.c} | 19 +- .../gcc.dg/analyzer/taint-read-offset-1.c | 128 +++ gcc/testsuite/gcc.dg/analyzer/taint-size-1.c | 32 + .../gcc.dg/analyzer/taint-write-index-1.c | 132 +++ .../gcc.dg/analyzer/taint-write-offset-1.c | 132 +++ 19 files changed, 1492 insertions(+), 123 deletions(-) create mode 100644 gcc/testsuite/gcc.dg/analyzer/taint-alloc-1.c create mode 100644 gcc/testsuite/gcc.dg/analyzer/taint-alloc-2.c create mode 100644 gcc/testsuite/gcc.dg/analyzer/taint-divisor-1.c rename gcc/testsuite/gcc.dg/analyzer/{taint-1.c => taint-read-index-1.c} (72%) create mode 100644 gcc/testsuite/gcc.dg/analyzer/taint-read-offset-1.c create mode 100644 gcc/testsuite/gcc.dg/analyzer/taint-size-1.c create mode 100644 gcc/testsuite/gcc.dg/analyzer/taint-write-index-1.c create mode 100644 gcc/testsuite/gcc.dg/analyzer/taint-write-offset-1.c diff --git a/gcc/analyzer/analyzer.opt b/gcc/analyzer/analyzer.opt index 6ddb6e3abb3..c85e30f8b11 100644 --- a/gcc/analyzer/analyzer.opt +++ b/gcc/analyzer/analyzer.opt @@ -114,10 +114,26 @@ Wanalyzer-stale-setjmp-buffer Common Var(warn_analyzer_stale_setjmp_buffer) Init(1) Warning Warn about code paths in which a longjmp rewinds to a jmp_buf saved in a stack frame that has returned. +Wanalyzer-tainted-allocation-size +Common Var(warn_analyzer_tainted_allocation_size) Init(1) Warning +Warn about code paths in which an unsanitized value is used as an allocation size. + Wanalyzer-tainted-array-index Common Var(warn_analyzer_tainted_array_index) Init(1) Warning Warn about code paths in which an unsanitized value is used as an array index. +Wanalyzer-tainted-divisor +Common Var(warn_analyzer_tainted_divisor) Init(1) Warning +Warn about code paths in which an unsanitized value is used as a divisor. + +Wanalyzer-tainted-offset +Common Var(warn_analyzer_tainted_offset) Init(1) Warning +Warn about code paths in which an unsanitized value is used as a pointer offset. + +Wanalyzer-tainted-size +Common Var(warn_analyzer_tainted_size) Init(1) Warning +Warn about code paths in which an unsanitized value is used as a size. + Wanalyzer-use-after-free Common Var(warn_analyzer_use_after_free) Init(1) Warning Warn about code paths in which a freed value is used. diff --git a/gcc/analyzer/engine.cc b/gcc/analyzer/engine.cc index b29a21cce30..096e219392d 100644 --- a/gcc/analyzer/engine.cc +++ b/gcc/analyzer/engine.cc @@ -220,6 +220,24 @@ impl_region_model_context::get_malloc_map (sm_state_map **out_smap, return true; } +bool +impl_region_model_context::get_taint_map (sm_state_map **out_smap, + const state_machine **out_sm, + unsigned *out_sm_idx) +{ + if (!m_new_state) + return false; + + unsigned taint_sm_idx; + if (!m_ext_state.get_sm_idx_by_name ("taint", &taint_sm_idx)) + return false; + + *out_smap = m_new_state->m_checker_states[taint_sm_idx]; + *out_sm = &m_ext_state.get_sm (taint_sm_idx); + *out_sm_idx = taint_sm_idx; + return true; +} + /* struct setjmp_record. */ int diff --git a/gcc/analyzer/exploded-graph.h b/gcc/analyzer/exploded-graph.h index b9c17672aec..9b18b487999 100644 --- a/gcc/analyzer/exploded-graph.h +++ b/gcc/analyzer/exploded-graph.h @@ -86,6 +86,9 @@ class impl_region_model_context : public region_model_context bool get_malloc_map (sm_state_map **out_smap, const state_machine **out_sm, unsigned *out_sm_idx) FINAL OVERRIDE; + bool get_taint_map (sm_state_map **out_smap, + const state_machine **out_sm, + unsigned *out_sm_idx) FINAL OVERRIDE; exploded_graph *m_eg; log_user m_logger; diff --git a/gcc/analyzer/program-state.cc b/gcc/analyzer/program-state.cc index 8230140cec6..1c87af0f00b 100644 --- a/gcc/analyzer/program-state.cc +++ b/gcc/analyzer/program-state.cc @@ -420,6 +420,10 @@ sm_state_map::get_state (const svalue *sval, } } + if (state_machine::state_t state + = m_sm.alt_get_inherited_state (*this, sval, ext_state)) + return state; + return m_sm.get_default_state (sval); } @@ -495,6 +499,18 @@ sm_state_map::impl_set_state (const svalue *sval, gcc_assert (sval->can_have_associated_state_p ()); + if (m_sm.inherited_state_p ()) + { + if (const compound_svalue *compound_sval + = sval->dyn_cast_compound_svalue ()) + for (auto iter : *compound_sval) + { + const svalue *inner_sval = iter.second; + if (inner_sval->can_have_associated_state_p ()) + impl_set_state (inner_sval, state, origin, ext_state); + } + } + /* Special-case state 0 as the default value. */ if (state == 0) { @@ -1384,6 +1400,10 @@ program_state::impl_call_analyzer_dump_state (const gcall *call, const svalue *sval = cd.get_arg_svalue (1); + /* Strip off cast to int (due to variadic args). */ + if (const svalue *cast = sval->maybe_undo_cast ()) + sval = cast; + state_machine::state_t state = smap->get_state (sval, ext_state); warning_at (call->location, 0, "state: %qs", state->get_name ()); } @@ -1543,7 +1563,8 @@ test_program_state_1 () region_model *model = s.m_region_model; const svalue *size_in_bytes = mgr->get_or_create_unknown_svalue (size_type_node); - const region *new_reg = model->create_region_for_heap_alloc (size_in_bytes); + const region *new_reg + = model->create_region_for_heap_alloc (size_in_bytes, NULL); const svalue *ptr_sval = mgr->get_ptr_svalue (ptr_type_node, new_reg); model->set_value (model->get_lvalue (p, NULL), ptr_sval, NULL); @@ -1599,7 +1620,8 @@ test_program_state_merging () region_model *model0 = s0.m_region_model; const svalue *size_in_bytes = mgr->get_or_create_unknown_svalue (size_type_node); - const region *new_reg = model0->create_region_for_heap_alloc (size_in_bytes); + const region *new_reg + = model0->create_region_for_heap_alloc (size_in_bytes, NULL); const svalue *ptr_sval = mgr->get_ptr_svalue (ptr_type_node, new_reg); model0->set_value (model0->get_lvalue (p, &ctxt), ptr_sval, &ctxt); diff --git a/gcc/analyzer/region-model-impl-calls.cc b/gcc/analyzer/region-model-impl-calls.cc index ff2ae9ca77d..90d4cf9c2db 100644 --- a/gcc/analyzer/region-model-impl-calls.cc +++ b/gcc/analyzer/region-model-impl-calls.cc @@ -221,7 +221,7 @@ void region_model::impl_call_alloca (const call_details &cd) { const svalue *size_sval = cd.get_arg_svalue (0); - const region *new_reg = create_region_for_alloca (size_sval); + const region *new_reg = create_region_for_alloca (size_sval, cd.get_ctxt ()); const svalue *ptr_sval = m_mgr->get_ptr_svalue (cd.get_lhs_type (), new_reg); cd.maybe_set_lhs (ptr_sval); @@ -302,7 +302,8 @@ region_model::impl_call_calloc (const call_details &cd) const svalue *prod_sval = m_mgr->get_or_create_binop (size_type_node, MULT_EXPR, nmemb_sval, size_sval); - const region *new_reg = create_region_for_heap_alloc (prod_sval); + const region *new_reg + = create_region_for_heap_alloc (prod_sval, cd.get_ctxt ()); zero_fill_region (new_reg); if (cd.get_lhs_type ()) { @@ -408,7 +409,8 @@ void region_model::impl_call_malloc (const call_details &cd) { const svalue *size_sval = cd.get_arg_svalue (0); - const region *new_reg = create_region_for_heap_alloc (size_sval); + const region *new_reg + = create_region_for_heap_alloc (size_sval, cd.get_ctxt ()); if (cd.get_lhs_type ()) { const svalue *ptr_sval @@ -471,7 +473,8 @@ void region_model::impl_call_operator_new (const call_details &cd) { const svalue *size_sval = cd.get_arg_svalue (0); - const region *new_reg = create_region_for_heap_alloc (size_sval); + const region *new_reg + = create_region_for_heap_alloc (size_sval, cd.get_ctxt ()); if (cd.get_lhs_type ()) { const svalue *ptr_sval @@ -579,7 +582,7 @@ region_model::impl_call_realloc (const call_details &cd) const svalue *size_sval = cd.get_arg_svalue (1); if (const region *buffer_reg = ptr_sval->maybe_get_region ()) if (compat_types_p (size_sval->get_type (), size_type_node)) - model->set_dynamic_extents (buffer_reg, size_sval); + model->set_dynamic_extents (buffer_reg, size_sval, ctxt); if (cd.get_lhs_region ()) { model->set_value (cd.get_lhs_region (), ptr_sval, cd.get_ctxt ()); @@ -619,7 +622,7 @@ region_model::impl_call_realloc (const call_details &cd) /* Create the new region. */ const region *new_reg - = model->create_region_for_heap_alloc (new_size_sval); + = model->create_region_for_heap_alloc (new_size_sval, ctxt); const svalue *new_ptr_sval = model->m_mgr->get_ptr_svalue (cd.get_lhs_type (), new_reg); if (cd.get_lhs_type ()) diff --git a/gcc/analyzer/region-model.cc b/gcc/analyzer/region-model.cc index a14d107709c..416a5ac7249 100644 --- a/gcc/analyzer/region-model.cc +++ b/gcc/analyzer/region-model.cc @@ -2301,6 +2301,8 @@ region_model::check_region_access (const region *reg, if (!ctxt) return; + check_region_for_taint (reg, dir, ctxt); + switch (dir) { default: @@ -2862,6 +2864,17 @@ region_model::get_representative_path_var_1 (const svalue *sval, parent_pv.m_stack_depth); } + /* Handle binops. */ + if (const binop_svalue *binop_sval = sval->dyn_cast_binop_svalue ()) + if (path_var lhs_pv + = get_representative_path_var (binop_sval->get_arg0 (), visited)) + if (path_var rhs_pv + = get_representative_path_var (binop_sval->get_arg1 (), visited)) + return path_var (build2 (binop_sval->get_op (), + sval->get_type (), + lhs_pv.m_tree, rhs_pv.m_tree), + lhs_pv.m_stack_depth); + if (pvs.length () < 1) return path_var (NULL_TREE, 0); @@ -3720,36 +3733,45 @@ region_model::append_ssa_names_cb (const region *base_reg, } } -/* Return a new region describing a heap-allocated block of memory. */ +/* Return a new region describing a heap-allocated block of memory. + Use CTXT to complain about tainted sizes. */ const region * -region_model::create_region_for_heap_alloc (const svalue *size_in_bytes) +region_model::create_region_for_heap_alloc (const svalue *size_in_bytes, + region_model_context *ctxt) { const region *reg = m_mgr->create_region_for_heap_alloc (); if (compat_types_p (size_in_bytes->get_type (), size_type_node)) - set_dynamic_extents (reg, size_in_bytes); + set_dynamic_extents (reg, size_in_bytes, ctxt); return reg; } /* Return a new region describing a block of memory allocated within the - current frame. */ + current frame. + Use CTXT to complain about tainted sizes. */ const region * -region_model::create_region_for_alloca (const svalue *size_in_bytes) +region_model::create_region_for_alloca (const svalue *size_in_bytes, + region_model_context *ctxt) { const region *reg = m_mgr->create_region_for_alloca (m_current_frame); if (compat_types_p (size_in_bytes->get_type (), size_type_node)) - set_dynamic_extents (reg, size_in_bytes); + set_dynamic_extents (reg, size_in_bytes, ctxt); return reg; } -/* Record that the size of REG is SIZE_IN_BYTES. */ +/* Record that the size of REG is SIZE_IN_BYTES. + Use CTXT to complain about tainted sizes. */ void region_model::set_dynamic_extents (const region *reg, - const svalue *size_in_bytes) + const svalue *size_in_bytes, + region_model_context *ctxt) { assert_compat_types (size_in_bytes->get_type (), size_type_node); + if (ctxt) + check_dynamic_size_for_taint (reg->get_memory_space (), size_in_bytes, + ctxt); m_dynamic_extents.put (reg, size_in_bytes); } @@ -5096,7 +5118,8 @@ test_state_merging () region_model model0 (&mgr); tree size = build_int_cst (size_type_node, 1024); const svalue *size_sval = mgr.get_or_create_constant_svalue (size); - const region *new_reg = model0.create_region_for_heap_alloc (size_sval); + const region *new_reg + = model0.create_region_for_heap_alloc (size_sval, &ctxt); const svalue *ptr_sval = mgr.get_ptr_svalue (ptr_type_node, new_reg); model0.set_value (model0.get_lvalue (p, &ctxt), ptr_sval, &ctxt); @@ -5484,7 +5507,7 @@ test_malloc_constraints () const svalue *size_in_bytes = mgr.get_or_create_unknown_svalue (size_type_node); - const region *reg = model.create_region_for_heap_alloc (size_in_bytes); + const region *reg = model.create_region_for_heap_alloc (size_in_bytes, NULL); const svalue *sval = mgr.get_ptr_svalue (ptr_type_node, reg); model.set_value (model.get_lvalue (p, NULL), sval, NULL); model.set_value (q, p, NULL); @@ -5705,7 +5728,7 @@ test_malloc () /* "p = malloc (n * 4);". */ const svalue *size_sval = model.get_rvalue (n_times_4, &ctxt); - const region *reg = model.create_region_for_heap_alloc (size_sval); + const region *reg = model.create_region_for_heap_alloc (size_sval, &ctxt); const svalue *ptr = mgr.get_ptr_svalue (int_star, reg); model.set_value (model.get_lvalue (p, &ctxt), ptr, &ctxt); ASSERT_EQ (model.get_capacity (reg), size_sval); @@ -5739,7 +5762,7 @@ test_alloca () NULL, &ctxt); /* "p = alloca (n * 4);". */ const svalue *size_sval = model.get_rvalue (n_times_4, &ctxt); - const region *reg = model.create_region_for_alloca (size_sval); + const region *reg = model.create_region_for_alloca (size_sval, &ctxt); ASSERT_EQ (reg->get_parent_region (), frame_reg); const svalue *ptr = mgr.get_ptr_svalue (int_star, reg); model.set_value (model.get_lvalue (p, &ctxt), ptr, &ctxt); diff --git a/gcc/analyzer/region-model.h b/gcc/analyzer/region-model.h index 5fabf7881e2..13e8109aa51 100644 --- a/gcc/analyzer/region-model.h +++ b/gcc/analyzer/region-model.h @@ -676,8 +676,10 @@ class region_model region_model_context *ctxt, rejected_constraint **out); - const region *create_region_for_heap_alloc (const svalue *size_in_bytes); - const region *create_region_for_alloca (const svalue *size_in_bytes); + const region *create_region_for_heap_alloc (const svalue *size_in_bytes, + region_model_context *ctxt); + const region *create_region_for_alloca (const svalue *size_in_bytes, + region_model_context *ctxt); tree get_representative_tree (const svalue *sval) const; path_var @@ -703,7 +705,8 @@ class region_model } const svalue *get_dynamic_extents (const region *reg) const; void set_dynamic_extents (const region *reg, - const svalue *size_in_bytes); + const svalue *size_in_bytes, + region_model_context *ctxt); void unset_dynamic_extents (const region *reg); region_model_manager *get_manager () const { return m_mgr; } @@ -792,6 +795,14 @@ class region_model tree expr, region_model_context *ctxt) const; + void check_dynamic_size_for_taint (enum memory_space mem_space, + const svalue *size_in_bytes, + region_model_context *ctxt) const; + + void check_region_for_taint (const region *reg, + enum access_direction dir, + region_model_context *ctxt) const; + void check_for_writable_region (const region* dest_reg, region_model_context *ctxt) const; void check_region_access (const region *reg, @@ -891,6 +902,10 @@ class region_model_context virtual bool get_malloc_map (sm_state_map **out_smap, const state_machine **out_sm, unsigned *out_sm_idx) = 0; + /* Likewise for the "taint" state machine. */ + virtual bool get_taint_map (sm_state_map **out_smap, + const state_machine **out_sm, + unsigned *out_sm_idx) = 0; }; /* A "do nothing" subclass of region_model_context. */ @@ -935,6 +950,12 @@ public: { return false; } + bool get_taint_map (sm_state_map **, + const state_machine **, + unsigned *) OVERRIDE + { + return false; + } }; /* A subclass of region_model_context for determining if operations fail diff --git a/gcc/analyzer/sm-taint.cc b/gcc/analyzer/sm-taint.cc index 721d3eabb8f..0a51a1fe2ea 100644 --- a/gcc/analyzer/sm-taint.cc +++ b/gcc/analyzer/sm-taint.cc @@ -33,9 +33,21 @@ along with GCC; see the file COPYING3. If not see #include "function.h" #include "json.h" #include "analyzer/analyzer.h" -#include "diagnostic-event-id.h" #include "analyzer/analyzer-logging.h" +#include "gimple-iterator.h" +#include "tristate.h" +#include "selftest.h" +#include "ordered-hash-map.h" +#include "cgraph.h" +#include "cfg.h" +#include "digraph.h" +#include "analyzer/supergraph.h" +#include "analyzer/call-string.h" +#include "analyzer/program-point.h" +#include "analyzer/store.h" +#include "analyzer/region-model.h" #include "analyzer/sm.h" +#include "analyzer/program-state.h" #include "analyzer/pending-diagnostic.h" #if ENABLE_ANALYZER @@ -44,6 +56,20 @@ namespace ana { namespace { +/* An enum for describing tainted values. */ + +enum bounds +{ + /* This tainted value has no upper or lower bound. */ + BOUNDS_NONE, + + /* This tainted value has an upper bound but not lower bound. */ + BOUNDS_UPPER, + + /* This tainted value has a lower bound but no upper bound. */ + BOUNDS_LOWER +}; + /* An experimental state machine, for tracking "taint": unsanitized uses of data potentially under an attacker's control. */ @@ -54,6 +80,11 @@ public: bool inherited_state_p () const FINAL OVERRIDE { return true; } + state_t alt_get_inherited_state (const sm_state_map &map, + const svalue *sval, + const extrinsic_state &ext_state) + const FINAL OVERRIDE; + bool on_stmt (sm_context *sm_ctxt, const supernode *node, const gimple *stmt) const FINAL OVERRIDE; @@ -67,6 +98,10 @@ public: bool can_purge_p (state_t s) const FINAL OVERRIDE; + bool get_taint (state_t s, tree type, enum bounds *out) const; + + state_t combine_states (state_t s0, state_t s1) const; + /* State for a "tainted" value: unsanitized data potentially under an attacker's control. */ state_t m_tainted; @@ -81,31 +116,65 @@ public: state_t m_stop; }; -enum bounds +/* Class for diagnostics relating to taint_state_machine. */ + +class taint_diagnostic : public pending_diagnostic { - BOUNDS_NONE, - BOUNDS_UPPER, - BOUNDS_LOWER +public: + taint_diagnostic (const taint_state_machine &sm, tree arg, + enum bounds has_bounds) + : m_sm (sm), m_arg (arg), m_has_bounds (has_bounds) + {} + + bool subclass_equal_p (const pending_diagnostic &base_other) const OVERRIDE + { + return same_tree_p (m_arg, ((const taint_diagnostic &)base_other).m_arg); + } + + label_text describe_state_change (const evdesc::state_change &change) + FINAL OVERRIDE + { + if (change.m_new_state == m_sm.m_tainted) + { + if (change.m_origin) + return change.formatted_print ("%qE has an unchecked value here" + " (from %qE)", + change.m_expr, change.m_origin); + else + return change.formatted_print ("%qE gets an unchecked value here", + change.m_expr); + } + else if (change.m_new_state == m_sm.m_has_lb) + return change.formatted_print ("%qE has its lower bound checked here", + change.m_expr); + else if (change.m_new_state == m_sm.m_has_ub) + return change.formatted_print ("%qE has its upper bound checked here", + change.m_expr); + return label_text (); + } +protected: + const taint_state_machine &m_sm; + tree m_arg; + enum bounds m_has_bounds; }; -class tainted_array_index - : public pending_diagnostic_subclass +/* Concrete taint_diagnostic subclass for reporting attacker-controlled + array index. */ + +class tainted_array_index : public taint_diagnostic { public: tainted_array_index (const taint_state_machine &sm, tree arg, enum bounds has_bounds) - : m_sm (sm), m_arg (arg), m_has_bounds (has_bounds) {} + : taint_diagnostic (sm, arg, has_bounds) + {} const char *get_kind () const FINAL OVERRIDE { return "tainted_array_index"; } - bool operator== (const tainted_array_index &other) const - { - return same_tree_p (m_arg, other.m_arg); - } - bool emit (rich_location *rich_loc) FINAL OVERRIDE { diagnostic_metadata m; + /* CWE-129: "Improper Validation of Array Index". */ m.add_cwe (129); switch (m_has_bounds) { @@ -113,45 +182,197 @@ public: gcc_unreachable (); case BOUNDS_NONE: return warning_meta (rich_loc, m, OPT_Wanalyzer_tainted_array_index, - "use of tainted value %qE in array lookup" - " without bounds checking", + "use of attacker-controlled value %qE" + " in array lookup without bounds checking", m_arg); break; case BOUNDS_UPPER: return warning_meta (rich_loc, m, OPT_Wanalyzer_tainted_array_index, - "use of tainted value %qE in array lookup" - " without lower-bounds checking", + "use of attacker-controlled value %qE" + " in array lookup without checking for negative", m_arg); break; case BOUNDS_LOWER: return warning_meta (rich_loc, m, OPT_Wanalyzer_tainted_array_index, - "use of tainted value %qE in array lookup" - " without upper-bounds checking", + "use of attacker-controlled value %qE" + " in array lookup without upper-bounds checking", m_arg); break; } } - label_text describe_state_change (const evdesc::state_change &change) - FINAL OVERRIDE + label_text describe_final_event (const evdesc::final_event &ev) FINAL OVERRIDE { - if (change.m_new_state == m_sm.m_tainted) + switch (m_has_bounds) { - if (change.m_origin) - return change.formatted_print ("%qE has an unchecked value here" - " (from %qE)", - change.m_expr, change.m_origin); - else - return change.formatted_print ("%qE gets an unchecked value here", - change.m_expr); + default: + gcc_unreachable (); + case BOUNDS_NONE: + return ev.formatted_print + ("use of attacker-controlled value %qE in array lookup" + " without bounds checking", + m_arg); + case BOUNDS_UPPER: + return ev.formatted_print + ("use of attacker-controlled value %qE" + " in array lookup without checking for negative", + m_arg); + case BOUNDS_LOWER: + return ev.formatted_print + ("use of attacker-controlled value %qE" + " in array lookup without upper-bounds checking", + m_arg); + } + } +}; + +/* Concrete taint_diagnostic subclass for reporting attacker-controlled + pointer offset. */ + +class tainted_offset : public taint_diagnostic +{ +public: + tainted_offset (const taint_state_machine &sm, tree arg, + enum bounds has_bounds) + : taint_diagnostic (sm, arg, has_bounds) + {} + + const char *get_kind () const FINAL OVERRIDE { return "tainted_offset"; } + + bool emit (rich_location *rich_loc) FINAL OVERRIDE + { + diagnostic_metadata m; + /* CWE-823: "Use of Out-of-range Pointer Offset". */ + m.add_cwe (823); + if (m_arg) + switch (m_has_bounds) + { + default: + gcc_unreachable (); + case BOUNDS_NONE: + return warning_meta (rich_loc, m, OPT_Wanalyzer_tainted_offset, + "use of attacker-controlled value %qE as offset" + " without bounds checking", + m_arg); + break; + case BOUNDS_UPPER: + return warning_meta (rich_loc, m, OPT_Wanalyzer_tainted_offset, + "use of attacker-controlled value %qE as offset" + " without lower-bounds checking", + m_arg); + break; + case BOUNDS_LOWER: + return warning_meta (rich_loc, m, OPT_Wanalyzer_tainted_offset, + "use of attacker-controlled value %qE as offset" + " without upper-bounds checking", + m_arg); + break; + } + else + switch (m_has_bounds) + { + default: + gcc_unreachable (); + case BOUNDS_NONE: + return warning_meta (rich_loc, m, OPT_Wanalyzer_tainted_offset, + "use of attacker-controlled value as offset" + " without bounds checking"); + break; + case BOUNDS_UPPER: + return warning_meta (rich_loc, m, OPT_Wanalyzer_tainted_offset, + "use of attacker-controlled value as offset" + " without lower-bounds checking"); + break; + case BOUNDS_LOWER: + return warning_meta (rich_loc, m, OPT_Wanalyzer_tainted_offset, + "use of attacker-controlled value as offset" + " without upper-bounds checking"); + break; + } + } + + label_text describe_final_event (const evdesc::final_event &ev) FINAL OVERRIDE + { + if (m_arg) + switch (m_has_bounds) + { + default: + gcc_unreachable (); + case BOUNDS_NONE: + return ev.formatted_print ("use of attacker-controlled value %qE" + " as offset without bounds checking", + m_arg); + case BOUNDS_UPPER: + return ev.formatted_print ("use of attacker-controlled value %qE" + " as offset without lower-bounds checking", + m_arg); + case BOUNDS_LOWER: + return ev.formatted_print ("use of attacker-controlled value %qE" + " as offset without upper-bounds checking", + m_arg); + } + else + switch (m_has_bounds) + { + default: + gcc_unreachable (); + case BOUNDS_NONE: + return ev.formatted_print ("use of attacker-controlled value" + " as offset without bounds checking"); + case BOUNDS_UPPER: + return ev.formatted_print ("use of attacker-controlled value" + " as offset without lower-bounds" + " checking"); + case BOUNDS_LOWER: + return ev.formatted_print ("use of attacker-controlled value" + " as offset without upper-bounds" + " checking"); + } + } +}; + +/* Concrete taint_diagnostic subclass for reporting attacker-controlled + size. */ + +class tainted_size : public taint_diagnostic +{ +public: + tainted_size (const taint_state_machine &sm, tree arg, + enum bounds has_bounds, + enum access_direction dir) + : taint_diagnostic (sm, arg, has_bounds), + m_dir (dir) + {} + + const char *get_kind () const FINAL OVERRIDE { return "tainted_size"; } + + bool emit (rich_location *rich_loc) FINAL OVERRIDE + { + diagnostic_metadata m; + m.add_cwe (129); + switch (m_has_bounds) + { + default: + gcc_unreachable (); + case BOUNDS_NONE: + return warning_meta (rich_loc, m, OPT_Wanalyzer_tainted_size, + "use of attacker-controlled value %qE as size" + " without bounds checking", + m_arg); + break; + case BOUNDS_UPPER: + return warning_meta (rich_loc, m, OPT_Wanalyzer_tainted_size, + "use of attacker-controlled value %qE as size" + " without lower-bounds checking", + m_arg); + break; + case BOUNDS_LOWER: + return warning_meta (rich_loc, m, OPT_Wanalyzer_tainted_size, + "use of attacker-controlled value %qE as size" + " without upper-bounds checking", + m_arg); + break; } - else if (change.m_new_state == m_sm.m_has_lb) - return change.formatted_print ("%qE has its lower bound checked here", - change.m_expr); - else if (change.m_new_state == m_sm.m_has_ub) - return change.formatted_print ("%qE has its upper bound checked here", - change.m_expr); - return label_text (); } label_text describe_final_event (const evdesc::final_event &ev) FINAL OVERRIDE @@ -161,24 +382,194 @@ public: default: gcc_unreachable (); case BOUNDS_NONE: - return ev.formatted_print ("use of tainted value %qE in array lookup" - " without bounds checking", + return ev.formatted_print ("use of attacker-controlled value %qE" + " as size without bounds checking", m_arg); case BOUNDS_UPPER: - return ev.formatted_print ("use of tainted value %qE in array lookup" - " without lower-bounds checking", + return ev.formatted_print ("use of attacker-controlled value %qE" + " as size without lower-bounds checking", m_arg); case BOUNDS_LOWER: - return ev.formatted_print ("use of tainted value %qE in array lookup" - " without upper-bounds checking", + return ev.formatted_print ("use of attacker-controlled value %qE" + " as size without upper-bounds checking", m_arg); } } private: - const taint_state_machine &m_sm; - tree m_arg; - enum bounds m_has_bounds; + enum access_direction m_dir; +}; + +/* Concrete taint_diagnostic subclass for reporting attacker-controlled + divisor (so that an attacker can trigger a divide by zero). */ + +class tainted_divisor : public taint_diagnostic +{ +public: + tainted_divisor (const taint_state_machine &sm, tree arg, + enum bounds has_bounds) + : taint_diagnostic (sm, arg, has_bounds) + {} + + const char *get_kind () const FINAL OVERRIDE { return "tainted_divisor"; } + + bool emit (rich_location *rich_loc) FINAL OVERRIDE + { + diagnostic_metadata m; + /* CWE-369: "Divide By Zero". */ + m.add_cwe (369); + if (m_arg) + return warning_meta (rich_loc, m, OPT_Wanalyzer_tainted_divisor, + "use of attacker-controlled value %qE as divisor" + " without checking for zero", + m_arg); + else + return warning_meta (rich_loc, m, OPT_Wanalyzer_tainted_divisor, + "use of attacker-controlled value as divisor" + " without checking for zero"); + } + + label_text describe_final_event (const evdesc::final_event &ev) FINAL OVERRIDE + { + if (m_arg) + return ev.formatted_print + ("use of attacker-controlled value %qE as divisor" + " without checking for zero", + m_arg); + else + return ev.formatted_print + ("use of attacker-controlled value as divisor" + " without checking for zero"); + } +}; + +/* Concrete taint_diagnostic subclass for reporting attacker-controlled + size of a dynamic allocation. */ + +class tainted_allocation_size : public taint_diagnostic +{ +public: + tainted_allocation_size (const taint_state_machine &sm, tree arg, + enum bounds has_bounds, enum memory_space mem_space) + : taint_diagnostic (sm, arg, has_bounds), + m_mem_space (mem_space) + { + gcc_assert (mem_space == MEMSPACE_STACK || mem_space == MEMSPACE_HEAP); + } + + const char *get_kind () const FINAL OVERRIDE + { + return "tainted_allocation_size"; + } + + bool emit (rich_location *rich_loc) FINAL OVERRIDE + { + diagnostic_metadata m; + /* "CWE-789: Memory Allocation with Excessive Size Value". */ + m.add_cwe (789); + gcc_assert (m_mem_space == MEMSPACE_STACK || m_mem_space == MEMSPACE_HEAP); + // TODO: make use of m_mem_space + if (m_arg) + switch (m_has_bounds) + { + default: + gcc_unreachable (); + case BOUNDS_NONE: + return warning_meta (rich_loc, m, + OPT_Wanalyzer_tainted_allocation_size, + "use of attacker-controlled value %qE as" + " allocation size without bounds checking", + m_arg); + break; + case BOUNDS_UPPER: + return warning_meta (rich_loc, m, + OPT_Wanalyzer_tainted_allocation_size, + "use of attacker-controlled value %qE as" + " allocation size without lower-bounds checking", + m_arg); + break; + case BOUNDS_LOWER: + return warning_meta (rich_loc, m, + OPT_Wanalyzer_tainted_allocation_size, + "use of attacker-controlled value %qE as" + " allocation size without upper-bounds checking", + m_arg); + break; + } + else + switch (m_has_bounds) + { + default: + gcc_unreachable (); + case BOUNDS_NONE: + return warning_meta (rich_loc, m, + OPT_Wanalyzer_tainted_allocation_size, + "use of attacker-controlled value as" + " allocation size without bounds" + " checking"); + break; + case BOUNDS_UPPER: + return warning_meta (rich_loc, m, + OPT_Wanalyzer_tainted_allocation_size, + "use of attacker-controlled value as" + " allocation size without lower-bounds" + " checking"); + break; + case BOUNDS_LOWER: + return warning_meta (rich_loc, m, + OPT_Wanalyzer_tainted_allocation_size, + "use of attacker-controlled value as" + " allocation size without upper-bounds" + " checking"); + break; + } + } + + label_text describe_final_event (const evdesc::final_event &ev) FINAL OVERRIDE + { + if (m_arg) + switch (m_has_bounds) + { + default: + gcc_unreachable (); + case BOUNDS_NONE: + return ev.formatted_print + ("use of attacker-controlled value %qE as allocation size" + " without bounds checking", + m_arg); + case BOUNDS_UPPER: + return ev.formatted_print + ("use of attacker-controlled value %qE as allocation size" + " without lower-bounds checking", + m_arg); + case BOUNDS_LOWER: + return ev.formatted_print + ("use of attacker-controlled value %qE as allocation size" + " without upper-bounds checking", + m_arg); + } + else + switch (m_has_bounds) + { + default: + gcc_unreachable (); + case BOUNDS_NONE: + return ev.formatted_print + ("use of attacker-controlled value as allocation size" + " without bounds checking"); + case BOUNDS_UPPER: + return ev.formatted_print + ("use of attacker-controlled value as allocation size" + " without lower-bounds checking"); + case BOUNDS_LOWER: + return ev.formatted_print + ("use of attacker-controlled value as allocation size" + " without upper-bounds checking"); + } + } + +private: + enum memory_space m_mem_space; }; /* taint_state_machine's ctor. */ @@ -192,6 +583,79 @@ taint_state_machine::taint_state_machine (logger *logger) m_stop = add_state ("stop"); } +state_machine::state_t +taint_state_machine::alt_get_inherited_state (const sm_state_map &map, + const svalue *sval, + const extrinsic_state &ext_state) + const +{ + switch (sval->get_kind ()) + { + default: + break; + case SK_UNARYOP: + { + const unaryop_svalue *unaryop_sval + = as_a (sval); + enum tree_code op = unaryop_sval->get_op (); + const svalue *arg = unaryop_sval->get_arg (); + switch (op) + { + case NOP_EXPR: + { + state_t arg_state = map.get_state (arg, ext_state); + return arg_state; + } + default: + gcc_unreachable (); + break; + } + } + break; + case SK_BINOP: + { + const binop_svalue *binop_sval = as_a (sval); + enum tree_code op = binop_sval->get_op (); + const svalue *arg0 = binop_sval->get_arg0 (); + const svalue *arg1 = binop_sval->get_arg1 (); + switch (op) + { + default: + break; + case PLUS_EXPR: + case MINUS_EXPR: + case MULT_EXPR: + case POINTER_PLUS_EXPR: + case TRUNC_DIV_EXPR: + case TRUNC_MOD_EXPR: + { + state_t arg0_state = map.get_state (arg0, ext_state); + state_t arg1_state = map.get_state (arg1, ext_state); + return combine_states (arg0_state, arg1_state); + } + break; + + case EQ_EXPR: + case GE_EXPR: + case LE_EXPR: + case NE_EXPR: + case GT_EXPR: + case LT_EXPR: + case UNORDERED_EXPR: + case ORDERED_EXPR: + /* Comparisons are just booleans. */ + return m_start; + + case BIT_AND_EXPR: + case RSHIFT_EXPR: + return NULL; + } + } + break; + } + return NULL; +} + /* Implementation of state_machine::on_stmt vfunc for taint_state_machine. */ bool @@ -220,53 +684,35 @@ taint_state_machine::on_stmt (sm_context *sm_ctxt, if (const gassign *assign = dyn_cast (stmt)) { - tree rhs1 = gimple_assign_rhs1 (assign); enum tree_code op = gimple_assign_rhs_code (assign); - /* Check array accesses. */ - if (op == ARRAY_REF) + switch (op) { - tree arg = TREE_OPERAND (rhs1, 1); - - /* Unsigned types have an implicit lower bound. */ - bool is_unsigned = false; - if (INTEGRAL_TYPE_P (TREE_TYPE (arg))) - is_unsigned = TYPE_UNSIGNED (TREE_TYPE (arg)); - - state_t state = sm_ctxt->get_state (stmt, arg); - /* Can't use a switch as the states are non-const. */ - if (state == m_tainted) - { - /* Complain about missing bounds. */ - tree diag_arg = sm_ctxt->get_diagnostic_tree (arg); - pending_diagnostic *d - = new tainted_array_index (*this, diag_arg, - is_unsigned - ? BOUNDS_LOWER : BOUNDS_NONE); - sm_ctxt->warn (node, stmt, arg, d); - sm_ctxt->set_next_state (stmt, arg, m_stop); - } - else if (state == m_has_lb) - { - /* Complain about missing upper bound. */ - tree diag_arg = sm_ctxt->get_diagnostic_tree (arg); - sm_ctxt->warn (node, stmt, arg, - new tainted_array_index (*this, diag_arg, - BOUNDS_LOWER)); - sm_ctxt->set_next_state (stmt, arg, m_stop); - } - else if (state == m_has_ub) - { - /* Complain about missing lower bound. */ - if (!is_unsigned) - { - tree diag_arg = sm_ctxt->get_diagnostic_tree (arg); - sm_ctxt->warn (node, stmt, arg, - new tainted_array_index (*this, diag_arg, - BOUNDS_UPPER)); - sm_ctxt->set_next_state (stmt, arg, m_stop); - } - } + default: + break; + case TRUNC_DIV_EXPR: + case CEIL_DIV_EXPR: + case FLOOR_DIV_EXPR: + case ROUND_DIV_EXPR: + case TRUNC_MOD_EXPR: + case CEIL_MOD_EXPR: + case FLOOR_MOD_EXPR: + case ROUND_MOD_EXPR: + case RDIV_EXPR: + case EXACT_DIV_EXPR: + { + tree divisor = gimple_assign_rhs2 (assign);; + state_t state = sm_ctxt->get_state (stmt, divisor); + enum bounds b; + if (get_taint (state, TREE_TYPE (divisor), &b)) + { + tree diag_divisor = sm_ctxt->get_diagnostic_tree (divisor); + sm_ctxt->warn (node, stmt, divisor, + new tainted_divisor (*this, diag_divisor, b)); + sm_ctxt->set_next_state (stmt, divisor, m_stop); + } + } + break; } } @@ -324,6 +770,62 @@ taint_state_machine::can_purge_p (state_t s ATTRIBUTE_UNUSED) const return true; } +/* If STATE is a tainted state, write the bounds to *OUT and return true. + Otherwise return false. + Use the signedness of TYPE to determine if "has_ub" is tainted. */ + +bool +taint_state_machine::get_taint (state_t state, tree type, + enum bounds *out) const +{ + /* Unsigned types have an implicit lower bound. */ + bool is_unsigned = false; + if (type) + if (INTEGRAL_TYPE_P (type)) + is_unsigned = TYPE_UNSIGNED (type); + + /* Can't use a switch as the states are non-const. */ + if (state == m_tainted) + { + *out = is_unsigned ? BOUNDS_LOWER : BOUNDS_NONE; + return true; + } + else if (state == m_has_lb) + { + *out = BOUNDS_LOWER; + return true; + } + else if (state == m_has_ub && !is_unsigned) + { + /* Missing lower bound. */ + *out = BOUNDS_UPPER; + return true; + } + return false; +} + +/* Find the most tainted state of S0 and S1. */ + +state_machine::state_t +taint_state_machine::combine_states (state_t s0, state_t s1) const +{ + gcc_assert (s0); + gcc_assert (s1); + if (s0 == s1) + return s0; + if (s0 == m_tainted || s1 == m_tainted) + return m_tainted; + if (s0 == m_stop) + return s1; + if (s1 == m_stop) + return s0; + if (s0 == m_start) + return s1; + if (s1 == m_start) + return s0; + gcc_unreachable (); +} + } // anonymous namespace /* Internal interface to this file. */ @@ -334,6 +836,152 @@ make_taint_state_machine (logger *logger) return new taint_state_machine (logger); } +/* Complain to CTXT if accessing REG leads could lead to arbitrary + memory access under an attacker's control (due to taint). */ + +void +region_model::check_region_for_taint (const region *reg, + enum access_direction dir, + region_model_context *ctxt) const +{ + gcc_assert (reg); + gcc_assert (ctxt); + + LOG_SCOPE (ctxt->get_logger ()); + + sm_state_map *smap; + const state_machine *sm; + unsigned sm_idx; + if (!ctxt->get_taint_map (&smap, &sm, &sm_idx)) + return; + + gcc_assert (smap); + gcc_assert (sm); + + const taint_state_machine &taint_sm = (const taint_state_machine &)*sm; + + const extrinsic_state *ext_state = ctxt->get_ext_state (); + if (!ext_state) + return; + + const region *iter_region = reg; + while (iter_region) + { + switch (iter_region->get_kind ()) + { + default: + break; + + case RK_ELEMENT: + { + const element_region *element_reg + = (const element_region *)iter_region; + const svalue *index = element_reg->get_index (); + const state_machine::state_t + state = smap->get_state (index, *ext_state); + gcc_assert (state); + enum bounds b; + if (taint_sm.get_taint (state, index->get_type (), &b)) + { + tree arg = get_representative_tree (index); + ctxt->warn (new tainted_array_index (taint_sm, arg, b)); + } + } + break; + + case RK_OFFSET: + { + const offset_region *offset_reg + = (const offset_region *)iter_region; + const svalue *offset = offset_reg->get_byte_offset (); + const state_machine::state_t + state = smap->get_state (offset, *ext_state); + gcc_assert (state); + /* Handle implicit cast to sizetype. */ + tree effective_type = offset->get_type (); + if (const svalue *cast = offset->maybe_undo_cast ()) + if (cast->get_type ()) + effective_type = cast->get_type (); + enum bounds b; + if (taint_sm.get_taint (state, effective_type, &b)) + { + tree arg = get_representative_tree (offset); + ctxt->warn (new tainted_offset (taint_sm, arg, b)); + } + } + break; + + case RK_CAST: + { + const cast_region *cast_reg + = as_a (iter_region); + iter_region = cast_reg->get_original_region (); + continue; + } + + case RK_SIZED: + { + const sized_region *sized_reg + = (const sized_region *)iter_region; + const svalue *size_sval = sized_reg->get_byte_size_sval (m_mgr); + const state_machine::state_t + state = smap->get_state (size_sval, *ext_state); + gcc_assert (state); + enum bounds b; + if (taint_sm.get_taint (state, size_sval->get_type (), &b)) + { + tree arg = get_representative_tree (size_sval); + ctxt->warn (new tainted_size (taint_sm, arg, b, dir)); + } + } + break; + } + + iter_region = iter_region->get_parent_region (); + } +} + +/* Complain to CTXT about a tainted allocation size if SIZE_IN_BYTES is + under an attacker's control (due to taint), where the allocation + is happening within MEM_SPACE. */ + +void +region_model::check_dynamic_size_for_taint (enum memory_space mem_space, + const svalue *size_in_bytes, + region_model_context *ctxt) const +{ + gcc_assert (mem_space == MEMSPACE_STACK || mem_space == MEMSPACE_HEAP); + gcc_assert (size_in_bytes); + gcc_assert (ctxt); + + LOG_SCOPE (ctxt->get_logger ()); + + sm_state_map *smap; + const state_machine *sm; + unsigned sm_idx; + if (!ctxt->get_taint_map (&smap, &sm, &sm_idx)) + return; + + gcc_assert (smap); + gcc_assert (sm); + + const taint_state_machine &taint_sm = (const taint_state_machine &)*sm; + + const extrinsic_state *ext_state = ctxt->get_ext_state (); + if (!ext_state) + return; + + const state_machine::state_t + state = smap->get_state (size_in_bytes, *ext_state); + gcc_assert (state); + enum bounds b; + if (taint_sm.get_taint (state, size_in_bytes->get_type (), &b)) + { + tree arg = get_representative_tree (size_in_bytes); + ctxt->warn (new tainted_allocation_size (taint_sm, arg, b, mem_space)); + } +} + } // namespace ana #endif /* #if ENABLE_ANALYZER */ diff --git a/gcc/analyzer/sm.h b/gcc/analyzer/sm.h index 02faffbff99..b8570e8a1ac 100644 --- a/gcc/analyzer/sm.h +++ b/gcc/analyzer/sm.h @@ -69,6 +69,15 @@ public: within a heap-allocated struct. */ virtual bool inherited_state_p () const = 0; + /* A vfunc for more general handling of inheritance. */ + virtual state_t + alt_get_inherited_state (const sm_state_map &, + const svalue *, + const extrinsic_state &) const + { + return NULL; + } + virtual state_machine::state_t get_default_state (const svalue *) const { return m_start; diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi index a7c4d24a762..518a5216c73 100644 --- a/gcc/doc/invoke.texi +++ b/gcc/doc/invoke.texi @@ -449,7 +449,11 @@ Objective-C and Objective-C++ Dialects}. -Wno-analyzer-shift-count-negative @gol -Wno-analyzer-shift-count-overflow @gol -Wno-analyzer-stale-setjmp-buffer @gol +-Wno-analyzer-tainted-allocation-size @gol -Wno-analyzer-tainted-array-index @gol +-Wno-analyzer-tainted-divisor @gol +-Wno-analyzer-tainted-offset @gol +-Wno-analyzer-tainted-size @gol -Wanalyzer-too-complex @gol -Wno-analyzer-unsafe-call-within-signal-handler @gol -Wno-analyzer-use-after-free @gol @@ -9400,7 +9404,11 @@ Enabling this option effectively enables the following warnings: -Wanalyzer-shift-count-negative @gol -Wanalyzer-shift-count-overflow @gol -Wanalyzer-stale-setjmp-buffer @gol +-Wanalyzer-tainted-allocation-size @gol -Wanalyzer-tainted-array-index @gol +-Wanalyzer-tainted-divisor @gol +-Wanalyzer-tainted-offset @gol +-Wanalyzer-tainted-size @gol -Wanalyzer-unsafe-call-within-signal-handler @gol -Wanalyzer-use-after-free @gol -Wanalyzer-use-of-uninitialized-value @gol @@ -9583,6 +9591,21 @@ when the function containing the @code{setjmp} call returns. Attempting to rewind to it via @code{longjmp} would reference a stack frame that no longer exists, and likely lead to a crash (or worse). +@item -Wno-analyzer-tainted-allocation-size +@opindex Wanalyzer-tainted-allocation-size +@opindex Wno-analyzer-tainted-allocation-size +This warning requires both @option{-fanalyzer} and +@option{-fanalyzer-checker=taint} to enable it; +use @option{-Wno-analyzer-tainted-allocation-size} to disable it. + +This diagnostic warns for paths through the code in which a value +that could be under an attacker's control is used as the size +of an allocation without being sanitized, so that an attacker could +inject an excessively large allocation and potentially cause a denial +of service attack. + +See @url{https://cwe.mitre.org/data/definitions/789.html, CWE-789: Memory Allocation with Excessive Size Value}. + @item -Wno-analyzer-tainted-array-index @opindex Wanalyzer-tainted-array-index @opindex Wno-analyzer-tainted-array-index @@ -9592,7 +9615,48 @@ use @option{-Wno-analyzer-tainted-array-index} to disable it. This diagnostic warns for paths through the code in which a value that could be under an attacker's control is used as the index -of an array access without being sanitized. +of an array access without being sanitized, so that an attacker +could inject an out-of-bounds access. + +See @url{https://cwe.mitre.org/data/definitions/129.html, CWE-129: Improper Validation of Array Index}. + +@item -Wno-analyzer-tainted-divisor +@opindex Wanalyzer-tainted-divisor +@opindex Wno-analyzer-tainted-divisor +This warning requires both @option{-fanalyzer} and +@option{-fanalyzer-checker=taint} to enable it; +use @option{-Wno-analyzer-tainted-divisor} to disable it. + +This diagnostic warns for paths through the code in which a value +that could be under an attacker's control is used as the divisor +in a division or modulus operation without being sanitized, so that +an attacker could inject a division-by-zero. + +@item -Wno-analyzer-tainted-offset +@opindex Wanalyzer-tainted-offset +@opindex Wno-analyzer-tainted-offset +This warning requires both @option{-fanalyzer} and +@option{-fanalyzer-checker=taint} to enable it; +use @option{-Wno-analyzer-tainted-offset} to disable it. + +This diagnostic warns for paths through the code in which a value +that could be under an attacker's control is used as a pointer offset +without being sanitized, so that an attacker could inject an out-of-bounds +access. + +See @url{https://cwe.mitre.org/data/definitions/823.html, CWE-823: Use of Out-of-range Pointer Offset}. + +@item -Wno-analyzer-tainted-size +@opindex Wanalyzer-tainted-size +@opindex Wno-analyzer-tainted-size +This warning requires both @option{-fanalyzer} and +@option{-fanalyzer-checker=taint} to enable it; +use @option{-Wno-analyzer-tainted-size} to disable it. + +This diagnostic warns for paths through the code in which a value +that could be under an attacker's control is used as the size of +an operation such as @code{memset} without being sanitized, so that an +attacker could inject an out-of-bounds access. @item -Wno-analyzer-unsafe-call-within-signal-handler @opindex Wanalyzer-unsafe-call-within-signal-handler diff --git a/gcc/testsuite/gcc.dg/analyzer/pr93382.c b/gcc/testsuite/gcc.dg/analyzer/pr93382.c index 210b97d1a47..1e6612ddc05 100644 --- a/gcc/testsuite/gcc.dg/analyzer/pr93382.c +++ b/gcc/testsuite/gcc.dg/analyzer/pr93382.c @@ -23,5 +23,5 @@ int pl (void) { ql (); - return arr[idx]; /* { dg-warning "use of tainted value 'idx' in array lookup without bounds checking" "" { xfail *-*-* } } */ + return arr[idx]; /* { dg-warning "use of attacker-controlled value 'idx' in array lookup without bounds checking" "" { xfail *-*-* } } */ } diff --git a/gcc/testsuite/gcc.dg/analyzer/taint-alloc-1.c b/gcc/testsuite/gcc.dg/analyzer/taint-alloc-1.c new file mode 100644 index 00000000000..102aa4b9220 --- /dev/null +++ b/gcc/testsuite/gcc.dg/analyzer/taint-alloc-1.c @@ -0,0 +1,64 @@ +// TODO: remove need for this option +/* { dg-additional-options "-fanalyzer-checker=taint" } */ + +#include "analyzer-decls.h" +#include +#include +#include + +struct foo +{ + size_t sz; +}; + +/* malloc with tainted size. */ + +void *test_1 (FILE *f) +{ + struct foo tmp; + if (1 == fread(&tmp, sizeof(tmp), 1, f)) { /* { dg-message "\\(\[0-9\]+\\) 'tmp' gets an unchecked value here" "event: tmp gets unchecked value" { xfail *-*-* } } */ + /* { dg-message "\\(\[0-9\]+\\) following 'true' branch\\.\\.\\." "event: following true branch" { target *-*-* } .-1 } */ + __analyzer_dump_state ("taint", tmp.sz); /* { dg-warning "state: 'tainted'" } */ + /* { dg-message "\\(\[0-9\]+\\) \\.\\.\\.to here" "event: to here" { target *-*-* } .-1 } */ + + return malloc (tmp.sz); /* { dg-warning "use of attacker-controlled value 'tmp\\.sz' as allocation size without upper-bounds checking" "warning" } */ + /* { dg-message "23: \\(\[0-9\]+\\) 'tmp.i' has an unchecked value here \\(from 'tmp'\\)" "event: tmp.i has an unchecked value" { xfail *-*-* } .-1 } */ + /* { dg-message "\\(\[0-9\]+\\) use of attacker-controlled value 'tmp\\.sz' as allocation size without upper-bounds checking" "final event" { target *-*-* } .-2 } */ + + // TOOD: better messages for state changes + } + return 0; +} + +/* VLA with tainted size. */ + +void *test_2 (FILE *f) +{ + struct foo tmp; + if (1 == fread(&tmp, sizeof(tmp), 1, f)) { /* { dg-message "\\(\[0-9\]+\\) 'tmp' gets an unchecked value here" "event: tmp gets unchecked value" { xfail *-*-* } } */ + /* { dg-message "\\(\[0-9\]+\\) following 'true' branch\\.\\.\\." "event: following true branch" { target *-*-* } .-1 } */ + __analyzer_dump_state ("taint", tmp.sz); /* { dg-warning "state: 'tainted'" } */ + /* { dg-message "\\(\[0-9\]+\\) \\.\\.\\.to here" "event: to here" { target *-*-* } .-1 } */ + + /* VLA with tainted size. */ + { + char buf[tmp.sz]; /* { dg-warning "use of attacker-controlled value 'tmp\\.sz' as allocation size without upper-bounds checking" "warning" } */ + /* { dg-message "\\(\[0-9\]+\\) 'tmp.i' has an unchecked value here \\(from 'tmp'\\)" "event: tmp.i has an unchecked value" { xfail *-*-* } .-1 } */ + /* { dg-message "\\(\[0-9\]+\\) use of attacker-controlled value 'tmp\\.sz' as allocation size without upper-bounds checking" "final event" { target *-*-* } .-2 } */ + fread (buf, tmp.sz, 1, f); + } + + // TOOD: better messages for state changes + } + return 0; +} + +void *test_3 (FILE *f) +{ + int num; + fread (&num, sizeof (int), 1, f); + __analyzer_dump_state ("taint", num); /* { dg-warning "state: 'tainted'" } */ + __analyzer_dump_state ("taint", num * 16); /* { dg-warning "state: 'tainted'" } */ + __analyzer_dump_state ("taint", (size_t)(num * 16)); /* { dg-warning "state: 'tainted'" } */ + return malloc (num * 16); /* { dg-warning "use of attacker-controlled value 'num \\* 16' as allocation size without upper-bounds checking" } */ +} diff --git a/gcc/testsuite/gcc.dg/analyzer/taint-alloc-2.c b/gcc/testsuite/gcc.dg/analyzer/taint-alloc-2.c new file mode 100644 index 00000000000..72dbca5cbf0 --- /dev/null +++ b/gcc/testsuite/gcc.dg/analyzer/taint-alloc-2.c @@ -0,0 +1,27 @@ +// TODO: remove need for this option: +/* { dg-additional-options "-fanalyzer-checker=taint" } */ + +#include "analyzer-decls.h" +#include +#include +#include + +struct foo +{ + int num; +}; + +/* malloc with tainted size from a field. */ + +void *test_1 (FILE *f) +{ + struct foo tmp; + fread(&tmp, sizeof(tmp), 1, f); /* { dg-message "\\(\[0-9\]+\\) 'tmp' gets an unchecked value here" "event: tmp gets unchecked value" { xfail *-*-* } } */ + + __analyzer_dump_state ("taint", tmp.num); /* { dg-warning "state: 'tainted'" } */ + __analyzer_dump_state ("taint", tmp.num * 16); /* { dg-warning "state: 'tainted'" } */ + + return malloc (tmp.num * 16); /* { dg-warning "use of attacker-controlled value 'tmp\\.num \\* 16' as allocation size without upper-bounds checking" "warning" } */ + /* { dg-message "\\(\[0-9\]+\\) use of attacker-controlled value 'tmp\\.num \\* 16' as allocation size without upper-bounds checking" "final event with expr" { target *-*-* } .-1 } */ + // TODO: show where tmp.num * 16 gets the bogus value +} diff --git a/gcc/testsuite/gcc.dg/analyzer/taint-divisor-1.c b/gcc/testsuite/gcc.dg/analyzer/taint-divisor-1.c new file mode 100644 index 00000000000..5a5a0b93ce0 --- /dev/null +++ b/gcc/testsuite/gcc.dg/analyzer/taint-divisor-1.c @@ -0,0 +1,26 @@ +// TODO: remove need for this option: +/* { dg-additional-options "-fanalyzer-checker=taint" } */ + +#include "analyzer-decls.h" +#include + +struct st1 +{ + int a; + int b; +}; + + +int test_1 (FILE *f) +{ + struct st1 s; + fread (&s, sizeof (s), 1, f); + return s.a / s.b; /* { dg-warning "use of attacker-controlled value 's\\.b' as divisor without checking for zero" } */ +} + +int test_2 (FILE *f) +{ + struct st1 s; + fread (&s, sizeof (s), 1, f); + return s.a % s.b; /* { dg-warning "use of attacker-controlled value 's\\.b' as divisor without checking for zero" } */ +} diff --git a/gcc/testsuite/gcc.dg/analyzer/taint-1.c b/gcc/testsuite/gcc.dg/analyzer/taint-read-index-1.c similarity index 72% rename from gcc/testsuite/gcc.dg/analyzer/taint-1.c rename to gcc/testsuite/gcc.dg/analyzer/taint-read-index-1.c index cd46dd5fc14..71c0816fd1f 100644 --- a/gcc/testsuite/gcc.dg/analyzer/taint-1.c +++ b/gcc/testsuite/gcc.dg/analyzer/taint-read-index-1.c @@ -1,3 +1,4 @@ +// TODO: remove need for this option: /* { dg-additional-options "-fanalyzer-checker=taint" } */ #include @@ -18,10 +19,10 @@ char test_1(FILE *f) /* { dg-message "\\(\[0-9\]+\\) following 'true' branch\\.\\.\\." "event: following true branch" { target *-*-* } .-1 } */ /* BUG: the following array lookup trusts that the input data's index is in the range 0 <= i < 256; otherwise it's accessing the stack */ - return tmp.buf[tmp.i]; // { dg-warning "use of tainted value 'tmp.i' in array lookup without bounds checking" "warning" } */ + return tmp.buf[tmp.i]; // { dg-warning "use of attacker-controlled value 'tmp.i' in array lookup without bounds checking" "warning" } */ /* { dg-message "23: \\(\[0-9\]+\\) \\.\\.\\.to here" "event: to here" { target *-*-* } .-1 } */ /* { dg-message "23: \\(\[0-9\]+\\) 'tmp.i' has an unchecked value here \\(from 'tmp'\\)" "event: tmp.i has an unchecked value" { xfail *-*-* } .-2 } */ - /* { dg-message "\\(\[0-9\]+\\) use of tainted value 'tmp.i' in array lookup without bounds checking" "final event" { target *-*-* } .-3 } */ + /* { dg-message "\\(\[0-9\]+\\) use of attacker-controlled value 'tmp.i' in array lookup without bounds checking" "final event" { target *-*-* } .-3 } */ // TOOD: better messages for state changes } @@ -53,8 +54,8 @@ char test_4(FILE *f) if (1 == fread(&tmp, sizeof(tmp), 1, f)) { if (tmp.i >= 0) { /* { dg-message "'tmp.i' has an unchecked value here \\(from 'tmp'\\)" "event: tmp.i has an unchecked value" { xfail *-*-* } } */ - /* { dg-message "'tmp.i' has its lower bound checked here" "event: lower bound checked" { target *-*-* } .-1 } */ - return tmp.buf[tmp.i]; /* { dg-warning "use of tainted value 'tmp.i' in array lookup without upper-bounds checking" "warning" } */ + /* { dg-message "'tmp.i' has its lower bound checked here" "event: lower bound checked" { xfail *-*-* } .-1 } */ + return tmp.buf[tmp.i]; /* { dg-warning "use of attacker-controlled value 'tmp.i' in array lookup without upper-bounds checking" "warning" } */ } } return 0; @@ -66,8 +67,8 @@ char test_5(FILE *f) if (1 == fread(&tmp, sizeof(tmp), 1, f)) { if (tmp.i < 256) { /* { dg-message "'tmp.i' has an unchecked value here \\(from 'tmp'\\)" "event: tmp.i has an unchecked value" { xfail *-*-* } } */ - /* { dg-message "'tmp.i' has its upper bound checked here" "event: upper bound checked" { target *-*-* } .-1 } */ - return tmp.buf[tmp.i]; /* { dg-warning "use of tainted value 'tmp.i' in array lookup without lower-bounds checking" "warning" } */ + /* { dg-message "'tmp.i' has its upper bound checked here" "event: upper bound checked" { xfail *-*-* } .-1 } */ + return tmp.buf[tmp.i]; /* { dg-warning "use of attacker-controlled value 'tmp.i' in array lookup without checking for negative" "warning" } */ } } return 0; @@ -85,7 +86,7 @@ char test_6(FILE *f) struct bar tmp; if (1 == fread(&tmp, sizeof(tmp), 1, f)) { - return tmp.buf[tmp.i]; /* { dg-warning "use of tainted value 'tmp.i' in array lookup without upper-bounds checking" } */ + return tmp.buf[tmp.i]; /* { dg-warning "use of attacker-controlled value 'tmp.i' in array lookup without upper-bounds checking" } */ } return 0; } @@ -96,7 +97,7 @@ char test_7(FILE *f) if (1 == fread(&tmp, sizeof(tmp), 1, f)) { if (tmp.i >= 0) { - return tmp.buf[tmp.i]; /* { dg-warning "use of tainted value 'tmp.i' in array lookup without upper-bounds checking" } */ + return tmp.buf[tmp.i]; /* { dg-warning "use of attacker-controlled value 'tmp.i' in array lookup without upper-bounds checking" } */ } } return 0; @@ -122,7 +123,7 @@ char test_9(FILE *f) if (1 == fread(&tmp, sizeof(tmp), 1, f)) { if (tmp.i == 42) { /* not a bug: tmp.i compared against a specific value: */ - return tmp.buf[tmp.i]; /* { dg-bogus "tainted" "" { xfail *-*-* } } */ + return tmp.buf[tmp.i]; /* { dg-bogus "attacker-controlled" "" { xfail *-*-* } } */ // TODO: xfail } } diff --git a/gcc/testsuite/gcc.dg/analyzer/taint-read-offset-1.c b/gcc/testsuite/gcc.dg/analyzer/taint-read-offset-1.c new file mode 100644 index 00000000000..6db59bcc615 --- /dev/null +++ b/gcc/testsuite/gcc.dg/analyzer/taint-read-offset-1.c @@ -0,0 +1,128 @@ +// TODO: remove need for this option: +/* { dg-additional-options "-fanalyzer-checker=taint" } */ + +#include +#include +#include + +struct foo +{ + ssize_t offset; +}; + +char *p; + +char test_1(FILE *f) +{ + struct foo tmp; + + if (1 == fread(&tmp, sizeof(tmp), 1, f)) { /* { dg-message "\\(\[0-9\]+\\) 'tmp' gets an unchecked value here" "event: tmp gets unchecked value" { xfail *-*-* } } */ + /* { dg-message "\\(\[0-9\]+\\) following 'true' branch\\.\\.\\." "event: following true branch" { target *-*-* } .-1 } */ + return *(p + tmp.offset); // { dg-warning "use of attacker-controlled value 'tmp.offset' as offset without bounds checking" "warning" } */ + /* { dg-message "\\(\[0-9\]+\\) \\.\\.\\.to here" "event: to here" { target *-*-* } .-1 } */ + /* { dg-message "\\(\[0-9\]+\\) 'tmp.offset' has an unchecked value here \\(from 'tmp'\\)" "event: tmp.offset has an unchecked value" { xfail *-*-* } .-2 } */ + /* { dg-message "\\(\[0-9\]+\\) use of attacker-controlled value 'tmp.offset' as offset without bounds checking" "final event" { target *-*-* } .-3 } */ + + // TOOD: better messages for state changes + } + return 0; +} + +char test_2(struct foo *f) +{ + /* not a bug: the data is not known to be tainted: */ + return *(p + f->offset); +} + +char test_3(FILE *f) +{ + struct foo tmp; + + if (1 == fread(&tmp, sizeof(tmp), 1, f)) { + if (tmp.offset >= 0 && tmp.offset < 256) { + /* not a bug: the access is guarded by upper and lower bounds: */ + return *(p + tmp.offset); + } + } + return 0; +} + +char test_4(FILE *f) +{ + struct foo tmp; + + if (1 == fread(&tmp, sizeof(tmp), 1, f)) { + if (tmp.offset >= 0) { /* { dg-message "'tmp.offset' has an unchecked value here \\(from 'tmp'\\)" "event: tmp.offset has an unchecked value" { xfail *-*-* } } */ + /* { dg-message "'tmp.offset' has its lower bound checked here" "event: lower bound checked" { xfail *-*-* } .-1 } */ + return *(p + tmp.offset); /* { dg-warning "use of attacker-controlled value 'tmp.offset' as offset without upper-bounds checking" "warning" } */ + } + } + return 0; +} + +char test_5(FILE *f) +{ + struct foo tmp; + + if (1 == fread(&tmp, sizeof(tmp), 1, f)) { + if (tmp.offset < 256) { /* { dg-message "'tmp.offset' has an unchecked value here \\(from 'tmp'\\)" "event: tmp.offset has an unchecked value" { xfail *-*-* } } */ + /* { dg-message "'tmp.offset' has its upper bound checked here" "event: upper bound checked" { xfail *-*-* } .-1 } */ + return *(p + tmp.offset); /* { dg-warning "use of attacker-controlled value 'tmp.offset' as offset without lower-bounds checking" "warning" } */ + } + } + return 0; +} + +/* unsigned types have a natural lower bound of 0 */ +struct bar +{ + size_t offset; +}; + +char test_6(FILE *f) +{ + struct bar tmp; + + if (1 == fread(&tmp, sizeof(tmp), 1, f)) { + return *(p + tmp.offset); /* { dg-warning "use of attacker-controlled value 'tmp.offset' as offset without upper-bounds checking" } */ + } + return 0; +} + +char test_7(FILE *f) +{ + struct bar tmp; + + if (1 == fread(&tmp, sizeof(tmp), 1, f)) { + if (tmp.offset >= 0) { + return *(p + tmp.offset); /* { dg-warning "use of attacker-controlled value 'tmp.offset' as offset without upper-bounds checking" } */ + } + } + return 0; +} + +char test_8(FILE *f) +{ + struct bar tmp; + + if (1 == fread(&tmp, sizeof(tmp), 1, f)) { + if (tmp.offset < 256) { + /* not a bug: has an upper bound, and an implicit lower bound: */ + return *(p + tmp.offset); + } + } + return 0; +} + +char test_9(FILE *f) +{ + struct foo tmp; + + if (1 == fread(&tmp, sizeof(tmp), 1, f)) { + if (tmp.offset == 42) { + /* not a bug: tmp.offset compared against a specific value: */ + return *(p + tmp.offset); /* { dg-bogus "attacker-controlled" "" { xfail *-*-* } } */ + } + } + return 0; +} diff --git a/gcc/testsuite/gcc.dg/analyzer/taint-size-1.c b/gcc/testsuite/gcc.dg/analyzer/taint-size-1.c new file mode 100644 index 00000000000..64c9c26aa2d --- /dev/null +++ b/gcc/testsuite/gcc.dg/analyzer/taint-size-1.c @@ -0,0 +1,32 @@ +// TODO: remove need for this option: +/* { dg-additional-options "-fanalyzer-checker=taint" } */ + +#include "analyzer-decls.h" +#include +#include +#include + +struct foo +{ + size_t sz; +}; + +char buf[100]; + +/* memset with tainted size. */ + +void test_1 (FILE *f) +{ + struct foo tmp; + if (1 == fread(&tmp, sizeof(tmp), 1, f)) { /* { dg-message "\\(\[0-9\]+\\) 'tmp' gets an unchecked value here" "event: tmp gets unchecked value" { xfail *-*-* } } */ + /* { dg-message "\\(\[0-9\]+\\) following 'true' branch\\.\\.\\." "event: following true branch" { target *-*-* } .-1 } */ + __analyzer_dump_state ("taint", tmp.sz); /* { dg-warning "state: 'tainted'" } */ + /* { dg-message "\\(\[0-9\]+\\) \\.\\.\\.to here" "event: to here" { target *-*-* } .-1 } */ + + memset (buf, 0, tmp.sz); /* { dg-warning "use of attacker-controlled value 'tmp\\.sz' as size without upper-bounds checking" "warning" } */ + /* { dg-message "23: \\(\[0-9\]+\\) 'tmp.i' has an unchecked value here \\(from 'tmp'\\)" "event: tmp.i has an unchecked value" { xfail *-*-* } .-1 } */ + /* { dg-message "\\(\[0-9\]+\\) use of attacker-controlled value 'tmp\\.sz' as size without upper-bounds checking" "final event" { target *-*-* } .-2 } */ + + // TOOD: better messages for state changes + } +} diff --git a/gcc/testsuite/gcc.dg/analyzer/taint-write-index-1.c b/gcc/testsuite/gcc.dg/analyzer/taint-write-index-1.c new file mode 100644 index 00000000000..cc7ab1ca4f6 --- /dev/null +++ b/gcc/testsuite/gcc.dg/analyzer/taint-write-index-1.c @@ -0,0 +1,132 @@ +// TODO: remove need for this option: +/* { dg-additional-options "-fanalyzer-checker=taint" } */ + +#include +#include +#include + +struct foo +{ + signed int i; + char buf[256]; +}; + +struct foo g; +char test_1(FILE *f) +{ + struct foo tmp; + + if (1 == fread(&tmp, sizeof(tmp), 1, f)) { /* { dg-message "\\(\[0-9\]+\\) 'tmp' gets an unchecked value here" "event: tmp gets unchecked value" { xfail *-*-* } } */ + /* { dg-message "\\(\[0-9\]+\\) following 'true' branch\\.\\.\\." "event: following true branch" { target *-*-* } .-1 } */ + /* BUG: the following array lookup trusts that the input data's index is + in the range 0 <= i < 256; otherwise it's accessing the stack */ + g.buf[tmp.i] = 42; // { dg-warning "use of attacker-controlled value 'tmp.i' in array lookup without bounds checking" "warning" } */ + /* { dg-message "\\(\[0-9\]+\\) \\.\\.\\.to here" "event: to here" { target *-*-* } .-1 } */ + /* { dg-message "\\(\[0-9\]+\\) 'tmp.i' has an unchecked value here \\(from 'tmp'\\)" "event: tmp.i has an unchecked value" { xfail *-*-* } .-2 } */ + /* { dg-message "\\(\[0-9\]+\\) use of attacker-controlled value 'tmp.i' in array lookup without bounds checking" "final event" { target *-*-* } .-3 } */ + + // TOOD: better messages for state changes + } + return 0; +} + +char test_2(struct foo *f, int i) +{ + /* not a bug: the data is not known to be tainted: */ + return f->buf[f->i]; +} + +char test_3(FILE *f) +{ + struct foo tmp; + + if (1 == fread(&tmp, sizeof(tmp), 1, f)) { + if (tmp.i >= 0 && tmp.i < 256) { + /* not a bug: the access is guarded by upper and lower bounds: */ + g.buf[tmp.i] = 42; + } + } + return 0; +} + +char test_4(FILE *f) +{ + struct foo tmp; + + if (1 == fread(&tmp, sizeof(tmp), 1, f)) { + if (tmp.i >= 0) { /* { dg-message "'tmp.i' has an unchecked value here \\(from 'tmp'\\)" "event: tmp.i has an unchecked value" { xfail *-*-* } } */ + /* { dg-message "'tmp.i' has its lower bound checked here" "event: lower bound checked" { xfail *-*-* } .-1 } */ + g.buf[tmp.i] = 42; /* { dg-warning "use of attacker-controlled value 'tmp.i' in array lookup without upper-bounds checking" "warning" } */ + } + } + return 0; +} + +char test_5(FILE *f) +{ + struct foo tmp; + + if (1 == fread(&tmp, sizeof(tmp), 1, f)) { + if (tmp.i < 256) { /* { dg-message "'tmp.i' has an unchecked value here \\(from 'tmp'\\)" "event: tmp.i has an unchecked value" { xfail *-*-* } } */ + /* { dg-message "'tmp.i' has its upper bound checked here" "event: upper bound checked" { xfail *-*-* } .-1 } */ + g.buf[tmp.i] = 42; /* { dg-warning "use of attacker-controlled value 'tmp.i' in array lookup without checking for negative" "warning" } */ + } + } + return 0; +} + +/* unsigned types have a natural lower bound of 0 */ +struct bar +{ + unsigned int i; + char buf[256]; +}; + +char test_6(FILE *f) +{ + struct bar tmp; + + if (1 == fread(&tmp, sizeof(tmp), 1, f)) { + g.buf[tmp.i] = 42; /* { dg-warning "use of attacker-controlled value 'tmp.i' in array lookup without upper-bounds checking" } */ + } + return 0; +} + +char test_7(FILE *f) +{ + struct bar tmp; + + if (1 == fread(&tmp, sizeof(tmp), 1, f)) { + if (tmp.i >= 0) { + g.buf[tmp.i] = 42; /* { dg-warning "use of attacker-controlled value 'tmp.i' in array lookup without upper-bounds checking" } */ + } + } + return 0; +} + +char test_8(FILE *f) +{ + struct bar tmp; + + if (1 == fread(&tmp, sizeof(tmp), 1, f)) { + if (tmp.i < 256) { + /* not a bug: has an upper bound, and an implicit lower bound: */ + g.buf[tmp.i] = 42; + } + } + return 0; +} + +char test_9(FILE *f) +{ + struct foo tmp; + + if (1 == fread(&tmp, sizeof(tmp), 1, f)) { + if (tmp.i == 42) { + /* not a bug: tmp.i compared against a specific value: */ + g.buf[tmp.i] = 42; /* { dg-bogus "attacker-controlled" "" { xfail *-*-* } } */ + // TODO: xfail + } + } + return 0; +} diff --git a/gcc/testsuite/gcc.dg/analyzer/taint-write-offset-1.c b/gcc/testsuite/gcc.dg/analyzer/taint-write-offset-1.c new file mode 100644 index 00000000000..d0df6220315 --- /dev/null +++ b/gcc/testsuite/gcc.dg/analyzer/taint-write-offset-1.c @@ -0,0 +1,132 @@ +// TODO: remove need for this option: +/* { dg-additional-options "-fanalyzer-checker=taint" } */ + +#include +#include +#include + +struct foo +{ + ssize_t offset; +}; + +char *p; + +char test_1(FILE *f) +{ + struct foo tmp; + + if (1 == fread(&tmp, sizeof(tmp), 1, f)) { /* { dg-message "\\(\[0-9\]+\\) 'tmp' gets an unchecked value here" "event: tmp gets unchecked value" { xfail *-*-* } } */ + /* { dg-message "\\(\[0-9\]+\\) following 'true' branch\\.\\.\\." "event: following true branch" { target *-*-* } .-1 } */ + /* BUG: the following array lookup trusts that the input data's index is + in the range 0 <= i < 256; otherwise it's accessing the stack */ + *(p + tmp.offset) = 42; // { dg-warning "use of attacker-controlled value 'tmp.offset' as offset without bounds checking" "warning" } */ + /* { dg-message "\\(\[0-9\]+\\) \\.\\.\\.to here" "event: to here" { target *-*-* } .-1 } */ + /* { dg-message "\\(\[0-9\]+\\) 'tmp.offset' has an unchecked value here \\(from 'tmp'\\)" "event: tmp.offset has an unchecked value" { xfail *-*-* } .-2 } */ + /* { dg-message "\\(\[0-9\]+\\) use of attacker-controlled value 'tmp.offset' as offset without bounds checking" "final event" { target *-*-* } .-3 } */ + + // TOOD: better messages for state changes + } + return 0; +} + +char test_2(struct foo *f) +{ + /* not a bug: the data is not known to be tainted: */ + *(p + f->offset) = 42; +} + +char test_3(FILE *f) +{ + struct foo tmp; + + if (1 == fread(&tmp, sizeof(tmp), 1, f)) { + if (tmp.offset >= 0 && tmp.offset < 256) { + /* not a bug: the access is guarded by upper and lower bounds: */ + *(p + tmp.offset) = 42; + } + } + return 0; +} + +char test_4(FILE *f) +{ + struct foo tmp; + + if (1 == fread(&tmp, sizeof(tmp), 1, f)) { + if (tmp.offset >= 0) { /* { dg-message "'tmp.offset' has an unchecked value here \\(from 'tmp'\\)" "event: tmp.offset has an unchecked value" { xfail *-*-* } } */ + /* { dg-message "'tmp.offset' has its lower bound checked here" "event: lower bound checked" { xfail *-*-* } .-1 } */ + *(p + tmp.offset) = 42; /* { dg-warning "use of attacker-controlled value 'tmp.offset' as offset without upper-bounds checking" "warning" } */ + } + } + return 0; +} + +char test_5(FILE *f) +{ + struct foo tmp; + + if (1 == fread(&tmp, sizeof(tmp), 1, f)) { + if (tmp.offset < 256) { /* { dg-message "'tmp.offset' has an unchecked value here \\(from 'tmp'\\)" "event: tmp.offset has an unchecked value" { xfail *-*-* } } */ + /* { dg-message "'tmp.offset' has its upper bound checked here" "event: upper bound checked" { xfail *-*-* } .-1 } */ + *(p + tmp.offset) = 42; /* { dg-warning "use of attacker-controlled value 'tmp.offset' as offset without lower-bounds checking" "warning" } */ + } + } + return 0; +} + +/* unsigned types have a natural lower bound of 0 */ +struct bar +{ + unsigned int offset; + char buf[256]; +}; + +char test_6(FILE *f) +{ + struct bar tmp; + + if (1 == fread(&tmp, sizeof(tmp), 1, f)) { + *(p + tmp.offset) = 42; /* { dg-warning "use of attacker-controlled value 'tmp.offset' as offset without upper-bounds checking" } */ + } + return 0; +} + +char test_7(FILE *f) +{ + struct bar tmp; + + if (1 == fread(&tmp, sizeof(tmp), 1, f)) { + if (tmp.offset >= 0) { + *(p + tmp.offset) = 42; /* { dg-warning "use of attacker-controlled value 'tmp.offset' as offset without upper-bounds checking" } */ + } + } + return 0; +} + +char test_8(FILE *f) +{ + struct bar tmp; + + if (1 == fread(&tmp, sizeof(tmp), 1, f)) { + if (tmp.offset < 256) { + /* not a bug: has an upper bound, and an implicit lower bound: */ + *(p + tmp.offset) = 42; + } + } + return 0; +} + +char test_9(FILE *f) +{ + struct foo tmp; + + if (1 == fread(&tmp, sizeof(tmp), 1, f)) { + if (tmp.offset == 42) { + /* not a bug: tmp.offset compared against a specific value: */ + *(p + tmp.offset) = 42; /* { dg-bogus "tainted" "" { xfail *-*-* } } */ + // TODO: xfail + } + } + return 0; +}