From patchwork Fri May 13 17:14:33 2011 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ian Lance Taylor X-Patchwork-Id: 95501 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) by ozlabs.org (Postfix) with SMTP id 571DAB6EF2 for ; Sat, 14 May 2011 03:15:02 +1000 (EST) Received: (qmail 1896 invoked by alias); 13 May 2011 17:14:56 -0000 Received: (qmail 1878 invoked by uid 22791); 13 May 2011 17:14:55 -0000 X-SWARE-Spam-Status: No, hits=-2.6 required=5.0 tests=AWL, BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, SPF_HELO_PASS, T_RP_MATCHES_RCVD, T_TVD_MIME_NO_HEADERS X-Spam-Check-By: sourceware.org Received: from smtp-out.google.com (HELO smtp-out.google.com) (74.125.121.67) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Fri, 13 May 2011 17:14:40 +0000 Received: from wpaz1.hot.corp.google.com (wpaz1.hot.corp.google.com [172.24.198.65]) by smtp-out.google.com with ESMTP id p4DHEchf004694 for ; Fri, 13 May 2011 10:14:38 -0700 Received: from pvf33 (pvf33.prod.google.com [10.241.210.97]) by wpaz1.hot.corp.google.com with ESMTP id p4DHEapS004301 (version=TLSv1/SSLv3 cipher=RC4-SHA bits=128 verify=NOT) for ; Fri, 13 May 2011 10:14:37 -0700 Received: by pvf33 with SMTP id 33so1455699pvf.38 for ; Fri, 13 May 2011 10:14:36 -0700 (PDT) Received: by 10.68.57.105 with SMTP id h9mr2539975pbq.206.1305306876435; Fri, 13 May 2011 10:14:36 -0700 (PDT) Received: from coign.google.com (dhcp-172-22-126-184.mtv.corp.google.com [172.22.126.184]) by mx.google.com with ESMTPS id o20sm1473665pbt.50.2011.05.13.10.14.35 (version=TLSv1/SSLv3 cipher=OTHER); Fri, 13 May 2011 10:14:35 -0700 (PDT) From: Ian Lance Taylor To: overseers@gcc.gnu.org, gcc-patches@gcc.gnu.org Subject: Don't let search bots look at buglist.cgi Date: Fri, 13 May 2011 10:14:33 -0700 Message-ID: User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/23.1 (gnu/linux) MIME-Version: 1.0 X-System-Of-Record: true X-IsSubscribed: yes Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org I noticed that buglist.cgi was taking quite a bit of CPU time. I looked at some of the long running instances, and they were coming from searchbots. I can't think of a good reason for this, so I have committed this patch to the gcc.gnu.org robots.txt file to not let searchbots search through lists of bugs. I plan to make a similar change on the sourceware.org and cygwin.com sides. Please let me know if this seems like a mistake. Does anybody have any experience with http://code.google.com/p/bugzilla-sitemap/ ? That might be a slightly better approach. Ian Index: robots.txt =================================================================== RCS file: /cvs/gcc/wwwdocs/htdocs/robots.txt,v retrieving revision 1.9 diff -u -r1.9 robots.txt --- robots.txt 22 Sep 2009 19:19:30 -0000 1.9 +++ robots.txt 13 May 2011 17:08:33 -0000 @@ -5,4 +5,5 @@ User-Agent: * Disallow: /viewcvs/ Disallow: /cgi-bin/ +Disallow: /bugzilla/buglist.cgi Crawl-Delay: 60