From 8d133850499c24d622e9a2569c69134604d04647 Mon Sep 17 00:00:00 2001 From: cvsdist Date: Sep 09 2004 12:36:11 +0000 Subject: auto-import changelog data from squid-2.3.STABLE4-1.src.rpm Fri Jul 28 2000 Bill Nottingham - clean up init script; fix condrestart - update to STABLE4, more bugfixes - update FAQ Tue Jul 18 2000 Nalin Dahyabhai - fix syntax error in init script - finish adding condrestart support Fri Jul 14 2000 Bill Nottingham - move initscript back Wed Jul 12 2000 Prospector - automatic rebuild Thu Jul 06 2000 Bill Nottingham - prereq /etc/init.d - add bugfix patch - update FAQ Thu Jun 29 2000 Bill Nottingham - fix init script Tue Jun 27 2000 Bill Nottingham - don't prereq new initscripts Mon Jun 26 2000 Bill Nottingham - initscript munging Sat Jun 10 2000 Bill Nottingham - rebuild for exciting FHS stuff Wed May 31 2000 Bill Nottingham - fix init script again (#11699) - add --enable-delay-pools (#11695) - update to STABLE3 - update FAQ Fri Apr 28 2000 Bill Nottingham - fix init script (#11087) Fri Apr 07 2000 Bill Nottingham - three more bugfix patches from the squid people - buildprereq jade, sgmltools Sun Mar 26 2000 Florian La Roche - make %pre more portable Thu Mar 16 2000 Bill Nottingham - bugfix patches - fix dependency on /usr/local/bin/perl Sat Mar 04 2000 Bill Nottingham - 2.3.STABLE2 Mon Feb 14 2000 Bill Nottingham - Yet More Bugfix Patches Tue Feb 08 2000 Bill Nottingham - add more bugfix patches - --enable-heap-replacement Mon Jan 31 2000 Cristian Gafton - rebuild to fix dependencies Fri Jan 28 2000 Bill Nottingham - grab some bugfix patches Mon Jan 10 2000 Bill Nottingham - 2.3.STABLE1 (whee, another serial number) Tue Dec 21 1999 Bernhard Rosenkraenzer - Fix compliance with ftp RFCs (http://www.wu-ftpd.org/broken-clients.html) - Work around a bug in some versions of autoconf - BuildPrereq sgml-tools - we're using sgml2html Mon Oct 18 1999 Bill Nottingham - add a couple of bugfix patches Wed Oct 13 1999 Bill Nottingham - update to 2.2.STABLE5. - update FAQ, fix URLs. Sat Sep 11 1999 Cristian Gafton - transform restart in reload and add restart to the init script Tue Aug 31 1999 Bill Nottingham - add squid user as user 23. Mon Aug 16 1999 Bill Nottingham - initscript munging - fix conflict between logrotate & squid -k (#4562) Wed Jul 28 1999 Bill Nottingham - put cachemgr.cgi back in /usr/lib/squid Wed Jul 14 1999 Bill Nottingham - add webdav bugfix patch (#4027) Mon Jul 12 1999 Bill Nottingham - fix path to config in squid.init (confuses linuxconf) Wed Jul 07 1999 Bill Nottingham - 2.2.STABLE4 Wed Jun 09 1999 Dale Lovelace - logrotate changes - errors from find when /var/spool/squid or - /var/log/squid didn't exist Thu May 20 1999 Bill Nottingham - 2.2.STABLE3 Thu Apr 22 1999 Bill Nottingham - update to 2.2.STABLE.2 Sun Apr 18 1999 Bill Nottingham - update to 2.2.STABLE1 Thu Apr 15 1999 Bill Nottingham - don't need to run groupdel on remove - fix useradd Mon Apr 12 1999 Bill Nottingham - fix effective_user (bug #2124) Mon Apr 05 1999 Bill Nottingham - strip binaries Thu Apr 01 1999 Bill Nottingham - duh. adduser does require a user name. - add a serial number Tue Mar 30 1999 Bill Nottingham - add an adduser in %pre, too Thu Mar 25 1999 Bill Nottingham - oog. chkconfig must be in %preun, not %postun Wed Mar 24 1999 Bill Nottingham - switch to using group squid - turn off icmp (insecure) - update to 2.2.DEVEL3 - build FAQ docs from source Tue Mar 23 1999 Bill Nottingham - logrotate changes Sun Mar 21 1999 Cristian Gafton - auto rebuild in the new build environment (release 4) Wed Feb 10 1999 Bill Nottingham - update to 2.2.PRE2 Wed Dec 30 1998 Bill Nottingham - cache & log dirs shouldn't be world readable - remove preun script (leave logs & cache @ uninstall) Tue Dec 29 1998 Bill Nottingham - fix initscript to get cache_dir correct Fri Dec 18 1998 Bill Nottingham - update to 2.1.PATCH2 - merge in some changes from RHCN version Sat Oct 10 1998 Cristian Gafton - strip binaries - version 1.1.22 Sun May 10 1998 Cristian Gafton - don't make packages conflict with each other... Sat May 02 1998 Cristian Gafton - added a proxy auth patch from Alex deVries - fixed initscripts Thu Apr 09 1998 Cristian Gafton - rebuilt for Manhattan Fri Mar 20 1998 Cristian Gafton - upgraded to 1.1.21/1.NOVM.21 Mon Mar 02 1998 Cristian Gafton - updated the init script to use reconfigure option to restart squid instead of shutdown/restart (both safer and quicker) Sat Feb 07 1998 Cristian Gafton - upgraded to 1.1.20 - added the NOVM package and tryied to reduce the mess in the spec file Wed Jan 07 1998 Cristian Gafton - first build against glibc - patched out the use of setresuid(), which is available only on kernels 2.1.44 and later --- diff --git a/.cvsignore b/.cvsignore index e69de29..ab64fb1 100644 --- a/.cvsignore +++ b/.cvsignore @@ -0,0 +1 @@ +squid-2.3.STABLE4-src.tar.gz diff --git a/FAQ.sgml b/FAQ.sgml new file mode 100644 index 0000000..d31e450 --- /dev/null +++ b/FAQ.sgml @@ -0,0 +1,11925 @@ + + +
+ +SQUID Frequently Asked Questions +Duane Wessels, +Frequently Asked Questions (with answers!) about the Squid Internet +Object Cache software. + + + + + + +About Squid, this FAQ, and other Squid information resources + +What is Squid? +

+Squid is a high-performance proxy caching server for web clients, +supporting FTP, gopher, and HTTP data objects. Unlike traditional +caching software, Squid handles all requests in a single, +non-blocking, I/O-driven process. + +Squid keeps +meta data and especially hot objects cached in RAM, caches +DNS lookups, supports non-blocking DNS lookups, and implements +negative caching of failed requests. + +Squid supports SSL, extensive +access controls, and full request logging. By using the +lightweight Internet Cache Protocol, Squid caches can be arranged +in a hierarchy or mesh for additional bandwidth savings. + +

+Squid consists of a main server program +Squid is derived from the ARPA-funded +. + +What is Internet object caching? +

+Internet object caching is a way to store requested Internet objects +(i.e., data available via the HTTP, FTP, and gopher protocols) on a +system closer to the requesting site than to the source. Web browsers +can then use the local Squid cache as a proxy HTTP server, reducing +access time as well as bandwidth consumption. + +Why is it called Squid? +

+Harris' Lament says, ``All the good ones are taken." + +

+We needed to distinguish this new version from the Harvest +cache software. Squid was the code name for initial +development, and it stuck. + +What is the latest version of Squid? +

+Squid is updated often; please see + +for the most recent versions. + +Who is responsible for Squid? +

+Squid is the result of efforts by numerous individuals from +the Internet community. + +of the National Laboratory for Applied Network Research (funded by +the National Science Foundation) leads code development. +Please see + +for a list of our excellent contributors. + +Where can I get Squid? +

+You can download Squid via FTP from + +or one of the many worldwide +. + +

+Many sushi bars also have Squid. + +What Operating Systems does Squid support? +

+The software is designed to operate on any modern Unix system, and +is known to work on at least the following platforms: + + Linux + FreeBSD + NetBSD + BSDI + OSF and Digital Unix + IRIX + SunOS/Solaris + NeXTStep + SCO Unix + AIX + HP-UX + + + +

+For more specific information, please see +. +If you encounter any platform-specific problems, please +let us know by sending email to +. + +Does Squid run on Windows NT? +

+Recent versions of Squid will . + +

+ +has ported Squid to Windows NT and sells a supported +version. You can also download the source from +. +Thanks to LogiSense for making the code available as required by the GPL terms. + +

+ +is working on a Windows NT port as well. You can find more information from him +at . + +What Squid mailing lists are available? +

+ + squid-users@ircache.net: general discussions about the +Squid cache software. Subscribe via +, +and also at . + + +squid-users-digest: digested (daily) version of +above. Subscribe via + +squid-announce@ircache.net: A receive-only list for +announcements of new versions. +Subscribe via + + + + + +

+We also have a few other mailing lists which are not strictly +Squid-related. + + + + + + +I can't figure out how to unsubscribe from your mailing list. +

+All of our mailing lists have ``-request'' addresses that you must +use for subscribe and unsubscribe requests. To unsubscribe from +the squid-users list, you send a message to What Squid web pages are available? +

+Several Squid and Caching-related web pages are available: + + + +for information on the Squid software + + + +gives information on our operational mesh of caches. + + + (uh, you're reading it). + + +. + + + + + +. +Yeah, its extremely incomplete. I assure you this is the most recent version. + + + + + + + + ICPv2 -- Protocol + ICPv2 -- Application + + + +Does Squid support SSL? +

+Squid can proxy SSL requests. By default, Squid will forward all +SSL requests directly to their origin servers. In firewall configurations, +Squid will forward all SSL requests to one other proxy, defined with +the What's the legal status of Squid? +

+Squid is +by the University of California San Diego. +Squid uses some . + +

+Squid is +. + +

+Squid is licensed under the terms of the +. + +Is Squid year-2000 compliant? +

+We think so. Squid uses the Unix time format for all internal time +representations. Potential problem areas are in printing and +parsing other time representations. We have made the following +fixes in to address the year 2000: + + + timestamps use 4-digit years instead of just 2 digits. + + + + +

+Year-2000 fixes were applied to the following Squid versions: + + +: +Year parsing bug fixed for dates in the "Wed Jun 9 01:29:59 1993 GMT" +format (Richard Kettlewell). + +squid-1.1.22: +Fixed likely year-2000 bug in ftpget's timestamp parsing (Henrik Nordstrom). + +squid-1.1.20: +Misc fixes (Arjan de Vet). + + +

Patches: + + +. +If you are still running 1.1.X, then you should apply this patch to +your source and recompile. + +. + +. + + +

+Squid-2.2 and earlier versions have a . This is not strictly a Year-2000 bug; it would happen on the first day of any year. + +Can I pay someone for Squid support? +

+Yep. The following companies will support Squid for you: + + + + +We provide commercial Squid support; we +frequently deploy squids in caching proxy, transparent caching +proxy, httpd accelerator, and hierarchical modes, for a wide variety +of corporate and public sector clients, including members of the +Fortune 500. Inquires can be directed to . + + +We are supporting the complete SA territory and speak Portuguese, +Spanish, English and German. We are experienced in compiling Squid for +SCO and FreeBSD. We do custom configurations, OS and cache fine tuning, +maintenance and remote adminstration. Also we can help setting your +server into existing hierarchies giving you best performance. Contact +us on or send +some e-mail to our address. + + + provides commercial Squid support. We are specialized in installing +Squid Proxy Systems on Linux machines. Please contact us by sending an +e-mail to . + + + provides a version +of Squid and an accompanying library modified to support push as well as +the traditional pull. We support our software and traditional Squid. +Contact us at info@pushcache.com. + + + are supporting squid, apache, +linux and other public license software for professional use in germany. +. + + +Plugged In Software provides commercial support for Squid, Apache, +sendmail, Samba, RedHat Linux and other Open Source (TM) software. For +further information, please see or email our . + + + provides +managed firewall solutions and support for Squid proxies running +under linux based firewalls. For further information, contact: + +Garry Thorpe +Chief Engineer +NEC Australia +Ph +61 2 6250 8749 +Email: garry.thorpe@nec.com.au + + + +Linux Manages! also provides support for squid caches on Linux. We do +installations, maintenance, linking squids into existing hierarchies, +remote support and tuning of the cache to the customer's needs. +Contact address is linux@thing.at. + + +We provide commercial support for Squid, SCO UNIX, Cisco, Apache, +Sendmail and RedHat Linux . Specially Squid and Linux based Transparent +Caches. We also build and provide support for LANs. For further +information, please email us at: wizards@brain.net.pk + + +ATRC provides commercial squid support on Linux in Pakistan +Our web address is: and . +Email : knehal@bigfoot.com + + + +We provide commercial support for Squid, SCO UNIX, Cisco, Apache, Sendmail and +any kind of Linux. Find us at: E-mail us at: info@edpweb.com. + + +, located +in Cambridge, Ontario, Canada. + + +Snerpa provides commercial Squid Support since 1995. We specialize +in Linux solutions and we provide the Web content control database for Squid. You can +contact us at . + + + + +

+If you know someone who takes money for supporting Squid, let us know and +we will add their information here. + +Squid FAQ contributors +

+The following people have made contributions to this document: + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +

+Please send corrections, updates, and comments to: +. + +About This Document +

+This document is copyrighted (2000) by Duane Wessels. + +

+This document was written in SGML and converted with the +. This document is available in +, +, and +. + +Want to contribute? Please write in SGML... + +

+It is easier for us if you send us text which is close to "correct" SGML. +The SQUID FAQ currently uses the LINUXDOC DTD. Its probably easiest +to follow examples in the this file. +Here are the basics: + +

+Use the <url> tag for links, instead of HTML <A HREF ...> + + <url url="http://www.squid-cache.org" name="Squid Home Page"> + + +

+Use <em> for emphasis, config options, and pathnames: + + <em>usr/local/squid/etc/squid.conf</em> + <em/cache_peer/ + + +

+Here is how you do lists: + + <itemize> + <item>foo + <item>bar + </itemize> + + +

+Use <verb>, just like HTML's <PRE> to show +unformatted text. + + + +Getting and Compiling Squid +

+You must download a source archive file of the form +squid-x.y.z-src.tar.gz (eg, squid-1.1.6-src.tar.gz) from +, or. +. +Context diffs are available for upgrading to new versions. +These can be applied with the ). + +How do I compile Squid? + +

+For + % tar xzf squid-1.1.21-src.tar.gz + % cd squid-1.1.21 + % make + +

+For + % tar xzf squid-2.0.RELEASE-src.tar.gz + % cd squid-2.0.RELEASE + % ./configure + % make + + +What kind of compiler do I need? +

+To compile Squid, you will need an ANSI C compiler. Almost all +modern Unix systems come with pre-installed compilers which work +just fine. The old +If you are uncertain about your system's C compiler, The GNU C compiler is +available at +. +In addition to gcc, you may also want or need to install the Do you have pre-compiled binaries available? + + +

+The developers do not have the resources to make pre-compiled +binaries available. Instead, we invest effort into making +the source code very portable. Some people have made +binary packages available. Please see our +. + +

+The site +has pre-compiled packages for SGI IRIX. + +

+Squid binaries for +. + +

+Squid binaries for + + +How do I apply a patch or a diff? +

+You need the + cd squid-1.1.10 + mkdir ../squid-1.1.11 + find . -depth -print | cpio -pdv ../squid-1.1.11 + cd ../squid-1.1.11 + patch < /tmp/diff-1.1.10-1.1.11 + +After the patch has been applied, you must rebuild Squid from the +very beginning, i.e.: + + make realclean + ./configure + make + make install + +Note, In later distributions (Squid 2), 'realclean' has been changed +to 'distclean'. + +

+If patch keeps asking for a file name, try adding ``-p0'': + + patch -p0 < filename + + +

+If your , for example. + + +The configure script can take numerous options. The most +useful is /usr/local/squid/. To +change the default, you could do: + + % cd squid-x.y.z + % ./configure --prefix=/some/other/directory/squid + + +

+Type + + % ./configure --help + +to see all available options. You will need to specify some +of these options to enable or disable certain features. +Some options which are used often include: + + + --prefix=PREFIX install architecture-independent files in PREFIX + [/usr/local/squid] + --enable-dlmalloc[=LIB] Compile & use the malloc package by Doug Lea + --enable-gnuregex Compile GNUregex + --enable-splaytree Use SPLAY trees to store ACL lists + --enable-xmalloc-debug Do some simple malloc debugging + --enable-xmalloc-debug-trace + Detailed trace of memory allocations + --enable-xmalloc-statistics + Show malloc statistics in status page + --enable-carp Enable CARP support + --enable-async-io Do ASYNC disk I/O using threads + --enable-icmp Enable ICMP pinging + --enable-delay-pools Enable delay pools to limit bandwith usage + --enable-mem-gen-trace Do trace of memory stuff + --enable-useragent-log Enable logging of User-Agent header + --enable-kill-parent-hack + Kill parent on shutdown + --enable-snmp Enable SNMP monitoring + --enable-time-hack Update internal timestamp only once per second + --enable-cachemgr-hostname[=hostname] + Make cachemgr.cgi default to this host + --enable-arp-acl Enable use of ARP ACL lists (ether address) + --enable-htpc Enable HTCP protocol + --enable-forw-via-db Enable Forw/Via database + --enable-cache-digests Use Cache Digests + see http://www.squid-cache.org/Doc/FAQ/FAQ-16.html + --enable-err-language=lang + Select language for Error pages (see errors dir) + + +undefined reference to __inet_ntoa + +

+by +and . + +

+Probably you've recently installed bind 8.x. There is a mismatch between +the header files and DNS library that Squid has found. There are a couple +of things you can try. + +

+First, try adding src/Makefile. +If +If that doesn't seem to work, edit your arpa/inet.h file and comment out the following: + + + #define inet_addr __inet_addr + #define inet_aton __inet_aton + #define inet_lnaof __inet_lnaof + #define inet_makeaddr __inet_makeaddr + #define inet_neta __inet_neta + #define inet_netof __inet_netof + #define inet_network __inet_network + #define inet_net_ntop __inet_net_ntop + #define inet_net_pton __inet_net_pton + #define inet_ntoa __inet_ntoa + #define inet_pton __inet_pton + #define inet_ntop __inet_ntop + #define inet_nsap_addr __inet_nsap_addr + #define inet_nsap_ntoa __inet_nsap_ntoa + + +How can I get true DNS TTL info into Squid's IP cache? +

+If you have source for BIND, you can modify it as indicated in the diff +below. It causes the global variable _dns_ttl_ to be set with the TTL +of the most recent lookup. Then, when you compile Squid, the configure +script will look for the _dns_ttl_ symbol in libresolv.a. If found, +dnsserver will return the TTL value for every lookup. +

+This hack was contributed by +. + + +diff -ru bind-4.9.4-orig/res/gethnamaddr.c bind-4.9.4/res/gethnamaddr.c +--- bind-4.9.4-orig/res/gethnamaddr.c Mon Aug 5 02:31:35 1996 ++++ bind-4.9.4/res/gethnamaddr.c Tue Aug 27 15:33:11 1996 +@@ -133,6 +133,7 @@ + } align; + + extern int h_errno; ++int _dns_ttl_; + + #ifdef DEBUG + static void +@@ -223,6 +224,7 @@ + host.h_addr_list = h_addr_ptrs; + haveanswer = 0; + had_error = 0; ++ _dns_ttl_ = -1; + while (ancount-- > 0 && cp < eom && !had_error) { + n = dn_expand(answer->buf, eom, cp, bp, buflen); + if ((n < 0) || !(*name_ok)(bp)) { +@@ -232,8 +234,11 @@ + cp += n; /* name */ + type = _getshort(cp); + cp += INT16SZ; /* type */ +- class = _getshort(cp); +- cp += INT16SZ + INT32SZ; /* class, TTL */ ++ class = _getshort(cp); ++ cp += INT16SZ; /* class */ ++ if (qtype == T_A && type == T_A) ++ _dns_ttl_ = _getlong(cp); ++ cp += INT32SZ; /* TTL */ + n = _getshort(cp); + cp += INT16SZ; /* len */ + if (class != C_IN) { + + +

+And here is a patch for BIND-8: + +*** src/lib/irs/dns_ho.c.orig Tue May 26 21:55:51 1998 +--- src/lib/irs/dns_ho.c Tue May 26 21:59:57 1998 +*************** +*** 87,92 **** +--- 87,93 ---- + #endif + + extern int h_errno; ++ int _dns_ttl_; + + /* Definitions. */ + +*************** +*** 395,400 **** +--- 396,402 ---- + pvt->host.h_addr_list = pvt->h_addr_ptrs; + haveanswer = 0; + had_error = 0; ++ _dns_ttl_ = -1; + while (ancount-- > 0 && cp < eom && !had_error) { + n = dn_expand(ansbuf, eom, cp, bp, buflen); + if ((n < 0) || !(*name_ok)(bp)) { +*************** +*** 404,411 **** + cp += n; /* name */ + type = ns_get16(cp); + cp += INT16SZ; /* type */ +! class = ns_get16(cp); +! cp += INT16SZ + INT32SZ; /* class, TTL */ + n = ns_get16(cp); + cp += INT16SZ; /* len */ + if (class != C_IN) { +--- 406,416 ---- + cp += n; /* name */ + type = ns_get16(cp); + cp += INT16SZ; /* type */ +! class = _getshort(cp); +! cp += INT16SZ; /* class */ +! if (qtype == T_A && type == T_A) +! _dns_ttl_ = _getlong(cp); +! cp += INT32SZ; /* TTL */ + n = ns_get16(cp); + cp += INT16SZ; /* len */ + if (class != C_IN) { + + +My platform is BSD/OS or BSDI and I can't compile Squid +

+ + cache_cf.c: In function `parseConfigFile': + cache_cf.c:1353: yacc stack overflow before `token' + ... + + +

+You may need to upgrade your gcc installation to a more recent version. +Check your gcc version with + + gcc -v + +If it is earlier than 2.7.2, you might consider upgrading. + +

+Alternatively, you can get pre-compiled Squid binaries for BSD/OS 2.1 at +the , +patch . + + +Problems compiling +The following error occurs on Solaris systems using gcc when the Solaris C +compiler is not installed: + + /usr/bin/rm -f libmiscutil.a + /usr/bin/false r libmiscutil.a rfc1123.o rfc1738.o util.o ... + make[1]: *** [libmiscutil.a] Error 255 + make[1]: Leaving directory `/tmp/squid-1.1.11/lib' + make: *** [all] Error 1 + +Note on the second line the /usr/bin/false. This is supposed +to be a path to the +To fix this you either need to: + + + Add /usr/ccs/bin to your PATH. This is where the + Install the . + This package includes programs such as + +I have problems compiling Squid on Platform Foo. +

+Please check the + +on which Squid is known to compile. Your problem might be listed +there together with a solution. If it isn't listed there, mail +us what you are trying, your Squid version, and the problems +you encounter. + +I see a lot warnings while compiling Squid. +

+Warnings are usually not a big concern, and can be common with software +designed to operate on multiple platforms. If you feel like fixing +compile-time warnings, please do so and send us the patches. + + +Building Squid on OS/2 +

+by + +

+In order in compile squid, you need to have a reasonable facsimile of a +Unix system installed. This includes +I made a few modifications to the pristine EMX 0.9d install. + + +added defines for +changed all occurrences of time_t to signed long instead +of unsigned long + +hacked ld.exe + + + to search for both xxxx.a and libxxxx.a + + to produce the correct filename when using the + -Zexe option + + + +

+You will need to run scripts/convert.configure.to.os2 (in the +Squid source distribution) to modify +the configure script so that it can search for the various programs. + +

+Next, you need to set a few environment variables (see EMX docs +for meaning): + + export EMXOPT="-h256 -c" + export LDFLAGS="-Zexe -Zbin -s" + + +

+Now you are ready to configure squid: + + ./configure + +

+Compile everything: + + make + +

+and finally, install: + + make install + +

+This will by default, install into /usr/local/squid. If you wish +to install somewhere else, see the +Now, don't forget to set EMXOPT before running squid each time. I +recommend using the -Y and -N options. + + + + +Installing and Running Squid + +How big of a system do I need to run Squid? + +

+There are no hard-and-fast rules. The most important resource +for Squid is physical memory. Your processor does not need +to be ultra-fast. Your disk system will be the major bottleneck, +so fast disks are important for high-volume caches. Do not use +IDE disks if you can help it. + +

+In late 1998, if you are buying a new machine for +a cache, I would recommend the following configuration: + +300 MHz Pentium II CPU +512 MB RAM +Five 9 GB UW-SCSI disks + +Your system disk, and logfile disk can probably be IDE without losing +any cache performance. + +

+Also, see by Martin Hamilton This is a +very nice page summarizing system configurations people are using for +large Squid caches. + +How do I install Squid? + +

+After , you can install it +with this simple command: + + % make install + +If you have enabled the + +then you will also want to type + + % su + # make install-pinger + + +

+After installing, you will want to edit and customize +the /usr/local/squid/etc/squid.conf. + +

+Also, a QUICKSTART guide has been included with the source +distribution. Please see the directory where you +unpacked the source archive. + +What does the +The Do you have a +Yes, after you + +How do I start Squid? +

+After you've finished editing the configuration file, you can +start Squid for the first time. The procedure depends a little +bit on which version you are using. + +Squid version 2.X +

+First, you must create the swap directories. Do this by +running Squid with the -z option: + + % /usr/local/squid/bin/squid -z + +Once that completes, you can start Squid and try it out. +Probably the best thing to do is run it from your terminal +and watch the debugging output. Use this command: + + % /usr/local/squid/bin/squid -NCd1 + +If everything is working okay, you will see the line: + + Ready to serve requests. + +If you want to run squid in the background, as a daemon process, +just leave off all options: + + % /usr/local/squid/bin/squid + +

+NOTE: depending on your configuration, you may need to start +squid as root. + +Squid version 1.1.X + +

+With version 1.1.16 and later, you must first run Squid with the + + % /usr/local/squid/bin/squid -z + +Squid will exit when it finishes creating all of the directories. +Next you can start + % /usr/local/squid/bin/RunCache & + + +

+For versions before 1.1.6 you should just start How do I start Squid automatically when the system boots? + +Squid Version 2.X + +

+Squid-2 has a restart feature built in. This greatly simplifies +starting Squid and means that you don't need to use + /usr/local/squid/bin/squid + + +

+Squid will automatically background itself and then spawn +a child process. In your + Sep 23 23:55:58 kitty squid[14616]: Squid Parent: child process 14617 started + +That means that process ID 14563 is the parent process which monitors the child +process (pid 14617). The child process is the one that does all of the +work. The parent process just waits for the child process to exit. If the +child process exits unexpectedly, the parent will automatically start another +child process. In that case, + Sep 23 23:56:02 kitty squid[14616]: Squid Parent: child process 14617 exited with status 1 + Sep 23 23:56:05 kitty squid[14616]: Squid Parent: child process 14619 started + + +

+If there is some problem, and Squid can not start, the parent process will give up +after a while. Your + Sep 23 23:56:12 kitty squid[14616]: Exiting due to repeated, frequent failures + +When this happens you should check your +When you look at a process ( + 24353 ?? Ss 0:00.00 /usr/local/squid/bin/squid + 24354 ?? R 0:03.39 (squid) (squid) + +The first is the parent process, and the child process is the one called ``(squid)''. +Note that if you accidentally kill the parent process, the child process will not +notice. + +

+If you want to run Squid from your termainal and prevent it from +backgrounding and spawning a child process, use the + /usr/local/squid/bin/squid -N + + +Squid Version 1.1.X + +From inittab +

+On systems which have an /etc/inittab file (Digital Unix, +Solaris, IRIX, HP-UX, Linux), you can add a line like this: + + sq:3:respawn:/usr/local/squid/bin/squid.sh < /dev/null >> /tmp/squid.log 2>&1 + +We recommend using a + #!/bin/sh + C=/usr/local/squid + PATH=/usr/bin:$C/bin + TZ=PST8PDT + export PATH TZ + + notify="root" + cd $C + umask 022 + sleep 10 + while [ -f /tmp/nosquid ]; do + sleep 1 + done + /usr/bin/tail -20 $C/logs/cache.log \ + | Mail -s "Squid restart on `hostname` at `date`" $notify + exec bin/squid -CYs + + +From rc.local +

+On BSD-ish systems, you will need to start Squid from the ``rc'' files, +usually /etc/rc.local. For example: + + if [ -f /usr/local/squid/bin/RunCache ]; then + echo -n ' Squid' + (/usr/local/squid/bin/RunCache &) + fi + + +From init.d +

+Some people may want to use the ``init.d'' startup system. +If you start Squid (or RunCache) from an ``init.d'' script, then you +should probably use + nohup squid -sY $conf >> $logdir/squid.out 2>&1 + +Also, you may need to add a line to trap certain signals +and prevent them from being sent to the Squid process. +Add this line at the top of your script: + + trap '' 1 2 3 18 + + +How do I tell if Squid is running? +

+You can use the + % client http://www.netscape.com/ > test + +

+There are other command-line HTTP client programs available +as well. Two that you may find useful are + +and +. + +

+Another way is to use Squid itself to see if it can signal a running +Squid process: + + % squid -k check + +And then check the shell's exit status variable. + +

+Also, check the log files, most importantly the +These are the command line options for + + +How do I see how Squid works? +

+ + +Check the +Install and use the +. + + + + + +Configuration issues + +How do I join a cache hierarchy? +

+To place your cache in a hierarchy, use the +For example, the following + # squid.conf - On the host: childcache.example.com + # + # Format is: hostname type http_port udp_port + # + cache_host parentcache.example.com parent 3128 3130 + cache_host childcache2.example.com sibling 3128 3130 + cache_host childcache3.example.com sibling 3128 3130 + + +The + # squid.conf - On the host: sv.cache.nlanr.net + # + # Format is: hostname type http_port udp_port + # + + cache_host electraglide.geog.unsw.edu.au parent 3128 3130 + cache_host cache1.nzgate.net.nz parent 3128 3130 + cache_host pb.cache.nlanr.net parent 3128 3130 + cache_host it.cache.nlanr.net parent 3128 3130 + cache_host sd.cache.nlanr.net parent 3128 3130 + cache_host uc.cache.nlanr.net sibling 3128 3130 + cache_host bo.cache.nlanr.net sibling 3128 3130 + cache_host_domain electraglide.geog.unsw.edu.au .au + cache_host_domain cache1.nzgate.net.nz .au .aq .fj .nz + cache_host_domain pb.cache.nlanr.net .uk .de .fr .no .se .it + cache_host_domain it.cache.nlanr.net .uk .de .fr .no .se .it + cache_host_domain sd.cache.nlanr.net .mx .za .mu .zm + + +The configuration above indicates that the cache will use +How do I join NLANR's cache hierarchy? +

+We have a simple set of + +the NLANR cache hierarchy. + +Why should I want to join NLANR's cache hierarchy? +

+The NLANR hierarchy can provide you with an initial source for parent or +sibling caches. Joining the NLANR global cache system will frequently +improve the performance of your caching service. + +How do I register my cache with NLANR's registration service? +

+Just enable these options in your + cache_announce 24 + announce_to sd.cache.nlanr.net:3131 + + + + +How do I find other caches close to me and arrange parent/child/sibling relationships with them? +

+Visit the NLANR cache + +to discover other caches near you. Keep in mind that just because +a cache is registered in the database + +My cache registration is not appearing in the Tracker database. + +

+ + +Your site will not be listed if your cache IP address does not have +a DNS PTR record. If we can't map the IP address back to a domain +name, it will be listed as ``Unknown.'' + +The registration messages are sent with UDP. We may not be receiving +your announcement message due to firewalls which block UDP, or +dropped packets due to congestion. + + +What is the httpd-accelerator mode? +

+This entry has been moved to . + +How do I configure Squid to work behind a firewall? +

+Note: The information here is current for version 2.2. + +

+If you are behind a firewall then you can't make direct connections +to the outside world, so you +You can use the + acl INSIDE dstdomain mydomain.com + never_direct deny INSIDE + +Note that the outside domains will not match the +You could also specify internal servers by IP address + + acl INSIDE_IP dst 1.2.3.4/24 + never_direct deny INSIDE + +Note, however that when you use IP addresses, Squid must +perform a DNS lookup to convert URL hostnames to an +address. Your internal DNS servers may not be able to +lookup external domains. + +

+If you use + cache_peer xyz.mydomain.com parent 3128 0 default + + +How do I configure Squid forward all requests to another proxy? +

+Note: The information here is current for version 2.2. +

+First, you need to give Squid a parent cache. Second, you need +to tell Squid it can not connect directly to origin servers. This is done +with three configuration file lines: + + cache_peer parentcache.foo.com parent 3128 0 no-query default + acl all src 0.0.0.0/0.0.0.0 + never_direct allow all + +Note, with this configuration, if the parent cache fails or becomes +unreachable, then every request will result in an error message. + +

+In case you want to be able to use direct connections when all the +parents go down you should use a different approach: + + cache_peer parentcache.foo.com parent 3128 0 no-query + prefer_direct off + +The default behaviour of Squid in the absence of positive ICP, HTCP, etc +replies is to connect to the origin server instead of using parents. +The prefer_direct off directive tells Squid to try parents first. + +I have +The +It's very important that there are enough My +First, find out if you have enough +Another factor which affects the How can I easily change the default HTTP port? +

+Before you run the configure script, simply set the + setenv CACHE_HTTP_PORT 8080 + ./configure + make + make install + + +Is it possible to control how big each +With Squid-1.1 it is NOT possible. Each What +Most people have a disk partition dedicated to the Squid cache. +You don't want to use the entire partition size. You have to leave +some extra room. Currently, Squid is not very tolerant of running +out of disk space. +

+Lets say you have a 9GB disk. +Remember that disk manufacturers lie about the space available. +A so-called 9GB disk usually results in about 8.5GB of raw, usable space. +First, put a filesystem on it, and mount +it. Then check the ``available space'' with your +Next, I suggest taking off another 10% +or so for Squid overheads, and a "safe buffer." Squid normally puts +its +cache_dir ... 7000 16 256 + + +

+Its better to start out conservative. After the cache becomes full, +look at the disk usage. If you think there is plenty of unused space, +then increase the +If you're getting ``disk full'' write errors, then you definately need +to decrease your cache size. + +I'm adding a new +With Squid-1.1, yes, you will lose your cache. This is because +version 1.1 uses a simplistic algorithm to distribute files +between cache directories. + +

+With Squid-2, you will not lose your existing cache. +You can add and delete Squid and +Several people on both the . +The most elegant way in my opinion is to run an internal Squid caching +proxyserver which handles client requests and let this server forward +it's requests to the http-gw running on the firewall. Cache hits won't +need to be handled by the firewall. + +

+In this example Squid runs on the same server as the http-gw, Squid uses +8000 and http-gw uses 8080 (web). The local domain is Firewall configuration: + +

+Either run http-gw as a daemon from the /etc/rc.d/rc.local (Linux +Slackware): + + exec /usr/local/fwtk/http-gw -daemon 8080 + +or run it from inetd like this: + + web stream tcp nowait.100 root /usr/local/fwtk/http-gw http-gw + +I increased the watermark to 100 because a lot of people run into +problems with the default value. + +

+Make sure you have at least the following line in +/usr/local/etc/netperm-table: + + http-gw: hosts 127.0.0.1 + +You could add the IP-address of your own workstation to this rule and +make sure the http-gw by itself works, like: + + http-gw: hosts 127.0.0.1 10.0.0.1 + + +Squid configuration: + +

+The following settings are important: + + + http_port 8000 + icp_port 0 + + cache_host localhost.home.nl parent 8080 0 default + acl HOME dstdomain home.nl + never_direct deny HOME + +This tells Squid to use the parent for all domains other than +872739961.631 1566 10.0.0.21 ERR_CLIENT_ABORT/304 83 GET http://www.squid-cache.org/ - DEFAULT_PARENT/localhost.home.nl - +872739962.976 1266 10.0.0.21 TCP_CLIENT_REFRESH/304 88 GET http://www.nlanr.net/Images/cache_now.gif - DEFAULT_PARENT/localhost.home.nl - +872739963.007 1299 10.0.0.21 ERR_CLIENT_ABORT/304 83 GET http://www.squid-cache.org/Icons/squidnow.gif - DEFAULT_PARENT/localhost.home.nl - +872739963.061 1354 10.0.0.21 TCP_CLIENT_REFRESH/304 83 GET http://www.squid-cache.org/Icons/Squidlogo2.gif - DEFAULT_PARENT/localhost.home.nl + + +

+http-gw entries in syslog: + + +Aug 28 02:46:00 memo http-gw[2052]: permit host=localhost/127.0.0.1 use of gateway (V2.0beta) +Aug 28 02:46:00 memo http-gw[2052]: log host=localhost/127.0.0.1 protocol=HTTP cmd=dir dest=www.squid-cache.org path=/ +Aug 28 02:46:01 memo http-gw[2052]: exit host=localhost/127.0.0.1 cmds=1 in=0 out=0 user=unauth duration=1 +Aug 28 02:46:01 memo http-gw[2053]: permit host=localhost/127.0.0.1 use of gateway (V2.0beta) +Aug 28 02:46:01 memo http-gw[2053]: log host=localhost/127.0.0.1 protocol=HTTP cmd=get dest=www.squid-cache.org path=/Icons/Squidlogo2.gif +Aug 28 02:46:01 memo http-gw[2054]: permit host=localhost/127.0.0.1 use of gateway (V2.0beta) +Aug 28 02:46:01 memo http-gw[2054]: log host=localhost/127.0.0.1 protocol=HTTP cmd=get dest=www.squid-cache.org path=/Icons/squidnow.gif +Aug 28 02:46:01 memo http-gw[2055]: permit host=localhost/127.0.0.1 use of gateway (V2.0beta) +Aug 28 02:46:01 memo http-gw[2055]: log host=localhost/127.0.0.1 protocol=HTTP cmd=get dest=www.nlanr.net path=/Images/cache_now.gif +Aug 28 02:46:02 memo http-gw[2055]: exit host=localhost/127.0.0.1 cmds=1 in=0 out=0 user=unauth duration=1 +Aug 28 02:46:03 memo http-gw[2053]: exit host=localhost/127.0.0.1 cmds=1 in=0 out=0 user=unauth duration=2 +Aug 28 02:46:04 memo http-gw[2054]: exit host=localhost/127.0.0.1 cmds=1 in=0 out=0 user=unauth duration=3 + + + +

+To summarize: + +

+Advantages: + + +http-gw allows you to selectively block ActiveX and Java, and it's +primary design goal is security. + +The firewall doesn't need to run large applications like Squid. + +The internal Squid-server still gives you the benefit of caching. + + +

+Disadvantages: + + +The internal Squid proxyserver can't (and shouldn't) work with other +parent or neighbor caches. + +Initial requests are slower because these go through http-gw, http-gw +also does reverse lookups. Run a nameserver on the firewall or use an +internal nameserver. + + + + +-- + + + +What is ``HTTP_X_FORWARDED_FOR''? Why does squid provide it to WWW servers, and how can I stop it? + +

+When a proxy-cache is used, a server does not see the connection +coming from the originating client. Many people like to implement +access controls based on the client address. +To accommodate these people, Squid adds its own request header +called "X-Forwarded-For" which looks like this: + + X-Forwarded-For: 128.138.243.150, unknown, 192.52.106.30 + +Entries are always IP addresses, or the word +We must note that access controls based on this header are extremely +weak and simple to fake. Anyone may hand-enter a request with any IP +address whatsoever. This is perhaps the reason why client IP addresses +have been omitted from the HTTP/1.1 specification. + +Can Squid anonymize HTTP requests? +

+Yes it can, however the way of doing it has changed from earlier versions +of squid. As of squid-2.2 a more customisable method has been introduced. +Please follow the instructions for the version of squid that you are using. +As a default, no anonymizing is done. + +

+If you choose to use the anonymizer you might wish to investigate the forwarded_for +option to prevent the client address being disclosed. Failure to turn off the +forwarded_for option will reduce the effectiveness of the anonymizer. Finally +if you filter the User-Agent header using the fake_user_agent option can +prevent some user problems as some sites require the User-Agent header. + +Squid 2.2 +

+With the introduction of squid 2.2 the anonoymizer has become more customisable. +It now allows specification of exactly which headers will be allowed to pass. + +The new anonymizer uses the 'anonymize_headers' tag. It has two modes 'deny' all +and allow the specified headers. The following example will simulate the old +paranoid mode. + + + anonymize_headers allow Allow Authorization Cache-Control + anonymize_headers allow Content-Encoding Content-Length + anonymize_headers allow Content-Type Date Expires Host + anonymize_headers allow If-Modified-Since Last-Modified + anonymize_headers allow Location Pragma Accept Charset + anonymize_headers allow Accept-Encoding Accept-Language + anonymize_headers allow Content-Language Mime-Version + anonymize_headers allow Retry-After Title Connection + anonymize_headers allow Proxy-Connection + + +This will prevent any headers other than those listed from being passed by the +proxy. + +

+The second mode is 'allow' all and deny the specified headers. The example +replicates the old standard mode. + + + anonymize_headers deny From Referer Server + anonymize_headers deny User-Agent WWW-Authenticate Link + + +It allows all headers to pass unless they are listed. + +

+You can not mix allow and deny in a squid configuration it is either one +or the other! + +Squid 2.1 and Earlier +

+There are three modes: http_anonymizer +configuration option. +

+With no anonymizing (the default), Squid forwards all request headers +as received from the client, to the origin server (subject to the regular +rules of HTTP). +

+In the +From: +Referer: +Server: +User-Agent: +WWW-Authenticate: +Link: + + +

+In the +Allow: +Authorization: +Cache-Control: +Content-Encoding: +Content-Length: +Content-Type: +Date: +Expires: +Host: +If-Modified-Since: +Last-Modified: +Location: +Pragma: +Accept: +Accept-Charset: +Accept-Encoding: +Accept-Language: +Content-Language: +Mime-Version: +Retry-After: +Title: +Connection: +Proxy-Connection: + + +

+References: + + + + + + +Communication between browsers and Squid + +

+Most web browsers available today support proxying and are easily configured +to use a Squid server as a proxy. Some browsers support advanced features +such as lists of domains or URL patterns that shouldn't be fetched through +the proxy, or JavaScript automatic proxy configuration. + +Netscape manual configuration +

+Select +Here is a + of the Netscape Navigator manual proxy +configuration screen. +

+ +Netscape automatic configuration +

+Netscape Navigator's proxy configuration can be automated with +JavaScript (for Navigator versions 2.0 or higher). Select + +Here is a + +of the Netscape Navigator automatic proxy configuration screen. + +You may also wish to consult Netscape's documentation for the Navigator + + +

+Here is a sample auto configuration JavaScript from Oskar Pearson: + +//We (www.is.co.za) run a central cache for our customers that they +//access through a firewall - thus if they want to connect to their intranet +//system (or anything in their domain at all) they have to connect +//directly - hence all the "fiddling" to see if they are trying to connect +//to their local domain. + +//Replace each occurrence of company.com with your domain name +//and if you have some kind of intranet system, make sure +//that you put it's name in place of "internal" below. + +//We also assume that your cache is called "cache.company.com", and +//that it runs on port 8080. Change it down at the bottom. + +//(C) Oskar Pearson and the Internet Solution (http://www.is.co.za) + + function FindProxyForURL(url, host) + { + //If they have only specified a hostname, go directly. + if (isPlainHostName(host)) + return "DIRECT"; + + //These connect directly if the machine they are trying to + //connect to starts with "intranet" - ie http://intranet + //Connect directly if it is intranet.* + //If you have another machine that you want them to + //access directly, replace "internal*" with that + //machine's name + if (shExpMatch( host, "intranet*")|| + shExpMatch(host, "internal*")) + return "DIRECT"; + + //Connect directly to our domains (NB for Important News) + if (dnsDomainIs( host,"company.com")|| + //If you have another domain that you wish to connect to + //directly, put it in here + dnsDomainIs(host,"sistercompany.com")) + return "DIRECT"; + + //So the error message "no such host" will appear through the + //normal Netscape box - less support queries :) + if (!isResolvable(host)) + return "DIRECT"; + + //We only cache http, ftp and gopher + if (url.substring(0, 5) == "http:" || + url.substring(0, 4) == "ftp:"|| + url.substring(0, 7) == "gopher:") + + //Change the ":8080" to the port that your cache + //runs on, and "cache.company.com" to the machine that + //you run the cache on + return "PROXY cache.company.com:8080; DIRECT"; + + //We don't cache WAIS + if (url.substring(0, 5) == "wais:") + return "DIRECT"; + + else + return "DIRECT"; + } + + +Lynx and Mosaic configuration +

+For Mosaic and Lynx, you can set environment variables +before starting the application. For example (assuming csh or tcsh): +

+ + % setenv http_proxy http://mycache.example.com:3128/ + % setenv gopher_proxy http://mycache.example.com:3128/ + % setenv ftp_proxy http://mycache.example.com:3128/ + +

+For Lynx you can also edit the + http_proxy:http://mycache.example.com:3128/ + ftp_proxy:http://mycache.example.com:3128/ + gopher_proxy:http://mycache.example.com:3128/ + + +Redundant Auto-Proxy-Configuration + +

+There's one nasty side-effect to using auto-proxy scripts: if you start +the web browser it will try and load the auto-proxy-script. + +

+If your script isn't available either because the web server hosting the +script is down or your workstation can't reach the web server (e.g. +because you're working off-line with your notebook and just want to +read a previously saved HTML-file) you'll get different errors depending +on the browser you use. + +

+The Netscape browser will just return an error after a timeout (after +that it tries to find the site 'www.proxy.com' if the script you use is +called 'proxy.pac'). + +

+The Microsoft Internet Explorer on the other hand won't even start, no +window displays, only after about 1 minute it'll display a window asking +you to go on with/without proxy configuration. + +

+The point is that your workstations always need to locate the +proxy-script. I created some extra redundancy by hosting the script on +two web servers (actually Apache web servers on the proxy servers +themselves) and adding the following records to my primary nameserver: + + proxy CNAME proxy1 + CNAME proxy2 + +The clients just refer to 'http://proxy/proxy.pac'. This script looks like this: + + +function FindProxyForURL(url,host) +{ + // Hostname without domainname or host within our own domain? + // Try them directly: + // http://www.domain.com actually lives before the firewall, so + // make an exception: + if ((isPlainHostName(host)||dnsDomainIs( host,".domain.com")) && + !localHostOrDomainIs(host, "www.domain.com")) + return "DIRECT"; + + // First try proxy1 then proxy2. One server mostly caches '.com' + // to make sure both servers are not + // caching the same data in the normal situation. The other + // server caches the other domains normally. + // If one of 'm is down the client will try the other server. + else if (shExpMatch(host, "*.com")) + return "PROXY proxy1.domain.com:8080; PROXY proxy2.domain.com:8081; DIRECT"; + return "PROXY proxy2.domain.com:8081; PROXY proxy1.domain.com:8080; DIRECT"; +} + + +

+I made sure every client domain has the appropriate 'proxy' entry. +The clients are automatically configured with two nameservers using +DHCP. + + +-- + + + +Microsoft Internet Explorer configuration +

+Select +Here is a + of the Internet Explorer proxy +configuration screen. +

+Microsoft is also starting to support Netscape-style JavaScript +automated proxy configuration. As of now, only MSIE version 3.0a +for Windows 3.1 and Windows NT 3.51 supports this feature (i.e., +as of version 3.01 build 1225 for Windows 95 and NT 4.0, the feature +was not included). +

+If you have a version of MSIE that does have this feature, elect +Netmanage Internet Chameleon WebSurfer configuration +

+Netmanage WebSurfer supports manual proxy configuration and exclusion +lists for hosts or domains that should not be fetched via proxy +(this information is current as of WebSurfer 5.0). Select + +Take a look at this + +if the instructions confused you. +

+On the same configuration window, you'll find a button to bring up +the exclusion list dialog box, which will let you enter some hosts +or domains that you don't want fetched via proxy. It should be +self-explanatory, but you might look at this + +just for fun anyway. + +Opera 2.12 proxy configuration + +

+Select +Notes: + + +Opera 2.12 doesn't support gopher on its own, but requires a proxy; therefore +Squid's gopher proxying can extend the utility of your Opera immensely. + +Unfortunately, Opera 2.12 chokes on some HTTP requests, for example +. +At the moment I think it has something to do with cookies. If you have +trouble with a site, try disabling the HTTP proxying by unchecking +that protocol in the + + +-- + + +How do I tell Squid to use a specific username for FTP urls? + +

+Insert your username in the host part of the URL, for example: + + ftp://joecool@ftp.foo.org/ + +Squid should then prompt you for your account password. Alternatively, +you can specify both your username and password in the URL itself: + + ftp://joecool:secret@ftp.foo.org/ + +However, we certainly do not recommend this, as it could be very +easy for someone to see or grab your password. + +Configuring Browsers for WPAD +

+by +

+You may like to start by reading the + +that describes WPAD. + +

+After reading the 8 steps below, if you don't understand any of the +terms or methods mentioned, you probably shouldn't be doing this. +Implementing wpad requires you to + web server installations and modifications. + squid proxy server (or others) installation etc. + Domain Name System maintenance etc. + +Please don't bombard the squid list with web server or dns questions. See +your system administrator, or do some more research on those topics. + +

+This is not a recommendation for any product or version. As far as I +know IE5 is the only browser out now implementing wpad. I think wpad +is an excellent feature that will return several hours of life per month. +Hopefully, all browser clients will implement it as well. But it will take +years for all the older browsers to fade away though. + +

+I have only focused on the domain name method, to the exclusion of the +DHCP method. I think the dns method might be easier for most people. +I don't currently, and may never, fully understand wpad and IE5, but this +method worked for me. It +But if you'd rather just have a go ... + + + Create a standard . The sample provided there is more than + adequate to get you going. No doubt all the other load balancing + and backup scripts will be fine also. + + + Store the resultant file in the document root directory of a + handy web server as + + notes that you should be able to use an HTTP redirect if you + want to store the wpad.dat file somewhere else. You can probably + even redirect +Redirect /wpad.dat http://racoon.riga.lv/proxy.pac + + + + If you do nothing more, a url like + http://www.your.domain.name/wpad.dat should bring up + the script text in your browser window. + + + Insert the following entry into your web server + application/x-ns-proxy-autoconfig dat + + And then restart your web server, for new mime type to work. + + + Assuming Internet Explorer 5, under http://www.your.domain.name/wpad.dat Test that + that all works as per your script and network. There's no point + continuing until this works ... + + + Create/install/implement a DNS record so that + wpad.your.domain.name resolves to the host above where + you have a functioning auto config script running. You should + now be able to use http://wpad.your.domain.name/wpad.dat + as the Auto Config Script location in step 5 above. + + + And finally, go back to the setup screen detailed in 5 above, + and choose nothing but the + One final question might be 'Which domain name does the client + (IE5) use for the wpad... lookup?' It uses the hostname from + the control panel setting. It starts the search by adding the + hostname "WPAD" to current fully-qualified domain name. For + instance, a client in a.b.Microsoft.com would search for a WPAD + server at wpad.a.b.microsoft.com. If it could not locate one, + it would remove the bottom-most domain and try again; for + instance, it would try wpad.b.microsoft.com next. IE 5 would + stop searching when it found a WPAD server or reached the + third-level domain, wpad.microsoft.com. + + + + +

+Anybody using these steps to install and test, please feel free to make +notes, corrections or additions for improvements, and post back to the +squid list... + +

+There are probably many more tricks and tips which hopefully will be +detailed here in the future. Things like IE 5.0x crops trailing slashes from FTP URL's +

+by +

+There was a bug in the 5.0x releases of Internet Explorer in which IE +cropped any trailing slash off an FTP URL. The URL showed up correctly in +the browser's ``Address:'' field, however squid logs show that the trailing +slash was being taken off. +

+An example of where this impacted squid if you had a setup where squid +would go direct for FTP directory listings but forward a request to a +parent for FTP file transfers. This was useful if your upstream proxy was +an older version of Squid or another vendors software which displayed +directory listings with broken icons and you wanted your own local version +of squid to generate proper FTP directory listings instead. +The workaround for this is to add a double slash to any directory listing +in which the slash was important, or else upgrade to IE 5.5. (Or use Netscape) + + + +Squid Log Files + +

+The logs are a valuable source of information about Squid workloads and +performance. The logs record not only access information, but also system +configuration errors and resource consumption (eg, memory, disk +space). There are several log file maintained by Squid. Some have to be +explicitely activated during compile time, others can safely be deactivated +during run-time. + +

+There are a few basic points common to all log files. The time stamps +logged into the log files are usually UTC seconds unless stated otherwise. +The initial time stamp usually contains a millisecond extension. + +

+The frequent time lookups on busy caches may have a performance impact on +some systems. The compile time configuration option + +If you run your Squid from the +The +From the area of automatic log file analysis, the +The user agent log file is only maintained, if + + +you configured the compile time you pointed the + +

+From the user agent log file you are able to find out about distributation +of browsers of your clients. Using this option in conjunction with a loaded +production squid might not be the best of all ideas. + + +The +The +The print format for a store log entry (one line) consists of eleven +space-separated columns, compare with the src/store_log.c: + + + "%9d.%03d %-7s %08X %4d %9d %9d %9d %s %d/%d %s %s\n" + + + + +The timestamp when the line was logged in UTC with a millisecond fraction. + + +The action the object was sumitted to, compare with src/store_log.c: + + +). + + + +

+The file number for the object storage file. Please note that the path to +this file is calculated according to your +A file number of +The HTTP reply status code. + + +

+The value of the HTTP "Date: " reply header. + + +

+The value of the HTTP "Last-Modified: " reply header. + + +

+The value of the HTTP "Expires: " reply header. + + +The HTTP "Content-Type" major value, or "unknown" if it cannot be +determined. + + +This column consists of two slash separated fields: + + +The advertised content length from the HTTP "Content-Length: " reply + header. +The size actually read. + + +

+If the advertised (or expected) length is missing, it will be set to +zero. If the advertised length is not zero, but not equal to the real +length, the object will be realeased from the cache. + + +The request method for the object, e.g. +

+The key to the object, usually the URL. + + +

+The timestamp format for the columns to + are all expressed in UTC seconds. The +actual values are parsed from the HTTP reply headers. An unparsable header +is represented by a value of -1, and a missing header is represented by a +value of -2. + +

+The column usually contains just the URL of +the object. Some objects though will never become public. Thus the key is +said to include a unique integer number and the request method in addition +to the URL. + + +This logfile exists for Squid-1.0 only. The format is + + [date] URL peerstatus peerhost + + + +Most log file analysis program are based on the entries in + +The common log file format contains other information than the native log +file, and less. The native format contains more information for the admin +interested in cache evaluation. + + +The + +is used by numerous HTTP servers. This format consists of the following +seven fields: + + remotehost rfc931 authuser [date] "method URL" status bytes + +

+It is parsable by a variety of tools. The common format contains different +information than the native log file format. The HTTP version is logged, +which is not logged in native log file format. + + + +The native format is different for different major versions of Squid. For +Squid-1.0 it is: + + time elapsed remotehost code/status/peerstatus bytes method URL + + +

+For Squid-1.1, the information from the + time elapsed remotehost code/status bytes method URL rfc931 peerstatus/peerhost type + + +

+For Squid-2 the columns stay the same, though the content within may change +a little. + +

+The native log file format logs more and different information than the +common log file format: the request duration, some timeout information, +the next upstream server address, and the content type. + +There exist tools, which convert one file format into the other. Please +mind that even though the log formats share most information, both formats +contain information which is not part of the other format, and thus this +part of the information is lost when converting. Especially converting back +and forth is not possible without loss. + + +It is recommended though to use Squid's native log format due to its +greater amount of information made available for later analysis. The print +format line for native + "%9d.%03d %6d %s %s/%03d %d %s %s %s %s%s/%s %s" + + +

+Therefore, an + +A Unix timestamp as UTC seconds with a millisecond resolution. You +can convert Unix timestamps into something more human readable using +this short perl script: + + #! /usr/bin/perl -p + s/^\d+\.\d+/localtime $&/e; + + + +The elapsed time considers how many milliseconds the transaction +busied the cache. It differs in interpretation between TCP and UDP: +

+ +For HTTP/1.0, this is basically the time between For persistent connections, this ought to be the time between + scheduling the reply and finishing sending it. +For ICP, this is the time between scheduling a reply and actually + sending it. + +

+Please note that the entries are logged +The IP address of the requesting instance, the client IP address. The + +Also, the +

+This column is made up of two entries separated by a slash. This column +encodes the transaction result: + + +The cache result of the request contains information on the kind of + request, how it was satisfied, or in what way it failed. Please refer + to section + for valid symbolic result codes. + +

+ Several codes from older versions are no longer available, were + renamed, or split. Especially the for details + on the codes no longer available in Squid-2. + +

+ The NOVM versions and Squid-2 also rely on the Unix buffer cache, thus + you will see less The status part contains the HTTP result codes with some Squid specific + extensions. Squid uses a subset of the RFC defined error codes for + HTTP. Refer to section + for details of the status codes recognized by a Squid-2. + + + +The size is the amount of data delivered to the client. Mind that this does +not constitute the net object size, as headers are also counted. Also, +failed requests may deliver an error page, the size of which is also logged +here. + + +The request method to obtain an object. Please refer to section + for available methods. +If you turned off +This column contains the URL requested. Please note that the log file +may contain whitespaces for the URI. The default configuration for + +The eigth column may contain the ident lookups for the requesting +client. Since ident lookups have performance impact, the default +configuration turns +The hierarchy information consists of three items: +

+ +Any hierarchy tag may be prefixed with A code that explains how the request was handled, e.g. by + forwarding it to a peer, or going straight to the source. Refer to + section for details on hierarchy codes and + removed hierarchy codes. +The name of the host the object was requested from. This host may + be the origin site, a parent or any other peer. Also note that the + hostname may be numerical. + + + + The content type of the object as seen in the HTTP reply + header. Please note that ICP exchanges usually don't have any content + type, and thus are logged ``-''. Also, some weird replies have content + types ``:'' or even empty ones. + + +

+There may be two more columns in the Squid result codes +

+The +The following result codes were taken from a Squid-2, compare with the +src/access_log.c: + + + + + The client issued a "no-cache" pragma, or some analogous cache + control command along with the request. Thus, the cache has to + refetch the object. + + + The client issued an IMS request for an object which was in the + cache and fresh. + + + The object was believed to be in the cache, + but could not be accessed. + + + During "-Y" startup, or during frequent + failures, a cache in hit only mode will return either UDP_HIT or + this code. Neighbours will thus only fetch hits. + + + +

+The following codes are no longer available in Squid-2: + + +. + +. + + used instead. + +. + + +HTTP status codes +

+These are taken from + and verified for Squid. Squid-2 uses almost all +codes except 307 (Temporary Redirect), 416 (Request Range Not Satisfiable), +and 417 (Expectation Failed). Extra codes include 0 for a result code being +unavailable, and 600 to signal an invalid header, a proxy error. Also, some +definitions were added as for + (WebDAV). +Yes, there are really two entries for status code +424, compare with src/enums.h: + + + 000 Used mostly with UDP traffic. + + 100 Continue + 101 Switching Protocols +*102 Processing + + 200 OK + 201 Created + 202 Accepted + 203 Non-Authoritative Information + 204 No Content + 205 Reset Content + 206 Partial Content +*207 Multi Status + + 300 Multiple Choices + 301 Moved Permanently + 302 Moved Temporarily + 303 See Other + 304 Not Modified + 305 Use Proxy +[307 Temporary Redirect] + + 400 Bad Request + 401 Unauthorized + 402 Payment Required + 403 Forbidden + 404 Not Found + 405 Method Not Allowed + 406 Not Acceptable + 407 Proxy Authentication Required + 408 Request Timeout + 409 Conflict + 410 Gone + 411 Length Required + 412 Precondition Failed + 413 Request Entity Too Large + 414 Request URI Too Large + 415 Unsupported Media Type +[416 Request Range Not Satisfiable] +[417 Expectation Failed] +*424 Locked +*424 Failed Dependency +*433 Unprocessable Entity + + 500 Internal Server Error + 501 Not Implemented + 502 Bad Gateway + 503 Service Unavailable + 504 Gateway Timeout + 505 HTTP Version Not Supported +*507 Insufficient Storage + + 600 Squid header parsing error + + +Request methods +

+Squid recognizes several request methods as defined in +. Newer versions of Squid (2.2.STABLE5 and above) +also recognize + ``HTTP Extensions for Distributed Authoring -- +WEBDAV'' extensions. + + + method defined cachabil. meaning + --------- ---------- ---------- ------------------------------------------- + GET HTTP/0.9 possibly object retrieval and simple searches. + HEAD HTTP/1.0 possibly metadata retrieval. + POST HTTP/1.0 CC or Exp. submit data (to a program). + PUT HTTP/1.1 never upload data (e.g. to a file). + DELETE HTTP/1.1 never remove resource (e.g. file). + TRACE HTTP/1.1 never appl. layer trace of request route. + OPTIONS HTTP/1.1 never request available comm. options. + CONNECT HTTP/1.1r3 never tunnel SSL connection. + + ICP_QUERY Squid never used for ICP based exchanges. + PURGE Squid never remove object from cache. + + PROPFIND rfc2518 ? retrieve properties of an object. + PROPATCH rfc2518 ? change properties of an object. + MKCOL rfc2518 never create a new collection. + MOVE rfc2518 never create a duplicate of src in dst. + COPY rfc2518 never atomically move src to dst. + LOCK rfc2518 never lock an object against modifications. + UNLOCK rfc2518 never unlock an object. + + + + +Hierarchy Codes +

+The following hierarchy codes are used with Squid-2: + +src/peer_select.c:hier_strings[]. + +src/peer_select.c:hier_strings[]. + + +

+Almost any of these may be preceded by 'TIMEOUT_' if the two-second +(default) timeout occurs waiting for all ICP replies to arrive from +neighbors, see also the +The following hierarchy codes were removed from Squid-2: + +code meaning +-------------------- ------------------------------------------------- +PARENT_UDP_HIT_OBJ hit objects are not longer available. +SIBLING_UDP_HIT_OBJ hit objects are not longer available. +SSL_PARENT_MISS SSL can now be handled by squid. +FIREWALL_IP_DIRECT No special logging for hosts inside the firewall. +LOCAL_IP_DIRECT No special logging for local networks. + + +cache/log (Squid-1.x) +

+This file has a rather unfortunate name. It also is often called the + + % squid -k shutdown + +This will disrupt service, but at least you will have your swap log +back. +Alternatively, you can tell squid to rotate its log files. This also +causes a clean swap log to be written. + + % squid -k rotate + + +

+For Squid-1.1, there are six fields: + + + + + + + + + +swap.state (Squid-2.x) +

+In Squid-2, the swap log file is now called for +information on the contents and format of that file. + +

+If you remove + % squid -k rotate + +Alternatively, you can tell Squid to shutdown and it will +rewrite this file before it exits. + +

+If you remove the +By default the Which log files can I delete safely? +

+You should never delete +If you accidentally delete +The correct way to maintain your log files is with Squid's ``rotate'' +feature. You should rotate your log files at least once per day. +The current log files are closed and then renamed with numeric extensions +(.0, .1, etc). If you want to, you can write your own scripts +to archive or remove the old log files. If not, Squid will +only keep up to +To rotate Squid's logs, simple use this command: + + squid -k rotate + +For example, use this cron entry to rotate the logs at midnight: + + 0 0 * * * /usr/local/squid/bin/squid -k rotate + + +How can I disable Squid's log files? + +

+To disable + cache_access_log /dev/null + + +

+To disable + cache_store_log none + + +

+It is a bad idea to disable the + cache_log /dev/null + + +My log files get very big! +

+You need to + 0 0 * * * /usr/local/squid/bin/squid -k rotate + + +Managing log files + +

+The preferred log file for analysis is the +Depending on the disk space allocated for log file storage, it is +recommended to set up a cron job which rotates the log files every 24, 12, +or 8 hour. You will need to set your +Before transport, the log files can be compressed during off-peak time. On +the analysis host, the log file are concatinated into one file, so one file +for 24 hours is the yield. Also note that with +The EU project +developed some + +to obey when handling and processing log files: + + +Respect the privacy of your clients when publishing results. +Keep logs unavailable unless anonymized. Most countries have laws on + privacy protection, and some even on how long you are legally allowed to + keep certain kinds of information. +Rotate and process log files at least once a day. Even if you don't + process the log files, they will grow quite large, see section + . If you rely on processing the log files, reserve + a large enough partition solely for log files. +Keep the size in mind when processing. It might take longer to + process log files than to generate them! +Limit yourself to the numbers you are interested in. There is data + beyond your dreams available in your log file, some quite obvious, others + by combination of different views. Here are some examples for figures to + watch: + + The hosts using your cache. + The elapsed time for HTTP requests - this is the latency the user + sees. Usually, you will want to make a distinction for HITs and MISSes + and overall times. Also, medians are preferred over averages. + The requests handled per interval (e.g. second, minute or hour). + + + + +Why do I get ERR_NO_CLIENTS_BIG_OBJ messages so often? + +

+This message means that the requested object was in ``Delete Behind'' +mode and the user aborted the transfer. An object will go into +``Delete Behind'' mode if + +It is larger than It is being fetched from a neighbor which has the + +What does ERR_LIFETIME_EXP mean? + +

+This means that a timeout occurred while the object was being transferred. Most +likely the retrieval of this object was very slow (or it stalled before finishing) +and the user aborted the request. However, depending on your settings for +Retrieving ``lost'' files from the cache +

+ +I've been asked to retrieve an object which was accidentally +destroyed at the source for recovery. +So, how do I figure out where the things are so I can copy +them out and strip off the headers? + +

+The following method applies only to the Squid-1.1 versions: +

+Use grep to find the named object (Url) in the + file. The first field in +this file is an integer +Then, find the file + perl fileno-to-pathname.pl [-c squid.conf] + +file numbers are read on stdin, and pathnames are printed on +stdout. + + + + +Operational issues + +How do I see system level Squid statistics? +

+The Squid distribution includes a CGI utility called How can I find the biggest objects in my cache? +

+ + sort -r -n +4 -5 access.log | awk '{print $5, $7}' | head -25 + + +I want to restart Squid with a clean cache +

+Note: The information here is current for version 2.2. +

+First of all, you must stop Squid of course. You can use +the command: + + % squid -k shutdown + + +

+The fastest way to restart with an entirely clean cache is +to over write the + % echo "" > /cache1/swap.state + +Repeat that for every +Another way, which takes longer, is to have squid recreate all the + + % cd /cache1 + % mkdir JUNK + % mv ?? swap.state* JUNK + % rm -rf JUNK & + +Repeat this for your other + % squid -z + + +How can I proxy/cache Real Audio? + +

+by , +and + +

+ + + +Point the RealPlayer at your Squid server's HTTP port (e.g. 3128). + + +Using the Preferences->Transport tab, select + +The RealPlayer (and RealPlayer Plus) manual states: + +Use HTTP Only + Select this option if you are behind a firewall and cannot + receive data through TCP. All data will be streamed through + HTTP. + + Note: You may not be able to receive some content if you select + this option. + + +

+Again, from the documentation: + +RealPlayer 4.0 identifies itself to the firewall when making a +request for content to a RealServer. The following string is +attached to any URL that the Player requests using HTTP GET: + + /SmpDsBhgRl + +Thus, to identify an HTTP GET request from the RealPlayer, look +for: + + http://[^/]+/SmpDsBhgRl + +The Player can also be identified by the mime type in a POST to +the RealServer. The RealPlayer POST has the following mime +type: + + "application/x-pncmd" + + +Note that the first request is a POST, and the second has a '?' in the URL, so +standard Squid configurations would treat it as non-cachable. It also looks +rather ``magic.'' + +

+HTTP is an alternative delivery mechanism introduced with version 3 players, +and it allows a reasonable approximation to ``streaming'' data - that is playing +it as you receive it. For more details, see their notes on +. + +

+It isn't available in the general case: only if someone has made the realaudio +file available via an HTTP server, or they're using a version 4 server, they've +switched it on, and you're using a version 4 client. If someone has made the +file available via their HTTP server, then it'll be cachable. Otherwise, it +won't be (as far as we can tell.) + +

+The more common RealAudio link connects via their own +Some confusion arises because there is also a configuration option to use an +HTTP proxy (such as Squid) with the Realaudio/RealVideo players. This is +because the players can fetch the ``How can I purge an object from my cache? +

+Squid does not allow +you to purge objects unless it is configured with access controls +in + acl PURGE method purge + acl localhost src 127.0.0.1 + http_access allow purge localhost + http_access deny purge + +The above only allows purge requests which come from the local host and +denies all other purge requests. + +

+To purge an object, you can use the + client -m PURGE http://www.miscreant.com/ + +If the purge was successful, you will see a ``200 OK'' response: + + HTTP/1.0 200 OK + Date: Thu, 17 Jul 1997 16:03:32 GMT + Server: Squid/1.1.14 + +If the object was not found in the cache, you will see a ``404 Not Found'' +response: + + HTTP/1.0 404 Not Found + Date: Thu, 17 Jul 1997 16:03:22 GMT + Server: Squid/1.1.14 + + + + +Using ICMP to Measure the Network +

+As of version 1.1.9, Squid is able to utilize ICMP Round-Trip-Time (RTT) +measurements to select the optimal location to forward a cache miss. +Previously, cache misses would be forwarded to the parent cache +which returned the first ICP reply message. These were logged +with FIRST_PARENT_MISS in the access.log file. Now we can +select the parent which is closest (RTT-wise) to the origin +server. + +Supporting ICMP in your Squid cache + +

+ It is more important that your parent caches enable the ICMP + features. If you are acting as a parent, then you may want + to enable ICMP on your cache. Also, if your cache makes + RTT measurements, it will fetch objects directly if your + cache is closer than any of the parents. + +

+ If you want your Squid cache to measure RTT's to origin servers, + Squid must be compiled with the USE_ICMP option. This is easily + accomplished by uncommenting "-DUSE_ICMP=1" in src/Makefile and/or + src/Makefile.in. + +

+ An external program called + % make install + % su + # make install-pinger + + There are three configuration file options for tuning the + measurement database on your cache. + Another option, Utilizing your parents database +

+ Your parent caches can be asked to include the RTT measurements + in their ICP replies. To do this, you must enable + query_icmp on + + This causes a flag to be set in your outgoing ICP queries. +

+ If your parent caches return ICMP RTT measurements then + the eighth column of your access.log will have lines + similar to: + + CLOSEST_PARENT_MISS/it.cache.nlanr.net + + In this case, it means that + CLOSEST_DIRECT/www.sample.com + + +Inspecting the database +

+ The measurement database can be viewed from the cachemgr by + selecting "Network Probe Database." Hostnames are aggregated + into /24 networks. All measurements made are averaged over + time. Measurements are made to specific hosts, taken from + the URLs of HTTP requests. The recv and sent fields are the + number of ICMP packets sent and received. At this time they + are only informational. +

+ A typical database entry looks something like this: + + Network recv/sent RTT Hops Hostnames + 192.41.10.0 20/ 21 82.3 6.0 www.jisedu.org www.dozo.com + bo.cache.nlanr.net 42.0 7.0 + uc.cache.nlanr.net 48.0 10.0 + pb.cache.nlanr.net 55.0 10.0 + it.cache.nlanr.net 185.0 13.0 + + This means we have sent 21 pings to both www.jisedu.org and + www.dozo.com. The average RTT is 82.3 milliseconds. The + next four lines show the measured values from our parent + caches. Since Why are so few requests logged as TCP_IMS_MISS? + +

+When Squid receives an +If the request is not forwarded, Squid replies to the IMS request +according to the object in its cache. If the modification times are the +same, then Squid returns TCP_IMS_HIT. If the modification times are +different, then Squid returns TCP_IMS_MISS. In most cases, the cached +object will not have changed, so the result is TCP_IMS_HIT. Squid will +only return TCP_IMS_MISS if some other client causes a newer version of +the object to be pulled into the cache. + +How can I make Squid NOT cache some servers or URLs? + +

+In Squid-2, you use the + acl Local dst 10.0.1.0/24 + no_cache deny Local + + +

+This example makes all URL's with '.html' uncachable: + + acl HTML url_regex .html$ + no_cache deny HTML + + +

+This example makes a specific URL uncachable: + + acl XYZZY url_regex ^http://www.i.suck.com/foo.html$ + no_cache deny XYZZY + + +

+This example caches nothing between the hours of 8AM to 11AM: + + acl Morning time 08:00-11:00 + no_cache deny Morning + + +

+In Squid-1.1, +whether or not an object gets cached is controlled by the + + cache_stoplist my.domain.com + +Specifying uncachable objects by IP address is harder. The includes a patch called How can I delete and recreate a cache directory? + +

+Deleting an existing cache directory is easy to do. Unfortunately, +it may require a brief interruption of service. + + + +Edit your +You can not delete a cache directory from a running Squid process. +Thus, you can not simply reconfigure squid. You must +shutdown Squid: + + squid -k shutdown + + +Once Squid exits, you may immediately start it up again. If you +use the RunCache script, Squid should start up again automatically. + +Now Squid is no longer using the cache directory that you removed +from the config file. You can verify this by checking "Store Directory" +information with the cache manager. From the command line, type: + + client mgr:storedir + + + +

+Now that Squid is not using the cache directory, you can +The procedure is similar to recreate the directory. + + +Edit +Initialize the new directory by running + + % squid -z + +NOTE: it is safe to run this even if Squid is already running. +Reconfigure Squid + + squid -k reconfigure + +Unlike deleting, you can add new cache directories while Squid is +already running. + + +Why can't I run Squid as root? +

+by Dave J Woolley +

+If someone were to discover a buffer overrun bug in Squid and it runs as +a user other than root, they can only corrupt the files writeable to +that user, but if it runs a root, they can take over the whole machine. +This applies to all programs that don't absolutely need root status, not +just squid. + +Can you tell me a good way to upgrade Squid with minimal downtime? +

+Here is a technique that was described by . +

+Start a second Squid server on an unused HTTP port (say 4128). This +instance of Squid probably doesn't need a large disk cache. When this +second server has finished reloading the disk store, swap the +Can Squid listen on more than one HTTP port? +

+Note: The information here is current for version 2.3. +

+Yes, you can specify multiple +With version 2.3 and later you can specify IP addresses +and port numbers together (see the squid.conf comments). + + + +Memory + +Why does Squid use so much memory!? + +

+Squid uses a lot of memory for performance reasons. It takes much, much +longer to read something from disk than it does to read directly from +memory. + +

+A small amount of metadata for each cached object is kept in memory. +This is the +Squid-1.1 also uses a lot of memory to store in-transit objects. +This version stores incoming objects only in memory, until the transfer +is complete. At that point it decides whether or not to store the object +on disk. This means that when users download large files, your memory +usage will increase significantly. The squid.conf parameter +Other uses of memory by Squid include: + + + Disk buffers for reading and writing + + Network I/O buffers + + IP Cache contents + + FQDN Cache contents + + Netdb ICMP measurement database + + Per-request state information, including full request and + reply headers + + Miscellaneous statistics collection. + + ``Hot objects'' which are kept entirely in memory. + + +How can I tell how much memory my Squid process is using? + +

+One way is to simply look at + wessels ˜ 236% ps -axuhm + USER PID %CPU %MEM VSZ RSS TT STAT STARTED TIME COMMAND + squid 9631 4.6 26.4 141204 137852 ?? S 10:13PM 78:22.80 squid -NCYs + +For SYSV-ish, you probably want to use the +A nicer way to check the memory usage is with a program called + + last pid: 20128; load averages: 0.06, 0.12, 0.11 14:10:58 + 46 processes: 1 running, 45 sleeping + CPU states: % user, % nice, % system, % interrupt, % idle + Mem: 187M Active, 1884K Inact, 45M Wired, 268M Cache, 8351K Buf, 1296K Free + Swap: 1024M Total, 256K Used, 1024M Free + + PID USERNAME PRI NICE SIZE RES STATE TIME WCPU CPU COMMAND + 9631 squid 2 0 138M 135M select 78:45 3.93% 3.93% squid + + +

+Finally, you can ask the Squid process to report its own memory +usage. This is available on the Cache Manager + Resource usage for squid: + Maximum Resident Size: 137892 KB + Memory usage for squid via mstats(): + Total space in arena: 140144 KB + Total free: 8153 KB 6% + + +

+If your RSS (Resident Set Size) value is much lower than your +process size, then your cache performance is most likely suffering +due to . + + +My Squid process grows without bounds. + +

+You might just have your '' +entry below. + +

+When a process continually grows in size, without levelling off +or slowing down, it often indicates a memory leak. A memory leak +is when some chunk of memory is used, but not free'd when it is +done being used. + +

+Memory leaks are a real problem for programs (like Squid) which do all +of their processing within a single process. Historically, Squid has +had real memory leak problems. But as the software has matured, we +believe almost all of Squid's memory leaks have been eliminated, and +new ones are least easy to identify. + +

+Memory leaks may also be present in your system's libraries, such +as . + +I set +The How do I analyze memory usage from the cache manger output? +

+ +

+ +Note: This information is specific to Squid-1.1 versions + + +

+Look at your + Memory usage for squid via mallinfo(): + Total space in arena: 94687 KB + Ordinary blocks: 32019 KB 210034 blks + Small blocks: 44364 KB 569500 blks + Holding blocks: 0 KB 5695 blks + Free Small blocks: 6650 KB + Free Ordinary blocks: 11652 KB + Total in use: 76384 KB 81% + Total free: 18302 KB 19% + + Meta Data: + StoreEntry 246043 x 64 bytes = 15377 KB + IPCacheEntry 971 x 88 bytes = 83 KB + Hash link 2 x 24 bytes = 0 KB + URL strings = 11422 KB + Pool MemObject structures 514 x 144 bytes = 72 KB ( 70 free) + Pool for Request structur 516 x 4380 bytes = 2207 KB ( 2121 free) + Pool for in-memory object 6200 x 4096 bytes = 24800 KB ( 22888 free) + Pool for disk I/O 242 x 8192 bytes = 1936 KB ( 1888 free) + Miscellaneous = 2600 KB + total Accounted = 58499 KB + + +

+First note that +Of that 94M, 81% (76M) is actually being used at the moment. The +rest has been freed, or pre-allocated by +Of the 76M in use, we can account for 58.5M (76%). There are some +calls to +The +The pool sizes are specified by Pool for disk I/O is +hardcoded at 200. For +If you need to lower your process size, we recommend lowering the +max object sizes in the 'http', 'ftp' and 'gopher' config lines. +You may also want to lower The ``Total memory accounted'' value is less than the size of my Squid process. + +

+We are not able to account for +Also, note that the xmalloc: Unable to allocate 4096 bytes! +

+by + +

+Messages like "FATAL: xcalloc: Unable to allocate 4096 blocks of 1 bytes!" +appear when Squid can't allocate more memory, and on most operating systems +(inclusive BSD) there are only two possible reasons: + +The machine is out of swap +The process' maximum data segment size has been reached + +The first case is detected using the normal swap monitoring tools +available on the platform ( +To tell if it is the second case, first rule out the first case and then +monitor the size of the Squid process. If it dies at a certain size with +plenty of swap left then the max data segment size is reached without no +doubts. +

+The data segment size can be limited by two factors: + +Kernel imposed maximum, which no user can go above +The size set with ulimit, which the user can control. + +

+When squid starts it sets data and file ulimit's to the hard level. If +you manually tune ulimit before starting Squid make sure that you set +the hard limit and not only the soft limit (the default operation of +ulimit is to only change the soft limit). root is allowed to raise the +soft limit above the hard limit. +

+This command prints the hard limits: + + ulimit -aH + +

+This command sets the data size to unlimited: + + ulimit -HSd unlimited + + + +BSD/OS +

+by +

+The default kernel limit on BSD/OS for datasize is 64MB (at least on 3.0 +which I'm using). + +

+Recompile a kernel with larger datasize settings: + + + maxusers 128 + # Support for large inpcb hash tables, e.g. busy WEB servers. + options INET_SERVER + # support for large routing tables, e.g. gated with full Internet routing: + options "KMEMSIZE=\(16*1024*1024\)" + options "DFLDSIZ=\(128*1024*1024\)" + options "DFLSSIZ=\(8*1024*1024\)" + options "SOMAXCONN=128" + options "MAXDSIZ=\(256*1024*1024\)" + + +See /usr/share/doc/bsdi/config.n for more info. + +

+In /etc/login.conf I have this: + + + default:\ + :path=/bin /usr/bin /usr/contrib/bin:\ + :datasize-cur=256M:\ + :openfiles-cur=1024:\ + :openfiles-max=1024:\ + :maxproc-cur=1024:\ + :stacksize-cur=64M:\ + :radius-challenge-styles=activ,crypto,skey,snk,token:\ + :tc=auth-bsdi-defaults:\ + :tc=auth-ftp-bsdi-defaults: + + # + # Settings used by /etc/rc and root + # This must be set properly for daemons started as root by inetd as well. + # Be sure reset these values back to system defaults in the default class! + # + daemon:\ + :path=/bin /usr/bin /sbin /usr/sbin:\ + :widepasswords:\ + :tc=default: + # :datasize-cur=128M:\ + # :openfiles-cur=256:\ + # :maxproc-cur=256:\ + + +

+This should give enough space for a 256MB squid process. + +FreeBSD (2.2.X) +

+by Duane Wessels +

+The procedure is almost identical to that for BSD/OS above. +Increase the open filedescriptor limit in /sys/conf/param.c: + + int maxfiles = 4096; + int maxfilesperproc = 1024; + +Increase the maximum and default data segment size in your kernel +config file, e.g. /sys/conf/i386/CONFIG: + + options "MAXDSIZ=(512*1024*1024)" + options "DFLDSIZ=(128*1024*1024)" + +We also found it necessary to increase the number of mbuf clusters: + + options "NMBCLUSTERS=10240" + +And, if you have more than 256 MB of physical memory, you probably +have to disable BOUNCE_BUFFERS (whatever that is), so comment +out this line: + + #options BOUNCE_BUFFERS #include support for DMA bounce buffers + + + +Also, update limits in /etc/login.conf: + + # Settings used by /etc/rc + # + daemon:\ + :coredumpsize=infinity:\ + :datasize=infinity:\ + :maxproc=256:\ + :maxproc-cur@:\ + :memoryuse-cur=64M:\ + :memorylocked-cur=64M:\ + :openfiles=4096:\ + :openfiles-cur@:\ + :stacksize=64M:\ + :tc=default: + +And don't forget to run ``cap_mkdb /etc/login.conf'' after editing that file. + + +OSF, Digital Unix +

+by +

+To increase the data size for Digital UNIX, edit the file /etc/sysconfigtab +and add the entry... + + proc: + per-proc-data-size=1073741824 + +Or, with csh, use the limit command, such as + + > limit datasize 1024M + + +

+Editing /etc/sysconfigtab requires a reboot, but the limit command +doesn't. + +fork: (12) Cannot allocate memory +

+When Squid is reconfigured (SIGHUP) or the logs are rotated (SIGUSR1), +some of the helper processes (dnsserver) must be killed and +restarted. If your system does not have enough virtual memory, +the Squid process may not be able to fork to start the new helper +processes. +The best way to fix this is to increase your virtual memory by adding +swap space. Normally your system uses raw disk partitions for swap +space, but most operating systems also support swapping on regular +files (Digital Unix excepted). See your system manual pages for +What can I do to reduce Squid's memory usage? +

+If your cache performance is suffering because of memory limitations, +you might consider buying more memory. But if that is not an option, +There are a number of things to try: + + +Try a . + +Reduce the +Turn the +Reduce the +Reduce the +If you are using Squid-1.1.x, try the ``NOVM'' version. + + +Using an alternate +

+Many users have found improved performance and memory utilization when +linking Squid with an external malloc library. We recommend either +GNU malloc, or dlmalloc. + +Using GNU malloc + +

+To make Squid use GNU malloc follow these simple steps: + + +Download the GNU malloc source, available from one of +. +Compile GNU malloc + + % gzip -dc malloc.tar.gz | tar xf - + % cd malloc + % vi Makefile # edit as needed + % make + +Copy libmalloc.a to your system's library directory and be sure to + name it + % su + # cp malloc.a /usr/lib/libgnumalloc.a + +(Optional) Copy the GNU malloc.h to your system's include directory and + be sure to name it + # cp malloc.h /usr/include/gnumalloc.h + +Reconfigure and recompile Squid + + % make realclean + % ./configure ... + % make + % make install + +Note, In later distributions, 'realclean' has been changed to 'distclean'. +As the configure script runs, watch its output. You should find that +it locates libgnumalloc.a and optionally gnumalloc.h. + + +dlmalloc + +

+ +has been written by . According to Doug: + +This is not the fastest, most space-conserving, most portable, or +most tunable malloc ever written. However it is among the fastest +while also being among the most space-conserving, portable and tunable. + + +

+dlmalloc is included with the + % ./configure --enable-dlmalloc ... + + + + +The Cache Manager +

+by + +What is the cache manager? +

+The cache manager (How do you set it up? +

+That depends on which web server you're using. Below you will +find instructions for configuring the CERN and Apache servers +to permit + +After you edit the server configuration files, you will probably +need to either restart your web server or or send it a +When you're done configuring your web server, you'll connect to +the cache manager with a web browser, using a URL such as: + + http://www.example.com/Squid/cgi-bin/cachemgr.cgi/ + + +Cache manager configuration for CERN httpd 3.0 +

+First, you should ensure that only specified workstations can access +the cache manager. That is done in your CERN + Protection MGR-PROT { + Mask @(workstation.example.com) + } + + +Wildcards are acceptable, IP addresses are acceptable, and others +can be added with a comma-separated list of IP addresses. There +are many more ways of protection. Your server documentation has +details. + +

+You also need to add: + + Protect /Squid/* MGR-PROT + Exec /Squid/cgi-bin/*.cgi /usr/local/squid/bin/*.cgi + +This marks the script as executable to those in Cache manager configuration for Apache +

+First, make sure the cgi-bin directory you're using is listed with a + + ScriptAlias /Squid/cgi-bin/ /usr/local/squid/cgi-bin/ + +It's probably a +Next, you should ensure that only specified workstations can access +the cache manager. That is done in your Apache + + order deny,allow + deny from all + allow from workstation.example.com + &etago;Location> + + +You can have more than one allow line, and you can allow +domains or networks. +

+Alternately, + + AuthUserFile /path/to/password/file + AuthGroupFile /dev/null + AuthName User/Password Required + AuthType Basic + require user cachemanager + &etago;Location> + + +Consult the Apache documentation for information on using Cache manager configuration for Roxen 2.0 and later +

+by Francesco ``kinkie'' Chemolli +

+Notice: this is +This is what's required to start up a fresh Virtual Server, only +serving the cache manager. If you already have some Virtual Server +you wish to use to host the Cache Manager, just add a new CGI +support module to it. + +

+Create a new virtual server, and set it to host http://www.example.com/. +Add to it at least the following modules: + +Content Types +CGI scripting support + + +

+In the +CGI-bin path: set to /Squid/cgi-bin/ +Handle *.cgi: set to Run user scripts as owner: set to Search path: set to the directory containing the cachemgr.cgi file + + +

+In section +allow ip=1.2.3.4 + +where 1.2.3.4 is the IP address for workstation.example.com + +

+Save the configuration, and you're done. + +Cache manager ACLs in +The default cache manager access configuration in + acl manager proto cache_object + acl localhost src 127.0.0.1/255.255.255.255 + acl all src 0.0.0.0/0.0.0.0 + + +With the following rules: + + + http_access deny manager !localhost + http_access allow all + + +

+The first ACL is the most important as the cache manager program +interrogates squid using a special + + telnet mycache.example.com 3128 + GET cache_object://mycache.example.com/info HTTP/1.0 + +

+The default ACLs say that if the request is for a + +In fact, only allowing localhost access means that on the +initial + acl manager proto cache_object + acl localhost src 127.0.0.1/255.255.255.255 + acl example src 123.123.123.123/255.255.255.255 + acl all src 0.0.0.0/0.0.0.0 + + +Where + http_access allow manager localhost + http_access allow manager example + http_access deny manager + http_access allow all + +If you're using + miss_access allow manager + + +

+ +The default ACLs assume that your web server is on the same machine +as +Always be sure to send a Why does it say I need a password and a URL? +

+If you ``drop'' the list box, and browse it, you will see that the +password is only required to shutdown the cache, and the URL is +required to refresh an object (i.e., retrieve it from its original +source again) Otherwise these fields can be left blank: a password +is not required to obtain access to the informational aspects of +I want to shutdown the cache remotely. What's the password? +

+See the How do I make the cache host default to +When you run +% ./configure --enable-cachemgr-hostname=`hostname` ... + +

+Note, if you do this after you already installed Squid before, you need to +make sure +% cd src +% rm cachemgr.o cachemgr.cgi +% make cachemgr.cgi + +

+Then copy What's the difference between Squid TCP connections and Squid UDP connections? +

+Browsers and caches use TCP connections to retrieve web objects +from web servers or caches. UDP connections are used when another +cache using you as a sibling or parent wants to find out if you +have an object in your cache that it's looking for. The UDP +connections are ICP queries. + +It says the storage expiration will happen in 1970! +

+Don't worry. The default (and sensible) behavior of What do the Meta Data entries mean? +

+ + + + +

+Basically just like the +Info about objects currently in memory, +(eg, in the process of being transferred). +Information about each request as it happens. +Space for object data as it is retrieved. + + +

+If In the utilization section, what is + +In the utilization section, why is the Transfer KB/sec +column always zero? +

+This column contains gross estimations of data transfer rates +averaged over the entire time the cache has been running. These +numbers are unreliable and mostly useless. + +In the utilization section, what is the +The number of objects of that type in the cache right now. + +In the utilization section, what is the Max/Current/Min KB? +

+These refer to the size all the objects of this type have grown +to/currently are/shrunk to. + +What is the I/O section about? +

+These are histograms on the number of bytes read from the network +per What is the + this will download to your browser +a list of every URL in the cache and statistics about it. It can +be very, very large. You +probably don't need this information anyway. + +What is the +What does +Average Round Trip Time. This is how long on average after +an ICP ping is sent that a reply is received. + +In the IP cache section, what's the difference between a hit, a negative hit and a miss? +

+ +A HIT means that the document was found in the cache. A +MISS, that it wasn't found in the cache. A negative hit +means that it was found in the cache, but it doesn't exist. + +What do the IP cache contents mean anyway? +

+ +The hostname is the name that was requested to be resolved. + +

+For the + + +The +The +The rest of the line lists all the IP addresses that have been associated +with that IP cache entry. +

+ +What is the fqdncache and how is it different from the ipcache? +

+IPCache contains data for the Hostname to IP-Number mapping, and +FQDNCache does it the other way round. For example: + + + Hostname Flags lstref TTL N [IP-Number] + gorn.cc.fh-lippe.de C 0 21581 1 193.16.112.73 + lagrange.uni-paderborn.de C 6 21594 1 131.234.128.245 + www.altavista.digital.com C 10 21299 4 204.123.2.75 ... + 2/ftp.symantec.com DL 1583 -772855 0 + + Flags: C --> Cached + D --> Dispatched + N --> Negative Cached + L --> Locked + lstref: Time since last use + TTL: Time-To-Live until information expires + N: Count of addresses + + +

+ + IP-Number Flags TTL N Hostname + 130.149.17.15 C -45570 1 andele.cs.tu-berlin.de + 194.77.122.18 C -58133 1 komet.teuto.de + 206.155.117.51 N -73747 0 + + Flags: C --> Cached + D --> Dispatched + N --> Negative Cached + L --> Locked + TTL: Time-To-Live until information expires + N: Count of names + + +What does ``Page faults with physical i/o: 4897'' mean? +

+This question was asked on the +by + +

+You get a ``page fault'' when your OS tries to access something in memory +which is actually swapped to disk. The term ``page fault'' while correct at +the kernel and CPU level, is a bit deceptive to a user, as there's no +actual error - this is a normal feature of operation. + +

+Also, this doesn't necessarily mean your squid is swapping by that much. +Most operating systems also implement paging for executables, so that only +sections of the executable which are actually used are read from disk into +memory. Also, whenever squid needs more memory, the fact that the memory +was allocated will show up in the page faults. + +

+However, if the number of faults is unusually high, and getting bigger, +this could mean that squid is swapping. Another way to verify this is using +a program called ``vmstat'' which is found on most UNIX platforms. If you run +this as ``vmstat 5'' this will update a display every 5 seconds. This can +tell you if the system as a whole is swapping a lot (see your local man +page for vmstat for more information). + +

+It is very bad for squid to swap, as every single request will be blocked +until the requested data is swapped in. It is better to tweak the +by + +

+There's two different operations at work, Paging and swapping. Paging +is when individual pages are shuffled (either discarded or swapped +to/from disk), while ``swapping'' +Needless to say, swapping a process is a pretty drastic event, and usually +only reserved for when there's a memory crunch and paging out cannot free +enough memory quickly enough. Also, there's some variation on how +swapping is implemented in OS's. Some don't do it at all or do a hybrid +of paging and swapping instead. + +

+As you say, paging out doesn't necessarily involve disk IO, eg: text (code) +pages are read-only and can simply be discarded if they are not used (and +reloaded if/when needed). Data pages are also discarded if unmodified, and +paged out if there's been any changes. Allocated memory (malloc) is always +saved to disk since there's no executable file to recover the data from. +mmap() memory is variable.. If it's backed from a file, it uses the same +rules as the data segment of a file - ie: either discarded if unmodified or +paged out. + +

+There's also ``demand zeroing'' of pages as well that cause faults.. If you +malloc memory and it calls brk()/sbrk() to allocate new pages, the chances +are that you are allocated demand zero pages. Ie: the pages are not +``really'' attached to your process yet, but when you access them for the +first time, the page fault causes the page to be connected to the process +address space and zeroed - this saves unnecessary zeroing of pages that are +allocated but never used. + +

+The ``page faults with physical IO'' comes from the OS via getrusage(). It's +highly OS dependent on what it means. Generally, it means that the process +accessed a page that was not present in memory (for whatever reason) and +there was disk access to fetch it. Many OS's load executables by demand +paging as well, so the act of starting squid implicitly causes page faults +with disk IO - however, many (but not all) OS's use ``read ahead'' and +``prefault'' heuristics to streamline the loading. Some OS's maintain +``intent queues'' so that pages can be selected as pageout candidates ahead +of time. When (say) squid touches a freshly allocated demand zero page and +one is needed, the OS can page out one of the candidates on the spot, +causing a 'fault with physical IO' with demand zeroing of allocated memory +which doesn't happen on many other OS's. (The other OS's generally put +the process to sleep while the pageout daemon finds a page for it). + +

+The meaning of ``swapping'' varies. On FreeBSD for example, swapping out is +implemented as unlocking upages, kernel stack, PTD etc for aggressive +pageout with the process. The only thing left of the process in memory is +the 'struct proc'. The FreeBSD paging system is highly adaptive and can +resort to paging in a way that is equivalent to the traditional swapping +style operation (ie: entire process). FreeBSD also tries stealing pages +from active processes in order to make space for disk cache. I suspect +this is why setting 'memory_pools off' on the non-NOVM squids on FreeBSD is +reported to work better - the VM/buffer system could be competing with +squid to cache the same pages. It's a pity that squid cannot use mmap() to +do file IO on the 4K chunks in it's memory pool (I can see that this is not +a simple thing to do though, but that won't stop me wishing. :-). + +

+by + +

+The comments so far have been about what paging/swapping figures mean in +a ``traditional'' context, but it's worth bearing in mind that on some systems +(Sun's Solaris 2, at least), the virtual memory and filesystem handling are +unified and what a user process sees as reading or writing a file, the system +simply sees as paging something in from disk or a page being updated so it +needs to be paged out. (I suppose you could view it as similar to the operating +system memory-mapping the files behind-the-scenes.) + +

+The effect of this is that on Solaris 2, paging figures will also include file +I/O. Or rather, the figures from vmstat certainly appear to include file I/O, +and I presume (but can't quickly test) that figures such as those quoted by +Squid will also include file I/O. + +

+To confirm the above (which represents an impression from what I've read and +observed, rather than 100% certain facts...), using an otherwise idle Sun Ultra +1 system system I just tried using cat (small, shouldn't need to page) to copy +(a) one file to another, (b) a file to /dev/null, (c) /dev/zero to a file, and +(d) /dev/zero to /dev/null (interrupting the last two with control-C after a +while!), while watching with vmstat. 300-600 page-ins or page-outs per second +when reading or writing a file (rather than a device), essentially zero in +other cases (and when not cat-ing). + +

+So ... beware assuming that all systems are similar and that paging figures +represent *only* program code and data being shuffled to/from disk - they +may also include the work in reading/writing all those files you were +accessing... + +Ok, so what is unusually high? + +

+You'll probably want to compare the number of page faults to the number of +HTTP requests. If this ratio is close to, or exceeding 1, then +Squid is paging too much. + +What does the IGNORED field mean in the 'cache server list'? +

+This refers to ICP replies which Squid ignored, for one of these +reasons: + + + The URL in the reply could not be found in the cache at all. + + The URL in the reply was already being fetched. Probably + this ICP reply arrived too late. + + The URL in the reply did not have a MemObject associated with + it. Either the request is already finished, or the user aborted + before the ICP arrived. + + The reply came from a multicast-responder, but the + + Source-Echo replies from known neighbors are ignored. + + ICP_OP_DENIED replies are ignored after the first 100. + + + + + +Access Controls +

+As an example, we will assume that you would like to prevent users from +accessing cooking recipes. + +

+One way to implement this would be to deny access to any URLs +that contain the words ``cooking'' or ``recipe.'' +You would use these configuration lines: + + acl Cooking1 url_regex cooking + acl Recipe1 url_regex recipe + http_access deny Cooking1 + http_access deny Recipe1 + http_access allow all + +The +Another way is to deny access to specific servers which are known +to hold recipes. For example: + + acl Cooking2 dstdomain gourmet-chef.com + http_access deny Cooking2 + http_access allow all + +The How do I block specific users or groups from accessing my cache? + +Ident +

+You can use + +to allow specific users access to your cache. This requires that an + +process runs on the user's machine(s). +In your + ident_lookup on + acl friends user kim lisa frank joe + http_access allow friends + http_access deny all + + +Proxy Authentication +

+Another option is to use proxy-authentication. In this scheme, you assign +usernames and passwords to individuals. When they first use the proxy +they are asked to authenticate themselves by entering their username and +password. + +

+In Squid v2 this authentication is hanled via external processes. For +information on how to configure this, please see +. + +Do you have a CGI program which lets users change their own proxy passwords? +

+ +has adapted the Apache's . + + + +Is there a way to do ident lookups only for a certain host and compare the result with a userlist in squid.conf? + +

+Sort of. + +

+If you use a +for every client request. In other words, Squid-1.1 will perform +ident lookups for all requests or no requests. Defining a +However, even though ident lookups are performed for every request, Squid does +not wait for the lookup to complete unless the ACL rules require it. Consider this +configuration: + + acl host1 src 10.0.0.1 + acl host2 src 10.0.0.2 + acl pals user kim lisa frank joe + http_access allow host1 + http_access allow host2 pals + +Requests coming from 10.0.0.1 will be allowed immediately because +there are no user requirements for that host. However, requests +from 10.0.0.2 will be allowed only after the ident lookup completes, and +if the username is in the set kim, lisa, frank, or joe. + +Common Mistakes + +And/Or logic + +

+You've probably noticed (and been frustrated by) the fact that +you cannot combine access controls with terms like ``and'' or ``or.'' +These operations are already built in to the access control scheme +in a fundamental way which you must understand. + + +All elements of an . + +All elements of an . +e.g. + +

+For example, the following access control configuration will never work: + + acl ME src 10.0.0.1 + acl YOU src 10.0.0.2 + http_access allow ME YOU + +In order for the request to be allowed, it must match the ``ME'' acl AND the ``YOU'' acl. +This is impossible because any IP address could only match one or the other. This +should instead be rewritten as: + + acl ME src 10.0.0.1 + acl YOU src 10.0.0.2 + http_access allow ME + http_access allow YOU + +Or, alternatively, this would also work: + + acl US src 10.0.0.1 10.0.0.2 + http_access allow US + + +allow/deny mixups + +

+ +I have read through my squid.conf numerous times, spoken to my +neighbors, read the FAQ and Squid Docs and cannot for the life of +me work out why the following will not work. + + +

+ +I can successfully access cachemgr.cgi from our web server machine here, +but I would like to use MRTG to monitor various aspects of our proxy. +When I try to use 'client' or GET cache_object from the machine the +proxy is running on, I always get access denied. + + + + acl manager proto cache_object + acl localhost src 127.0.0.1/255.255.255.255 + acl server src 1.2.3.4/255.255.255.255 + acl all src 0.0.0.0/0.0.0.0 + acl ourhosts src 1.2.0.0/255.255.0.0 + + http_access deny manager !localhost !server + http_access allow ourhosts + http_access deny all + + +

+The intent here is to allow cache manager requests from the + http_access deny manager !localhost !server + + +

+The problem here is that for allowable requests, this access rule is +not matched. For example, if the source IP address is +To implement the desired policy correctly, the access rules should be +rewritten as + + http_access allow manager localhost + http_access allow manager server + http_access deny manager + http_access allow ourhosts + http_access deny all + +If you're using + miss_access allow manager + + +

+You may be concerned that the having five access rules instead of three +may have an impact on the cache performance. In our experience this is +not the case. Squid is able to handle a moderate amount of access control +checking without degrading overall performance. You may like to verify +that for yourself, however. + +Differences between +For the I set up my access controls, but they don't work! why? + +

+You can debug your access control configuration by setting the + + debug_options ALL,1 28,9 + + +Proxy-authentication and neighbor caches +

+The problem... + + + [ Parents ] + / \ + / \ + [ Proxy A ] --- [ Proxy B ] + | + | + USER + +

+ +Proxy A sends and ICP query to Proxy B about an object, Proxy B replies with an +ICP_HIT. Proxy A forwards the HTTP request to Proxy B, but +does not pass on the authentication details, therefore the HTTP GET from +Proxy A fails. + + + +

+Only ONE proxy cache in a chain is allowed to ``use'' the Proxy-Authentication +request header. Once the header is used, it must not be passed on to +other proxies. + +

+Therefore, you must allow the neighbor caches to request from each other +without proxy authentication. This is simply accomplished by listing +the neighbor ACL's first in the list of + acl proxy-A src 10.0.0.1 + acl proxy-B src 10.0.0.2 + acl user_passwords proxy_auth /tmp/user_passwds + + http_access allow proxy-A + http_access allow proxy-B + http_access allow user_passwords + http_access deny all + + +Is there an easy way of banning all Destination addresses except one? +

+ + acl GOOD dst 10.0.0.1 + acl BAD dst 0.0.0.0/0.0.0.0 + http_access allow GOOD + http_access deny BAD + + +Does anyone have a ban list of porn sites and such? + +

+ + + + + Snerpa, an ISP in Iceland operates a DNS-database of + IP-addresses of blacklisted sites containing porn, violence, + etc. which is utilized using a small perl-script redirector. + Information on this on the webpage. + + +Squid doesn't match my subdomains +

+There is a subtle problem with domain-name based access controls +when a single ACL element has an entry that is a subdomain of +another entry. For example, consider this list: + + acl FOO dstdomain boulder.co.us vail.co.us co.us + +

+In the first place, the above list is simply wrong because +the first two ( +The problem stems from the data structure used to index domain +names in an access control list. Squid uses +The problem is that it is wrong to say that +For example, if you +said that +similarly, if you said that +The bottom line is that you can't have one entry that is a subdomain +of another. Squid-2.2 will warn you if it detects this condition. + +Why does Squid deny some port numbers? +

+It is dangerous to allow Squid to connect to certain port numbers. +For example, it has been demonstrated that someone can use Squid +as an SMTP (email) relay. As I'm sure you know, SMTP relays are +one of the ways that spammers are able to flood our mailboxes. +To prevent mail relaying, Squid denies requests when the URL port +number is 25. Other ports should be blocked as well, as a precaution. + +

+There are two ways to filter by port number: either allow specific +ports, or deny specific ports. By default, Squid does the first. This +is the ACL entry that comes in the default + acl Safe_ports port 80 21 443 563 70 210 1025-65535 + http_access deny !Safe_ports + +The above configuration denies requests when the URL port number is +not in the list. The list allows connections to the standard +ports for HTTP, FTP, Gopher, SSL, WAIS, and all non-priveleged +ports. + +

+Another approach is to deny dangerous ports. The dangerous +port list should look something like: + + acl Dangerous_ports 7 9 19 22 23 25 53 109 110 119 + http_access deny Dangerous_ports + +...and probably many others. + +

+Please consult the /etc/services file on your system +for a list of known ports and protocols. + +Does Squid support the use of a database such as mySQL for storing the ACL list? +

+Note: The information here is current for version 2.2. +

+No, it does not. + +How can I allow a single address to access a specific URL? +

+This example allows only the + acl special_client src 10.1.2.3 + acl special_url url_regex ^http://www.squid-cache.org/Doc/FAQ/$ + http_access allow special_client special_url + http_access deny special_url + + +How can I allow some clients to use the cache at specific times? +

+Let's say you have two workstations that should only be allowed access +to the Internet during working hours (8:30 - 17:30). You can use +something like this: + +acl FOO src 10.1.2.3 10.1.2.4 +acl WORKING time MTWHF 08:30-17:30 +http_access allow FOO WORKING +http_access deny FOO + + +Problems with IP ACL's that have complicated netmasks +

+Note: The information here is current for version 2.3. +

+The following ACL entry gives inconsistent or unexpected results: + + acl restricted src 10.0.0.128/255.0.0.128 10.85.0.0/16 + +The reason is that IP access lists are stored in ``splay'' tree +data structures. These trees require the keys to be sortable. +When you use a complicated, or non-standard, netmask (255.0.0.128), it confuses +the function that compares two address/mask pairs. +

+The best way to fix this problem is to use separate ACL names +for each ACL value. For example, change the above to: + + acl restricted1 src 10.0.0.128/255.0.0.128 + acl restricted2 src 10.85.0.0/16 + +

+Then, of course, you'll have to rewrite your Can I set up ACL's based on MAC address rather than IP? +

+Yes, for some operating systes. Squid calls these ``ARP ACLs'' and +they are supported on Linux, Solaris, and probably BSD variants. +

+NOTE: Squid can only determine the MAC address for clients that +are on the same subnet. If the client is on a different subnet, +then Squid can not find out its MAC address. +

+To use ARP (MAC) access controls, you +first need to compile in the optional code. Do this with +the +% ./configure --enable-arp-acl ... +% make clean +% make + +If src/acl.c doesn't compile, then ARP ACLs are probably not +supported on your system. +

+If everything compiles, then you can add some ARP ACL lines to +your +acl M1 arp 01:02:03:04:05:06 +acl M2 arp 11:12:13:14:15:16 +http_access allow M1 +http_access allow M2 +http_access deny all + + + + +Troubleshooting + +Why am I getting ``Proxy Access Denied?'' +

+You may need to set up the for information about that. +

+If + http_accel_with_proxy on + +Alternately, you may have misconfigured one of your ACLs. Check the +I can't get +The I get + +If the HTTP port number is wrong but the ICP port is correct you +will send ICP queries correctly and the ICP replies will fool your +cache into thinking the configuration is correct but large objects +will fail since you don't have the correct HTTP port for the sibling +in your Running out of filedescriptors +

+ +If you see the Linux +

+Start with Dancer's , but realize that +this information is specific to the Linux 2.0.36 kernel. + +

+You also might want to +have a look at + +by + + +

+If your kernel version is 2.2.x or greater, you can read and write +the maximum number of file handles and/or inodes +simply by accessing the special files: + + /proc/sys/fs/file-max + /proc/sys/fs/inode-max + +So, to increase your file descriptor limit: + + echo 3072 > /proc/sys/fs/file-max + + +

+If your kernel version is between 2.0.35 and 2.1.x (?), you can read and write +the maximum number of file handles and/or inodes +simply by accessing the special files: + + /proc/sys/kernel/file-max + /proc/sys/kernel/inode-max + + +

+While this does increase the current number of file descriptors, +Squid's /usr/include/linux/limits.h. + +Solaris +

+Add the following to your /etc/system file to +increase your maximum file descriptors per process: +

+ + set rlim_fd_max = 4096 + +

+Next you should re-run the configure script +in the top directory so that it finds the new value. +If it does not find the new limit, then you might try +editing include/autoconf.h and setting +include/autoconf.h is created from autoconf.h.in +every time you run configure. Thus, if you edit it by +hand, you might lose your changes later on. + +

+If you have a very old version of Squid (1.1.X), and you +want to use more than 1024 descriptors, then you must +edit src/Makefile and enable + + +advises that you should NOT change the soft limit (IRIX +

+For some hints, please see SGI's document. + +FreeBSD +

+by + +How do I check my maximum filedescriptors? +

Do How do I increase them? + + sysctl -w kern.maxfiles=XXXX + sysctl -w kern.maxfilesperproc=XXXX + +Warning: You probably want What is the upper limit? +

I don't think there is a formal upper limit inside the kernel. +All the data structures are dynamically allocated. In practice +there might be unintended metaphenomena (kernel spending too much +time searching tables, for example). + + +General BSD +

+For most BSD-derived systems (SunOS, 4.4BSD, OpenBSD, FreeBSD, +NetBSD, BSD/OS, 386BSD, Ultrix) you can also use the ``brute force'' +method to increase these values in the kernel (requires a kernel +rebuild): + +How do I check my maximum filedescriptors? +

Do How do I increase them the easy way? +

One way is to increase the value of the Is there a more precise method? +

Another way is to find the +Here are a few examples which should lead you in the right direction: + +SunOS +

Change the value of by altering this equation: + + int nfile = 16 * (NPROC + 16 + MAXUSERS) / 10 + 64; + +Where + #define NPROC (10 + 16 * MAXUSERS) + +FreeBSD (from the 2.1.6 kernel) +

Very similar to SunOS, edit /usr/src/sys/conf/param.c +and alter the relationship between maxfiles and maxfilesperproc variables: + + int maxfiles = NPROC*2; + int maxfilesperproc = NPROC*2; + +Where NPROC is defined by: +#define NPROC (20 + 16 * MAXUSERS) +The per-process limit can also be adjusted directly in the kernel +configuration file with the following directive: +options OPEN_MAX=128 +BSD/OS (from the 2.1 kernel) +

Edit /usr/src/sys/conf/param.c and adjust the +maxfiles math here: + + int maxfiles = 3 * (NPROC + MAXUSERS) + 80; + +Where NPROC is defined by: +#define NPROC (20 + 16 * MAXUSERS) +You should also set the OPEN_MAX value in your kernel +configuration file to change the per-process limit. + + +Reconfigure afterwards +

+ cd squid-1.1.x + make realclean + ./configure --prefix=/usr/local/squid + make + + +What are these strange lines about removing objects? +

+For example: + + 97/01/23 22:31:10| Removed 1 of 9 objects from bucket 3913 + 97/01/23 22:33:10| Removed 1 of 5 objects from bucket 4315 + 97/01/23 22:35:40| Removed 1 of 14 objects from bucket 6391 + + +These log entries are normal, and do not indicate that +Consult your cache information page in + Storage LRU Expiration Age: 364.01 days + + +Objects which have not been used for that amount of time are removed as +a part of the regular maintenance. You can set an upper limit on the +Can I change a Windows NT FTP server to list directories in Unix format? + +

+Why, yes you can! Select the following menus: + +Start +Programs +Microsoft Internet Server (Common) +Internet Service Manager + +

+This will bring up a box with icons for your various services. One of +them should be a little ftp ``folder.'' Double click on this. +

+You will then have to select the server (there should only be one) +Select that and then choose ``Properties'' from the menu and choose the +``directories'' tab along the top. +

+There will be an option at the bottom saying ``Directory listing style.'' +Choose the ``Unix'' type, not the ``MS-DOS'' type. +

+ +--Oskar Pearson <oskar@is.co.za> + + +Why am I getting ``Ignoring MISS from non-peer x.x.x.x?'' + +

+You are receiving ICP MISSes (via UDP) from a parent or sibling cache +whose IP address your cache does not know about. This may happen +in two situations. + +

+ + +If the peer is multihomed, it is sending packets out an interface +which is not advertised in the DNS. Unfortunately, this is a +configuration problem at the peer site. You can tell them to either +add the IP address interface to their DNS, or use Squid's +'udp_outgoing_address' option to force the replies +out a specific interface. For example: +

+ + udp_outgoing_address proxy.parent.com + + + cache_host proxy.parent.com parent 3128 3130 + + + +You can also see this warning when sending ICP queries to +multicast addresses. For security reasons, Squid requires +your configuration to list all other caches listening on the +multicast group address. If an unknown cache listens to that address +and sends replies, your cache will log the warning message. To fix +this situation, either tell the unknown cache to stop listening +on the multicast address, or if they are legitimate, add them +to your configuration file. + + +DNS lookups for domain names with underscores (_) always fail. + +

+The standards for naming hosts +(, +) +do not allow underscores in domain names: + +A "name" (Net, Host, Gateway, or Domain name) is a text string up +to 24 characters drawn from the alphabet (A-Z), digits (0-9), minus +sign (-), and period (.). + +The resolver library that ships with recent versions of BIND enforces +this restriction, returning an error for any host with underscore in +the hostname. The best solution is to complain to the hostmaster of the +offending site, and ask them to rename their host. + +

+Some people have noticed that + +implies that underscores Why does Squid say: ``Illegal character in hostname; underscores are not allowed?' + +

+See the above question. The underscore character is not +valid for hostnames. + +

+Some DNS resolvers allow the underscore, so yes, the hostname +might work fine when you don't use Squid. + +

+To make Squid allow underscores in hostnames, re-run the +configure script with this option: + + % ./configure --enable-underscores ... + +and then recompile: + + % make clean + % make + + +Why am I getting access denied from a sibling cache? + +

+The answer to this is somewhat complicated, so please hold on. +. + +

+An ICP query does not include any parent or sibling designation, +so the receiver really has no indication of how the peer +cache is configured to use it. This issue becomes important +when a cache is willing to serve cache hits to anyone, but only +handle cache misses for its paying users or customers. In other +words, whether or not to allow the request depends on if the +result is a hit or a miss. To accomplish this, +Squid acquired the +The necessity of ``miss access'' makes life a little bit complicated, +and not only because it was awkward to implement. Miss access +means that the ICP query reply must be an extremely accurate prediction +of the result of a subsequent HTTP request. Ascertaining +this result is actually very hard, if not impossible to +do, since the ICP request cannot convey the +full HTTP request. +Additionally, there are more types of HTTP request results than there +are for ICP. The ICP query reply will either be a hit or miss. +However, the HTTP request might result in a `` +One serious problem for cache hierarchies is mismatched freshness +parameters. Consider a cache +In an HTTP/1.0 world, +HTTP/1.1 provides numerous request headers to specify freshness +requirements, which actually introduces +a different problem for cache hierarchies: ICP +still does not include any age information, neither in query nor +reply. So +In the end, the fundamental problem is that the ICP query does not +provide enough information to accurately predict whether +the HTTP request +will be a hit or miss. In fact, the current ICP Internet Draft is very +vague on this subject. What does ICP HIT really mean? Does it mean +``I know a little about that URL and have some copy of the object?'' Or +does it mean ``I have a valid copy of that object and you are allowed to +get it from me?'' + +

+So, what can be done about this problem? We really need to change ICP +so that freshness parameters are included. Until that happens, the members +of a cache hierarchy have only two options to totally eliminate the ``access +denied'' messages from sibling caches: + +Make sure all members have the same Do not use +If neither of these is realistic, then the sibling relationship should not +exist. + +Cannot bind socket FD NN to *:8080 (125) Address already in use + +

+This means that another processes is already listening on port 8080 +(or whatever you're using). It could mean that you have a Squid process +already running, or it could be from another program. To verify, use +the + netstat -naf inet | grep LISTEN + +That will show all sockets in the LISTEN state. You might also try + + netstat -naf inet | grep 8080 + +If you find that some process has bound to your port, but you're not sure +which process it is, you might be able to use the excellent + +program. It will show you which processes own every open file descriptor +on your system. + +icpDetectClientClose: ERROR xxx.xxx.xxx.xxx: (32) Broken pipe + +

+This means that the client socket was closed by the client +before Squid was finished sending data to it. Squid detects this +by trying to icpDetectClientClose: FD 135, 255 unexpected bytes +

+These are caused by misbehaving Web clients attempting to use persistent +connections. Squid-1.1 does not support persistent connections. + +How come Squid doesn't work with NTLM Authorization. + +

+We are not sure. We were unable to find any detailed information +on NTLM (thanks Microsoft!), but here is our best guess: + +

+Squid transparently passes the NTLM request and response headers between +clients and servers. The encrypted challenge and response strings most likely +encode the IP address of the client. Because the proxy is passing these +strings and is connected with a different IP address, the authentication +scheme breaks down. +This implies that if NTLM authentication works at all with proxy caches, the proxy +would need to intercept the NTLM headers and process them itself. + +

+Henrik Nordstrom adds the following information about NTLM: + +

+NTLM authentication is carried entirely inside the HTTP protocol, but is +different from Basic authentication in many ways. + + + +It is dependent on the IP addresses of both the server and the +client, and thus cannot be proxied by a application level proxy (not +even Microsoft Proxy server). + + +It is only taking place once per connection, not per request. Once +the connection is authenticated then all future requests on the same +connection inherities the authentication. The connection must be +reestablished to set up other authentication. + + +

+The reasons why it is not implemented in Netscape is probably: + + + It is very specific for the Windows platform + + It is not defined in any RFC or even internet draft. + + The protocol has several shortcomings, where the most apparent one is +that it cannot be proxied. + + There exists an open internet standard which does mostly the same but +without the shortcomings or platform dependencies: Digest +authentication. + + + +The +This message was received at +If you have only one parent, configured as: + + cache_host xxxx parent 3128 3130 no-query default + +nothing is sent to the parent; neither UDP packets, nor TCP connections. + + +

+Simply adding + acl all src 0.0.0.0/0.0.0.0 + never_direct allow all + + +``Hot Mail'' complains about: Intrusion Logged. Access denied. + +

+``Hot Mail'' is proxy-unfriendly and requires all requests to come from +the same IP address. You can fix this by adding to your + + hierarchy_stoplist hotmail.com + + +My Squid becomes very slow after it has been running for some time. + +

+This is most likely because Squid is using more memory than it should be +for your system. When the Squid process becomes large, it experiences a lot +of paging. This will very rapidly degrade the performance of Squid. +Memory usage is a complicated problem. There are a number +of things to consider. + +

+First, examine the Cache Manager + Number of HTTP requests received: 121104 + Page faults with physical i/o: 16720 + +Note, if your system does not have the +Divide the number of page faults by the number of connections. In this +case 16720/121104 = 0.14. Ideally this ratio should be in the 0.0 - 0.1 +range. It may be acceptable to be in the 0.1 - 0.2 range. Above that, +however, and you will most likely find that Squid's performance is +unacceptably slow. + +

+If the ratio is too high, you will need to make some changes to +. + +WARNING: Failed to start 'dnsserver' + +

+This could be a permission problem. Does the Squid userid have +permission to execute the +You might also try testing + > echo oceana.nlanr.net | ./dnsserver + +Should produce something like: + + $name oceana.nlanr.net + $h_name oceana.nlanr.net + $h_len 4 + $ipcount 1 + 132.249.40.200 + $aliascount 0 + $ttl 82067 + $end + + +Sending in Squid bug reports +

+Bug reports for Squid should be sent to the . Any bug report must include + +The Squid version +Your Operating System type and version + + +crashes and core dumps +

+There are two conditions under which squid will exit abnormally and +generate a coredump. First, a SIGSEGV or SIGBUS signal will cause Squid +to exit and dump core. Second, many functions include consistency +checks. If one of those checks fail, Squid calls abort() to generate a +core dump. + +

+Many people report that Squid doesn't leave a coredump anywhere. This may be +due to one of the following reasons: + + + Resource Limits. The shell has limits on the size of a coredump + file. You may need to increase the limit. + + No debugging symbols. + The Squid binary must have debugging symbols in order to get + a meaningful coredump. + + Threads and Linux. On Linux, threaded applications do not generate + core dumps. When you use --enable-async-io, it uses threads and + you can't get a coredump. + + It did leave a coredump file, you just can't find it. + + + +

+/etc/login.conf file on FreeBSD and maybe other BSD +systems. + +

+To change the coredumpsize limit you might use a command like: + + limit coredumpsize unlimited + +or + + limits coredump unlimited + + +

+ + % nm /usr/local/squid/bin/squid | head + +The binary has debugging symbols if you see gobbledegook like this: + + 0812abec B AS_tree_head + 080a7540 D AclMatchedName + 080a73fc D ActionTable + 080908a4 r B_BYTES_STR + 080908bc r B_GBYTES_STR + 080908ac r B_KBYTES_STR + 080908b4 r B_MBYTES_STR + 080a7550 D Biggest_FD + 08097c0c R CacheDigestHashFuncCount + 08098f00 r CcAttrs + +There are no debugging symbols if you see this instead: + + /usr/local/squid/bin/squid: no symbols + +Debugging symbols may have been +removed by your + +The The first The current directory when Squid was started + +Recent versions of Squid report their current directory after +starting, so look there first: + + 2000/03/14 00:12:36| Set Current Directory to /usr/local/squid/cache + +If you cannot find a core file, then either Squid does not have +permission to write in its current directory, or perhaps your shell +limits (csh and clones) are preventing the core file from being written. + +

+Often you can get a coredump if you run Squid from the +command line like this: + + % limit core un + % /usr/local/squid/bin/squid -NCd1 + + + +

+Once you have located the core dump file, use a debugger such as + + +tirana-wessels squid/src 270% gdb squid /T2/Cache/core +GDB is free software and you are welcome to distribute copies of it + under certain conditions; type "show copying" to see the conditions. +There is absolutely no warranty for GDB; type "show warranty" for details. +GDB 4.15.1 (hppa1.0-hp-hpux10.10), Copyright 1995 Free Software Foundation, Inc... +Core was generated by `squid'. +Program terminated with signal 6, Aborted. + +[...] + +(gdb) where +#0 0xc01277a8 in _kill () +#1 0xc00b2944 in _raise () +#2 0xc007bb08 in abort () +#3 0x53f5c in __eprintf (string=0x7b037048 "", expression=0x5f

, line=8, filename=0x6b
) +#4 0x29828 in fd_open (fd=10918, type=3221514150, desc=0x95e4 "HTTP Request") at fd.c:71 +#5 0x24f40 in comm_accept (fd=2063838200, peer=0x7b0390b0, me=0x6b) at comm.c:574 +#6 0x23874 in httpAccept (sock=33, notused=0xc00467a6) at client_side.c:1691 +#7 0x25510 in comm_select_incoming () at comm.c:784 +#8 0x25954 in comm_select (sec=29) at comm.c:1052 +#9 0x3b04c in main (argc=1073745368, argv=0x40000dd8) at main.c:671 + + +

+If possible, you might keep the coredump file around for a day or +two. It is often helpful if we can ask you to send additional +debugger output, such as the contents of some variables. + +Debugging Squid + +

+If you believe you have found a non-fatal bug (such as incorrect HTTP +processing) please send us a section of your cache.log with debugging to +demonstrate the problem. The cache.log file can become very large, so +alternatively, you may want to copy it to an FTP or HTTP server where we +can download it. + +

+It is very simple to +enable full debugging on a running squid process. Simply use the + % ./squid -k debug + +This causes every +To enable selective debugging (e.g. for one source file only), you +need to edit doc/debug-levels.txt +(correctly renamed to + debug_options ALL,1 28,9 + +Then you have to restart or reconfigure Squid. + +

+Once you have the debugging captured to FATAL: ipcache_init: DNS name lookup tests failed +

+Squid normally tests your system's DNS configuration before +it starts server requests. Squid tries to resolve some +common DNS names, as defined in the +your DNS nameserver is unreachable or not running. +your /etc/resolv.conf file may contain incorrect information. +your /etc/resolv.conf file may have incorrect permissions, and + may be unreadable by Squid. + + +

+To disable this feature, use the +Note, Squid does NOT use the FATAL: Failed to make swap directory /var/spool/cache: (13) Permission denied +

+Starting with version 1.1.15, we have required that you first run + + squid -z + +to create the swap directories on your filesystem. If you have set the + + # mkdir /var/spool/cache + # chown /var/spool/cache + # squid -z + + +

+Alternatively, if the directory already exists, then your operating +system may be returning ``Permission Denied'' instead of ``File Exists'' +on the mkdir() system call. This + +by + +should fix it. + +FATAL: Cannot open HTTP Port +

+Either (1) the Squid userid does not have permission to bind to the port, or +(2) some other process has bound itself to the port. +Remember that root privileges are required to open port numbers +less than 1024. If you see this message when using a high port number, +or even when starting Squid as root, then the port has already been +opened by another process. +Maybe you are running in the HTTP Accelerator mode and there is +already a HTTP server running on port 80? If you're really stuck, +install the way cool + +utility to show you which process has your port in use. + +FATAL: All redirectors have exited! +

+This is explained in the . + +FATAL: file_map_allocate: Exceeded filemap limit +

+See the next question. + +FATAL: You've run out of swap file numbers. +

+Note: The information here applies to version 2.2 and earlier. +

+Squid keeps an in-memory bitmap of disk files that are +available for use, or are being used. The size of this +bitmap is determined at run name, based on two things: +the size of your cache, and the average (mean) cache object size. + +The size of your cache is specified in squid.conf, on the + +When allocating the bitmaps, Squid allocates this many bits: + + 2 * cache_size / store_avg_object_size + + +So, if you exactly specify the correct average object size, +Squid should have 50% filemap bits free when the cache is full. +You can see how many filemap bits are being used by looking +at the 'storedir' cache manager page. It looks like this: + + + Store Directory #0: /usr/local/squid/cache + First level subdirectories: 4 + Second level subdirectories: 4 + Maximum Size: 1024000 KB + Current Size: 924837 KB + Percent Used: 90.32% + Filemap bits in use: 77308 of 157538 (49%) + Flags: + + +

+Now, if you see the ``You've run out of swap file numbers'' message, +then it means one of two things: + + + You've found a Squid bug. + + Your cache's average file size is much smaller + than the 'store_avg_object_size' value. + + +To check the average file size of object currently in your +cache, look at the cache manager 'info' page, and you will +find a line like: + + Mean Object Size: 11.96 KB + + +

+To make the warning message go away, set 'store_avg_object_size' +to that value (or lower) and then restart Squid. + +I am using up over 95% of the filemap bits?!! +

+Note: The information here is current for version 2.3 +

+Calm down, this is now normal. Squid now dynamically allocates +filemap bits based on the number of objects in your cache. +You won't run out of them, we promise. + + +FATAL: Cannot open /usr/local/squid/logs/access.log: (13) Permission denied +

+In Unix, things like +To find out who owns a file, use the + % ls -l /usr/local/squid/logs/access.log + + +

+A process is normally owned by the user who starts it. However, +Unix sometimes allows a process to change its owner. If you +specified a value for the +If all this is confusing, then you probably should not be +running Squid until you learn some more about Unix. +As a reference, I suggest . + +When using a username and password, I can not access some files. + +

+If I try by way of a test, to access + + ftp://username:password@ftpserver/somewhere/foo.tar.gz + +I get + + somewhere/foo.tar.gz: Not a directory. + + +

+Use this URL instead: + + ftp://username:password@ftpserver/%2fsomewhere/foo.tar.gz + + +pingerOpen: icmp_sock: (13) Permission denied +

+This means your + % su + # make install-pinger + +or + + # chown root /usr/local/squid/bin/pinger + # chmod 4755 /usr/local/squid/bin/pinger + + +What is a forwarding loop? +

+A forwarding loop is when a request passes through one proxy more than +once. You can get a forwarding loop if + +a cache forwards requests to itself. This might happen with + transparent caching (or server acceleration) configurations. +a pair or group of caches forward requests to each other. This can + happen when Squid uses ICP, Cache Digests, or the ICMP RTT database + to select a next-hop cache. + + +

+Forwarding loops are detected by examining the +When Squid detects a forwarding loop, it is logged to the +One way to reduce forwarding loops is to change a +Another way is to use + # Our parent caches + cache_peer A.example.com parent 3128 3130 + cache_peer B.example.com parent 3128 3130 + cache_peer C.example.com parent 3128 3130 + + # An ACL list + acl PEERS src A.example.com + acl PEERS src B.example.com + acl PEERS src C.example.com + + # Prevent forwarding loops + cache_peer_access A.example.com allow !PEERS + cache_peer_access B.example.com allow !PEERS + cache_peer_access C.example.com allow !PEERS + +The above configuration instructs squid to NOT forward a request +to parents A, B, or C when a request is received from any one +of those caches. + +accept failure: (71) Protocol error +

+This error message is seen mostly on Solaris systems. + +gives a great explanation: + +Error 71 [EPROTO] is an obscure way of reporting that clients made it onto your +server's TCP incoming connection queue but the client tore down the +connection before the server could accept it. I.e. your server ignored +its clients for too long. We've seen this happen when we ran out of +file descriptors. I guess it could also happen if something made squid +block for a long time. + + +storeSwapInFileOpened: ... Size mismatch +

+ +Got these messages in my cache log - I guess it means that the index +contents do not match the contents on disk. + + +1998/09/23 09:31:30| storeSwapInFileOpened: /var/cache/00/00/00000015: Size mismatch: 776(fstat) != 3785(object) +1998/09/23 09:31:31| storeSwapInFileOpened: /var/cache/00/00/00000017: Size mismatch: 2571(fstat) != 4159(object) + + +

+ +What does Squid do in this case? + + +

+NOTE, these messages are specific to Squid-2. These happen when Squid +reads an object from disk for a cache hit. After it opens the file, +Squid checks to see if the size is what it expects it should be. If the +size doesn't match, the error is printed. In this case, Squid does not +send the wrong object to the client. It will re-fetch the object from +the source. + +Why do I get fwdDispatch: Cannot retrieve 'https://www.buy.com/corp/ordertracking.asp' +

+These messages are caused by buggy clients, mostly Netscape Navigator. +What happens is, Netscape sends an HTTPS/SSL request over a persistent HTTP connection. +Normally, when Squid gets an SSL request, it looks like this: + + CONNECT www.buy.com:443 HTTP/1.0 + +Then Squid opens a TCP connection to the destination host and port, and +the +With this client bug, however, Squid receives a request like this: + + GET https://www.buy.com/corp/ordertracking.asp HTTP/1.0 + Accept: */* + User-agent: Netscape ... + ... + +Now, all of the headers, and the message body have been sent, +Note, this browser bug does represent a security risk because the browser +is sending sensitive information unencrypted over the network. + +Squid can't access URLs like http://3626046468/ab2/cybercards/moreinfo.html +

+by Dave J Woolley (DJW at bts dot co dot uk) +

+These are illegal URLs, generally only used by illegal sites; +typically the web site that supports a spammer and is expected to +survive a few hours longer than the spamming account. +

+ Their intention is to: + + + confuse content filtering rules on proxies, and possibly + some browsers' idea of whether they are trusted sites on + the local intranet; + + confuse whois (?); + + make people think they are not IP addresses and unknown + domain names, in an attempt to stop them trying to locate + and complain to the ISP. + +

+Any browser or proxy that works with them should be considered a +security risk. +

+ +has this to say about the hostname part of a URL: + + The fully qualified domain name of a network host, or its IP + address as a set of four decimal digit groups separated by + ".". Fully qualified domain names take the form as described + in Section 3.5 of RFC 1034 [13] and Section 2.1 of RFC 1123 + [5]: a sequence of domain labels separated by ".", each domain + label starting and ending with an alphanumerical character and + possibly also containing "-" characters. The rightmost domain + label will never start with a digit, though, which + syntactically distinguishes all domain names from the IP + addresses. + + +I get a lot of ``URI has whitespace'' error messages in my cache log, what should I do? + +

+Whitespace characters (space, tab, newline, carriage return) are +not allowed in URI's and URL's. Unfortunately, a number of Web services +generate URL's with whitespace. Of course your favorite browser silently +accomodates these bad URL's. The servers (or people) that generate +these URL's are in violation of Internet standards. The whitespace +characters should be encoded. + +

+If you want Squid to accept URL's with whitespace, you have to +decide how to handle them. There are four choices that you +can set with the + + DENY: + The request is denied with an ``Invalid Request'' message. + This is the default. + + ALLOW: + The request is allowed and the URL remains unchanged. + + ENCODE: + The whitespace characters are encoded according to + . This can be considered a violation + of the HTTP specification. + + CHOP: + The URL is chopped at the first whitespace character + and then processed normally. This also can be considered + a violation of HTTP. + + +commBind: Cannot bind socket FD 5 to 127.0.0.1:0: (49) Can't assign requested address +

+This likely means that your system does not have a loopback network device, or +that device is not properly configured. +All Unix systems should have a network device named + % ifconfig lo0 + +The result should look something like: + + lo0: flags=8049 mtu 16384 + inet 127.0.0.1 netmask 0xff000000 + + +

+If you use FreeBSD, see . + +Unknown cache_dir type '/var/squid/cache' +

+The format of the + cache_dir ufs /var/squid/cache ... + + +unrecognized: 'cache_dns_program /usr/local/squid/bin/dnsserver' +

+As of Squid 2.3, the default is to use internal DNS lookup code. +The +If you want to use external DNS lookups, with the + --disable-internal-dns + + +Is +Sort of. As of Squid 2.3, the default is to use internal DNS lookup code. +The + + See if the + Configure squid with --disable-internal-dns to use the external + dnsservers. + + Enhance src/dns_internal.c to understand the /etc/resolv.conf. + + +What does sslReadClient: FD 14: read failure: (104) Connection reset by peer mean? +

+``Connection reset by peer'' is an error code that Unix operating systems +sometimes return for +Connection reset means that the other host, the peer, sent us a RESET +packet on a TCP connection. A host sends a RESET when it receives +an unexpected packet for a nonexistent connection. For example, if +one side sends data at the same time that the other side closes +a connection, when the other side receives the data it may send +a reset back. +

+The fact that these messages appear in Squid's log might indicate +a problem, such as a broken origin server or parent cache. On +the other hand, they might be ``normal,'' especially since +some applications are known to force connection resets rather +than a proper close. +

+You probably don't need to worry about them, unless you receive +a lot of user complaints relating to SSL sites. +

+ notes that +if the server is running a Microsoft TCP stack, clients +receive RST segments whenever the listen queue overflows. In other words, +if the server is really busy, new connections receive the reset message. +This is contrary to rational behaviour, but is unlikely to change. + + +What does Connection refused mean? +

+This is an error message, generated by your operating system, +in response to a +Its quite easy to generate this error on your own. Simply +telnet to a random, high numbered port: + +% telnet localhost 12345 +Trying 127.0.0.1... +telnet: Unable to connect to remote host: Connection refused + +It happens because there is no server listening for connections +on port 12345. +

+When you see this in response to a URL request, it probably means +the origin server web site is temporarily down. It may also mean +that your parent cache is down, if you have one. + +squid: ERROR: no running copy +

+You may get this message when you run commands like +This error message usually means that the +If you accidentally removed the PID file, there are two ways to get it back. + +run +bender-wessels % ps ax | grep squid +83617 ?? Ss 0:00.00 squid -s +83619 ?? S 0:00.48 (squid) -s (squid) + +You want the second process id, 83619 in this case. Create the PID file and put the +process id number there. For example: + +echo 83619 > /usr/local/squid/logs/squid.pid + + +Use the above technique to find the Squid process id. Send the process a HUP +signal, which is the same as +kill -HUP 83619 + +The reconfigure process creates a new PID file automatically. + + + + +How does Squid work? + +What are cachable objects? +

+An Internet Object is a file, document or response to a query for +an Internet service such as FTP, HTTP, or gopher. A client requests +an Internet object from a caching proxy; if the object +is not already cached, the proxy server fetches +the object (either from the host specified in the URL or from a +parent or sibling cache) and delivers it to the client. + +What is the ICP protocol? +

+ICP is a protocol used for communication among squid caches. +The ICP protocol is defined in two Internet RFC's. + +describes the protocol itself, while + +describes the application of ICP to hierarchical Web caching. + +

+ICP is primarily used within a cache hierarchy to locate specific +objects in sibling caches. If a squid cache does not have a +requested document, it sends an ICP query to its siblings, and the +siblings respond with ICP replies indicating a ``HIT'' or a ``MISS.'' +The cache then uses the replies to choose from which cache to +resolve its own MISS. + +

+ICP also supports multiplexed transmission of multiple object +streams over a single TCP connection. ICP is currently implemented +on top of UDP. Current versions of Squid also support ICP via +multicast. + +What is the +The gethostbyname(3) function blocks the calling process +until the DNS query is completed. +

+Squid must use non-blocking I/O at all times, so DNS lookups are +implemented external to the main process. The + +What is the + +The + +FTP PUT's don't work! +

+FTP PUT should work with Squid-2.0 and later versions. If you +are using Squid-1.1, then you need to upgrade before PUT will work. + +What is a cache hierarchy? What are parents and siblings? +

+ +A cache hierarchy is a collection of caching proxy servers organized +in a logical parent/child and sibling arrangement so that caches +closest to Internet gateways (closest to the backbone transit +entry-points) act as parents to caches at locations farther from +the backbone. The parent caches resolve ``misses'' for their children. +In other words, when a cache requests an object from its parent, +and the parent does not have the object in its cache, the parent +fetches the object, caches it, and delivers it to the child. This +ensures that the hierarchy achieves the maximum reduction in +bandwidth utilization on the backbone transit links, helps reduce +load on Internet information servers outside the network served by +the hierarchy, and builds a rich cache on the parents so that the +other child caches in the hierarchy will obtain better ``hit'' rates +against their parents. + +

+In addition to the parent-child relationships, squid supports the +notion of siblings: caches at the same level in the hierarchy, +provided to distribute cache server load. Each cache in the +hierarchy independently decides whether to fetch the reference from +the object's home site or from parent or sibling caches, using a +a simple resolution protocol. Siblings will not fetch an object +for another sibling to resolve a cache ``miss.'' + +What is the Squid cache resolution algorithm? +

+ + +Send ICP queries to all appropriate siblings +Wait for all replies to arrive with a configurable timeout +(the default is two seconds). +Begin fetching the object upon receipt of the first HIT reply, +or +Fetch the object from the first parent which replied with MISS +(subject to weighting values), or +Fetch the object from the source + + +

+The algorithm is somewhat more complicated when firewalls +are involved. + +

+The What features are Squid developers currently working on? +

+ +There are several open issues for the caching project namely +more automatic load balancing and (both configured and +dynamic) selection of parents, routing, multicast +cache-to-cache communication, and better recognition of URLs +that are not worth caching. +

+For our other to-do list items, please +see our ``TODO'' file in the recent source distributions. + +

+Prospective developers should review the resources available at the + + +Tell me more about Internet traffic workloads +

+ +Workload can be characterized as the burden a client or +group of clients imposes on a system. Understanding the +nature of workloads is important to the managing system +capacity. + +If you are interested in Internet traffic workloads then NLANR's + is a good place to start. + +What are the tradeoffs of caching with the NLANR cache system? +

+ +The NLANR root caches are at the NSF supercomputer centers (SCCs), +which are interconnected via NSF's high speed backbone service +(vBNS). So inter-cache communication between the NLANR root caches +does not cross the Internet. + +

+The benefits of hierarchical caching (namely, reduced network +bandwidth consumption, reduced access latency, and improved +resiliency) come at a price. Caches higher in the hierarchy must +field the misses of their descendents. If the equilibrium hit rate +of a leaf cache is 50%, half of all leaf references have to be +resolved through a second level cache rather than directly from +the object's source. If this second level cache has most of the +documents, it is usually still a win, but if higher level caches +often don't have the document, or become overloaded, then they +could actually increase access latency, rather than reduce it. +

+ +Where can I find out more about firewalls? + +

+Please see the + +information site. + +What is the ``Storage LRU Expiration Age?'' +

+For example: + + Storage LRU Expiration Age: 4.31 days + + +

+The LRU expiration age is a dynamically-calculated value. Any objects +which have not been accessed for this amount of time will be removed from +the cache to make room for new, incoming objects. Another way of looking +at this is that it would +take your cache approximately this many days to go from empty to full at +your current traffic levels. + +

+As your cache becomes more busy, the LRU age becomes lower so that more +objects will be removed to make room for the new ones. Ideally, your +cache will have an LRU age value in the range of at least 3 days. If the +LRU age is lower than 3 days, then your cache is probably not big enough +to handle the volume of requests it receives. By adding more disk space +you could increase your cache hit ratio. + +

+The configuration parameter What is ``Failure Ratio at 1.01; Going into hit-only-mode for 5 minutes''? +

+Consider a pair of caches named A and B. It may be the case that A can +reach B, and vice-versa, but B has poor reachability to the rest of the +Internet. +In this case, we would like B to recognize that it has poor reachability +and somehow convey this fact to its neighbor caches. + +

+Squid will track the ratio of failed-to-successful requests over short +time periods. A failed request is one which is logged as ERR_DNS_FAIL, ERR_CONNECT_FAIL, or ERR_READ_ERROR. When the failed-to-successful ratio exceeds 1.0, +then Squid will return ICP_MISS_NOFETCH instead of ICP_MISS to neighbors. +Note, Squid will still return ICP_HIT for cache hits. + +Does squid periodically re-read its configuration file? +

+No, you must send a HUP signal to have Squid re-read its configuration file, +including access control lists. An easy way to do this is with the + squid -k reconfigure + + +How does + + + An object with swap file number + We want to unlink file + We have a new object to swap out. It is allocated to the first available + file number, which happens to be + The +

+So, the problem is, how can we guarantee that +In terms of implementation, the only way to send unlink requests to +the +Unfortunately there are times when Squid can not use the What is an icon URL? + +

+One of the most unpleasant things Squid must do is generate HTML +pages of Gopher and FTP directory listings. For some strange +reason, people like to have little +In Squid 1.0 and 1.1, we used internal browser icons with names +like +For Squid 2 we include a set of icons in the source distribution. +These icon files are loaded by Squid as cached objects at runtime. +Thus, every Squid cache now has its own icons to use in Gopher and FTP +listings. Just like other objects available on the web, we refer to +the icons with +, or Can I make my regular FTP clients use a Squid cache? + +

+Nope, its not possible. Squid only accepts HTTP requests. It speaks +FTP on the +The very cool + +will download FTP URLs via Squid (and probably any other proxy cache). + +Why is the select loop average time so high? +

+ + +Is there any way to speed up the time spent dealing with select? Cachemgr +shows: + + + Select loop called: 885025 times, 714.176 ms avg + + +

+This number is NOT how much time it takes to handle filedescriptor I/O. +We simply count the number of times select was called, and divide the +total process running time by the number of select calls. + +

+This means, on average it takes your cache .714 seconds to check all +the open file descriptors once. But this also includes time select() +spends in a wait state when there is no I/O on any file descriptors. +My relatively idle workstation cache has similar numbers: + + Select loop called: 336782 times, 715.938 ms avg + +But my busy caches have much lower times: + + Select loop called: 16940436 times, 10.427 ms avg + Select loop called: 80524058 times, 10.030 ms avg + Select loop called: 10590369 times, 8.675 ms avg + Select loop called: 84319441 times, 9.578 ms avg + + +How does Squid deal with Cookies? + +

+The presence of Cookies headers in +The proper way to deal with +is to cache the whole object, +With Squid-1.1, we can not filter out specific HTTP headers, so +Squid-1.1 does not cache any response which contains a +With Squid-2, however, we can filter out specific HTTP headers. But instead +of filtering them on the receiving-side, we filter them on the sending-side. +Thus, Squid-2 does cache replies with How does Squid decide when to refresh a cached object? + +

+When checking the object freshness, we calculate these values: + + + + + + OBJ_AGE = NOW - OBJ_DATE + + + + LM_AGE = OBJ_DATE - OBJ_LASTMOD + + + + LM_FACTOR = OBJ_AGE / LM_AGE + + + + + +

+These values are compared with the parameters of the +URL regular expression + + +

+The URL regular expressions are checked in the order listed until a +match is found. Then the algorithms below are applied for determining +if an object is fresh or stale. + +Squid-1.1 and Squid-1.NOVM algorithm +

+ + if (CLIENT_MAX_AGE) + if (OBJ_AGE > CLIENT_MAX_AGE) + return STALE + if (OBJ_AGE <= CONF_MIN) + return FRESH + if (EXPIRES) { + if (EXPIRES <= NOW) + return STALE + else + return FRESH + } + if (OBJ_AGE > CONF_MAX) + return STALE + if (LM_FACTOR < CONF_PERCENT) + return FRESH + return STALE + + +

+ +has made an excellent + showing this process. + +Squid-2 algorithm + +

+For Squid-2 the refresh algorithm has been slightly modified to give the + + if (CLIENT_MAX_AGE) + if (OBJ_AGE > CLIENT_MAX_AGE) + return STALE + if (EXPIRES) { + if (EXPIRES <= NOW) + return STALE + else + return FRESH + } + if (OBJ_AGE > CONF_MAX) + return STALE + if (OBJ_DATE > OBJ_LASTMOD) { + if (LM_FACTOR < CONF_PERCENT) + return FRESH + else + return STALE + } + if (OBJ_AGE <= CONF_MIN) + return FRESH + return STALE + + + +What exactly is a +The cachemanager I/O page lists +Sometimes reading on the server-side gets ahead of writing to the +client-side. Especially if your cache is on a fast network and your +clients are connected at modem speeds. Squid-1.1 will read up to 256k +(per request) ahead before it starts to defer the server-side reads. + +Why is my cache's inbound traffic equal to the outbound traffic? +

+ +I've been monitoring +the traffic on my cache's ethernet adapter an found a behavior I can't explain: +the inbound traffic is equal to the outbound traffic. The differences are +negligible. The hit ratio reports 40%. +Shouldn't the outbound be at least 40% greater than the inbound? + +

+by +

+I can't account for the exact behavior you're seeing, but I can offer this +advice; whenever you start measuring raw Ethernet or IP traffic on +interfaces, you can forget about getting all the numbers to exactly match what +Squid reports as the amount of traffic it has sent/received. + +

+Why? + +

+Squid is an application - it counts whatever data is sent to, or received +from, the lower-level networking functions; at each successively lower layer, +additional traffic is involved (such as header overhead, retransmits and +fragmentation, unrelated broadcasts/traffic, etc.). The additional traffic is +never seen by Squid and thus isn't counted - but if you run MRTG (or any +SNMP/RMON measurement tool) against a specific interface, all this additional +traffic will "magically appear". + +

+Also remember that an interface has no concept of upper-layer networking (so +an Ethernet interface doesn't distinguish between IP traffic that's entirely +internal to your organization, and traffic that's to/from the Internet); this +means that when you start measuring an interface, you have to be aware of +*what* you are measuring before you can start comparing numbers elsewhere. + +

+It is possible (though by no means guaranteed) that you are seeing roughly +equivalent input/output because you're measuring an interface that both +retrieves data from the outside world (Internet), *and* serves it to end users +(internal clients). That wouldn't be the whole answer, but hopefully it gives +you a few ideas to start applying to your own circumstance. + +

+To interpret any statistic, you have to first know what you are measuring; +for example, an interface counts inbound and outbound bytes - that's it. The +interface doesn't distinguish between inbound bytes from external Internet +sites or from internal (to the organization) clients (making requests). If +you want that, try looking at RMON2. + +

+Also, if you're talking about a 40% hit rate in terms of object +requests/counts then there's absolutely no reason why you should expect a 40% +reduction in traffic; after all, not every request/object is going to be the +same size so you may be saving a lot in terms of requests but very little in +terms of actual traffic. + +How come some objects do not get cached? + +

+To determine whether a given object may be cached, Squid takes many +things into consideration. The current algorithm (for Squid-2) +goes something like this: + + + + Responses with + Responses with + Responses with + Responses for requests with an + Responses with + The following HTTP status codes are cachable: + + 200 OK + 203 Non-Authoritative Information + 300 Multiple Choices + 301 Moved Permanently + 410 Gone + + However, if Squid receives one of these responses from a neighbor + cache, it will NOT be cached if ALL of the + A 302 Moved Temporarily response is cachable ONLY if the response + also includes an + The following HTTP status codes are ``negatively cached'' for + a short amount of time (configurable): + + 204 No Content + 305 Use Proxy + 400 Bad Request + 403 Forbidden + 404 Not Found + 405 Method Not Allowed + 414 Request-URI Too Large + 500 Internal Server Error + 501 Not Implemented + 502 Bad Gateway + 503 Service Unavailable + 504 Gateway Time-out + + + All other HTTP status codes are NOT cachable, including: + + 206 Partial Content + 303 See Other + 304 Not Modified + 401 Unauthorized + 407 Proxy Authentication Required + + + +What does +The +This is a mechanism to try detecting neighbor caches which might +not be able to deal with HTTP/1.1 persistent connections. Every +time we send a +If the ratio stays above 0.5, then we continue to assume the neighbor +properly implements persistent connections. Otherwise, we will stop +sending the keep-alive request header to that neighbor. + +How does Squid's cache replacement algorithm work? + +

+Squid uses an LRU (least recently used) algorithm to replace old cache +objects. This means objects which have not been accessed for the +longest time are removed first. In the source code, the +StoreEntry->lastref value is updated every time an object is accessed. + +

+Objects are not necessarily removed ``on-demand.'' Instead, a regularly +scheduled event runs to periodically remove objects. Normally this +event runs every second. + +

+Squid keeps the cache disk usage between the low and high water marks. +By default the low mark is 90%, and the high mark is 95% of the total +configured cache size. When the disk usage is close to the low mark, +the replacement is less aggressive (fewer objects removed). When the +usage is close to the high mark, the replacement is more aggressive +(more objects removed). + +

+When selecting objects for removal, Squid examines some number of objects +and determines which can be removed and which cannot. +A number of factors determine whether or not any given object can be +removed. If the object is currently being requested, or retrieved +from an upstream site, it will not be removed. If the object is +``negatively-cached'' it will be removed. If the object has a private +cache key, it will be removed (there would be no reason to keep it -- +because the key is private, it can never be ``found'' by subsequent requests). +Finally, if the time since last access is greater than the LRU threshold, +the object is removed. + +

+The LRU threshold value is dynamically calculated based on the current +cache size and the low and high marks. The LRU threshold scaled +exponentially between the high and low water marks. When the store swap +size is near the low water mark, the LRU threshold is large. When the +store swap size is near the high water mark, the LRU threshold is small. +The threshold automatically adjusts to the rate of incoming requests. +In fact, when your cache size has stabilized, the LRU threshold +represents how long it takes to fill (or fully replace) your cache at +the current request rate. Typical values for the LRU threshold are 1 to +10 days. + +

+Back to selecting objects for removal. Obviously it is not possible to +check every object in the cache every time we need to remove some of them. +We can only check a small subset each time. The way in which +this is implemented is very different between Squid-1.1 and Squid-2. + +Squid 1.1 +

+The Squid cache storage is implemented as a hash table with some number +of "hash buckets." Squid-1.1 scans one bucket at a time and sorts all the +objects in the bucket by their LRU age. Objects with an LRU age +over the threshold are removed. The scan rate is adjusted so that +it takes approximately 24 hours to scan the entire cache. The +store buckets are randomized so that we don't always scan the same +buckets at the same time of the day. + +

+This algorithm has some flaws. Because we only scan one bucket, +there are going to be better candidates for removal in some of +the other 16,000 or so buckets. Also, the qsort() function +might take a non-trivial amount of CPU time, depending on how many +entries are in each bucket. + +Squid 2 +

+For Squid-2 we eliminated the need to use qsort() by indexing +cached objects into an automatically sorted linked list. Every time +an object is accessed, it gets moved to the top of the list. Over time, +the least used objects migrate to the bottom of the list. When looking +for objects to remove, we only need to check the last 100 or so objects +in the list. Unfortunately this approach increases our memory usage +because of the need to store three additional pointers per cache object. +But for Squid-2 we're still ahead of the game because we also replaced +plain-text cache keys with MD5 hashes. + +What are private and public keys? +

+ +The Squid cache uses the notions of +Objects are changed from private to public after all of the HTTP +reply headers have been received and parsed. In some cases, the +reply headers will indicate the object should not be made public. +For example, if the What is FORW_VIA_DB for? +

+We use it to collect data for . + +Does Squid send packets to port 7 (echo)? If so, why? +

+It may. This is an old feature from the Harvest cache software. +The cache would send ICP ``SECHO'' message to the echo ports of +origin servers. If the SECHO message came back before any of the +other ICP replies, then it meant the origin server was probably +closer than any neighbor cache. In that case Harvest/Squid sent +the request directly to the origin server. + +

+With more attention focused on security, many administrators filter +UDP packets to port 7. The Computer Emergency Response Team (CERT) +once issued an advisory note () that says UDP +echo and chargen services can be used for a denial of service +attack. This made admins extremely nervous about any packets +hitting port 7 on their systems, and they made complaints. + +

+The What does ``WARNING: Reply from unknown nameserver [a.b.c.d]'' mean? +

+It means Squid sent a DNS query to one IP address, but the response +came back from a different IP address. By default Squid checks that +the addresses match. If not, Squid ignores the response. + +

There are a number of reasons why this would happen: + + + Your DNS name server just works this way, either becuase + its been configured to, or because its stupid and doesn't + know any better. + + You have a weird broadcast address, like 0.0.0.0, in + your /etc/resolv.conf file. + + Somebody is trying to send spoofed DNS responses to + your cache. + + +

+If you recognize the IP address in the warning as one of your +name server hosts, then its probably numbers (1) or (2). + +

+You can make these warnings stop, and allow responses from +``unknown'' name servers by setting this configuration option: + + ignore_unknown_nameservers off + + +How does Squid distribute cache files among the available directories? +

+Note: The information here is current for version 2.2. +

+See +When Squid wants to create a new disk file for storing an object, it +first selects which 3N/4 (75%) +of them with the most available space. These directories are +then used, in order of having the most available space. When Squid has +stored one URL to each of the +3N/4 3N/4 +cache directories with the most available space. + +

+Once the Why do I see negative byte hit ratio? +

+Byte hit ratio is calculated a bit differently than +Request hit ratio. Squid counts the number of bytes read +from the network on the server-side, and the number of bytes written to +the client-side. The byte hit ratio is calculated as + + (client_bytes - server_bytes) / client_bytes + +If server_bytes is greater than client_bytes, you end up +with a negative value. + +

+The server_bytes may be greater than client_bytes for a number +of reasons, including: + + + Cache Digests and other internally generated requests. + Cache Digest messages are quite large. They are counted + in the server_bytes, but since they are consumed internally, + they do not count in client_bytes. + + User-aborted requests. If your + +What does ``Disabling use of private keys'' mean? +

+First you need to understand the +. + +

+When Squid sends ICP queries, it uses the ICP +Some ICP implementations always set the +Not having private cache keys has some important privacy +implications. Two users could receive one response that was +meant for only one of the users. This response could contain +personal, confidential information. You will need to disable +the ``zero reqnum'' neighbor if you want Squid to use private +cache keys. + +What is a half-closed filedescriptor? +

+TCP allows connections to be in a ``half-closed'' state. This +is accomplished with the +If Squid tries to read a connection, and +To disable half-closed connections, simply put this in +squid.conf: + + half_closed_clients off + +Then, Squid will always close its side of the connection +instead of marking it as half-closed. + +What does --enable-heap-replacement do? +

+Squid has traditionally used an LRU replacement algorithm. As of +, you can use +some other replacement algorithms by using the +The heap replacement code was contributed by John Dilley and others +from Hewlett-Packard. Their work is described in these papers: + + + +(HP Tech Report). + + +(WCW 1999 paper). + + +Why is actual filesystem space used greater than what Squid thinks? +

+If you compare + + Squid doesn't keep track of the size of the + Directory entries and take up filesystem space. + + Other applications might be using the same disk partition. + + Your filesystem block size might be larger than what Squid + thinks. When calculating total disk usage, Squid rounds + file sizes up to a whole number of 1024 byte blocks. If + your filesystem uses larger blocks, then some "wasted" space + is not accounted. + + +How do + + +Squid-2.3 and later versions with internal DNS lookups. Internal +lookups are the default for Squid-2.3 and later. +If you applied the ``DNS TTL'' +for BIND. +If you are using FreeBSD, then it already has the DNS TTL patch +built in. + + +

+Let's say you have the following settings: + +positive_dns_ttl 1 hours +negative_dns_ttl 1 minutes + +When Squid looks up a name like +If you have the DNS TTL patch, or are using internal lookups, then +each hostname has its own TTL value, which was set by the domain +name administrator. You can see these values in the 'ipcache' +cache manager page. For example: + + Hostname Flags lstref TTL N + www.squid-cache.org C 73043 12784 1( 0) 204.144.128.89-OK + www.ircache.net C 73812 10891 1( 0) 192.52.106.12-OK + polygraph.ircache.net C 241768 -181261 1( 0) 192.52.106.12-OK + +The TTL field shows how how many seconds until the entry expires. +Negative values mean the entry is already expired, and will be refreshed +upon next use. + +

+The What does swapin MD5 mismatch mean? +

+It means that Squid opened up a disk file to serve a cache hit, but +it found that the stored object doesn't match what the user's request. +Squid stores the MD5 digest of the URL at the start of each disk file. +When the file is opened, Squid checks that the disk file MD5 matches the +MD5 of the URL requested by the user. If they don't match, the warning +is printed and Squid forwards the request to the origin server. +

+You do not need to worry about this warning. It means that Squid is +recovering from a corrupted cache directory. + +What does failed to unpack swapfile meta data mean? +

+Each of Squid's disk cache files has a metadata section at the beginning. +This header is used to store the URL MD5, some StoreEntry data, and more. +When Squid opens a disk file for reading, it looks for the meta data +header and unpacks it. +

+This warning means that Squid couln't unpack the meta data. This is +non-fatal bug, from which Squid can recover. Perhaps +the meta data was just missing, or perhaps the file got corrupted. +

+You do not need to worry about this warning. It means that Squid is +double-checking that the disk file matches what Squid thinks should +be there, and the check failed. Squid recorvers and generates +a cache miss in this case. + +Why doesn't Squid make +Its a side-effect of the way interception proxying works. +

+When Squid is configured for interception proxying, the operating system +pretends that it is the origin server. That means that the "local" socket +address for intercepted TCP +connections is really the origin server's IP address. If you run + +When Squid wants to make an ident query, it creates a new TCP socket +and + +Multicast + +What is Multicast? +

+Multicast is essentially the ability to send one IP packet to multiple +receivers. Multicast is often used for audio and video conferencing systems. + +

+You often hear about in +reference to Multicast. The Mbone is essentially a ``virtual backbone'' +which exists in the Internet itself. If you want to send and/or receive +Multicast, you need to be ``on the Mbone.'' + +How do I know if I'm on the Mbone? + +

+One way is to ask someone who manages your network. If your network manager +doesn't know, or looks at you funny, then you are very likely NOT on the Mbone + +

+Another way is to use the . Mtrace is similar to traceroute. It will +tell you about the multicast path between your site and another. For example: + + > mtrace mbone.ucar.edu + mtrace: WARNING: no multicast group specified, so no statistics printed + Mtrace from 128.117.64.29 to 192.172.226.25 via group 224.2.0.1 + Querying full reverse path... * switching to hop-by-hop: + 0 oceana-ether.nlanr.net (192.172.226.25) + -1 avidya-ether.nlanr.net (192.172.226.57) DVMRP thresh^ 1 + -2 mbone.sdsc.edu (198.17.46.39) DVMRP thresh^ 1 + -3 * nccosc-mbone.dren.net (138.18.5.224) DVMRP thresh^ 48 + -4 * * FIXW-MBONE.NSN.NASA.GOV (192.203.230.243) PIM/Special thresh^ 64 + -5 dec3800-2-fddi-0.SanFrancisco.mci.net (204.70.158.61) DVMRP thresh^ 64 + -6 dec3800-2-fddi-0.Denver.mci.net (204.70.152.61) DVMRP thresh^ 1 + -7 mbone.ucar.edu (192.52.106.7) DVMRP thresh^ 64 + -8 mbone.ucar.edu (128.117.64.29) + Round trip time 196 ms; total ttl of 68 required. + + +

+If you think you need to be on the Mbone, this is +. + +Should I be using Multicast ICP? + +

+Short answer: No, probably not. + +

+Reasons why you SHOULD use Multicast: + + +It reduces the number of times Squid calls +Its trendy and cool to use Multicast. + + +

+Reasons why you SHOULD NOT use Multicast: + + +Multicast tunnels/configurations/infrastructure are often unstable. +You may lose multicast connectivity but still have unicast connectivity. + +Multicast does not simplify your Squid configuration file. Every trusted +neighbor cache must still be specified. + +Multicast does not reduce the number of ICP replies being sent around. +It does reduce the number of ICP queries sent, but not the number of replies. + +Multicast exposes your cache to some privacy issues. There are no special +emissions required to join a multicast group. Anyone may join your +group and eavesdrop on ICP query messages. However, the scope of your +multicast traffic can be controlled such that it does not exceed certain +boundaries. + + +

+We only recommend people to use Multicast ICP over network +infrastructure which they have close control over. In other words, only +use Multicast over your local area network, or maybe your wide area +network if you are an ISP. We think it is probably a bad idea to use +Multicast ICP over congested links or commodity backbones. + +How do I configure Squid to send Multicast ICP queries? + +

+To configure Squid to send ICP queries to a Multicast address, you +need to create another neighbour cache entry specified as + cache_host 224.9.9.9 multicast 3128 3130 ttl=64 + +224.9.9.9 is a sample multicast group address. + +You must also specify which of your neighbours will respond +to your multicast queries, since it would +be a bad idea to implicitly trust any ICP reply from an unknown +address. Note that ICP replies are sent back to + cache_host cache1 sibling 3128 3130 multicast-responder + cache_host cache2 sibling 3128 3130 multicast-responder + +Here all fields are relevant. The ICP port number (3130) +must be the same as in the How do I know what Multicast TTL to use? + +

+The Multicast TTL (which is specified on the + 32 for links that separate sites within an organization. + 64 for links that separate communities or organizations, and are + attached to the Internet MBONE. + 128 for links that separate continents on the MBONE. + + +

+A good way to determine the TTL you need is to run +If you set you TTL too high, then your ICP messages may travel ``too far'' +and will be subject to eavesdropping by others. +If you're only using multicast on your LAN, as we suggest, then your TTL will +be quite small, for example How do I configure Squid to receive and respond to Multicast ICP? + +

+You must tell Squid to join a multicast group address with the + + mcast_groups 224.9.9.9 + +Of course, all members of your Multicast ICP group will need to use the +exact same multicast group address. + +

+ + +Use a unique group address + +Limit the scope of multicast messages with TTLs or administrative scoping. + + +

+Using a unique address is a good idea, but not without some potential +problems. If you choose an address randomly, how do you know that +someone else will not also randomly choose the same address? NLANR +has been assigned a block of multicast addresses by the IANA for use +in situations such as this. If you would like to be assigned one +of these addresses, please . However, note that NLANR or IANA have no +authority to prevent anyone from using an address assigned to you. + +

+Limiting the scope of your multicast messages is probably a better +solution. They can be limited with the TTL value discussed above, or +with some newer techniques known as administratively scoped +addresses. Here you can configure well-defined boundaries for the +traffic to a specific address. The + +describes this. + + + +System-Dependent Weirdnesses + +Solaris + +select() +

+ +For older Squid versions you can enable include/autoconf.h, or +by adding -DUSE_POLL=1 to the DEFINES in src/Makefile. + +malloc +

+libmalloc.a is leaky. Squid's configure does not use -lmalloc on Solaris. + +DNS lookups and +by . +

+DNS lookups can be slow because of some mysterious thing called +/etc/nscd.conf and make it say: + + enable-cache hosts no + +

+Apparently nscd serializes DNS queries thus slowing everything down when +an application (such as Squid) hits the resolver hard. You may notice +something similar if you run a log processor executing many DNS resolver +queries - the resolver starts to slow.. right.. down.. . . . + +

+According to +, +users of Solaris starting from version 2.6 and up should NOT +completely disable +Several library calls rely on available free FILE descriptors +FD < 256. Systems running without nscd may fail on such calls +if first 256 files are all in use. + +

+Since solaris 2.6 Sun has changed the way some system calls +work and is using DNS lookups and /etc/nsswitch.conf +

+by . +

+The /etc/nsswitch.conf file determines the order of searches +for lookups (amongst other things). You might only have it set up to +allow NIS and HOSTS files to work. You definitely want the "hosts:" +line to include the word + hosts: nis dns [NOTFOUND=return] files + + +DNS lookups and NIS +

+by . + +

+Our site cache is running on a Solaris 2.6 machine. We use NIS to distribute +authentication and local hosts information around and in common with our +multiuser systems, we run a slave NIS server on it to help the response of +NIS queries. + +

+We were seeing very high name-ip lookup times (avg ˜2sec) +and ip->name lookup times (avg ˜8 sec), although there didn't +seem to be that much of a problem with response times for valid +sites until the cache was being placed under high load. Then, +performance went down the toilet. + +

+After some time, and a bit of detective work, we found the problem. +On Solaris 2.6, if you have a local NIS server running (/etc/nsswitch.conf hosts entry, +then check the flags it is being started with. The 2.6 ypstart +script checks to see if there is a +This has the same effect as putting the +This is a +If you're running in this kind of setup, then you will want to make +sure that + + +ypserv doesn't start with the you don't have the + +

+We changed these here, and saw our average lookup times drop by up +to an order of magnitude (˜150msec for name-ip queries and +˜1.5sec for ip-name queries, the latter still so high, I +suspect, because more of these fail and timeout since they are not +made so often and the entries are frequently non-existent anyway). + +Tuning +

+ by + +disk write error: (28) No space left on device +

+You might get this error even if your disk is not full, and is not out +of inodes. Check your syslog logs (/var/adm/messages, normally) for +messages like either of these: + + NOTICE: realloccg /proxy/cache: file system full + NOTICE: alloc: /proxy/cache: file system full + + +

+In a nutshell, the UFS filesystem used by Solaris can't cope with the +workload squid presents to it very well. The filesystem will end up +becoming highly fragmented, until it reaches a point where there are +insufficient free blocks left to create files with, and only fragments +available. At this point, you'll get this error and squid will revise +its idea of how much space is actually available to it. You can do a +"fsck -n raw_device" (no need to unmount, this checks in read only mode) +to look at the fragmentation level of the filesystem. It will probably +be quite high (>15%). + +

+Sun suggest two solutions to this problem. One costs money, the other is +free but may result in a loss of performance (although Sun do claim it +shouldn't, given the already highly random nature of squid disk access). + +

+The first is to buy a copy of VxFS, the Veritas Filesystem. This is an +extent-based filesystem and it's capable of having online defragmentation +performed on mounted filesystems. This costs money, however (VxFS is not +very cheap!) + +

+The second is to change certain parameters of the UFS filesystem. Unmount +your cache filesystems and use tunefs to change optimization to "space" and +to reduce the "minfree" value to 3-5% (under Solaris 2.6 and higher, very +large filesystems will almost certainly have a minfree of 2% already and you +shouldn't increase this). You should be able to get fragmentation down to +around 3% by doing this, with an accompanied increase in the amount of space +available. + +

+Thanks to . + +Solaris X86 and IPFilter +

+by +

+Important update regarding Squid running on Solaris x86. I have been +working for several months to resolve what appeared to be a memory leak in +squid when running on Solaris x86 regardless of the malloc that was used. I +have made 2 discoveries that anyone running Squid on this platform may be +interested in. +

+Number 1: There is not a memory leak in Squid even though after the system +runs for some amount of time, this varies depending on the load the system +is under, Top reports that there is very little memory free. True to the +claims of the Sun engineer I spoke to this statistic from Top is incorrect. +The odd thing is that you do begin to see performance suffer substantially +as time goes on and the only way to correct the situation is to reboot the +system. This leads me to discovery number 2. +

+Number 2: There is some type of resource problem, memory or other, with +IPFilter on Solaris x86. I have not taken the time to investigate what the +problem is because we no longer are using IPFilter. We have switched to a +Alteon ACE 180 Gigabit switch which will do the trans-proxy for you. After +moving the trans-proxy, redirection process out to the Alteon switch Squid +has run for 3 days strait under a huge load with no problem what so ever. +We currently have 2 boxes with 40 GB of cached objects on each box. This 40 +GB was accumulated in the 3 days, from this you can see what type of load +these boxes are under. Prior to this change we were never able to operate +for more than 4 hours. +

+Because the problem appears to be with IPFilter I would guess that you +would only run into this issue if you are trying to run Squid as a +transparent proxy using IPFilter. That makes sense. If there is anyone +with information that would indicate my finding are incorrect I am willing +to investigate further. + +Changing the directory lookup cache size +

+by +

+On Solaris, the kernel variable for the directory name lookup cache size is +ncsize. In /etc/system, you might want to try + + set ncsize = 8192 + +or even +higher. The kernel variable +You can set +Defaults are: + + Solaris 2.5.1 : (max_nprocs + 16 + maxusers) + 64 + Solaris 2.6/Solaris 7 : 4 * (max_nprocs + maxusers) + 320 + + +The priority_paging algorithm +

+by +

+Another new tuneable (actually a toggle) in Solaris 2.5.1, 2.6 or Solaris 7 is +the /etc/system). As you may know, the Solaris buffer cache grows to fill +available pages, and under the old VM system, applications could get paged out +to make way for the buffer cache, which can lead to swap thrashing and +degraded application performance. The new FreeBSD + +T/TCP bugs +

+We have found that with FreeBSD-2.2.2-RELEASE, there some bugs with T/TCP. FreeBSD will +try to use T/TCP if you've enabled the ``TCP Extensions.'' To disable T/TCP, +use /etc/rc.conf and set + + tcp_extensions="NO" # Allow RFC1323 & RFC1544 extensions (or NO). + +or add this to your /etc/rc files: + + sysctl -w net.inet.tcp.rfc1644=0 + + +mbuf size +

+We noticed an odd thing with some of Squid's interprocess communication. +Often, output from the +1998/04/02 15:18:48| comm_select: FD 46 ready for reading +1998/04/02 15:18:48| ipcache_dnsHandleRead: Result from DNS ID 2 (100 bytes) +1998/04/02 15:18:48| ipcache_dnsHandleRead: Incomplete reply +....other processing occurs... +1998/04/02 15:18:48| comm_select: FD 46 ready for reading +1998/04/02 15:18:48| ipcache_dnsHandleRead: Result from DNS ID 2 (9 bytes) +1998/04/02 15:18:48| ipcache_parsebuffer: parsing: +$name www.karup.com +$h_name www.karup.inter.net +$h_len 4 +$ipcount 2 +38.15.68.128 +38.15.67.128 +$ttl 2348 +$end + + +Interestingly, it is very common to get only 100 bytes on the first +read. When two read() calls are required, this adds additional latency +to the overall request. On our caches running Digital Unix, the median + +Here is a simple patch to fix the bug: + +=================================================================== +RCS file: /home/ncvs/src/sys/kern/uipc_socket.c,v +retrieving revision 1.40 +retrieving revision 1.41 +diff -p -u -r1.40 -r1.41 +--- src/sys/kern/uipc_socket.c 1998/05/15 20:11:30 1.40 ++++ /home/ncvs/src/sys/kern/uipc_socket.c 1998/07/06 19:27:14 1.41 +@@ -31,7 +31,7 @@ + * SUCH DAMAGE. + * + * @(#)uipc_socket.c 8.3 (Berkeley) 4/15/94 +- * $Id: FAQ.sgml,v 1.1 2004/09/09 12:36:11 cvsdist Exp $ ++ * $Id: FAQ.sgml,v 1.1 2004/09/09 12:36:11 cvsdist Exp $ + */ + + #include +@@ -491,6 +491,7 @@ restart: + mlen = MCLBYTES; + len = min(min(mlen, resid), space); + } else { ++ atomic = 1; + nopages: + len = min(min(mlen, resid), space); + /* + + + +

+Another technique which may help, but does not fix the bug, is to +increase the kernel's mbuf size. +The default is 128 bytes. The MSIZE symbol is defined in +/usr/include/machine/param.h. However, to change it we added +this line to our kernel configuration file: + + options MSIZE="256" + + +Dealing with NIS +

+/var/yp/Makefile has the following section: + + # The following line encodes the YP_INTERDOMAIN key into the hosts.byname + # and hosts.byaddr maps so that ypserv(8) will do DNS lookups to resolve + # hosts not in the current domain. Commenting this line out will disable + # the DNS lookups. + B=-b + +You will want to comment out the FreeBSD 3.3: The lo0 (loop-back) device is not configured on startup +

+Squid requires a the loopback interface to be up and configured. If it is not, you will +get errors such as . +

+From : +

+ +Fix: Assuming that you experience this problem at all, edit /etc/rc.conf + and search for where the network_interfaces variable is set. In + its value, change the word +

+Thanks to . + + +FreeBSD 3.x or newer: Speed up disk writes using Softupdates +

+by + +

+FreeBSD 3.x and newer support Softupdates. This is a mechanism to +speed up disk writes as it is possible by mounting ufs volumes +async. However, Softupdates does this in a way that a performance +similar or better than async is achieved but without loosing security +in a case of a system crash. For more detailed information and the +copyright terms see /sys/contrib/softupdates/README and +/sys/ufs/ffs/README.softupdate. + +

+To build a system supporting softupdates, you have to build +a kernel with options SOFTUPDATES set (see + $ tunefs -n /mountpoint + +The filesystem in question MUST NOT be mounted at +this time. After that, softupdates are permanently enabled and the +filesystem can be mounted normally. To verify that the softupdates +code is running, simply issue a mount command and an output similar +to the following will appear: + + $ mount + /dev/da2a on /usr/local/squid/cache (ufs, local, noatime, soft-updates, writes: sync 70 async 225) + + +OSF1/3.2 + +

+If you compile both libgnumalloc.a and Squid with BSD/OS +gcc/yacc +

+Some people report +. + +process priority +

+ +I've noticed that my Squid process +seems to stick at a nice value of four, and clicks back to that even +after I renice it to a higher priority. However, looking through the +Squid source, I can't find any instance of a setpriority() call, or +anything else that would seem to indicate Squid's adjusting its own +priority. + +

+by +

+BSD Unices traditionally have auto-niced non-root processes to 4 after +they used alot (4 minutes???) of CPU time. My guess is that it's the BSD/OS +not Squid that is doing this. I don't know offhand if there is a way to +disable this on BSD/OS. +

+by +

+You can get around this by +starting Squid with nice-level -4 (or another negative value). +

+by +

+The autonice behavior is a leftover from the history of BSD as a +university OS. It penalises CPU bound jobs by nicing them after using 600 +CPU seconds. +Adding + + sysctl -w kern.autonicetime=0 + +to /etc/rc.local will disable the behavior systemwide. + + + +Linux + +Cannot bind socket FD 5 to 127.0.0.1:0: (49) Can't assign requested address + +

+Try a different version of Linux. We have received many reports of this +``bug'' from people running Linux 2.0.30. The FATAL: Don't run Squid as root, set 'cache_effective_user'! +

+Some users have reported that setting +Another problem is that RedHat 5.0 Linux seems to have a broken + + % setenv ac_cv_func_setresuid no + % ./configure ... + % make clean + % make install + +Or after running configure, manually edit include/autoconf.h and +change the HAVE_SETRESUID line to: + + #define HAVE_SETRESUID 0 + + +

+Also, some users report this error is due to a NIS configuration +problem. By adding /etc/nsswitch.conf, the problem goes away. +(). + +

+ notes +that these problems with Large ACL lists make Squid slow +

+The regular expression library which comes with Linux is known +to be very slow. Some people report it entirely fails to work +after long periods of time. + +

+To fix, use the GNUregex library included with the Squid source code. +With Squid-2, use the gethostbyname() leaks memory in RedHat 6.0 with glibc 2.1.1. +

+by +

+The gethostbyname() function leaks memory in RedHat +6.0 with glibc 2.1.1. The quick fix is to delete nisplus service from +hosts entry in /etc/nsswitch.conf. In my tests dnsserver memory use +remained stable after I made the above change. + +

+See . + +assertion failed: StatHist.c:91: `statHistBin(H, max) == H->capacity - 1' on Alpha system. +

+by +

+Some early versions of Linux have a kernel bug that causes this. +All that is needed is a recent kernel that doesn't have the mentioned bug. + +tools.c:605: storage size of `rl' isn't known +

+This is a bug with some versions of glibc. The glibc headers +incorrectly depended on the contents of some kernel headers. +Everything broke down when the kernel folks rearranged a bit in +the kernel-specific header files. +

+We think this glibc bug is present in versions +2.1.1 (or 2.1.0) and earlier. There are two solutions: + + +Make sure /usr/include/linux and /usr/include/asm are from the kernel +version glibc is build/configured for, not any other kernel version. +Only compiling of loadable kernel modules outside of the kernel sources +depends on having the current versions of these, and for such builds +-I/usr/src/linux/include (or where ever the new kernel headers are +located) can be used to resolve the matter. + + +Upgrade glibc to 2.1.2 or later. This is always a good idea anyway, +provided a prebuilt upgrade package exists for the Linux distribution +used.. Note: Do not attempt to manually build and install glibc from +source unless you know exactly what you are doing, as this can easily +render the system unuseable. + + + + +HP-UX + +StatHist.c:74: failed assertion `statHistBin(H, min) == 0' +

+This was a very mysterious and unexplainable bug with GCC on HP-UX. +Certain functions, when specified as IRIX + + +There is a problem with GCC (2.8.1 at least) on +Irix 6 which causes it to always return the string 255.255.255.255 for _ANY_ +address when calling inet_ntoa(). If this happens to you, compile Squid +with the native C compiler instead of GCC. + +SCO-UNIX +

+by +

+To make squid run comfortable on SCO-unix you need to do the following: +

+Increase the +One thing left is the number of tcp-connections the system can handle. +Default is 256, but I increase that as well because of the number of +clients we have. + + + +Redirectors + +What is a redirector? + +

+Squid has the ability to rewrite requested URLs. Implemented +as an external process (similar to a dnsserver), Squid can be +configured to pass every incoming URL through a +The Why use a redirector? + +

+A redirector allows the administrator to control the locations to which +his users goto. Using this in conjunction with transparent proxies +allows simple but effective porn control. + +How does it work? + +

+The redirector program must read URLs (one per line) on standard input, +and write rewritten URLs or blank lines on standard output. Note that +the redirector program can not use buffered I/O. Squid writes +additional information after the URL which a redirector can use to make +a decision. The input line consists of four fields: + + URL ip-address/fqdn ident method + + + +

Do you have any examples? + +

+A simple very fast redirector called is a good place to +start, it uses the regex lib to allow pattern matching. + +

+Also see . + +

+The following Perl script may also be used as a template for writing +your own redirector: + + #!/usr/local/bin/perl + $|=1; + while (<>) { + s@http://fromhost.com@http://tohost.org@; + print; + } + + + +Can I use the redirector to return HTTP redirect messages? +

+Normally, the +Simply modify your redirector program to append either "301:" or "302:" +before the new URL. For example, the following script might be used +to direct external clients to a secure Web server for internal documents: + +#!/usr/local/bin/perl +$|=1; + while (<>) { + @X = split; + $url = $X[0]; + if ($url =~ /^http:\/\/internal\.foo\.com/) { + $url =~ s/^http/https/; + $url =~ s/internal/secure/; + print "302:$url\n"; + } else { + print "$url\n"; + } + } + + +

+Please see sections 10.3.2 and 10.3.3 of + +for an explanation of the 301 and 302 HTTP reply codes. + +FATAL: All redirectors have exited! +

+A redirector process must files from the redirector program. + + + +Cache Digests +

+Cache Digest FAQs compiled by +. + + +What is a Cache Digest? + +

+A Cache Digest is a summary of the contents of an Internet Object Caching +Server. +It contains, in a compact (i.e. compressed) format, an indication of whether +or not particular URLs are in the cache. + +

+A "lossy" technique is used for compression, which means that very high +compression factors can be achieved at the expense of not having 100% +correct information. + + +How and why are they used? + +

+Cache servers periodically exchange their digests with each other. + +

+When a request for an object (URL) is received from a client a cache +can use digests from its peers to find out which of its peers (if any) +have that object. +The cache can then request the object from the closest peer (Squid +uses the NetDB database to determine this). + +

+Note that Squid will only make digest queries in those digests that are +enabled. +It will disable a peers digest IFF it cannot fetch a valid digest +for that peer. +It will enable that peers digest again when a valid one is fetched. + +

+The checks in the digest are very fast and they eliminate the need +for per-request queries to peers. Hence: + +

+ +Latency is eliminated and client response time should be improved. +Network utilisation may be improved. + + +

+Note that the use of Cache Digests (for querying the cache contents of peers) +and the generation of a Cache Digest (for retrieval by peers) are independent. +So, it is possible for a cache to make a digest available for peers, and not +use the functionality itself and vice versa. + + +What is the theory behind Cache Digests? + +

+Cache Digests are based on Bloom Filters - they are a method for +representing a set of keys with lookup capabilities; +where lookup means "is the key in the filter or not?". + +

+In building a cache digest: + +

+ + A vector (1-dimensional array) of m bits is allocated, with all +bits initially set to 0. + A number, k, of independent hash functions are chosen, h1, h2, +..., hk, with range { 1, ..., m } +(i.e. a key hashed with any of these functions gives a value between 1 +and m inclusive). + The set of n keys to be operated on are denoted by: +A = { a1, a2, a3, ..., an }. + + + +Adding a Key + +

+To add a key the value of each hash function for that key is calculated. +So, if the key was denoted by a, then h1(a), h2(a), ..., +hk(a) are calculated. + +

+The value of each hash function for that key represents an index into +the array and the corresponding bits are set to 1. So, a digest with +6 hash functions would have 6 bits to be set to 1 for each key added. + +

+Note that the addition of a number of different keys could +cause one particular bit to be set to 1 multiple times. + + +Querying a Key + +

+To query for the existence of a key the indices into the array are +calculated from the hash functions as above. + +

+ + If any of the corresponding bits in the array are 0 then the key is +not present. + If all of the corresponding bits in the array are 1 then the key is +likely to be present. + + +

+Note the term likely. +It is possible that a collision in the digest can occur, whereby +the digest incorrectly indicates a key is present. +This is the price paid for the compact representation. +While the probability of a collision can never be reduced to zero it can +be controlled. +Larger values for the ratio of the digest size to the number of entries added +lower the probability. +The number of hash functions chosen also influence the probability. + + +Deleting a Key +

+ +To delete a key, it is not possible to simply set the associated bits +to 0 since any one of those bits could have been set to 1 by the addition +of a different key! + +

+Therefore, to support deletions a counter is required for each bit position +in the array. +The procedures to follow would be: + +

+ + When adding a key, set appropriate bits to 1 and increment the +corresponding counters. + When deleting a key, decrement the appropriate counters (while > 0), +and if a counter reaches 0 then the corresponding bit is set to 0. + + + +How is the size of the Cache Digest in Squid determined? + +

+Upon initialisation, the capacity is set to the number +of objects that can be (are) stored in the cache. +Note that there are upper and lower limits here. + +

+An arbitrary constant, bits_per_entry (currently set to 5), is +used to calculate the size of the array using the following formula: + +

+ + number of bits in array = capacity * bits_per_entry + 7 + + +

+The size of the digest, in bytes, is therefore: + +

+ + digest size = int (number of bits in array / 8) + + +

+When a digest rebuild occurs, the change in the cache size (capacity) +is measured. +If the capacity has changed by a large enough amount (10%) then +the digest array is freed and reallocated memory, otherwise the +same digest is re-used. + + +What hash functions (and how many of them) does Squid use? + +

+The protocol design allows for a variable number of hash functions (k). +However, Squid employs a very efficient method using a fixed number - four. + +

+Rather than computing a number of independent hash functions over a URL +Squid uses a 128-bit MD5 hash of the key (actually a combination of the URL +and the HTTP retrieval method) and then splits this into four equal +chunks. + +Each chunk, modulo the digest size (m), is used as the value for one of +the hash functions - i.e. an index into the bit array. + +

+Note: As Squid retrieves objects and stores them in its cache on disk, +it adds them to the in-RAM index using a lookup key which is an MD5 hash +- the very one discussed above. +This means that the values for the Cache Digest hash functions are +already available and consequently the operations are extremely +efficient! + +

+Obviously, modifying the code to support a variable number of hash functions +would prove a little more difficult and would most likely reduce efficiency. + + +How are objects added to the Cache Digest in Squid? + +

+Every object referenced in the index in RAM is checked to see if +it is suitable for addition to the digest. + +A number of objects are not suitable, e.g. those that are private, +not cachable, negatively cached etc. and are skipped immediately. + +

+A freshness test is next made in an attempt to guess if +the object will expire soon, since if it does, it is not worthwhile +adding it to the digest. +The object is checked against the refresh patterns for staleness... + +

+Since Squid stores references to objects in its index using the MD5 key +discussed earlier there is no URL actually available for each object - +which means that the pattern used will fall back to the default pattern, ".". +This is an unfortunate state of affairs, but little can be done about +it. +A cd_refresh_pattern option will be added to the configuration +file soon which will at least make the confusion a little clearer :-) + +

+Note that it is best to be conservative with your refresh pattern +for the Cache Digest, i.e. +do not add objects if they might become stale soon. +This will reduce the number of False Hits. + + +Does Squid support deletions in Cache Digests? What are diffs/deltas? + +

+Squid does not support deletions from the digest. +Because of this the digest must, periodically, be rebuilt from scratch to +erase stale bits and prevent digest pollution. + +

+A more sophisticated option is to use diffs or deltas. +These would be created by building a new digest and comparing with the +current/old one. +They would essentially consist of aggregated deletions and additions +since the previous digest. + +

+Since less bandwidth should be required using these it would be possible +to have more frequent updates (and hence, more accurate information). + +

+Costs: + +

+ +RAM - extra RAM needed to hold two digests while comparisons takes place. +CPU - probably a negligible amount. + + + +When and how often is the local digest built? + +

+The local digest is built: + +

+ + when store_rebuild completes after startup +(the cache contents have been indexed in RAM), and + periodically thereafter. Currently, it is rebuilt every hour +(more data and experience is required before other periods, whether +fixed or dynamically varying, can "intelligently" be chosen). +The good thing is that the local cache decides on the expiry time and +peers must obey (see later). + + +

+While the [new] digest is being built in RAM the old version (stored +on disk) is still valid, and will be returned to any peer requesting it. +When the digest has completed building it is then swapped out to disk, +overwriting the old version. + +

+The rebuild is CPU intensive, but not overly so. +Since Squid is programmed using an event-handling model, the approach +taken is to split the digest building task into chunks (i.e. chunks +of entries to add) and to register each chunk as an event. +If CPU load is overly high, it is possible to extend the build period +- as long as it is finished before the next rebuild is due! + +

+It may prove more efficient to implement the digest building as a separate +process/thread in the future... + + +How are Cache Digests transferred between peers? + +

+Cache Digests are fetched from peers using the standard HTTP protocol +(note that a pull rather than push technique is +used). + +

+After the first access to a peer, a peerDigestValidate event +is queued +(this event decides if it is time to fetch a new version of a digest +from a peer). +The queuing delay depends on the number of peers already queued +for validation - so that all digests from different peers are not +fetched simultaneously. + +

+A peer answering a request for its digest will specify an expiry +time for that digest by using the HTTP Expires header. +The requesting cache thus knows when it should request a fresh +copy of that peers digest. + +

+Note: requesting caches use an If-Modified-Since request in case the peer +has not rebuilt its digest for some reason since the last time it was +fetched. + + +How and where are Cache Digests stored? + +

+Cache Digest built locally + +

+Since the local digest is generated purely for the benefit of its neighbours +keeping it in RAM is not strictly required. +However, it was decided to keep the local digest in RAM partly because of +the following: + +

+ + Approximately the same amount of memory will be (re-)allocated on every +rebuild of the digest, + the memory requirements are probably quite small (when compared to other +requirements of the cache server), + if ongoing updates of the digest are to be supported (e.g. additions/deletions) it will be necessary to perform these operations on a digest +in RAM, and + if diffs/deltas are to be supported the "old" digest would have to +be swapped into RAM anyway for the comparisons. + + +

+When the digest is built in RAM, it is then swapped out to disk, where it is +stored as a "normal" cache item - which is how peers request it. + + +Cache Digest fetched from peer + +

+When a query from a client arrives, fast lookups are +required to decide if a request should be made to a neighbour cache. +It it therefore required to keep all peer digests in RAM. + +

+Peer digests are also stored on disk for the following reasons: + +

+ +Recovery +- If stopped and restarted, peer digests can be reused from the local +on-disk copy (they will soon be validated using an HTTP IMS request +to the appropriate peers as discussed earlier), and +Sharing +- peer digests are stored as normal objects in the cache. This +allows them to be given to neighbour caches. + + + +How are the Cache Digest statistics in the Cache Manager to be interpreted? + +

+Cache Digest statistics can be seen from the Cache Manager or through the +client utility. +The following examples show how to use the client utility +to request the list of possible operations from the localhost, local +digest statistics from the localhost, refresh statistics from the +localhost and local digest statistics from another cache, respectively. + +

+ + ./client mgr:menu + + ./client mgr:store_digest + + ./client mgr:refresh + + ./client -h peer mgr:store_digest + + +

+The available statistics provide a lot of useful debugging information. +The refresh statistics include a section for Cache Digests which +explains why items were added (or not) to the digest. + +

+The following example shows local digest statistics for a 16GB +cache in a corporate intranet environment +(may be a useful reference for the discussion below). + +

+ +store digest: size: 768000 bytes + entries: count: 588327 capacity: 1228800 util: 48% + deletion attempts: 0 + bits: per entry: 5 on: 1953311 capacity: 6144000 util: 32% + bit-seq: count: 2664350 avg.len: 2.31 + added: 588327 rejected: 528703 ( 47.33 %) del-ed: 0 + collisions: on add: 0.23 % on rej: 0.23 % + + +

+entries:capacity is a measure of how many items "are likely" to +be added to the digest. +It represents the number of items that were in the local cache at the +start of digest creation - however, upper and lower limits currently +apply. +This value is multiplied by bits: per entry (an arbitrary constant) +to give bits:capacity, which is the size of the cache digest in bits. +Dividing this by 8 will give store digest: size which is the +size in bytes. + +

+The number of items represented in the digest is given by +entries:count. +This should be equal to added minus deletion attempts. + +Since (currently) no modifications are made to the digest after the initial +build (no additions are made and deletions are not supported) +deletion attempts will always be 0 and entries:count +should simply be equal to added. + +

+entries:util is not really a significant statistic. +At most it gives a measure of how many of the items in the store were +deemed suitable for entry into the cache compared to how many were +"prepared" for. + +

+rej shows how many objects were rejected. +Objects will not be added for a number of reasons, the most common being +refresh pattern settings. +Remember that (currently) the default refresh pattern will be used for +checking for entry here and also note that changing this pattern can +significantly affect the number of items added to the digest! +Too relaxed and False Hits increase, too strict and False Misses increase. +Remember also that at time of validation (on the peer) the "real" refresh +pattern will be used - so it is wise to keep the default refresh pattern +conservative. + +

+bits: on indicates the number of bits in the digest that are set +to 1. +bits: util gives this figure as a percentage of the total number +of bits in the digest. +As we saw earlier, a figure of 50% represents the optimal trade-off. +Values too high (say > 75%) would cause a larger number of collisions, +and hence False Hits, +while lower values mean the digest is under-utilised (using unnecessary RAM). +Note that low values are normal for caches that are starting to fill up. + +

+A bit sequence is an uninterrupted sequence of bits with the same value. +bit-seq: avg.len gives some insight into the quality of the hash +functions. +Long values indicate problem, even if bits:util is 50% +(> 3 = suspicious, > 10 = very suspicious). + + +What are False Hits and how should they be handled? + +

+A False Hit occurs when a cache believes a peer has an object +and asks the peer for it but the peer is not able to +satisfy the request. + +

+Expiring or stale objects on the peer are frequent causes of False +Hits. +At the time of the query actual refresh patterns are used on the +peer and stale entries are marked for revalidation. +However, revalidation is prohibited unless the peer is behaving +as a parent, or miss_access is enabled. +Thus, clients can receive error messages instead of revalidated +objects! + +

+The frequency of False Hits can be reduced but never eliminated +completely, therefore there must be a robust way of handling them +when they occur. +The philosophy behind the design of Squid is to use lightweight +techniques and optimise for the common case and robustly handle the +unusual case (False Hits). + +

+Squid will soon support the HTTP only-if-cached header. +Requests for objects made to a peer will use this header and if +the objects are not available, the peer can reply appropriately +allowing Squid to recognise the situation. +The following describes what Squid is aiming towards: + +

+ +Cache Digests used to obtain good estimates of where a +requested object is located in a Cache Hierarchy. +Persistent HTTP Connections between peers. +There will be no TCP startup overhead and both latency and +network load will be similar for ICP (i.e. fast). +HTTP False Hit Recognition using the only-if-cached +HTTP header - allowing fall back to another peer or, if no other +peers are available with the object, then going direct (or +through a parent if behind a firewall). + + + +How can Cache Digest related activity be traced/debugged? + +

+Enabling Cache Digests + +

+If you wish to use Cache Digests (available in Squid version 2) you need to +add a configure option, so that the relevant code is compiled in: + +

+ + ./configure --enable-cache-digests ... + + + +What do the access.log entries look like? + +

+If a request is forwarded to a neighbour due a HIT in that neighbour's +Cache Digest the hierarchy (9th) field of the access.log file for +the local cache will look like CACHE_DIGEST_HIT/neighbour. +The Log Tag (4th field) should obviously show a MISS. + +

+On the peer cache the request should appear as a normal HTTP request +from the first cache. + + +What does a False Hit look like? + +

+The easiest situation to analyse is when two caches (say A and B) are +involved neither of which uses the other as a parent. +In this case, a False Hit would show up as a CACHE_DIGEST_HIT on A and +NOT as a TCP_HIT on B (or vice versa). +If B does not fetch the object for A then the hierarchy field will +look like NONE/- (and A will have received an Access Denied +or Forbidden message). +This will happen if the object is not "available" on B and B does not +have miss_access enabled for A (or is not acting as a parent +for A). + + +How is the cause of a False Hit determined? + +

+Assume A requests a URL from B and receives a False Hit + + + Using the client utility PURGE the URL from A, e.g. + +

+ + ./client -m PURGE 'URL' + + + Using the client utility request the object from A, e.g. + +

+ + ./client 'URL' + + + + +

+The HTTP headers of the request are available. +Two header types are of particular interest: + +

+ + X-Cache - this shows whether an object is available or not. + X-Cache-Lookup - this keeps the result of a store table lookup +before refresh causing rules are checked (i.e. it indicates if the +object is available before any validation would be attempted). + + +

+The X-Cache and X-Cache-Lookup headers from A should both show MISS. + +

+If A requests the object from B (which it will if the digest lookup indicates +B has it - assuming B is closest peer of course :-) then there will be another +set of these headers from B. + +

+If the X-Cache header from B shows a MISS a False Hit has occurred. +This means that A thought B had an object but B tells A it does not +have it available for retrieval. +The reason why it is not available for retrieval is indicated by the +X-Cache-Lookup header. If: + +

+ + + X-Cache-Lookup = MISS then either A's (version of + B's) digest is out-of-date or corrupt OR a collision occurred + in the digest (very small probability) OR B recently purged + the object. + + X-Cache-Lookup = HIT then B had the object, but + refresh rules (or A's max-age requirements) prevent A from + getting a HIT (validation failed). + + + +Use The Source + +

+If there is something else you need to check you can always look at the +source code. +The main Cache Digest functionality is organised as follows: + +

+ + CacheDigest.c (debug section 70) Generic Cache Digest routines + store_digest.c (debug section 71) Local Cache Digest routines + peer_digest.c (debug section 72) Peer Cache Digest routines + + +

+Note that in the source the term Store Digest refers to the digest +created locally. +The Cache Digest code is fairly self-explanatory (once you understand how Cache +Digests work): + + +What about ICP? +

+ +COMING SOON! + + +Is there a Cache Digest Specification? + +

+There is now, thanks to + and +. + +

+Cache Digests, as implemented in Squid 2.1.PATCH2, are described in +. + +You'll notice the format is similar to an Internet Draft. +We decided not to submit this document as a draft because Cache Digests +will likely undergo some important changes before we want to try to make +it a standard. + +Would it be possible to stagger the timings when cache_digests are retrieved from peers? +

+Note: The information here is current for version 2.2. +

+Squid already has code to spread the digest updates. The algorithm is +currently controlled by a few hard-coded constants in +Note that whatever you do, you still need to give Squid enough time and +bandwidth to fetch all the digests. Depending on your environment, that +bandwidth may be more or less than an ICP would require. Upcoming digest +deltas (x10 smaller than the digests themselves) may be the only way to +solve the ``big scale'' problem. + + + + +Transparent Caching/Proxying +

+ +How can I make my users' browsers use my cache without configuring +the browsers for proxying? + + +First, it is +Getting transparent caching to work requires four distinct steps: + + + + + + + + + + + http_port 8080 + httpd_accel_host virtual + httpd_accel_port 80 + httpd_accel_with_proxy on + httpd_accel_uses_host_header on + + + + + +The In the The You + +Transparent caching for Solaris, SunOS, and BSD systems +Install IP Filter +

+First, get and install the +. + +Configure ipnat +

+Put these lines in /etc/ipnat.rules: + + # Redirect direct web traffic to local web server. + rdr de0 1.2.3.4/32 port 80 -> 1.2.3.4 port 80 tcp + + # Redirect everything else to squid on port 8080 + rdr de0 0.0.0.0/0 port 80 -> 1.2.3.4 port 8080 tcp + + +

+Modify your startup scripts to enable ipnat. For example, on FreeBSD it +looks something like this: + + /sbin/modload /lkm/if_ipl.o + /sbin/ipnat -f /etc/ipnat.rules + chgrp nobody /dev/ipnat + chmod 644 /dev/ipnat + + +Configure Squid +Squid-2 +

+Squid-2 (after version beta25) has IP filter support built in. +Simple enable it when you run + ./configure --enable-ipf-transparent + +Add these lines to your + http_port 8080 + httpd_accel_host virtual + httpd_accel_port 80 + httpd_accel_with_proxy on + httpd_accel_uses_host_header on + +Note, you don't have to use port 8080, but it must match whatever you +used in the /etc/ipnat.rules file. +Squid-1.1 +

+Patches for Squid-1.X are available from +. +Add these lines to + http_port 8080 + httpd_accel virtual 80 + httpd_accel_with_proxy on + httpd_accel_uses_host_header on + + +

+Thanks to . + +Transparent caching with Linux +

+by + +

. + +

+ +Since the browser wasn't set up to use a proxy server, it uses +the FTP protocol (with destination port 21) and not the required +HTTP protocol. You can't setup a redirection-rule to the proxy +server since the browser is speaking the wrong protocol. A similar +problem occurs with gopher. Normally all proxy requests are +translated by the client into the HTTP protocol, but since the +client isn't aware of the redirection, this never happens. + + +

+If you can live with the side-effects, go ahead and compile your +kernel with firewalling and redirection support. Here are the +important parameters from /usr/src/linux/.config: + + + # + # Code maturity level options + # + CONFIG_EXPERIMENTAL=y + # + # Networking options + # + CONFIG_FIREWALL=y + # CONFIG_NET_ALIAS is not set + CONFIG_INET=y + CONFIG_IP_FORWARD=y + # CONFIG_IP_MULTICAST is not set + CONFIG_IP_FIREWALL=y + # CONFIG_IP_FIREWALL_VERBOSE is not set + CONFIG_IP_MASQUERADE=y + CONFIG_IP_TRANSPARENT_PROXY=y + CONFIG_IP_ALWAYS_DEFRAG=y + # CONFIG_IP_ACCT is not set + CONFIG_IP_ROUTER=y + + +

+You may also need to enable + echo 1 > /proc/sys/net/ipv4/ip_forward + + +

+Go to the + page, +obtain the source distribution to /etc/rc.d/rc.inet1 +(Slackware) which sets up the interfaces at boot-time. The redirection +should be done before any other Input-accept rule. To really make +sure it worked I disabled the forwarding (masquerading) I normally +do. +

+/etc/rc.d/rc.firewall: + + + #!/bin/sh + # rc.firewall Linux kernel firewalling rules + FW=/sbin/ipfwadm + + # Flush rules, for testing purposes + for i in I O F # A # If we enabled accounting too + do + ${FW} -$i -f + done + + # Default policies: + ${FW} -I -p rej # Incoming policy: reject (quick error) + ${FW} -O -p acc # Output policy: accept + ${FW} -F -p den # Forwarding policy: deny + + # Input Rules: + + # Loopback-interface (local access, eg, to local nameserver): + ${FW} -I -a acc -S localhost/32 -D localhost/32 + + # Local Ethernet-interface: + + # Redirect to Squid proxy server: + ${FW} -I -a acc -P tcp -D default/0 80 -r 8080 + + # Accept packets from local network: + ${FW} -I -a acc -P all -S localnet/8 -D default/0 -W eth0 + + # Only required for other types of traffic (FTP, Telnet): + + # Forward localnet with masquerading (udp and tcp, no icmp!): + ${FW} -F -a m -P tcp -S localnet/8 -D default/0 + ${FW} -F -a m -P udp -S localnet/8 -D default/0 + + +

+Here all traffic from the local LAN with any destination gets redirected to +the local port 8080. Rules can be viewed like this: + + IP firewall input rules, default policy: reject + type prot source destination ports + acc all 127.0.0.1 127.0.0.1 n/a + acc/r tcp 10.0.0.0/8 0.0.0.0/0 * -> 80 => 8080 + acc all 10.0.0.0/8 0.0.0.0/0 n/a + acc tcp 0.0.0.0/0 0.0.0.0/0 * -> * + + +

+I did some testing on Windows 95 with both Microsoft Internet +Explorer 3.01 and Netscape Communicator pre-release and it worked +with both browsers with the proxy-settings disabled. + +

+At one time + ${FW} -I -a rej -P tcp -S localnet/8 -D hostname/32 80 + + IP firewall input rules, default policy: reject + type prot source destination ports + acc all 127.0.0.1 127.0.0.1 n/a + rej tcp 10.0.0.0/8 10.0.0.1 * -> 80 + acc/r tcp 10.0.0.0/8 0.0.0.0/0 * -> 80 => 8080 + acc all 10.0.0.0/8 0.0.0.0/0 n/a + acc tcp 0.0.0.0/0 0.0.0.0/0 * -> * + + +

+ +If you're already running a nameserver at the firewall or proxy server +(which is a good idea anyway IMHO) let the workstations use this +nameserver. + +

+Additional notes from + + + +

+I'm using such a setup. The only issues so far have been that: + + + +It's fairly useless to use my service providers parent caches +(cache-?.www.demon.net) because by proxying squid only sees IP addresses, +not host names and demon aren't generally asked for IP addresses by other +users; + + +Linux kernel 2.0.30 is a no-no as transparent proxying is broken (I use +2.0.29); + + +Client browsers must do host name lookups themselves, as they don't know +they're using a proxy; + + +The Microsoft Network won't authorize its users through a proxy, so I +have to specifically *not* redirect those packets (my company is a MSN +content provider). + + +Aside from this, I get a 30-40% hit rate on a 50MB cache for 30-40 users and +am quite pleased with the results. + + +

+See also . + + +Transparent caching with Cisco routers + +

+by + +

+This works with at least IOS 11.1 and later I guess. Possibly earlier, +as I'm no CISCO expert I can't say for sure. If your router is doing +anything more complicated that shuffling packets between an ethernet +interface and either a serial port or BRI port, then you should work +through if this will work for you. + +

+First define a route map with a name of proxy-redirect (name doesn't +matter) and specify the next hop to be the machine Squid runs on. + + ! + route-map proxy-redirect permit 10 + match ip address 110 + set ip next-hop 203.24.133.2 + ! + +Define an access list to trap HTTP requests. The second line allows +the Squid host direct access so an routing loop is not formed. +By carefully writing your access list as show below, common +cases are found quickly and this can greatly reduce the load on your +router's processor. + + ! + access-list 110 deny tcp any any neq www + access-list 110 deny tcp host 203.24.133.2 any + access-list 110 permit tcp any any + ! + +Apply the route map to the ethernet interface. + + ! + interface Ethernet0 + ip policy route-map proxy-redirect + ! + + +possible bugs + +

+ notes that +there is a Cisco bug relating to transparent proxying using IP +policy route maps, that causes NFS and other applications to break. +Apparently there are two bug reports raised in Cisco, but they are +not available for public dissemination. + +

+The problem occurs with o/s packets with more than 1472 data bytes. If you try +to ping a host with more than 1472 data bytes across a Cisco interface with the +access-lists and ip policy route map, the icmp request will fail. The +packet will be fragmented, and the first fragment is checked against the +access-list and rejected - it goes the "normal path" as it is an icmp +packet - however when the second fragment is checked against the +access-list it is accepted (it isn't regarded as an icmp packet), and +goes to the action determined by the policy route map! + +

+ notes that you +may be able to get around this bug by carefully writing your access lists. +If the last/default rule is to permit then this bug +would be a problem, but if the last/default rule was to deny then +it won't be a problem. I guess fragments, other than the first, +don't have the information available to properly policy route them. +Normally TCP packets should not be fragmented, at least my network +runs an MTU of 1500 everywhere to avoid fragmentation. So this would +affect UDP and ICMP traffic only. + +

+Basically, you will have to pick between living with the bug or better +performance. This set has better performance, but suffers from the +bug: + + access-list 110 deny tcp any any neq www + access-list 110 deny tcp host 10.1.2.3 any + access-list 110 permit tcp any any + +Conversely, this set has worse performance, but works for all protocols: + + access-list 110 deny tcp host 10.1.2.3 any + access-list 110 permit tcp any any eq www + access-list 110 deny tcp any any + + +Transparent caching with LINUX 2.0.29 and CISCO IOS 11.1 + +

+Just for kicks, here's an email message posted to squid-users +on how to make transparent proxying work with a Cisco router +and Squid running on Linux. + +

+by + +

+Here is how I have Transparent proxying working for me, in an environment +where my router is a Cisco 2501 running IOS 11.1, and Squid machine is +running Linux 2.0.33. + +

+Many thanks to the following individuals and the squid-users list for +helping me get redirection and transparent proxying working on my +Cisco/Linux box. + + +Lincoln Dale +Riccardo Vratogna +Mark White +Henrik Nordstrom + + +

+First, here is what I added to my Cisco, which is running IOS 11.1. In +IOS 11.1 the route-map command is "process switched" as opposed to the +faster "fast-switched" route-map which is found in IOS 11.2 and later. +You may wish to be running IOS 11.2. I am running 11.1, and have had no +problems with my current load of about 150 simultaneous connections to +squid.: + + ! + interface Ethernet0 + description To Office Ethernet + ip address 208.206.76.1 255.255.255.0 + no ip directed-broadcast + no ip mroute-cache + ip policy route-map proxy-redir + ! + access-list 110 deny tcp host 208.206.76.44 any eq www + access-list 110 permit tcp any any eq www + route-map proxy-redir permit 10 + match ip address 110 + set ip next-hop 208.206.76.44 + + +

+So basically from above you can see I added the "route-map" declaration, +and an access-list, and then turned the route-map on under int e0 "ip +policy route-map proxy-redir" + +

+ok, so the Cisco is taken care of at this point. The host above: +208.206.76.44, is the ip number of my squid host. + +

+My squid box runs Linux, so I had to do the following on it: + +

+my kernel (2.0.33) config looks like this: + + # + # Networking options + # + CONFIG_FIREWALL=y + # CONFIG_NET_ALIAS is not set + CONFIG_INET=y + CONFIG_IP_FORWARD=y + CONFIG_IP_MULTICAST=y + CONFIG_SYN_COOKIES=y + # CONFIG_RST_COOKIES is not set + CONFIG_IP_FIREWALL=y + # CONFIG_IP_FIREWALL_VERBOSE is not set + CONFIG_IP_MASQUERADE=y + # CONFIG_IP_MASQUERADE_IPAUTOFW is not set + CONFIG_IP_MASQUERADE_ICMP=y + CONFIG_IP_TRANSPARENT_PROXY=y + CONFIG_IP_ALWAYS_DEFRAG=y + # CONFIG_IP_ACCT is not set + CONFIG_IP_ROUTER=y + + +

+You will need Firewalling and Transparent Proxy turned on at a minimum. + +

+Then some ipfwadm stuff: + + # Accept all on loopback + ipfwadm -I -a accept -W lo + # Accept my own IP, to prevent loops (repeat for each interface/alias) + ipfwadm -I -a accept -P tcp -D 208.206.76.44 80 + # Send all traffic destined to port 80 to Squid on port 3128 + ipfwadm -I -a accept -P tcp -D 0/0 80 -r 3128 + +

+it accepts packets on port 80 (redirected from the Cisco), and redirects +them to 3128 which is the port my squid process is sitting on. I put all +this in /etc/rc.d/rc.local + +

+I am using + with + +installed. You will want to install this patch if using a setup similar +to mine. + +The cache is trying to connect to itself... +

+by +

+I think almost everyone who have tried to build a transparent proxy +setup have been bitten by this one. + +

+Measures you can take: + + +Deny Squid from fetching objects from itself (using ACL lists). + +Apply a small patch that prevents Squid from looping infinitely +(available from ) + +Don't run Squid on port 80, and redirect port 80 not destined for +the local machine to Squid (redirection == ipfilter/ipfw/ipfadm). This +avoids the most common loops. + +If you are using ipfilter then you should also use transproxyd in +front of Squid. Squid does not yet know how to interface to ipfilter +(patches are welcome: squid-bugs@ircache.net). + + +Transparent caching with FreeBSD +

+by Duane Wessels +

+I set out yesterday to make transparent caching work with Squid and +FreeBSD. It was, uh, fun. +

+It was relatively easy to configure a cisco to divert port 80 +packets to my FreeBSD box. Configuration goes something like this: + +access-list 110 deny tcp host 10.0.3.22 any eq www +access-list 110 permit tcp any any eq www +route-map proxy-redirect permit 10 + match ip address 110 + set ip next-hop 10.0.3.22 +int eth2/0 + ip policy route-map proxy-redirect + +Here, 10.0.3.22 is the IP address of the FreeBSD cache machine. + +

+Once I have packets going to the FreeBSD box, I need to get the +kernel to deliver them to Squid. +I started on FreeBSD-2.2.7, and then downloaded +. This was a dead end for me. The IPFilter distribution +includes patches to the FreeBSD kernel sources, but many of these had +conflicts. Then I noticed that the IPFilter page says +``It comes as a part of [FreeBSD-2.2 and later].'' Fair enough. Unfortunately, +you can't hijack connections with the FreeBSD-2.2.X IPFIREWALL code ( +FreeBSD-3.0 has much better support for connection hijacking, so I suggest +you start with that. You need to build a kernel with the following options: + + options IPFIREWALL + options IPFIREWALL_FORWARD + + +

+Next, its time to configure the IP firewall rules with /etc/rc.local +just to be able to use the machine on my network: + + ipfw add 60000 allow all from any to any + +But we're still not hijacking connections. To accomplish that, +add these rules: + + ipfw add 49 allow tcp from 10.0.3.22 to any + ipfw add 50 fwd 127.0.0.1 tcp from any to any 80 + +The second line (rule 50) is the one which hijacks the connection. +The first line makes sure we never hit rule 50 for traffic originated +by the local machine. This prevents forwarding loops. + +

+Note that I am not changing the port number here. That is, +port 80 packets are simply diverted to Squid on port 80. +My Squid configuration is: + + http_port 80 + httpd_accel_host virtual + httpd_accel_port 80 + httpd_accel_with_proxy on + httpd_accel_uses_host_header on + + +

+If you don't want Squid to listen on port 80 (because that +requires root privileges) then you can use another port. +In that case your ipfw redirect rule looks like: + + ipfw add 50 fwd 127.0.0.1,3128 tcp from any to any 80 + +and the + http_port 3128 + httpd_accel_host virtual + httpd_accel_port 80 + httpd_accel_with_proxy on + httpd_accel_uses_host_header on + + +Transparent caching with Linux and ipchains +

+by +

+The following shows important kernel features to include: + + [*] Network firewalls + [ ] Socket Filtering + [*] Unix domain sockets + [*] TCP/IP networking + [ ] IP: multicasting + [ ] IP: advanced router + [ ] IP: kernel level autoconfiguration + [*] IP: firewalling + [ ] IP: firewall packet netlink device + [*] IP: always defragment (required for masquerading) + [*] IP: transparent proxy support + +

+You must include the IP: always defragment, otherwise it prevents +you from using the REDIRECT chain. + +

+The following script is used to configure ipchains: + + #Send all traffic destined to port 80 to squid on port 8081 + /sbin/ipchains -A input -p tcp -s 10.0.3.22/16 -d 0/0 80 -j REDIRECT 8081 + + +

+Also, +notes that with 2.0.x kernels you don't need to enable packet forwarding, +but with the 2.1.x and 2.2.x kernels using ipchains you do. Packet +forwarding is enabled with the following command: + + echo 1 > /proc/sys/net/ipv4/ip_forward + + +Transparent caching with ACC Tigris digital access server +

+by +

+This is to do with configuring transparent proxy +for an ACC Tigris digital access server (like a CISCO 5200/5300 +or an Ascend MAX 4000). I've found that doing this in the NAS +reduces traffic on the LAN and reduces processing load on the +CISCO. The Tigris has ample CPU for filtering. + +

+Step 1 is to create filters that allow local traffic to pass. +Add as many as needed for all of your address ranges. + + ADD PROFILE IP FILTER ENTRY local1 INPUT 10.0.3.0 255.255.255.0 0.0.0.0 0.0.0.0 NORMAL + ADD PROFILE IP FILTER ENTRY local2 INPUT 10.0.4.0 255.255.255.0 0.0.0.0 0.0.0.0 NORMAL + + +

+Step 2 is to create a filter to trap port 80 traffic. + + ADD PROFILE IP FILTER ENTRY http INPUT 0.0.0.0 0.0.0.0 0.0.0.0 0.0.0.0 = 0x6 D= 80 NORMAL + + +

+Step 3 is to set the "APPLICATION_ID" on port 80 traffic to 80. +This causes all packets matching this filter to have ID 80 +instead of the default ID of 0. + + SET PROFILE IP FILTER APPLICATION_ID http 80 + + +

+Step 4 is to create a special route that is used for +packets with "APPLICATION_ID" set to 80. The routing +engine uses the ID to select which routes to use. + + ADD IP ROUTE ENTRY 0.0.0.0 0.0.0.0 PROXY-IP 1 + SET IP ROUTE APPLICATION_ID 0.0.0.0 0.0.0.0 PROXY-IP 80 + + +

+Step 5 is to bind everything to a filter ID called transproxy. +List all local filters first and the http one last. + + ADD PROFILE ENTRY transproxy local1 local2 http + + +

+With this in place use your RADIUS server to send back the +``Framed-Filter-Id = transproxy'' key/value pair to the NAS. + +

+You can check if the filter is being assigned to logins with +the following command: + + display profile port table + + +``Connection reset by peer'' and Cisco policy routing +

+ +has tracked down the cause of unusual ``connection reset by peer'' messages +when using Cisco policy routing to hijack HTTP requests. +

+When the network link between router and the cache goes down for just a +moment, the packets that are supposed to be redirected are instead sent +out the default route. If this happens, a TCP ACK from the client host +may be sent to the origin server, instead of being diverted to the +cache. The origin server, upon receiving an unexpected ACK packet, +sends a TCP RESET back to the client, which aborts the client's request. +

+To work around this problem, you can install a static route to the + + ip route 1.2.3.4 255.255.255.255 Null0 250 + +This appears to cause the correct behaviour. + + +WCCP - Web Cache Coordination Protocol + +

+Contributors: and +. + +Does Squid support WCCP? + +

+CISCO's Web Cache Coordination Protocol V1.0 is supported in squid +2.3 and later. Due to licencing requirements squid is not able to +support WCCP V2.0. If CISCO chooses to open WCCP V2 and relax the +licensing terms, Squid may be able to support it in the future. + +Configuring your Router + +

+There are two different methods of configuring WCCP on CISCO routers. +The first method is for routers that only support V1.0 of the +protocol. The second is for routers that support both. + +IOS Version 11.x + +

+It is possible that later versions of IOS 11.x will support V2.0 of the +protocol. If that is the case follow the 12.x instructions. Several +people have reported that the squid implimentation of WCCP does not +work with their 11.x routers. If you experience this please mail the +debug output from your router to + conf t + + wccp enable + ! + interface [Interface Carrying Outgoing Traffic]x/x + ! + ip wccp web-cache redirect + ! + CTRL Z + write mem + + + IOS Version 12.x + +

+Some of the early versions of 12.x do not have the 'ip wccp version' +command. You will need to upgrade your IOS version to use V1.0. + +

+You will need to be running at least IOS Software Release + conf t + + ip wccp version 1 + ip wccp web-cache + ! + interface [Interface Carrying Outgoing/Incomming Traffic]x/x + ip wccp web-cache redirect out|in + ! + CTRL Z + write mem + + +Configuring FreeBSD + +

+FreeBSD first needs to be configured to recieve and strip the GRE +encapsulation from the packets from the router. To do this you will +need to patch and recompile your kernel. + +

+First, a patch needs to be applied to your kernel for GRE +support. Apply the + +or the + +as appropriate. + +

+Secondly you will need to download + +or + +and copy it to /usr/src/sys/netinet/gre.c. + +

+Finally add "OPTION GRE" to your kernel config file and rebuild +your kernel. Note, the . + +Configuring Linux 2.2 + +

+There are currently two methods for supporting WCCP with Linux 2.2. +A specific purpose module. Or the standard Linux GRE tunneling +driver. People have reported difficulty with the standard GRE +tunneling driver, however it does allow GRE functionality other +than WCCP. You should choose the method that suits your enviroment. + +Standard Linux GRE Tunnel + +

+Linux 2.2 kernels already support GRE, as long as the GRE module is +compiled into the kernel. + +

+You will need to patch the supplied by . + +

+Ensure that the GRE code is either built as static or as a module by chosing +the appropriate option in your kernel config. Then rebuild your kernel. +If it is a module you will need to: + + modprobe ip_gre + + +The next step is to tell Linux to establish an IP tunnel between the router and +your host. Daniele Orlandi reports +that you have to give the gre1 interface an address, but any old +address seems to work. + + iptunnel add gre1 mode gre remote <Router-IP> local <Host-IP> dev <interface> + ifconfig gre1 127.0.0.2 up + +<Router-IP> is the IP address of your router that is intercepting the +HTTP packets. <Host-IP> is the IP address of your cache, and +<interface> is the network interface that receives those packets (probably eth0). + +WCCP Specific Module + +

+This module is not part of the standard Linux distributon. It needs +to be compiled as a module and loaded on your system to function. +Do not attempt to build this in as a static part of your kernel. + +

+Download the +and compile it as you would any Linux network module. + +

+Copy the module to /lib/modules/kernel-version/ipv4/ip_wccp.o. Edit +/lib/modules/kernel-version/modules.dep and add: + + + /lib/modules/kernel-version/ipv4/ip_wccp.o: + + +

+Finally you will need to load the module: + + + modprobe ip_wccp + + +Common Steps + +

+The machine should now be striping the GRE encapsulation from any packets +recieved and requeuing them. The system will also need to be configured +for transparent proxying, either with +or with . + +Configuring Others + +

+If you have managed to configuring your operating system to support WCCP +with Squid +please contact us with the details so we may share them with others. + +Can someone tell me what version of cisco IOS WCCP is added in? +

+IOS releases: + +11.1(19?)CA/CC or later +11.2(14)P or later +12.0(anything) or later + + +Transparent caching with Foundry L4 switches +

+by . +

+First, configure Squid for transparent caching as detailed +at the . +

+Next, configure +the Foundry layer 4 switch to +transparently redirect traffic to your Squid box or boxes. By default, +the Foundry +redirects to port 80 of your squid box. This can +be changed to a different port if needed, but won't be covered +here. + +

+In addition, the switch does a "health check" of the port to make +sure your squid is answering. If you squid does not answer, the +switch defaults to sending traffic directly thru instead of +redirecting it. When the Squid comes back up, it begins +redirecting once again. + +

+This example assumes you have two squid caches: + +squid1.foo.com 192.168.1.10 +squid2.foo.com 192.168.1.11 + + +

+We will assume you have various workstations, customers, etc, plugged +into the switch for which you want them to be transparently proxied. +The squid caches themselves should be plugged into the switch as well. +Only the interface that the router is connected to is important. Where you +put the squid caches or other connections does not matter. + +

+This example assumes your router is plugged into interface + +Enter configuration mode: + +telnet@ServerIron#conf t + + + +Configure each squid on the Foundry: + +telnet@ServerIron(config)# server cache-name squid1 192.168.1.10 +telnet@ServerIron(config)# server cache-name squid2 192.168.1.11 + + + +Add the squids to a cache-group: + +telnet@ServerIron(config)#server cache-group 1 +telnet@ServerIron(config-tc-1)#cache-name squid1 +telnet@ServerIron(config-tc-1)#cache-name squid2 + + + +Create a policy for caching http on a local port + +telnet@ServerIron(config)# ip policy 1 cache tcp http local + + + +Enable that policy on the port connected to your router + +telnet@ServerIron(config)#int e 17 +telnet@ServerIron(config-if-17)# ip-policy 1 + + + + +

+Since all outbound traffic to the Internet goes out interface + +The default port to redirect to can be changed. The load balancing +algorithm used can be changed (Least Used, Round Robin, etc). Ports +can be exempted from caching if needed. Access Lists can be applied +so that only certain source IP Addresses are redirected, etc. This +information was left out of this document since this was just a quick +howto that would apply for most people, not meant to be a comprehensive +manual of how to configure a Foundry switch. I can however revise this +with any information necessary if people feel it should be included. + + + +SNMP + +

+Contributors: . + +Does Squid support SNMP? + +

+True SNMP support is available in squid 2 and above. A significant change in the implimentation +occured starting with the development 2.2 code. Therefore there are two sets of instructions +on how to configure SNMP in squid, please make sure that you follow the correct one. + +Enabling SNMP in Squid +

+To use SNMP, it must first be enabled with the + ./configure --enable-snmp [ ... other configure options ] + +Next, recompile after cleaning the source tree : + + make clean + make all + make install + +Once the compile is completed and the new binary is installed the Configuring Squid 2.2 + +

+To configure SNMP first specify a list of communities that you would like to allow access +by using a standard acl of the form: + + acl aclname snmp_community string + +For example: + + acl snmppublic snmp_community public + acl snmpjoebloggs snmp_community joebloggs + +This creates two acl's, with two different communities, public and joebloggs. You can +name the acl's and the community strings anything that you like. + +

+To specify the port that the agent will listen on modify the "snmp_port" parameter, +it is defaulted to 3401. The port that the agent will forward requests that can +not be furfilled by this agent to is set by "forward_snmpd_port" it is defaulted +to off. It must be configured for this to work. Remember that as the requests will +be originating from this agent you will need to make sure that you configure +your access accordingly. + +

+To allow access to Squid's SNMP agent, define an + snmp_access allow snmppublic localhost + snmp_access deny all + +The above will allow anyone on the localhost who uses the community + +If you do not define any +Finally squid allows to you to configure the address that the agent will bind to +for incomming and outgoing traffic. These are defaulted to 0.0.0.0, changing these +will cause the agent to bind to a specific address on the host, rather than the +default which is all. + + snmp_incoming_address 0.0.0.0 + snmp_outgoing_address 0.0.0.0 + + +Configuring Squid 2.1 +

+Prior to Squid 2.1 the SNMP code had a number of issues with the ACL's. If you are +a frequent user of SNMP with Squid, please upgrade to 2.2 or higher. +

+ +A sort of default, working configuration is: + + snmp_port 3401 + snmp_mib_path /local/squid/etc/mib.txt + + snmp_agent_conf view all .1.3.6 included + snmp_agent_conf view squid .1.3.6 included + snmp_agent_conf user squid - all all public + snmp_agent_conf user all all all all squid + snmp_agent_conf community public squid squid + snmp_agent_conf community readwrite all all + +

+Note that for security you are advised to restrict SNMP access to your +caches. You can do this easily as follows: + + acl snmpmanagementhosts 1.2.3.4/255.255.255.255 1.2.3.0/255.255.255.0 + snmp_acl public deny all !snmpmanagementhosts + snmp_acl readwrite deny all + +You must follow these instructions for 2.1 and below exactly or you are +likely to have problems. The parser has some issues which have been corrected +in 2.2. + +How can I query the Squid SNMP Agent + +

+You can test if your Squid supports SNMP with the ). +Note that you have to specify the SNMP port, which in Squid defaults to +3401. + + snmpwalk -p 3401 hostname communitystring .1.3.6.1.4.1.3495.1.1 + +If it gives output like: + + enterprises.nlanr.squid.cacheSystem.cacheSysVMsize = 7970816 + enterprises.nlanr.squid.cacheSystem.cacheSysStorage = 2796142 + enterprises.nlanr.squid.cacheSystem.cacheUptime = Timeticks: (766299) 2:07:42.99 + +then it is working ok, and you should be able to make nice statistics out of it. + +

+For an explanation of what every string (OID) does, you should +refer to the . + +What can I use SNMP and Squid for? +

+There are a lot of things you can do with SNMP and Squid. It can be useful +in some extent for a longer term overview of how your proxy is doing. It can +also be used as a problem solver. For example: how is it going with your +filedescriptor usage? or how much does your LRU vary along a day. Things +you can't monitor very well normally, aside from clicking at the cachemgr +frequently. Why not let MRTG do it for you? + +How can I use SNMP with Squid? +

+There are a number of tools that you can use to monitor Squid via SNMP. A very popular one +is MRTG, there are however a number of others. To learn what they are and to get additional +documentation, please visit the . + +MRTG +

+We use +to query Squid through its . + +

+To get instruction on using MRTG with Squid please visit the +. + +Where can I get more information/discussion about Squid and SNMP? + +

+General Discussion: +These messages are . + +

+Subscriptions should be sent to: . + + + + +Squid version 2 + +What are the new features? + +

+ +HTTP/1.1 persistent connections. +Lower VM usage; in-transit objects are not held fully in memory. +Totally independent swap directories. +Customizable error texts. +FTP supported internally; no more ftpget. +Asynchronous disk operations (optional, requires pthreads library). +Internal icons for FTP and gopher directories. +snprintf() used everywhere instead of sprintf(). +SNMP. + +Routing requests based on AS numbers. + +...and many more! + + + + +How do I configure 'ssl_proxy' now? +

+By default, Squid connects directly to origin servers for SSL requests. +But if you must force SSL requests through a parent, first tell Squid +it can not go direct for SSL: + + acl SSL method CONNECT + never_direct allow SSL + +With this in place, Squid + cache_peer parent1 parent 3128 3130 + cache_peer parent2 parent 3128 3130 + cache_peer_access parent2 allow !SSL + +The above lines tell Squid to NOT use Logfile rotation doesn't work with Async I/O + +

+It is a know limitation when using Async I/O on Linux. The Linux +Threads package steals (uses internally) the SIGUSR1 signal that squid uses +to rotate logs. + +

+In order to not disturb the threads package SIGUSR1 use is disabled in +Squid when threads is enabled on Linux. + +Adding a new cache disk +

+Simply add your new Squid 2 performs badly on Linux +

+by +

+You may have enabled Asyncronous I/O with the +You should also know that How do I configure proxy authentication with Squid-2? +

+For Squid-2, the implementation and configuration has changed. +Authentication is now handled via external processes. +Arjan's +describes how to set it up. Some simple instructions are given below as well. + + + +We assume you have configured an ACL entry with proxy_auth, for example: + + acl foo proxy_auth REQUIRED + http_access allow foo + + + +You will need to compile and install an external authenticator program. +Most people will want to use auth_modules/NCSA +directory. + + % cd auth_modules/NCSA + % make + % make install + +You should now have an +You may need to create a password file. If you have been using +proxy authentication before, you probably already have such a file. +You can get +from our server. Pick a pathname for your password file. We will assume +you will want to put it in the same directory as your squid.conf. + + +Configure the external authenticator in + authenticate_program /usr/local/squid/bin/ncsa_auth /usr/local/squid/etc/passwd + + + +

+After all that, you should be able to start up Squid. If we left something out, or +haven't been clear enough, please let us know (squid-faq@ircache.net). + +Why does proxy-auth reject all users with Squid-2.2? +

+The ACL for proxy-authentication has changed from: + + acl foo proxy_auth timeout + +to: + + acl foo proxy_auth username + +Please update your ACL appropriately - a username of + authenticate_ttl timeout + + +Delay Pools + +

+by . + +

+ +The information here is current for version 2.2. It is strongly +recommended that you use at least Squid 2.2 if you wish to use delay pools. + + +

+Delay pools provide a way to limit the bandwidth of certain requests +based on any list of criteria. The idea came from a Western Australian +university who wanted to restrict student traffic costs (without +affecting staff traffic, and still getting cache and local peering hits +at full speed). There was some early Squid 1.0 code by Central Network +Services at Murdoch University, which I then developed (at the University +of Western Australia) into a much more complex patch for Squid 1.0 +called ``DELAY_HACK.'' I then tried to code it in a much cleaner style +and with slightly more generic options than I personally needed, and +called this ``delay pools'' in Squid 2. I almost completely recoded +this in Squid 2.2 to provide the greater flexibility requested by people +using the feature. + +

+To enable delay pools features in Squid 2.2, you must use the +--enable-delay-pools configure option before compilation. + +

+Terminology for this FAQ entry: + + + + +

+Delay pools allows you to limit traffic for clients or client groups, +with various features: + + + can specify peer hosts which aren't affected by delay pools, + ie, local peering or other 'free' traffic (with the + no-delay peer option). + + + delay behavior is selected by ACLs (low and high priority + traffic, staff vs students or student vs authenticated student + or so on). + + + each group of users has a number of buckets, a bucket has an + amount coming into it in a second and a maximum amount it can + grow to; when it reaches zero, objects reads are deferred + until one of the object's clients has some traffic allowance. + + + any number of pools can be configured with a given class and + any set of limits within the pools can be disabled, for example + you might only want to use the aggregate and per-host bucket + groups of class 3, not the per-network one. + + + +

+This allows options such as creating a number of class 1 delay pools +and allowing a certain amount of bandwidth to given object types (by +using URL regular expressions or similar), and many other uses I'm sure +I haven't even though of beyond the original fair balancing of a +relatively small traffic allocation across a large number of users. + +

+There are some limitations of delay pools: + + + delay pools are incompatible with slow aborts; quick abort + should be set fairly low to prevent objects being retrived at + full speed once there are no clients requesting them (as the + traffic allocation is based on the current clients, and when + there are no clients attached to the object there is no way to + determine the traffic allocation). + + delay pools only limits the actual data transferred and is not + inclusive of overheads such as TCP overheads, ICP, DNS, icmp + pings, etc. + + it is possible for one connection or a small number of + connections to take all the bandwidth from a given bucket and + the other connections to be starved completely, which can be a + major problem if there are a number of large objects being + transferred and the parameters are set in a way that a few + large objects will cause all clients to be starved (potentially + fixed by a currently experimental patch). + + +How can I limit Squid's total bandwidth to, say, 512 Kbps? + +

+ + acl all src 0.0.0.0/0.0.0.0 # might already be defined + delay_pools 1 + delay_class 1 1 + delay_access 1 allow all + delay_parameters 1 64000/64000 # 512 kbits == 64 kbytes per second + + + +For an explanation of these tags please see the configuration file. + + + +

+The 1 second buffer (max = restore = 64kbytes/sec) is because a limit +is requested, and no responsiveness to a busrt is requested. If you +want it to be able to respond to a burst, increase the aggregate_max to +a larger value, and traffic bursts will be handled. It is recommended +that the maximum is at least twice the restore value - if there is only +a single object being downloaded, sometimes the download rate will fall +below the requested throughput as the bucket is not empty when it comes +to be replenished. + +How to limit a single connection to 128 Kbps? + +

+You can not limit a single HTTP request's connection speed. You +can limit individual hosts to some bandwidth rate. To limit a +specific host, define an acl for that host and use the example +above. To limit a group of hosts, then you must use a delay pool of +class 2 or 3. For example: + + acl only128kusers src 192.168.1.0/255.255.192.0 + acl all src 0.0.0.0/0.0.0.0 + delay_pools 1 + delay_class 1 3 + delay_access 1 allow only128kusers + delay_access 1 deny all + delay_parameters 1 64000/64000 -1/-1 16000/64000 + + + +For an explanation of these tags please see the configuration file. + + +The above gives a solution where a cache is given a total of 512kbits to +operate in, and each IP address gets only 128kbits out of that pool. + +How do you personally use delay pools? + +

+We have six local cache peers, all with the options 'proxy-only no-delay' +since they are fast machines connected via a fast ethernet and microwave (ATM) +network. + +

+For our local access we use a dstdomain ACL, and for delay pool exceptions +we use a dst ACL as well since the delay pool ACL processing is done using +'fast lookups', which means (among other things) it won't wait for a DNS +lookup if it would need one. + +

+Our proxy has two virtual interfaces, one which requires student +authentication to connect from machines where a department is not +paying for traffic, and one which uses delay pools. Also, users of the +main Unix system are allowed to choose slow or fast traffic, but must +pay for any traffic they do using the fast cache. Ident lookups are +disabled for accesses through the slow cache since they aren't needed. +Slow accesses are delayed using a class 3 delay pool to give fairness +between departments as well as between users. We recognize users of +Lynx on the main host are grouped together in one delay bucket but they +are mostly viewing text pages anyway, so this isn't considered a +serious problem. If it was we could take those hosts into a class 1 +delay pool and give it a larger allocation. + +

+I prefer using a slow restore rate and a large maximum rate to give +preference to people who are looking at web pages as their individual +bucket fills while they are reading, and those downloading large +objects are disadvantaged. This depends on which clients you believe +are more important. Also, one individual 8 bit network (a residential +college) have paid extra to get more bandwidth. + +

+The relevant parts of my configuration file are (IP addresses, etc, all +changed): + + # ACL definitions + # Local network definitions, domains a.net, b.net + acl LOCAL-NET dstdomain a.net b.net + # Local network; nets 64 - 127. Also nearby network class A, 10. + acl LOCAL-IP dst 192.168.64.0/255.255.192.0 10.0.0.0/255.0.0.0 + # Virtual i/f used for slow access + acl virtual_slowcache myip 192.168.100.13/255.255.255.255 + # All permitted slow access, nets 96 - 127 + acl slownets src 192.168.96.0/255.255.224.0 + # Special 'fast' slow access, net 123 + acl fast_slow src 192.168.123.0/255.255.255.0 + # User hosts + acl my_user_hosts src 192.168.100.2/255.255.255.254 + # "All" ACL + acl all src 0.0.0.0/0.0.0.0 + + # Don't need ident lookups for billing on (free) slow cache + ident_lookup_access allow my_user_hosts !virtual_slowcache + ident_lookup_access deny all + + # Security access checks + http_access [...] + + # These people get in for slow cache access + http_access allow virtual_slowcache slownets + http_access deny virtual_slowcache + + # Access checks for main cache + http_access [...] + + # Delay definitions (read config file for clarification) + delay_pools 2 + delay_initial_bucket_level 50 + + delay_class 1 3 + delay_access 1 allow virtual_slowcache !LOCAL-NET !LOCAL-IP !fast_slow + delay_access 1 deny all + delay_parameters 1 8192/131072 1024/65536 256/32768 + + delay_class 2 2 + delay_access 2 allow virtual_slowcache !LOCAL-NET !LOCAL-IP fast_slow + delay_access 2 deny all + delay_parameters 2 2048/65536 512/32768 + + +

+The same code is also used by a some of departments using class 2 delay +pools to give them more flexibility in giving different performance to +different labs or students. + +Where else can I find out about delay pools? + +

+This is also pretty well documented in the configuration file, with +examples. Since people seem to loose their config files, here's a copy +of the relevant section. + + +# DELAY POOL PARAMETERS (all require DELAY_POOLS compilation option) +# ----------------------------------------------------------------------------- + +# TAG: delay_pools +# This represents the number of delay pools to be used. For example, +# if you have one class 2 delay pool and one class 3 delays pool, you +# have a total of 2 delay pools. +# +# To enable this option, you must use --enable-delay-pools with the +# configure script. +#delay_pools 0 + +# TAG: delay_class +# This defines the class of each delay pool. There must be exactly one +# delay_class line for each delay pool. For example, to define two +# delay pools, one of class 2 and one of class 3, the settings above +# and here would be: +# +#delay_pools 2 # 2 delay pools +#delay_class 1 2 # pool 1 is a class 2 pool +#delay_class 2 3 # pool 2 is a class 3 pool +# +# The delay pool classes are: +# +# class 1 Everything is limited by a single aggregate +# bucket. +# +# class 2 Everything is limited by a single aggregate +# bucket as well as an "individual" bucket chosen +# from bits 25 through 32 of the IP address. +# +# class 3 Everything is limited by a single aggregate +# bucket as well as a "network" bucket chosen +# from bits 17 through 24 of the IP address and a +# "individual" bucket chosen from bits 17 through +# 32 of the IP address. +# +# NOTE: If an IP address is a.b.c.d +# -> bits 25 through 32 are "d" +# -> bits 17 through 24 are "c" +# -> bits 17 through 32 are "c * 256 + d" + +# TAG: delay_access +# This is used to determine which delay pool a request falls into. +# The first matched delay pool is always used, ie, if a request falls +# into delay pool number one, no more delay are checked, otherwise the +# rest are checked in order of their delay pool number until they have +# all been checked. For example, if you want some_big_clients in delay +# pool 1 and lotsa_little_clients in delay pool 2: +# +#delay_access 1 allow some_big_clients +#delay_access 1 deny all +#delay_access 2 allow lotsa_little_clients +#delay_access 2 deny all + +# TAG: delay_parameters +# This defines the parameters for a delay pool. Each delay pool has +# a number of "buckets" associated with it, as explained in the +# description of delay_class. For a class 1 delay pool, the syntax is: +# +#delay_parameters pool aggregate +# +# For a class 2 delay pool: +# +#delay_parameters pool aggregate individual +# +# For a class 3 delay pool: +# +#delay_parameters pool aggregate network individual +# +# The variables here are: +# +# pool a pool number - ie, a number between 1 and the +# number specified in delay_pools as used in +# delay_class lines. +# +# aggregate the "delay parameters" for the aggregate bucket +# (class 1, 2, 3). +# +# individual the "delay parameters" for the individual +# buckets (class 2, 3). +# +# network the "delay parameters" for the network buckets +# (class 3). +# +# A pair of delay parameters is written restore/maximum, where restore is +# the number of bytes (not bits - modem and network speeds are usually +# quoted in bits) per second placed into the bucket, and maximum is the +# maximum number of bytes which can be in the bucket at any time. +# +# For example, if delay pool number 1 is a class 2 delay pool as in the +# above example, and is being used to strictly limit each host to 64kbps +# (plus overheads), with no overall limit, the line is: +# +#delay_parameters 1 -1/-1 8000/8000 +# +# Note that the figure -1 is used to represent "unlimited". +# +# And, if delay pool number 2 is a class 3 delay pool as in the above +# example, and you want to limit it to a total of 256kbps (strict limit) +# with each 8-bit network permitted 64kbps (strict limit) and each +# individual host permitted 4800bps with a bucket maximum size of 64kb +# to permit a decent web page to be downloaded at a decent speed +# (if the network is not being limited due to overuse) but slow down +# large downloads more significantly: +# +#delay_parameters 2 32000/32000 8000/8000 600/64000 +# +# There must be one delay_parameters line for each delay pool. + +# TAG: delay_initial_bucket_level (percent, 0-100) +# The initial bucket percentage is used to determine how much is put +# in each bucket when squid starts, is reconfigured, or first notices +# a host accessing it (in class 2 and class 3, individual hosts and +# networks only have buckets associated with them once they have been +# "seen" by squid). +# +#delay_initial_bucket_level 50 + + +Can I preserve my cache when upgrading from 1.1 to 2? +

+At the moment we do not have a script which will convert your cache +contents from the 1.1 to the Squid-2 format. If enough people ask for +one, then somebody will probably write such a script. + +

+If you like, you can configure a new Squid-2 cache with your old +Squid-1.1 cache as a sibling. After a few days, weeks, or +however long you want to wait, shut down the old Squid cache. +If you want to force-load your new cache with the objects +from the old cache, you can try something like this: + + +Install Squid-2 and configure it to have the same +amount of disk space as your Squid-1 cache, even +if there is not currently that much space free. + +Configure Squid-2 with Squid-1 as a parent cache. +You might want to enable +Enable the on Squid-1. + +Set the refresh rules on Squid-1 to be very liberal so that it +does not generate IMS requests for cached objects. + +Create a list of all the URLs in the Squid-1 cache. These can +be extracted from the access.log, store.log and swap logs. + +For every URL in the list, request the URL from Squid-2, and then +immediately send a PURGE request to Squid-1. + +Eventually Squid-2 will have all the objects, and Squid-1 +will be empty. + + + + +Customizable Error Messages +

+Squid-2 lets you customize your error messages. The source distribution +includes error messages in different languages. You can select the +language with the configure option: + + --enable-err-language=lang + + +

+Furthermore, you can rewrite the error message template files if you like. +This list describes the tags which Squid will insert into the messages: + + + +My squid.conf from version 1.1 doesn't work! +

+Yes, a number of configuration directives have been renamed. +Here are some of them: + + + acl Uncachable url_regex cgi ? + no_cache deny Uncachable + + + acl that-AS dst_as 1241 + cache_peer_access thatcache.thatdomain.net allow that-AS + cache_peer_access thatcache.thatdomain.net deny all + + This example sends requests to your peer + connect_timeout 120 seconds + read_timeout 15 minutes + + + + + + +httpd-accelerator mode + +What is the httpd-accelerator mode? +

+Occasionally people have trouble understanding accelerators and +proxy caches, usually resulting from mixed up interpretations of +"incoming" and ``outgoing" data. I think in terms of requests (i.e., +an outgoing request is from the local site out to the big bad +Internet). The data received in reply is incoming, of course. +Others think in the opposite sense of ``a request for incoming data". + +

+An accelerator caches incoming requests for outgoing data (i.e., +that which you publish to the world). It takes load away from your +HTTP server and internal network. You move the server away from +port 80 (or whatever your published port is), and substitute the +accelerator, which then pulls the HTTP data from the ``real" +HTTP server (only the accelerator needs to know where the real +server is). The outside world sees no difference (apart from an +increase in speed, with luck). + +

+Quite apart from taking the load of a site's normal web server, +accelerators can also sit outside firewalls or other network +bottlenecks and talk to HTTP servers inside, reducing traffic across +the bottleneck and simplifying the configuration. Two or more +accelerators communicating via ICP can increase the speed and +resilience of a web service to any single failure. + +

+The Squid redirector can make one accelerator act as a single +front-end for multiple servers. If you need to move parts of your +filesystem from one server to another, or if separately administered +HTTP servers should logically appear under a single URL hierarchy, +the accelerator makes the right thing happen. + +

+If you wish only to cache the ``rest of the world" to improve local users +browsing performance, then accelerator mode is irrelevant. Sites which +own and publish a URL hierarchy use an accelerator to improve other +sites' access to it. Sites wishing to improve their local users' access +to other sites' URLs use proxy caches. Many sites, like us, do both and +hence run both. + +

+Measurement of the Squid cache and its Harvest counterpart suggest an +order of magnitude performance improvement over CERN or other widely +available caching software. This order of magnitude performance +improvement on hits suggests that the cache can serve as an httpd +accelerator, a cache configured to act as a site's primary httpd server +(on port 80), forwarding references that miss to the site's real httpd +(on port 81). + +

+In such a configuration, the web administrator renames all +non-cachable URLs to the httpd's port (81). The cache serves +references to cachable objects, such as HTML pages and GIFs, and +the true httpd (on port 81) serves references to non-cachable +objects, such as queries and cgi-bin programs. If a site's usage +characteristics tend toward cachable objects, this configuration +can dramatically reduce the site's web workload. + +

+Note that it is best not to run a single How do I set it up? +

+First, you have to tell Squid to listen on port 80 (usually), so set the 'http_port' +option: + + http_port 80 + +

+Next, you need to move your normal HTTP server to another port and/or +another machine. If you want to run your HTTP server on the same +machine, then it can not also use port 80 (except see the next FAQ entry +below). A common choice is port 81. Configure squid as follows: + + httpd_accel_host localhost + httpd_accel_port 81 + +Alternatively, you could move the HTTP server to another machine and leave it +on port 80: + + httpd_accel_host otherhost.foo.com + httpd_accel_port 80 + +

+You should now be able to start Squid and it will serve requests as a HTTP server. + + +

+If you are using Squid has an accelerator for a virtual host system, then you +need to specify + + httpd_accel_host virtual + + + +

+Finally, if you want Squid to also accept + httpd_accel_with_proxy on + + +When using an httpd-accelerator, the port number for redirects is wrong + +

+Yes, this is because you probably moved your real httpd to port 81. When +your httpd issues a redirect message (e.g. 302 Moved Temporarily), it knows +it is not running on the standard port (80), so it inserts +How can you fix this? + +

+One way is to leave your httpd running on port 80, but bind the httpd +socket to a you can do it +like this in + Port 80 + BindAddress 127.0.0.1 + +Then, in your + httpd_accel_host 127.0.0.1 + httpd_accel_port 80 + + +

+Note, you probably also need to add an /etc/hosts entry +of 127.0.0.1 for your server hostname. Otherwise, Squid may +get stuck in a forwarding loop. + + + +Related Software + +Clients + +Wget +

+ is a +command-line Web client. It supports recursive retrievals and +HTTP proxies. + +echoping +

+If you want to test your Squid cache in batch (from a cron command, for +instance), you can use the program, +which will tell you (in plain text or via an exit code) if the cache is +up or not, and will indicate the response times. + +Logfile Analysis + +

+Rather than maintain the same list in two places, please see the + page +on the Web server. + +Configuration Tools + +3Dhierarchy.pl +

+Kenichi Matsui has a simple perl script which generates a 3D hierarchy map (in VRML) from +squid.conf. +. + +Squid add-ons + +transproxy +

+ +is a program used in conjunction with the Linux Transparent Proxy +networking feature, and ipfwadm, to transparently proxy HTTP and +other requests. Transproxy is written by . + +Iain's redirector package +

+A from + to allow Intranet (restricted) or Internet +(full) access with URL deny and redirection for sites that are not deemed +acceptable for a userbase all via a single proxy port. + +Junkbusters +

+ Corp has a +copyleft privacy-enhancing, ad-blocking proxy server which you can +use in conjunction with Squid. + +Squirm +

+ is a configurable, efficient redirector for Squid +by . Features: + + Very fast + Virtually no memory usage + It can re-read it's config files while running by sending it a HUP signal + Interactive test mode for checking new configs + Full regular expression matching and replacement + Config files for patterns and IP addresses. + If you mess up the config file, Squirm runs in Dodo Mode so your squid keeps working :-) + + +chpasswd.cgi +

+ +has adapated the Apache's into a CGI program +called . + +jesred +

+ +by . + +squidGuard +

+ is +a free (GPL), flexible and efficient filter and +redirector program for squid. It lets you define multiple access +rules with different restrictions for different user groups on a squid +cache. squidGuard uses squid standard redirector interface. + +Central Squid Server +

+The +(or 'Central Squid Server' - CSS) is a cut-down +version of Squid without HTTP or object caching functionality. The +CSS deals only with ICP messages. Instead of caching objects, the CSS +records the availability of objects in each of its neighbour caches. +Caches that have smart neighbours update each smart neighbour with the +status of their cache by sending ICP_STORE_NOTIFY/ICP_RELEASE_NOTIFY +messages upon storing/releasing an object from their cache. The CSS +maintains an up to date 'object map' recording the availability of +objects in its neighbouring caches. + +Ident Servers +

+For +, +, +and +. + + + + +DISKD + +What is DISKD? +

+DISKD refers to some features in Squid-2.4 to improve Disk I/O performance. +The basic idea is that each Does it perform better? +

+Yes. We benchmarked Squid-2.4 with DISKD at the +. +The results are also described . +At the bakeoff, we got 160 req/sec with diskd. Without diskd, we'd have gotten about 40 req/sec. + +What do I need to use it? +

+ + + Squid-2.4 + + Your operating system must support message queues. + + Your operating system must support shared memory. + + +If I use DISKD, do I have to wipe out my current cache? +

+No. Diskd uses the same storage scheme as the standard "UFS" +type. It only changes how I/O is performed. + +How do I configure message queues? +

+Most Unix operating systems have message queue support +by default. One way to check is to see if you have +an +However, you will likely need to increase the message +queue parameters for Squid. Message queue implementations +normally have the following parameters: + + + +

+The messages between Squid and diskd are 32 bytes. Thus, MSGMAX +should be 32 or greater. You may want to set it to a larger +value, just to be safe. + +

+We'll have two queues for each +MSGMNB and MSGTQL affect how many messages can be in the queues +at one time. I've found that 75 messages per queue is about +the limit of decent performance. Thus, MSGMNB must be +at least 75*MSGMAX, and MSGTQL must be at least 75 times +the number of FreeBSD +

+Your kernel must have + +options SYSVMSG + + +

+You can set the parameters in the kernel as follows. This is just +an example. Make sure the values are appropriate for your system: + +options MSGMNB=16384 # max # of bytes in a queue +options MSGMNI=41 # number of message queue identifiers +options MSGSEG=2049 # number of message segments +options MSGSSZ=64 # size of a message segment +options MSGTQL=512 # max messages in system + + +Digital Unix +

+Message queue support seems to be in the kernel +by default. Setting the options is as follows: + +options MSGMNB="8192" # max # bytes on queue +options MSGMNI="31" # # of message queue identifiers +options MSGMAX="2049" # max message size +options MSGTQL="1024" # # of system message headers + + +

+by +

+If you have a newer version (DU64), then you can probably use + +# sysconfig -q ipc + +To change them make a file like this called ipc.stanza: + +ipc: + msg-max = 2049 + msg-mni = 31 + msg-tql = 1024 + msg-mnb = 8192 + +then run + +# sysconfigdb -a -f ipc.stanza + +You have to reboot for the change to take effect. + + + +Linux +

+In my limited browsing on Linux, I didn't see any way to change +message queue parameters except to modify the include files +and build a new kernel. On my system, the file +is /usr/src/linux/include/linux/msg.h. + +Solaris +

+Refer to in Sunworld Magazine. + +

+I don't think the above article really tells you how to set the parameters. +You do it in /etc/system with lines like this: + +set msgsys:msginfo_msgmax=2049 +set msgsys:msginfo_msgmnb=8192 +set msgsys:msginfo_msgmni=31 +set msgsys:msginfo_msgsz=64 +set msgsys:msginfo_msgtql=1024 + +

+Of course, you must reboot whenever you modify /etc/system +before changes take effect. + +How do I configure shared memory? +

+Shared memory uses a set of parameters similar to the ones for message +queues. The Squid DISKD implementation uses one shared memory area +for each cache_dir. Each shared memory area is about +800 kilobytes in size. You may need to modify your system's +shared memory parameters: + +

+ + + +

+For Squid and DISKD, FreeBSD +

+Your kernel must have + +options SYSVSHM + + +

+You can set the parameters in the kernel as follows. This is just +an example. Make sure the values are appropriate for your system: + +options SHMSEG=16 # max shared mem id's per process +options SHMMNI=32 # max shared mem id's per system +options SHMMAX=2097152 # max shared memory segment size (bytes) +options SHMALL=4096 # max amount of shared memory (pages) + + +Digital Unix +

+Message queue support seems to be in the kernel +by default. Setting the options is as follows: + +options SHMSEG="16" # max shared mem id's per process +options SHMMNI="32" # max shared mem id's per system +options SHMMAX="2097152" # max shared memory segment size (bytes) +options SHMALL=4096 # max amount of shared memory (pages) + + +

+by +

+If you have a newer version (DU64), then you can probably use + +# sysconfig -q ipc + +To change them make a file like this called ipc.stanza: + +ipc: + shm-seg = 16 + shm-mni = 32 + shm-max = 2097152 + shm-all = 4096 + +then run + +# sysconfigdb -a -f ipc.stanza + +You have to reboot for the change to take effect. + + +Linux +

+In my limited browsing on Linux, I didn't see any way to change +shared memory parameters except to modify the include files +and build a new kernel. On my system, the file +is /usr/src/linux/include/asm-i386/shmparam.h + +

+Oh, it looks like you can change /proc/sys/kernel/shmmax. + +Solaris + +

+Refer to + +in Sunworld Magazine. + +

+To set the values, you can put these lines in /etc/system: + +set shmsys:shminfo_shmmax=2097152 +set shmsys:shminfo_shmmni=32 +set shmsys:shminfo_shmseg=16 + + +Sometimes shared memory and message queues aren't released when Squid exits. + +

+Insert this command into your +ipcs | grep '^[mq]' | awk '{printf "ipcrm -%s %s\n", $1, $2}' | /bin/sh + + + + +Authentication + +How does Proxy Authentication work in Squid? +

+Note: The information here is current for version 2.4. +

+Users will be authenticated if squid is configured to use +Browsers send the user's authentication credentials in the + +If Squid gets a request and the +If the header is missing, Squid returns +an HTTP reply with status 407 (Proxy Authentication Required). +The user agent (browser) receives the 407 reply and then prompts +the user to enter a name and password. The name and password are +encoded, and sent in the +Authentication is actually performed outside of main Squid process. +When Squid starts, it spawns a number of authentication subprocesses. +These processes read usernames and passwords on stdin, and reply +with "OK" or "ERR" on stdout. This technique allows you to use +a number of different authentication schemes, although currently +you can only use one scheme at a time. +

+The Squid source code comes with a few authentcation processes. +These include: + + +LDAP: Uses the Lightweight Directory Access Protocol + +NCSA: Uses an NCSA-style username and password file. + +MSNT: Uses a Windows NT authentication domain. + +PAM: Uses the Linux Pluggable Authentication Modules scheme. + +SMB: Uses a SMB server like Windows NT or Samba. + +getpwam: Uses the old-fashioned Unix password file. + + +

+In order to authenticate users, you need to compile and install +one of the supplied authentication modules, one of , +or supply your own. + +

+You tell Squid which authentcation program to use with the + +authenticate_program /usr/local/squid/bin/ncsa_auth /usr/local/squid/etc/passwd + + + +How do I use authentication in access controls? +

+Make sure that your authentication program is installed +and working correctly. You can test it by hand. +

+Add some +acl foo proxy_auth REQUIRED +acl all src 0/0 +http_access allow foo +http_access deny all + +The REQURIED term means that any authenticated user will match the +ACL named +Squid allows you to provide fine-grained controls +by specifying individual user names. For example: + +acl foo proxy_auth REQUIRED +acl bar proxy_auth lisa sarah frank joe +acl daytime time 08:00-17:00 +acl all src 0/0 +http_access allow bar +http_access allow foo daytime +http_access deny all + +In this example, users named lisa, sarah, joe, and frank +are allowed to use the proxy at all times. Other users +are allowed only during daytime hours. + +Does Squid cache authentication lookups? +

+Yes. Successful authentication lookups are cached for +one hour by default. That means (in the worst case) its possible +for someone to keep using your cache up to an hour after he +has been removed from the authentication database. +

+You can control the expiration +time with the Are passwords stored in clear text or ecrypted? +

+Squid stores cleartext passwords in itsmemory cache. +

+Squid writes cleartext usernames and passwords when talking to +the external authentication processes. Note, however, that this +interprocess communication occors over TCP connections bound to +the loopback interface. Thus, its not possile for processes on +other comuters to "snoop" on the authentication traffic. + +

+Each authentication program must select its own scheme for persistent +storage of passwords and usernames. + + + +Terms and Definitions + +Neighbor + +

+In Squid, +In Harvest 1.4, neighbor referred to what Squid calls a sibling. That is, Harvest +had Regular Expression +

+Regular expressions are patterns that used for matching sequences +of characters in text. For more information, see + and +. + + +$Id: FAQ.sgml,v 1.1 2004/09/09 12:36:11 cvsdist Exp $ + +

+ + diff --git a/sources b/sources index e69de29..8a6ae46 100644 --- a/sources +++ b/sources @@ -0,0 +1 @@ +c38c083f44c222a8d026fa129c30b98f squid-2.3.STABLE4-src.tar.gz diff --git a/squid.init b/squid.init new file mode 100644 index 0000000..bf051eb --- /dev/null +++ b/squid.init @@ -0,0 +1,134 @@ +#!/bin/bash +# squid This shell script takes care of starting and stopping +# Squid Internet Object Cache +# +# chkconfig: - 90 25 +# description: Squid - Internet Object Cache. Internet object caching is \ +# a way to store requested Internet objects (i.e., data available \ +# via the HTTP, FTP, and gopher protocols) on a system closer to the \ +# requesting site than to the source. Web browsers can then use the \ +# local Squid cache as a proxy HTTP server, reducing access time as \ +# well as bandwidth consumption. +# pidfile: /var/run/squid.pid +# config: /etc/squid/squid.conf + +PATH=/usr/bin:/sbin:/bin:/usr/sbin +export PATH + +# Source function library. +. /etc/rc.d/init.d/functions + +# Source networking configuration. +. /etc/sysconfig/network + +# Check that networking is up. +[ ${NETWORKING} = "no" ] && exit 0 + +# check if the squid conf file is present +[ -f /etc/squid/squid.conf ] || exit 0 + +# determine the name of the squid binary +[ -f /usr/sbin/squid ] && SQUID=squid +[ -z "$SQUID" ] && exit 0 + +# determine which one is the cache_swap directory +CACHE_SWAP=`sed -e 's/#.*//g' /etc/squid/squid.conf | \ + grep cache_dir | awk '{ print $3 }'` +[ -z "$CACHE_SWAP" ] && CACHE_SWAP=/var/spool/squid + +# default squid options +# -D disables initial dns checks. If you most likely will not to have an +# internet connection when you start squid, uncomment this +SQUID_OPTS="-D" + +RETVAL=0 + +start() { + echo -n "Starting $SQUID: " + for adir in $CACHE_SWAP; do + if [ ! -d $adir/00 ]; then + echo -n "init_cache_dir $adir... " + $SQUID -z -F 2>/dev/null + fi + done + $SQUID $SQUID_OPTS & + RETVAL=$? + echo $SQUID + [ $RETVAL -eq 0 ] && touch /var/lock/subsys/$SQUID + return $RETVAL +} + +stop() { + echo -n "Stopping $SQUID: " + $SQUID -k shutdown & + RETVAL=$? + if [ $RETVAL -eq 0 ] ; then + rm -f /var/lock/subsys/$SQUID + while : ; do + [ -f /var/run/squid.pid ] || break + sleep 2 && echo -n "." + done + echo "done" + else + echo + fi + return $RETVAL +} + +reload() { + $SQUID $SQUID_OPTS -k reconfigure +} + +restart() { + stop + start +} + +condrestart() { + [ -e /var/lock/subsys/squid ] && restart || : +} + +rhstatus() { + status $SQUID + $SQUID -k check +} + +probe() { + return 0 +} + +case "$1" in +start) + start + ;; + +stop) + stop + ;; + +reload) + reload + ;; + +restart) + restart + ;; + +condrestart) + condrestart + ;; + +status) + rhstatus + ;; + +probe) + exit 0 + ;; + +*) + echo "Usage: $0 {start|stop|status|reload|restart|condrestart}" + exit 1 +esac + +exit $? diff --git a/squid.logrotate b/squid.logrotate new file mode 100644 index 0000000..be3f30f --- /dev/null +++ b/squid.logrotate @@ -0,0 +1,31 @@ +/var/log/squid/access.log { + weekly + rotate 5 + copytruncate + compress + notifempty + missingok +} +/var/log/squid/cache.log { + weekly + rotate 5 + copytruncate + compress + notifempty + missingok +} + +/var/log/squid/store.log { + weekly + rotate 5 + copytruncate + compress + notifempty + missingok +# This script asks squid to rotate its logs on its own. +# Restarting squid is a long process and it is not worth +# doing it just to rotate logs + postrotate + /usr/sbin/squid -k rotate + endscript +} diff --git a/squid.spec b/squid.spec new file mode 100644 index 0000000..83fc827 --- /dev/null +++ b/squid.spec @@ -0,0 +1,428 @@ +Summary: The Squid proxy caching server. +Name: squid +Version: 2.3.STABLE4 +Release: 1 +Serial: 6 +Copyright: GPL +Group: System Environment/Daemons +Source: http://www.squid-cache.org/Squid/v2/squid-%{version}-src.tar.gz +Source1: http://www.squid-cache.org/Squid/FAQ/FAQ.sgml +Source2: squid.init +Source3: squid.logrotate +Patch0: squid-2.1-make.patch +Patch1: squid-2.3-config.patch +Patch2: squid-perlpath.patch +Patch10: squid-2.3.stable4-ftp_icon_not_found.patch +Patch11: squid-2.3.stable4-internal_dns_rcode_table_formatting.patch +BuildRoot: /var/tmp/squid-root +Prereq: /sbin/chkconfig logrotate shadow-utils /etc/init.d +BuildPrereq: jade sgml-tools +Obsoletes: squid-novm + +%description +Squid is a high-performance proxy caching server for Web clients, +supporting FTP, gopher, and HTTP data objects. Unlike traditional +caching software, Squid handles all requests in a single, +non-blocking, I/O-driven process. Squid keeps meta data and especially +hot objects cached in RAM, caches DNS lookups, supports non-blocking +DNS lookups, and implements negative caching of failed requests. + +Squid consists of a main server program squid, a Domain Name System +lookup program (dnsserver), a program for retrieving FTP data +(ftpget), and some management and client tools. + +Install squid if you need a proxy caching server. + +%prep +%setup -q +%patch0 -p1 -b .make +%patch1 -p1 -b .config +%patch2 -p1 +%patch10 -p0 -b .ftp-icon +%patch11 -p0 -b .dns + +%build +%configure \ + --exec_prefix=/usr --bindir=/usr/sbin --libexecdir=/usr/lib/squid \ + --localstatedir=/var --sysconfdir=/etc/squid \ + --enable-poll --enable-snmp --enable-heap-replacement \ + --enable-delay-pools # --enable-icmp + +# Some versions of autoconf fail to detect sys/resource.h correctly; +# apparently because it generates a compiler warning. + +if [ -e /usr/include/sys/resource.h ]; then +cat >>include/autoconf.h </dev/null 2>&1 + +for i in /var/log/squid /var/spool/squid ; do + if [ -d $i ] ; then + for adir in `find $i -maxdepth 0 \! -user squid`; do + chown -R squid.squid $adir + done + fi +done + +exit 0 + +%post +/sbin/chkconfig --add squid +if [ $1 = 0 ]; then + case "$LANG" in + bg*) + DIR=Bulgarian + ;; + cs*) + DIR=Czech + ;; + da*) + DIR=Danish + ;; + nl*) + DIR=Dutch + ;; + en*) + DIR=English + ;; + ea*) + DIR=Estonian + ;; + fi*) + DIR=Finnish + ;; + fr*) + DIR=French + ;; + de*) + DIR=German + ;; + hu*) + DIR=Hungarian + ;; + it*) + DIR=Italian + ;; + ja*) + DIR=Japanese + ;; + kr*) + DIR=Korean + ;; + pl*) + DIR=Polish + ;; + pt*) + DIR=Portuguese + ;; + ro*) + DIR=Romanian + ;; + ru*) + DIR=Russian-koi8-r + ;; + sk*) + DIR=Slovak + ;; + es*) + DIR=Spanish + ;; + sv*) + DIR=Swedish + ;; + zh*) + DIR=Traditional_Chinese + ;; + tr*) + DIR=Turkish + ;; + *) + DIR=English + ;; + esac + ln -snf /usr/lib/squid/errors/$DIR /etc/squid/errors +fi + +%preun +if [ $1 = 0 ] ; then + rm -f /var/log/squid/* + /sbin/chkconfig --del squid + service squid stop >/dev/null 2>&1 +fi + +%postun +if [ $1 = 0 ] ; then + userdel squid +fi +if [ "$1" -ge "1" ] ; then + service squid condrestart >/dev/null 2>&1 +fi + +%changelog +* Fri Jul 28 2000 Bill Nottingham +- clean up init script; fix condrestart +- update to STABLE4, more bugfixes +- update FAQ + +* Tue Jul 18 2000 Nalin Dahyabhai +- fix syntax error in init script +- finish adding condrestart support + +* Fri Jul 14 2000 Bill Nottingham +- move initscript back + +* Wed Jul 12 2000 Prospector +- automatic rebuild + +* Thu Jul 6 2000 Bill Nottingham +- prereq /etc/init.d +- add bugfix patch +- update FAQ + +* Thu Jun 29 2000 Bill Nottingham +- fix init script + +* Tue Jun 27 2000 Bill Nottingham +- don't prereq new initscripts + +* Mon Jun 26 2000 Bill Nottingham +- initscript munging + +* Sat Jun 10 2000 Bill Nottingham +- rebuild for exciting FHS stuff + +* Wed May 31 2000 Bill Nottingham +- fix init script again (#11699) +- add --enable-delay-pools (#11695) +- update to STABLE3 +- update FAQ + +* Fri Apr 28 2000 Bill Nottingham +- fix init script (#11087) + +* Fri Apr 7 2000 Bill Nottingham +- three more bugfix patches from the squid people +- buildprereq jade, sgmltools + +* Sun Mar 26 2000 Florian La Roche +- make %pre more portable + +* Thu Mar 16 2000 Bill Nottingham +- bugfix patches +- fix dependency on /usr/local/bin/perl + +* Sat Mar 4 2000 Bill Nottingham +- 2.3.STABLE2 + +* Mon Feb 14 2000 Bill Nottingham +- Yet More Bugfix Patches + +* Tue Feb 8 2000 Bill Nottingham +- add more bugfix patches +- --enable-heap-replacement + +* Mon Jan 31 2000 Cristian Gafton +- rebuild to fix dependencies + +* Fri Jan 28 2000 Bill Nottingham +- grab some bugfix patches + +* Mon Jan 10 2000 Bill Nottingham +- 2.3.STABLE1 (whee, another serial number) + +* Tue Dec 21 1999 Bernhard Rosenkraenzer +- Fix compliance with ftp RFCs + (http://www.wu-ftpd.org/broken-clients.html) +- Work around a bug in some versions of autoconf +- BuildPrereq sgml-tools - we're using sgml2html + +* Mon Oct 18 1999 Bill Nottingham +- add a couple of bugfix patches + +* Wed Oct 13 1999 Bill Nottingham +- update to 2.2.STABLE5. +- update FAQ, fix URLs. + +* Sat Sep 11 1999 Cristian Gafton +- transform restart in reload and add restart to the init script + +* Tue Aug 31 1999 Bill Nottingham +- add squid user as user 23. + +* Mon Aug 16 1999 Bill Nottingham +- initscript munging +- fix conflict between logrotate & squid -k (#4562) + +* Wed Jul 28 1999 Bill Nottingham +- put cachemgr.cgi back in /usr/lib/squid + +* Wed Jul 14 1999 Bill Nottingham +- add webdav bugfix patch (#4027) + +* Mon Jul 12 1999 Bill Nottingham +- fix path to config in squid.init (confuses linuxconf) + +* Wed Jul 7 1999 Bill Nottingham +- 2.2.STABLE4 + +* Wed Jun 9 1999 Dale Lovelace +- logrotate changes +- errors from find when /var/spool/squid or +- /var/log/squid didn't exist + +* Thu May 20 1999 Bill Nottingham +- 2.2.STABLE3 + +* Thu Apr 22 1999 Bill Nottingham +- update to 2.2.STABLE.2 + +* Sun Apr 18 1999 Bill Nottingham +- update to 2.2.STABLE1 + +* Thu Apr 15 1999 Bill Nottingham +- don't need to run groupdel on remove +- fix useradd + +* Mon Apr 12 1999 Bill Nottingham +- fix effective_user (bug #2124) + +* Mon Apr 5 1999 Bill Nottingham +- strip binaries + +* Thu Apr 1 1999 Bill Nottingham +- duh. adduser does require a user name. +- add a serial number + +* Tue Mar 30 1999 Bill Nottingham +- add an adduser in %pre, too + +* Thu Mar 25 1999 Bill Nottingham +- oog. chkconfig must be in %preun, not %postun + +* Wed Mar 24 1999 Bill Nottingham +- switch to using group squid +- turn off icmp (insecure) +- update to 2.2.DEVEL3 +- build FAQ docs from source + +* Tue Mar 23 1999 Bill Nottingham +- logrotate changes + +* Sun Mar 21 1999 Cristian Gafton +- auto rebuild in the new build environment (release 4) + +* Wed Feb 10 1999 Bill Nottingham +- update to 2.2.PRE2 + +* Wed Dec 30 1998 Bill Nottingham +- cache & log dirs shouldn't be world readable +- remove preun script (leave logs & cache @ uninstall) + +* Tue Dec 29 1998 Bill Nottingham +- fix initscript to get cache_dir correct + +* Fri Dec 18 1998 Bill Nottingham +- update to 2.1.PATCH2 +- merge in some changes from RHCN version + +* Sat Oct 10 1998 Cristian Gafton +- strip binaries +- version 1.1.22 + +* Sun May 10 1998 Cristian Gafton +- don't make packages conflict with each other... + +* Sat May 02 1998 Cristian Gafton +- added a proxy auth patch from Alex deVries +- fixed initscripts + +* Thu Apr 09 1998 Cristian Gafton +- rebuilt for Manhattan + +* Fri Mar 20 1998 Cristian Gafton +- upgraded to 1.1.21/1.NOVM.21 + +* Mon Mar 02 1998 Cristian Gafton +- updated the init script to use reconfigure option to restart squid instead + of shutdown/restart (both safer and quicker) + +* Sat Feb 07 1998 Cristian Gafton +- upgraded to 1.1.20 +- added the NOVM package and tryied to reduce the mess in the spec file + +* Wed Jan 7 1998 Cristian Gafton +- first build against glibc +- patched out the use of setresuid(), which is available only on kernels + 2.1.44 and later +