| View previous topic :: View next topic |
| Author |
Message |
alexc Site Admin

Joined: 17 Dec 2004 Posts: 25292 Location: Birmingham, UK
|
Posted: Tue Oct 14, 2008 10:01 pm Post subject: How to diagnose nodes that generate too many errors |
|
|
This thread applies to those nodes that generate too many DNS/Timeout/Connection errors - usually these errors should be under a few percent each. The main purpose of this thread to serve as the link given to possible issues affecting nodes when too many errors detected (soon server will probably be auto sending these emails - to me at first, it will be configureable for sure).
Having nodes that generate too many errors is very bad for the project - such buckets will need to be re-issued and we will have to clean up assigned credits for badly crawled buckets. So your help in getting quality crawl is much appreciated - this thread will help solve key reasons for too many errors.
-----------------------------------------------------------------
Reason 1 (Windows only): Microsoft by default cripples TCP/IP stack to 10 half open connections.
Diagnose - check in Control Panel -> Admin Tools -> Events Viewers, switch to System tab look for Warning message saying for TCPIP Source:
"TCP/IP has reached the security limit imposed on the number of concurrent TCP connect attempts."
If you see that then consider saying a few good swear words towards the guy at Microsoft who dreamed up this limit that can't be easily changed by the end user (at least for some apps that are known to be good).
Solution: Use Half-Open fix tool to increase max number of half open connections (recommended new default - 100) - reboot is required after this (it seems this way).
Windows may reset TCP/IP stack after some updates and you'd need to re-run half-open tool again!
-----------------------------------------------------------------
Reason 2 (all OSes): running high CPU usage processes like other Distributed Computing projects - they grab all CPU they can and node starves for CPU, this is usually manifested in high timeouts.
Solution: give node higher priority in Options->Crawler - ideally make high CPU usage programs run at Idle, and give node Above Normal Priority - maybe apart from Archiving thread that might still be desired to run under lower priority.
-----------------------------------------------------------------
Reason 3 (all OSes): you router fails to handle that many connections.
Solution: reduce number of workers - don't go too high! Usually 75-100 workers can be enough for healthy 5-7 Mbits crawl rate. If you see high error rates and other reasons do not apply cut down workers - start low say at 25-30 workers and work your way up.
-----------------------------------------------------------------
Reason 4 (all OSes) - massive error counts due to FireWall update.
Solution: make sure node is given right to access the network when firewall asks for the permission - sometimes firewalls get updated or node was updated so permission may need to be given again.
-----------------------------------------------------------------
Reason 5 (all OSes) - DNS server can't handle it - too many errors
This can happen with some routers, use free Open DNS - http://www.opendns.com/ - all you need is to point your DNS servers to their IPs:
208.67.222.222
208.67.220.220
This is known to fix a number of issues in some routers - DNS servers can be undone at any time and no software download is necessary for that. Changes can be made in router OR on the PC which runs node - Network Options -> TCP/IP Properties and it's there.
-----------------------------------------------------------------
It is possible some other reasons are responsible for errors - sometimes some buckets are bad so it is usually good idea to allow node crawl 50-70k urls before jumping to conclusions - it is much appreciated if you keep an eye on your node to see any abnormal error levels. At the time of the writing if error levels are much lower than 85% then you might have a problem - fear not, it can be fixed.  |
|
| Back to top |
|
 |
kharri1073 Senior Member
Joined: 17 Apr 2007 Posts: 436
|
Posted: Wed Dec 31, 2008 12:50 am Post subject: Re: How to diagnose nodes that generate too many errors |
|
|
| alexc wrote: |
Reason 1 (Windows only): Microsoft by default cripples TCP/IP stack to 10 half open connections.
Diagnose - check in Control Panel -> Admin Tools -> Events Viewers, switch to System tab look for Warning message saying for TCPIP Source:
"TCP/IP has reached the security limit imposed on the number of concurrent TCP connect attempts."
If you see that then consider saying a few good swear words towards the guy at Microsoft who dreamed up this limit that can't be easily changed by the end user (at least for some apps that are known to be good).
Solution: Use Half-Open fix tool to increase max number of half open connections (recommended new default - 100) - reboot is required after this (it seems this way).
Windows may reset TCP/IP stack after some updates and you'd need to re-run half-open tool again! |
It looks like the half-open fix tool has discontinued development. They only allow a download for an "undo" tool to patch your tcpip.sys file back to the original.
They posted a link to a "new and better" patch tool. DeepXW or TCP-Z I'm not sure of the name. This tool allows you to patch the tcpip file from memory so you don't have to restart your computer.
Anyone try it out yet? I'm about to. _________________ Kevin's Blog
 |
|
| Back to top |
|
 |
nibbs Junior Member
Joined: 30 Dec 2008 Posts: 6 Location: Germany, Berlin
|
Posted: Wed Dec 31, 2008 1:04 am Post subject: |
|
|
| I tried to, no success. Had to use the tool from www.lvllord.de |
|
| Back to top |
|
 |
kharri1073 Senior Member
Joined: 17 Apr 2007 Posts: 436
|
Posted: Wed Dec 31, 2008 1:10 am Post subject: |
|
|
I think it worked nicely for me, no reboot needed either. I'm firing up a node on vista to test, hopefully the router won't die.
| Code: | MJ12node : v1.6.7 (.NET 2.0)
Platform : Win32 specific running on Microsoft Windows NT 6.0.6000.0
Total URLs : 16,945 (100.0%)
Successes : 16,217 (95.7%)
Not found : 62 (0.4%)
Timed out : 9 (0.1%)
Disallowed : 216 (1.3%)
Banned : 1 (0.0%)
DNS errors : 370 (2.2%)
Conn errors : 10 (0.1%)
Forbidden (403): 0 (0.0%)
Other : 60 (0.4%)
Retries : 100 (8.0%)
Uptime : 14 mins 38 secs
Memory usage : 105 MB
GZIP requests : 4,419 (26.1% of successes)
GZIP saved data: 95 MB (19.1% of total) |
It's still kind of early to come to any conclusions but the tool says i have gone above 10 half open connections and nothing bad yet. _________________ Kevin's Blog
 |
|
| Back to top |
|
 |
nibbs Junior Member
Joined: 30 Dec 2008 Posts: 6 Location: Germany, Berlin
|
Posted: Wed Dec 31, 2008 1:18 am Post subject: |
|
|
Good luck then!  |
|
| Back to top |
|
 |
alexc Site Admin

Joined: 17 Dec 2004 Posts: 25292 Location: Birmingham, UK
|
Posted: Wed Dec 31, 2008 2:10 am Post subject: Re: How to diagnose nodes that generate too many errors |
|
|
| kharri1073 wrote: | | It looks like the half-open fix tool has discontinued development. They only allow a download for an "undo" tool to patch your tcpip.sys file back to the original. |
You right - I have not tried new tool but will do. So far it seems the same thing was integrated into new shiny product, perhaps that person bought rights or not. I liked the original one I have to say.  |
|
| Back to top |
|
 |
Grubee Senior Member

Joined: 15 Oct 2005 Posts: 957
|
Posted: Fri Apr 17, 2009 6:41 pm Post subject: Half-open limit fix (patch) for Windows TCP (Out) Connection |
|
|
Hi all,
An updated to the Half-open Limit (Patch) is available:
"Half-open limit fix is a program designed to change the maximum number of concurrent half-open outbound TCP connections (connection attempts) in the Windows system file tcpip.sys"
Website: http://half-open.com/home_en.htm
Download: http://half-open.com/download_en.htm
Cheers Grubee |
|
| Back to top |
|
 |
|