Skip to main content

pathping to detect network timeout

Recently our BPC nightly job failed intermittently, a few database mirror servers report timeout as well, I need to prove to network team that it's not an application issue, it's a network issue, so I implement the a windows scheduled task to run pathping among those servers every 1 minute. it did prove that network timeout. here is the script and sample of logs.

pingnetwork.bat :
echo %date% %time%
pathping server1
echo %date% %time%
pathping server2
echo %date% %time%
pathping server3
echo %date% %time%
pathping server4

pingnetwork.bat >> e:\dropit\d.txt

the log d.txt looks like this

"
Thu 05/19/2011 3:20:15.19
E:\dba>pathping agenwi034
Tracing route to agenwi034.corp.ctv.ca [10.1.38.134]
over a maximum of 30 hops:
0 agends030.corp.ctv.ca [10.1.38.130]
1 * agenwi034.corp.ctv.ca [10.1.38.134]
Computing statistics for 25 seconds...
Source to Here This Node/Link
Hop RTT Lost/Sent = Pct Lost/Sent = Pct Address
0 agends030.corp.ctv.ca [10.1.38.130]
68/ 100 = 68% |
1 1ms 68/ 100 = 68% 0/ 100 = 0% agenwi034.corp.ctv.ca [10.1.38.134]
Trace complete.
"

ftp the log file to unix and format it in a readable way.
grep -E "\/ 100" d.txt > d1.txt

the new log d1.txt looks like this:
"
E:\dba>echo Thu 05/19/2011 3:20:15.19
Thu 05/19/2011 3:20:15.19
0/ 100 = 0% |
1 0ms 0/ 100 = 0% 0/ 100 = 0% agends030.corp.ctv.ca [10.1.38.130]
68/ 100 = 68% |
1 1ms 68/ 100 = 68% 0/ 100 = 0% agenwi034.corp.ctv.ca [10.1.38.134]
65/ 100 = 65% |
1 1ms 65/ 100 = 65% 0/ 100 = 0% agenwi031.corp.ctv.ca [10.1.38.131]
60/ 100 = 60% |
1 1ms 60/ 100 = 60% 0/ 100 = 0% agenwi032.corp.ctv.ca [10.1.38.132]
39/ 100 = 39% |
1 0ms 39/ 100 = 39% 0/ 100 = 0% agenwi033.corp.ctv.ca [10.1.38.133]
"

now network is convinced of the network issue and is working on it.

Comments

Popular posts from this blog

Opatch apply/lsinventory error: oneoff is corrupted or does not exist

I am applying the quarterly patch for 19c RDBMS, I tried using napply but failed, but somehow it corrupted the inventory though nothing applied. further apply and lsinventory command ran into error like this: $ ./OPatch/opatch lsinventory Oracle Interim Patch Installer version 12.2.0.1.21 Copyright (c) 2020, Oracle Corporation.  All rights reserved. Oracle Home       : /u02/app/oracle/19.0.0 Central Inventory : /u01/app/oraInventory    from           : /u02/app/oracle/19.0.0/oraInst.loc OPatch version    : 12.2.0.1.21 OUI version       : 12.2.0.7.0 Log file location : /u02/app/oracle/19.0.0/cfgtoollogs/opatch/opatch2020-09-08_13-35-59PM_1.log Lsinventory Output file location : /u02/app/oracle/19.0.0/cfgtoollogs/opatch/lsinv/lsinventory2020-09-08_13-35-59PM.txt -------------------------------------------------------------------------------- Inventory load failed... OPatch cannot load inventory for the given Oracle Home. LsInventorySession failed: Unable to create patchObject Possible ca

non-existent process lock port on windows server

I have a database link created between oracle and sqlserver using oracle tg4odbc, the product is installed on windows server and run as service "OracleOraGtw11g_home1TNSListener", but sometime the service cannot started, the root cause of this problem is that the port number 1521 is used by an non-existent process. The first step is to use netstat -bano|find "1521" to get the process id, in my case it's 5844, which shows the connection is from my oracle server 10.8.0.169 H:\>netstat -bano|find "1521"   TCP    0.0.0.0:1521           0.0.0.0:0              LISTENING       5844   TCP    10.14.45.33:1521       10.8.0.169:42987       ESTABLISHED     5844 however the process id does not show in either task manager or process explorer. The next step is to run tcpview, which shows non-existent under process column, there are three rows, two show status as "listening", the other one shows status "established", right click and k

shell script to clean up oracle dumpfile

https://github.com/iacosta/personal/blob/master/shells/cleanhouse.sh #!/bin/ksh # # Script used to cleanup any Oracle environment. # # Cleans:      audit_file_dest #              background_dump_dest #              core_dump_dest #              user_dump_dest #              Clusterware logs # # Rotates:     Alert Logs #              Listener Logs # # Scheduling:  00 00 * * * /networkdrive/dba/scripts/purge/cleanup.sh -d 7 > /u01/dba/bin/cleanup.log 2>&1 # # Created By:  Lei Dao # # # RM="rm -f" RMDIR="rm -rf" LS="ls -l" MV="mv" TOUCH="touch" TESTTOUCH="echo touch" TESTMV="echo mv" TESTRM=$LS TESTRMDIR=$LS SUCCESS=0 FAILURE=1 TEST=0 HOSTNAME=`hostname` ORAENV="oraenv" TODAY=`date +%Y%m%d` ORIGPATH=/usr/local/bin:$PATH ORIGLD=$LD_LIBRARY_PATH export PATH=$ORIGPATH # Usage function. f_usage(){   echo "Usage: `basename $0` -d DAYS [-a DAYS] [-b DAYS] [