miércoles, 14 de diciembre de 2011

Checking the status of an Oracle Instance in Unix

In many Unix script is very common to find the commands below to check the status of an Oracle Instance :

_status=`ps -fe | grep pmon_${ORACLE_SID} | grep -v grep | wc -l`

This usually is correct. If the database is Up then it will return 1, else it returns 0.

But what happens if there's two instances in the same server. One named ORCL and the other ORC. In this case the command will return 2 when both databases are Up.

A posible solution to avoid a mistaken behavior in the script when asking "if [ ${_status} -eq 1 ]; then", is to check the status of the process relative to the specific instance. It can be done this way:

_status=`ps -fe | grep pmon_${ORACLE_SID} | grep -v grep | awk '{ print $NF"*" }' | grep pmon_${ORACLE_SID}"*" | wc -l`

With the commands above it will always return 0 or 1.

sábado, 3 de diciembre de 2011

Problem with dinamic array in ksh script

The other day I was working on a unix script in which I use dynamic arrays. I found some strange behavior when trying to fill up the array. When I use the code with “while loop” the array didn’t fill up.

  n=0
  grep -i "fbackup(" logfile.log | while read line
  do
   errores[$a]="$line"
   (( n=n+1 ))
  done


When the above code was changed to use the “for loop”, it did worked.

  ORIGIFS=$IFS
  IFS=`echo -en "\n\b"`     # set $IFS to end-of-line
  n=0
  for linea in `grep -i "fbackup(" logfile.log`
  do
   errores[$n]=${linea}
   (( n=n+1 ))
  done
  IFS=$ORIGIFS      # set $IFS back



domingo, 13 de febrero de 2011

krvxerpt: Errors detected in process #, role LOGICAL STANDBY COORDINATOR.

While doing some tests in my lab, I have faced error ORA-01425 on the logical standby. This causes the logical standby process to stop. See the log files and trace, and the workaround from Oracle.

Error
in primary alert.log


Sun Feb 13 08:38:39 2011
Error 12170 received logging on to the standby
Sun Feb 13 08:38:39 2011
Errors in file /oracle/app/oracle/admin/lcssl/bdump/lcssl_arc0_6079.trc:
ORA-12170: TNS:Connect timeout occurred
PING[ARC0]: Heartbeat failed to connect to standby 'lcssl.bccasdsp'. Error is 12170.



Error in standby alert.log

Sun Feb 13 08:41:31 2011
krvxerpt: Errors detected in process 22, role LOGICAL STANDBY COORDINATOR.
Sun Feb 13 08:41:31 2011
krvsqn2s: unhandled failure 604
Sun Feb 13 08:41:31 2011
Errors in file /oracle/app/oracle/admin/lcssl/bdump/lcssl_lsp0_6571.trc:
ORA-00604: error occurred at recursive SQL level 1
ORA-01425: escape character must be character string of length 1
LOGSTDBY status: ORA-00604: error occurred at recursive SQL level 1
ORA-01425: escape character must be character string of length 1



Trace file /oracle/app/oracle/admin/lcssl/bdump/lcssl_lsp0_6571.trc

Oracle Database 10g Enterprise Edition Release 10.2.0.1.0 - Production
With the Partitioning, OLAP and Data Mining options
ORACLE_HOME = /oracle/app/oracle/10.2
System name: Linux
Node name: bccasdsp.labdomain
Release: 2.6.9-42.0.0.0.1.ELsmp
Version: #1 SMP Sun Oct 15 14:02:40 PDT 2006
Machine: i686
Instance name: lcssl
Redo thread mounted by this instance: 1
Oracle process number: 22
Unix process pid: 6571, image: oracle@bccasdsp.labdomain (LSP0)

*** SERVICE NAME:(SYS$BACKGROUND) 2011-02-13 08:41:30.818
*** SESSION ID:(147.1) 2011-02-13 08:41:30.818
ORA-00604: error occurred at recursive SQL level 1
ORA-01425: escape character must be character string of length 1
knahcapplymain: encountered error=604
*** 2011-02-13 08:41:30.821
ksedmp: internal or fatal error
ORA-00604: error occurred at recursive SQL level 1
ORA-01425: escape character must be character string of length 1
KNACDMP: *******************************************************
KNACDMP: Dumping apply coordinator's context at bfffe214
KNACDMP: Apply Engine # 0
KNACDMP: Apply Engine name
KNACDMP: Coordinator's Watermarks ------------------------------
KNACDMP: Apply High Watermark = 0x0000.0169d754
KNACDMP: Apply Low Watermark = 0x0000.0169d754
KNACDMP: Recovery Low Watermark = 0x0000.00000000
KNACDMP: Fetch Low Watermark = 0x0000.00000000
KNACDMP: Fetch Low Watermark Time = 0
KNACDMP: Oldest SCN = 0x0000.00000000
KNACDMP: Oldest XID = 0x0000.000.00000000
KNACDMP: Last replicant syncpoint SCN = 0x0000.00000000
KNACDMP: Last syncpoint at primary SCN = 0x0000.00000000
KNACDMP: First partition max SCN = 0x0000.00608092
KNACDMP: Last partition max SCN = 0x0000.00608092
KNACDMP: Last processed SCN = 0x0000.0169d755
KNACDMP: Conservative SCN = 0x0000.00000000
KNACDMP: Recovery start SCN = 0x0000.00000000
KNACDMP: Recovery high watermark = 0x0000.00000000
KNACDMP: Coordinator's constants -------------------------------
KNACDMP: number of apply slaves = 5
KNACDMP: safety level (K) = 1
KNACDMP: max txns in memory = 400
KNACDMP: max constraints per table = 246
KNACDMP: hash table size (in entries) = 40000
KNACDMP: Coordinator's intervals -------------------------------
KNACDMP: syncpoint interval (ms) = 0
KNACDMP: write low watermark interval(ms)= 1
KNACDMP: Coordinator's timers/counters -------------------------
KNACDMP: current time = 1297600890
KNACDMP: low watermark timer = 0
KNACDMP: syncpoint timer = 0
KNACDMP: Coordinator's txn counts -------------------------
KNACDMP: total txns applied = 0
KNACDMP: total applied at last plwm write= 0
KNACDMP: apply prog. entries below plwm = 0
KNACDMP: Coordinator's State/Flags -----------------------------
KNACDMP: Coordinator's State = KNACST_APPLY_UNTIL_END
KNACDMP: Coordinator's Flags = 0x0
KNACDMP: Slave counts ------------------------------------------
KNACDMP: number of reserved slaves = 1
KNACDMP: number of admin slaves = 0
KNACDMP: number of slaves in wait cmt = 0
KNACDMP: number of slaves suspended = 0
KNACDMP: number of safe slaves = 0
KNACDMP: Slave Lists -------------------------------------------
KNACDMP: Dumping All Slaves :-
Slave id = 0, State = 8, Flags = 0, Not Assigned
Slave id = 1, State = 0, Flags = 0, Not Assigned
Slave id = 2, State = 0, Flags = 0, Not Assigned
Slave id = 3, State = 0, Flags = 0, Not Assigned
Slave id = 4, State = 0, Flags = 0, Not Assigned
Slave id = 5, State = 0, Flags = 0, Not Assigned
KNACDMP: End dumping all slaves
KNACDMP: syncdep slaves = { }
KNACDMP: cont chunk slaves = { }
KNACDMP: cont slaves = { }
KNACDMP: exec txn slaves = { }
KNACDMP:Idle slaves (5) ={ 1 2 3 4 5 }
KNACDMP: Txn Lists ---------------------------------------------
KNACDMP: Dumping all txns :-
KNACDMP: End dumping all txns.
KNACDMP: Complete txns = { ** NO UNASS ** }
KNACDMP: Unassigned txns = { }
KNACDMP: *******************************************************
Warning: Apply error received: ORA-26714: User error encountered while applying. Clearing.
ORA-00604: error occurred at recursive SQL level 1
knahcapplymain: encountered error=604
*** 2011-02-13 08:41:31.457
ksedmp: internal or fatal error
ORA-00604: error occurred at recursive SQL level 1
ORA-01425: escape character must be character string of length 1
KNACDMP: *******************************************************
KNACDMP: Dumping apply coordinator's context at bfffe214
KNACDMP: Apply Engine # 0
KNACDMP: Apply Engine name
KNACDMP: Coordinator's Watermarks ------------------------------
KNACDMP: Apply High Watermark = 0x0000.0169d754
KNACDMP: Apply Low Watermark = 0x0000.0169d754
KNACDMP: Recovery Low Watermark = 0x0000.00000000
KNACDMP: Fetch Low Watermark = 0x0000.00000000
KNACDMP: Fetch Low Watermark Time = 0
KNACDMP: Oldest SCN = 0x0000.00000000
KNACDMP: Oldest XID = 0x0000.000.00000000
KNACDMP: Last replicant syncpoint SCN = 0x0000.00000000
KNACDMP: Last syncpoint at primary SCN = 0x0000.00000000
KNACDMP: First partition max SCN = 0x0000.00608092
KNACDMP: Last partition max SCN = 0x0000.00608092
KNACDMP: Last processed SCN = 0x0000.0169d755
KNACDMP: Conservative SCN = 0x0000.00000000
KNACDMP: Recovery start SCN = 0x0000.00000000
KNACDMP: Recovery high watermark = 0x0000.00000000
KNACDMP: Coordinator's constants -------------------------------
KNACDMP: number of apply slaves = 5
KNACDMP: safety level (K) = 1
KNACDMP: max txns in memory = 400
KNACDMP: max constraints per table = 246
KNACDMP: hash table size (in entries) = 40000
KNACDMP: Coordinator's intervals -------------------------------
KNACDMP: syncpoint interval (ms) = 0
KNACDMP: write low watermark interval(ms)= 1
KNACDMP: Coordinator's timers/counters -------------------------
KNACDMP: current time = 1297600891
KNACDMP: low watermark timer = 0
KNACDMP: syncpoint timer = 0
KNACDMP: Coordinator's txn counts -------------------------
KNACDMP: total txns applied = 0
KNACDMP: total applied at last plwm write= 0
KNACDMP: apply prog. entries below plwm = 0
KNACDMP: Coordinator's State/Flags -----------------------------
KNACDMP: Coordinator's State = KNACST_APPLY_UNTIL_END
KNACDMP: Coordinator's Flags = 0x200
KNACDMP: Slave counts ------------------------------------------
KNACDMP: number of reserved slaves = 1
KNACDMP: number of admin slaves = 0
KNACDMP: number of slaves in wait cmt = 0
KNACDMP: number of slaves suspended = 0
KNACDMP: number of safe slaves = 0
KNACDMP: Slave Lists -------------------------------------------
KNACDMP: Dumping All Slaves :-
Slave id = 0, State = 8, Flags = 0, Not Assigned
Slave id = 1, State = 0, Flags = 0, Not Assigned
Slave id = 2, State = 0, Flags = 0, Not Assigned
Slave id = 3, State = 0, Flags = 0, Not Assigned
Slave id = 4, State = 0, Flags = 0, Not Assigned
Slave id = 5, State = 0, Flags = 0, Not Assigned
KNACDMP: End dumping all slaves
KNACDMP: syncdep slaves = { }
KNACDMP: cont chunk slaves = { }
KNACDMP: cont slaves = { }
KNACDMP: exec txn slaves = { }
KNACDMP:Idle slaves (5) ={ 1 2 3 4 5 }
KNACDMP: Txn Lists ---------------------------------------------
KNACDMP: Dumping all txns :-
KNACDMP: End dumping all txns.
KNACDMP: Complete txns = { ** NO UNASS ** }
KNACDMP: Unassigned txns = { }
KNACDMP: *******************************************************
Warning: Apply error received: ORA-26714: User error encountered while applying. Clearing.

Solution :

According to note 748208.1 this is Bug 5108158 wich will be fix in 11g. Meanwhile there is a workaround :

1. Stop Logical Standby Apply :
alter database stop logical standby;

2. Execute :

set echo on
set pagesize 100
spool workaround.log
select * from system.logstdby$skip;
select distinct nvl(esc, 'NULL') from system.logstdby$skip;
select * from system.logstdby$skip where esc is null;
update system.logstdby$skip
set esc = '\'
where esc is NULL;

-- Following should return no rows (due to update above)
select * from system.logstdby$skip where esc is null;
-- should no longer see any NULL in output
select distinct nvl(esc, 'NULL') from system.logstdby$skip;
-- Capture a snapshot of the final results
select * from system.logstdby$skip;

-- commit changes
commit;


3. Start Logical Standby Apply :
alter database start logical standby apply immediate;