miércoles, 14 de diciembre de 2011

Checking the status of an Oracle Instance in Unix

In many Unix script is very common to find the commands below to check the status of an Oracle Instance :

_status=`ps -fe | grep pmon_${ORACLE_SID} | grep -v grep | wc -l`

This usually is correct. If the database is Up then it will return 1, else it returns 0.

But what happens if there's two instances in the same server. One named ORCL and the other ORC. In this case the command will return 2 when both databases are Up.

A posible solution to avoid a mistaken behavior in the script when asking "if [ ${_status} -eq 1 ]; then", is to check the status of the process relative to the specific instance. It can be done this way:

_status=`ps -fe | grep pmon_${ORACLE_SID} | grep -v grep | awk '{ print $NF"*" }' | grep pmon_${ORACLE_SID}"*" | wc -l`

With the commands above it will always return 0 or 1.

sábado, 3 de diciembre de 2011

Problem with dinamic array in ksh script

The other day I was working on a unix script in which I use dynamic arrays. I found some strange behavior when trying to fill up the array. When I use the code with “while loop” the array didn’t fill up.

  n=0
  grep -i "fbackup(" logfile.log | while read line
  do
   errores[$a]="$line"
   (( n=n+1 ))
  done


When the above code was changed to use the “for loop”, it did worked.

  ORIGIFS=$IFS
  IFS=`echo -en "\n\b"`     # set $IFS to end-of-line
  n=0
  for linea in `grep -i "fbackup(" logfile.log`
  do
   errores[$n]=${linea}
   (( n=n+1 ))
  done
  IFS=$ORIGIFS      # set $IFS back



domingo, 13 de febrero de 2011

krvxerpt: Errors detected in process #, role LOGICAL STANDBY COORDINATOR.

While doing some tests in my lab, I have faced error ORA-01425 on the logical standby. This causes the logical standby process to stop. See the log files and trace, and the workaround from Oracle.

Error
in primary alert.log


Sun Feb 13 08:38:39 2011
Error 12170 received logging on to the standby
Sun Feb 13 08:38:39 2011
Errors in file /oracle/app/oracle/admin/lcssl/bdump/lcssl_arc0_6079.trc:
ORA-12170: TNS:Connect timeout occurred
PING[ARC0]: Heartbeat failed to connect to standby 'lcssl.bccasdsp'. Error is 12170.



Error in standby alert.log

Sun Feb 13 08:41:31 2011
krvxerpt: Errors detected in process 22, role LOGICAL STANDBY COORDINATOR.
Sun Feb 13 08:41:31 2011
krvsqn2s: unhandled failure 604
Sun Feb 13 08:41:31 2011
Errors in file /oracle/app/oracle/admin/lcssl/bdump/lcssl_lsp0_6571.trc:
ORA-00604: error occurred at recursive SQL level 1
ORA-01425: escape character must be character string of length 1
LOGSTDBY status: ORA-00604: error occurred at recursive SQL level 1
ORA-01425: escape character must be character string of length 1



Trace file /oracle/app/oracle/admin/lcssl/bdump/lcssl_lsp0_6571.trc

Oracle Database 10g Enterprise Edition Release 10.2.0.1.0 - Production
With the Partitioning, OLAP and Data Mining options
ORACLE_HOME = /oracle/app/oracle/10.2
System name: Linux
Node name: bccasdsp.labdomain
Release: 2.6.9-42.0.0.0.1.ELsmp
Version: #1 SMP Sun Oct 15 14:02:40 PDT 2006
Machine: i686
Instance name: lcssl
Redo thread mounted by this instance: 1
Oracle process number: 22
Unix process pid: 6571, image: oracle@bccasdsp.labdomain (LSP0)

*** SERVICE NAME:(SYS$BACKGROUND) 2011-02-13 08:41:30.818
*** SESSION ID:(147.1) 2011-02-13 08:41:30.818
ORA-00604: error occurred at recursive SQL level 1
ORA-01425: escape character must be character string of length 1
knahcapplymain: encountered error=604
*** 2011-02-13 08:41:30.821
ksedmp: internal or fatal error
ORA-00604: error occurred at recursive SQL level 1
ORA-01425: escape character must be character string of length 1
KNACDMP: *******************************************************
KNACDMP: Dumping apply coordinator's context at bfffe214
KNACDMP: Apply Engine # 0
KNACDMP: Apply Engine name
KNACDMP: Coordinator's Watermarks ------------------------------
KNACDMP: Apply High Watermark = 0x0000.0169d754
KNACDMP: Apply Low Watermark = 0x0000.0169d754
KNACDMP: Recovery Low Watermark = 0x0000.00000000
KNACDMP: Fetch Low Watermark = 0x0000.00000000
KNACDMP: Fetch Low Watermark Time = 0
KNACDMP: Oldest SCN = 0x0000.00000000
KNACDMP: Oldest XID = 0x0000.000.00000000
KNACDMP: Last replicant syncpoint SCN = 0x0000.00000000
KNACDMP: Last syncpoint at primary SCN = 0x0000.00000000
KNACDMP: First partition max SCN = 0x0000.00608092
KNACDMP: Last partition max SCN = 0x0000.00608092
KNACDMP: Last processed SCN = 0x0000.0169d755
KNACDMP: Conservative SCN = 0x0000.00000000
KNACDMP: Recovery start SCN = 0x0000.00000000
KNACDMP: Recovery high watermark = 0x0000.00000000
KNACDMP: Coordinator's constants -------------------------------
KNACDMP: number of apply slaves = 5
KNACDMP: safety level (K) = 1
KNACDMP: max txns in memory = 400
KNACDMP: max constraints per table = 246
KNACDMP: hash table size (in entries) = 40000
KNACDMP: Coordinator's intervals -------------------------------
KNACDMP: syncpoint interval (ms) = 0
KNACDMP: write low watermark interval(ms)= 1
KNACDMP: Coordinator's timers/counters -------------------------
KNACDMP: current time = 1297600890
KNACDMP: low watermark timer = 0
KNACDMP: syncpoint timer = 0
KNACDMP: Coordinator's txn counts -------------------------
KNACDMP: total txns applied = 0
KNACDMP: total applied at last plwm write= 0
KNACDMP: apply prog. entries below plwm = 0
KNACDMP: Coordinator's State/Flags -----------------------------
KNACDMP: Coordinator's State = KNACST_APPLY_UNTIL_END
KNACDMP: Coordinator's Flags = 0x0
KNACDMP: Slave counts ------------------------------------------
KNACDMP: number of reserved slaves = 1
KNACDMP: number of admin slaves = 0
KNACDMP: number of slaves in wait cmt = 0
KNACDMP: number of slaves suspended = 0
KNACDMP: number of safe slaves = 0
KNACDMP: Slave Lists -------------------------------------------
KNACDMP: Dumping All Slaves :-
Slave id = 0, State = 8, Flags = 0, Not Assigned
Slave id = 1, State = 0, Flags = 0, Not Assigned
Slave id = 2, State = 0, Flags = 0, Not Assigned
Slave id = 3, State = 0, Flags = 0, Not Assigned
Slave id = 4, State = 0, Flags = 0, Not Assigned
Slave id = 5, State = 0, Flags = 0, Not Assigned
KNACDMP: End dumping all slaves
KNACDMP: syncdep slaves = { }
KNACDMP: cont chunk slaves = { }
KNACDMP: cont slaves = { }
KNACDMP: exec txn slaves = { }
KNACDMP:Idle slaves (5) ={ 1 2 3 4 5 }
KNACDMP: Txn Lists ---------------------------------------------
KNACDMP: Dumping all txns :-
KNACDMP: End dumping all txns.
KNACDMP: Complete txns = { ** NO UNASS ** }
KNACDMP: Unassigned txns = { }
KNACDMP: *******************************************************
Warning: Apply error received: ORA-26714: User error encountered while applying. Clearing.
ORA-00604: error occurred at recursive SQL level 1
knahcapplymain: encountered error=604
*** 2011-02-13 08:41:31.457
ksedmp: internal or fatal error
ORA-00604: error occurred at recursive SQL level 1
ORA-01425: escape character must be character string of length 1
KNACDMP: *******************************************************
KNACDMP: Dumping apply coordinator's context at bfffe214
KNACDMP: Apply Engine # 0
KNACDMP: Apply Engine name
KNACDMP: Coordinator's Watermarks ------------------------------
KNACDMP: Apply High Watermark = 0x0000.0169d754
KNACDMP: Apply Low Watermark = 0x0000.0169d754
KNACDMP: Recovery Low Watermark = 0x0000.00000000
KNACDMP: Fetch Low Watermark = 0x0000.00000000
KNACDMP: Fetch Low Watermark Time = 0
KNACDMP: Oldest SCN = 0x0000.00000000
KNACDMP: Oldest XID = 0x0000.000.00000000
KNACDMP: Last replicant syncpoint SCN = 0x0000.00000000
KNACDMP: Last syncpoint at primary SCN = 0x0000.00000000
KNACDMP: First partition max SCN = 0x0000.00608092
KNACDMP: Last partition max SCN = 0x0000.00608092
KNACDMP: Last processed SCN = 0x0000.0169d755
KNACDMP: Conservative SCN = 0x0000.00000000
KNACDMP: Recovery start SCN = 0x0000.00000000
KNACDMP: Recovery high watermark = 0x0000.00000000
KNACDMP: Coordinator's constants -------------------------------
KNACDMP: number of apply slaves = 5
KNACDMP: safety level (K) = 1
KNACDMP: max txns in memory = 400
KNACDMP: max constraints per table = 246
KNACDMP: hash table size (in entries) = 40000
KNACDMP: Coordinator's intervals -------------------------------
KNACDMP: syncpoint interval (ms) = 0
KNACDMP: write low watermark interval(ms)= 1
KNACDMP: Coordinator's timers/counters -------------------------
KNACDMP: current time = 1297600891
KNACDMP: low watermark timer = 0
KNACDMP: syncpoint timer = 0
KNACDMP: Coordinator's txn counts -------------------------
KNACDMP: total txns applied = 0
KNACDMP: total applied at last plwm write= 0
KNACDMP: apply prog. entries below plwm = 0
KNACDMP: Coordinator's State/Flags -----------------------------
KNACDMP: Coordinator's State = KNACST_APPLY_UNTIL_END
KNACDMP: Coordinator's Flags = 0x200
KNACDMP: Slave counts ------------------------------------------
KNACDMP: number of reserved slaves = 1
KNACDMP: number of admin slaves = 0
KNACDMP: number of slaves in wait cmt = 0
KNACDMP: number of slaves suspended = 0
KNACDMP: number of safe slaves = 0
KNACDMP: Slave Lists -------------------------------------------
KNACDMP: Dumping All Slaves :-
Slave id = 0, State = 8, Flags = 0, Not Assigned
Slave id = 1, State = 0, Flags = 0, Not Assigned
Slave id = 2, State = 0, Flags = 0, Not Assigned
Slave id = 3, State = 0, Flags = 0, Not Assigned
Slave id = 4, State = 0, Flags = 0, Not Assigned
Slave id = 5, State = 0, Flags = 0, Not Assigned
KNACDMP: End dumping all slaves
KNACDMP: syncdep slaves = { }
KNACDMP: cont chunk slaves = { }
KNACDMP: cont slaves = { }
KNACDMP: exec txn slaves = { }
KNACDMP:Idle slaves (5) ={ 1 2 3 4 5 }
KNACDMP: Txn Lists ---------------------------------------------
KNACDMP: Dumping all txns :-
KNACDMP: End dumping all txns.
KNACDMP: Complete txns = { ** NO UNASS ** }
KNACDMP: Unassigned txns = { }
KNACDMP: *******************************************************
Warning: Apply error received: ORA-26714: User error encountered while applying. Clearing.

Solution :

According to note 748208.1 this is Bug 5108158 wich will be fix in 11g. Meanwhile there is a workaround :

1. Stop Logical Standby Apply :
alter database stop logical standby;

2. Execute :

set echo on
set pagesize 100
spool workaround.log
select * from system.logstdby$skip;
select distinct nvl(esc, 'NULL') from system.logstdby$skip;
select * from system.logstdby$skip where esc is null;
update system.logstdby$skip
set esc = '\'
where esc is NULL;

-- Following should return no rows (due to update above)
select * from system.logstdby$skip where esc is null;
-- should no longer see any NULL in output
select distinct nvl(esc, 'NULL') from system.logstdby$skip;
-- Capture a snapshot of the final results
select * from system.logstdby$skip;

-- commit changes
commit;


3. Start Logical Standby Apply :
alter database start logical standby apply immediate;

martes, 8 de septiembre de 2009

Memory Notification: Library Cache Object Loaded Into Sga

Problem :

Found many notifications in the alert.log reporting that a threshold was exceeded on the heap size.

Solution :

Set the hidden parameter _kgl_large_heap_warning_threshold to a reasonable greater size. The default threshold in 10.2.0.1 is 2M. In 10.2.0.2, the threshold was increased to 50M. See Metalink Note

330239.1


domingo, 16 de agosto de 2009

PANIC - Error while reading file 5 - state.tgz

While I was doing some test on this VMware ESXi, a power outage happened. When I booted the server this error was displayed :

PANIC: Error while reading file: -5, state.tgz

Oopss ! That freak me a out because I had a lot work done in this machine wich includes several linux servers with Oracle dataguard and Grid Control installed, Windows 2003 SP1, Windows 7, etc.

I immediately googled the error and found a solution in this link http://communities.vmware.com/thread/168277

The error posted in the link was :

Dave.Mishchenko reply this three ways to solve the problem :

Since I did’nt have a backup of the state.tgz file, I choose to use the Recovery Install of ESXi. So what I did was :

Boot from the VMWare ESXi 4.0 cd and select the Recovery Option. Then after the server boot normally I had to reconfigure the password, the Ip Address (I use fixed Address) and the DNS with the old values.

Then I connected to the server using VSphere Client and found out that it is not register and none of my virtual machines were present.

To restore all my virtual machines configuration, this is what I did :