Sunday, September 20, 2009

Oracle 10g RAC Setup on Solaris X86-64 bit platform

I am uploading a detailed RAC Setup that I did on Solaris X86-64 bit platform. I got the chance to configure the SAN running in Windows 2003 server. For those DBA's who would know about SAN and what LUN is and how it is created, this is a very good document.

www.oracleracsig.org/pls/apex/RAC_SIG.download_my_file?p_file=1002771

Hang Analyze for Oracle Database

The following blog was created because of the difficulty in getting Oracle give a reason for performance issues that are not repeatable. Normally when an SR is raised, after quite some time Oracle would ask for a list of files that we should have taken during the performance issue window.

The following steps would be best to find the root cause of a performance issue:

1) Using Truss/ Trace on the DB Process in the server

- Attaching to an existing process
- Use 'ps' to identify the process you wish to trace.
Eg: ps -ef | grep oracleTEST

Log in as a user who has permission to control the process you want to trace. This means logging in as either ROOT or the process owner.

Find a writable directory which has lots of spare space. Eg: We assume here /tmp has lots of free space.

Enter the relevant command:

truss -o /tmp/mytruss -p

or:

trace -o /tmp/mytruss -p

To follow child processes:

truss -o /tmp/mytruss -fae -p

This will run until the process (and any traced children) exit. You can interrupt the capture of information if only a snapshot of the trace is required.

2) Use of Hang Analyze in the Database

- Run the following in the Database which is hanging or in which a performance
issue is faced:

sqlplus '/ as sysdba'
oradebug setmypid
oradebug unlimit
oradebug hanganalyze 4
oradebug dump systemstate 266
quit

NOTE: THE SYSTEMSTATE should be taken carefully as it would bring the Instance/ Database to a standstill.

3) Upload files to Oracle Support

- Upload the relevant trace files created in the /tmp and udump folder to My Oracle Support for analysis


A Sample Action plan that I got for an issue is given below. I would suggest that it is best to follow this so that we do not end up Oracle telling us to do this N days after the issue was reported, by when this would be too late:

ACTION PLAN
===========
Steps suggested by Oracle Support during a performance issue is as follows:

Provide the 10046 trace for the slow select/update.
• To gather 10046 trace at the session level:
alter session set tracefile_identifier='10046';

alter session set timed_statistics = true;
alter session set statistics_level=all;
alter session set max_dump_file_size = unlimited;

alter session set events '10046 trace name context forever,level 12';

-- Execute the queries or operations to be traced here --

select * from dual;
exit;

Also provide the errorstack on the hung session.

connect / as sysdba
oradebug setospid
oradebug unlimit
oradebug dump errorstack 3
----wait for 10 secs----
oradebug dump errorstack 3
----wait for 10 secs----
oradebug dump errorstack 3
----wait for 10 secs----

Upload the traces generated in udump to metalink.

Using SQL*Plus connect as SYSDBA using the following command:
sqlplus " / as sysdba"
Do this 2 times in 2 separate windows, creating 2 sqlplus sessions (SQL1 and SQL2)
In SQL1 gather the hanganalyze by executing the following:
SQL1> oradebug setmypid
SQL1> oradebug unlimit;
SQL1> oradebug hanganalyze 3

In SQL2 gather the systemstates by executing the following:
SQL2> oradebug setmypid
SQL2> oradebug unlimit;
SQL2> oradebug dump systemstate 266

Gather a second hang analyze having waited at least 1 minute to give time to identify process state change
s. In SQL1 execute the following:
SQL1> oradebug hanganalyze 3

In SQL2 execute the following to collect a second systemstate dump:
SQL2> oradebug setmypid
SQL2> oradebug unlimit;
SQL2> oradebug dump systemstate 266

Hang Analyze for Oracle Database

The following blog was created because of the difficulty in getting Oracle give a reason for performance issues that are not repeatable.

DB Control 10.2 for RAC Database

The steps for dropping and creating DB Control for all version of 10.2 RAC is as follows:

1) Drop DB Control and remove the repository from the Database:

Note: If this is carried on in a database in which Transactions are happening, the database will go to a quiese state i.e. no new connections to the database would be
permitted. So be very careful before this is done. The following command normally finishes in about 4-5 minutes. This is not dependant on the size of the database.

-bash-3.00$ emca -deconfig dbcontrol db -repos drop -cluster
STARTED EMCA at Apr 1, 2009 4:13:23 AM
EM Configuration Assistant, Version 10.2.0.1.0 Production
Copyright (c) 2003, 2005, Oracle. All rights reserved.
Enter the following information:
Database unique name: testing
Listener port number: 1521
Password for SYS user:
Password for SYSMAN user:
Do you wish to continue? [yes(Y)/no(N)]: Y
Apr 1, 2009 4:14:09 AM oracle.sysman.emcp.EMConfig perform
INFO: This operation is being logged at /project/oracle/product/10.2.0/db_1/cfgtoollogs/emca/testing/emca_2009-04-01_04-13-23-AM.log.
Apr 1, 2009 4:14:11 AM oracle.sysman.emcp.util.DBControlUtil stopOMS
INFO: Stopping Database Control (this may take a while) ...
Apr 1, 2009 4:14:22 AM oracle.sysman.emcp.EMReposConfig dropRepository
INFO: Dropping the EM repository (this may take a while) ...
Apr 1, 2009 4:15:41 AM oracle.sysman.emcp.EMReposConfig invoke
INFO: Repository successfully dropped
Enterprise Manager configuration completed successfully
FINISHED EMCA at Apr 1, 2009 4:15:48 AM

2)Re-Create DB Control
Note: The following command normally finishes in about 5-10 minutes.

-bash-3.00$ emca -config dbcontrol db -repos create -cluster

STARTED EMCA at Apr 1, 2009 4:21:23 AM
EM Configuration Assistant, Version 10.2.0.1.0 Production
Copyright (c) 2003, 2005, Oracle. All rights reserved.

Enter the following information:
Database unique name: testing
Listener port number: 1521
Cluster name: crs
Password for SYS user:
Password for DBSNMP user:
Invalid username/password.
Password for DBSNMP user:
Password for SYSMAN user:
Email address for notifications (optional):
Outgoing Mail (SMTP) server for notifications (optional):
ASM ORACLE_HOME [ /project/oracle/product/10.2.0/db_1 ]:
ASM port [ 1521 ]:
ASM user role [ SYSDBA ]:
ASM username [ SYS ]:
ASM user password:
-----------------------------------------------------------------
You have specified the following settings
Database ORACLE_HOME ................ /project/oracle/product/10.2.0/db_1
Database instance hostname ................ n200
Listener port number ................ 1521
Cluster name ................ crs
Database unique name ................ testing
Email address for notifications...............
Outgoing Mail (SMTP) server for notifications...............
ASM ORACLE_HOME ................ /project/oracle/product/10.2.0/db_1
ASM port ................ 1521
ASM user role................ SYSDBA
ASM username................ SYS
-----------------------------------------------------------------
Do you wish to continue? [yes(Y)/no(N)]: Y
Apr 1, 2009 4:23:21 AM oracle.sysman.emcp.EMConfig perform
INFO: This operation is being logged at /project/oracle/product/10.2.0/db_1/cfgtoollogs/emca/testing/emca_2009-04-01_04-21-23-AM.log.
Apr 1, 2009 4:23:24 AM oracle.sysman.emcp.EMReposConfig createRepository
INFO: Creating the EM repository (this may take a while) ...
Apr 1, 2009 4:27:33 AM oracle.sysman.emcp.EMReposConfig invoke
INFO: Repository successfully created
Apr 1, 2009 4:27:36 AM oracle.sysman.emcp.EMDBCConfig instantiateOC4JConfigFiles
INFO: Propagating /project/oracle/product/10.2.0/db_1/oc4j/j2ee/OC4J_DBConsole_n200_testing1 to remote nodes ...
Apr 1, 2009 4:27:37 AM oracle.sysman.emcp.EMDBCConfig instantiateOC4JConfigFiles
INFO: Propagating /project/oracle/product/10.2.0/db_1/oc4j/j2ee/OC4J_DBConsole_n100_testing2 to remote nodes ...
Apr 1, 2009 4:27:39 AM oracle.sysman.emcp.EMAgentConfig deployStateDirs
INFO: Propagating /project/oracle/product/10.2.0/db_1/n200_testing1 to remote nodes ...
Apr 1, 2009 4:27:41 AM oracle.sysman.emcp.EMAgentConfig deployStateDirs
INFO: Propagating /project/oracle/product/10.2.0/db_1/n100_testing2 to remote nodes ...
Apr 1, 2009 4:27:42 AM oracle.sysman.emcp.util.DBControlUtil startOMS
INFO: Starting Database Control (this may take a while) ...
Apr 1, 2009 4:29:25 AM oracle.sysman.emcp.EMDBPostConfig performConfiguration
INFO: Database Control started successfully
Apr 1, 2009 4:29:25 AM oracle.sysman.emcp.EMDBPostConfig performConfiguration
INFO: >>>>>>>>>>> The Database Control URL is http://n200:1158/em <<<<<<<<<<<
Apr 1, 2009 4:29:25 AM oracle.sysman.emcp.EMDBPostConfig showClusterDBCAgentMessage
INFO:
**************** Current Configuration ****************
INSTANCE NODE DBCONTROL_UPLOAD_HOST
---------- ---------- ---------------------
testing1 n200 n200
testing2 n100 n200
Enterprise Manager configuration completed successfully
FINISHED EMCA at Apr 1, 2009 4:29:25 AM

3) Verify DB Control is running

-bash-3.00$ emctl status dbconsole
Oracle Enterprise Manager 10g Database Control Release 10.2.0.3.0
Copyright (c) 1996, 2006 Oracle Corporation. All rights reserved.
http://n200:1158/em/console/aboutApplication
Oracle Enterprise Manager 10g is running.
------------------------------------------------------------------
Logs are generated in directory /project/oracle/product/10.2.0/db_1/n200_testing1/sysman/log

4) Verify DB Control is working as required

To ensure that the DB Control is working as required connects to the specified IP using the web browser:
e.g.: http://192.168.20.213:1158/em/

Note: Do Connect to either of the Instances in the RAC Cluster from the EM Console and ensure all the information (i.e. the DB Control page of each instance is displayed) is being displayed. Otherwise it indicates that the DB Control installation is not correct