Announcement

**Drizzt** · 14 February 2003, 06:43

RedRed, a few times ago I was "chatting" with Oracle for infos about a system I had to set-up.

For ours standard database, a bit less than 1GB, with 2 read/write access users and about 30 read only users, they recomended a Cluster with 2 special PCs from Oracle, each one with a cost of 50.000€...

Dunno if the guy from Oracle was a bit "drunk" or what, but...

**Jammrock** · 14 February 2003, 06:47

For DB that size, you need a huge disk array server or clustered servers. With that many users and data, it can't be helped. Some small things you can try:

- Add more RAM.
- Add a second NIC.
- Move it to a faster network connection.
- Move it to a big server with a disk array and multiple fast network connections.

Jammrock

**Chucky Cheese** · 15 February 2003, 02:50

Oracle

RedRed,

i agree with Jammrock...

i maintain an oracle server running on an IBM rs6000/p6 server(AIX 4.3). it has an SSA disk array with bypass with 16 9GB drives(2 for rootvg), 4 533Mhz processors and 512MB memory for each processor. the server configuration also has 4 nics per server(2 servers in this cabinet).

all users(150) connect through cisco 2900 switches. as of now i have had no latency issues.

..knock...knock!

do you have HACMP(High Availability Clustering for Multi-Processors) or an equivilant running?

cc

**Pyrophos** · 16 February 2003, 11:34

I'd suggest that you try to see, if the system is running out of memory. It sure sounds like that.

Your DB is (IMHO) fairly small, only 30GB, but that 30GB will not fit to memory and it should not. That is why caches are for. Of course, fast disks/disk-arrays can be necessary. There are several basic fallacies when configuring the Oracle caches (SGA). Here are few hints:

1. Configure SharedPool and DatabaseBlockBuffers so that they are big enough. You want to keep as much data as possible in the memory. The size of the SharedPool depends on the application: If you use lots of PL/SQL or different SQL-statements, you might need larger SharedPool. Remember that configuration of DatabaseBlockBuffers is NOT bytes, but blocks (block = DB_BLOCK_SIZE)

2. Do NOT allocate too much memory for SGA! This will cause paging, which is, obviously counter-productive! If you have 1 GB RAM, then you can allocate, maybe, 700MB for database block buffers, 100MB for SharedPool and and leave 200MB for OS and session-specific information. You can always show the size of the SGA in SQL*Plus by using command SHOW SGA.

3. Note that SORT_AREA_SIZE is allocated PER SESSION, not per instance. I have seen Oracle instances where people have configured 10MB SortAreaSize. With ~ 100 session that is 1GB. The size of SortAreaSize depends, of course, on the application.

4. There are other tricks, but those are the ones I have seen most.

You should also run the normal statistics-scripts to see, where the bottlenecks are. The order of optimization, however, is always from application to DB to OS to hardware, in that order (ie. application optimization usually has highest effect on performance). Without knowing the statistics-numbers of the DB, it is very hard to estimate, where the problem actually is.

You can also try to check for locks.

One thing: I believe that usually when CPU load is 60-70% the box is very heavily loaded, but I think that your box should be able to work with 70 users, depending highly on the nature of your application, ie.: Type of application (OLTP or DW)? Types of users (% of read-only, % of read-write, etc. eg. Some people input data only, some people take reports, some background processes are loading huge amounts of data etc.)? How much data is inserted per day? Etc...

If you can give more detailed information, I might be able to help more.

**RedRed** · 17 February 2003, 02:33

Hello There - thanks very much, one and all......

My main problem is that Oracle is 'not my bag', and that currently the system is being maintained by the application developers - (Set up before my time as Project Manager). The problem is 'WHO WATCHES THE WATCHMEN?' I dont believe that they have carried out any reasonable maintainence since they started! We have no formal resource within my organisation for this kind of thing....

BTW The show SGA returns:

Total System Global Area 766792356 bytes
Fixed Size 74404 bytes
Variable Size 356945920 bytes
Database Buffers 409600000 bytes
Redo Buffers 172032 bytes

the setup parameters are as follows:
background_dump_dest = /oracle8/admin/secomm/bdump
compatible = 8.1.7
control_files = "/app/ost/8.1.7/oradata/parist3/cntlparist302.ctl"
control_files = "/app/ost/8.1.7/oradata/parist3/cntlparist301.ctl"
control_files = "/app/ost/8.1.7/oradata/parist3/cntlparist303.ctl"
core_dump_dest = /oracle8/admin/secomm/cdump
db_block_buffers = 200000
db_domain = world
db_file_multiblock_read_count = 32
db_files = 250
db_name = parist3
dml_locks = 500
enqueue_resources = 1020
global_names = TRUE
hash_area_size = 24020000
log_archive_dest_1 = "LOCATION=/user4/arch/parist3"
log_archive_start = FALSE
log_buffer = 163840
log_checkpoint_interval = 10000
max_dump_file_size = 10240
open_cursors = 300
optimizer_mode = CHOOSE
processes = 500
rollback_segments = ROLLBACK5
rollback_segments = ROLLBACK1
rollback_segments = ROLLBACK3
rollback_segments = ROLLBACK7
rollback_segments = ROLLBACK9
rollback_segments = ROLLBACK2
rollback_segments = ROLLBACK6
rollback_segments = ROLLBACK8
rollback_segments = ROLLBACK4
shared_pool_size = 300000000
sort_area_retained_size = 100000
sort_area_size = 1020000
user_dump_dest = /oracle8/admin/secomm/udump

I cant get the full init.ora as I dont have root access.... (I dont want it either - cant be blamed for any f*ups!)

I honestly dont know if these look reasonable -

I forgot to mention that the DB is istting on a 10 disk Clarrion array - I odnt think there are any problems with its performance... This application is a DCOM three teir application - with next to no useage on the client or the middle teir..... (probably a 'design issue' there!)

RedRed

**cjolley** · 17 February 2003, 06:47

1 gig does not sound like enough memory.
Our DBs are around that size and they have 4 proc and 4 gigs each.

But. it could be your parse to execute ratio.
What language is the front end written in?
I am looking for a script to show your parse to exec ratio now.

Here is a generic link to get you started.

Ask The Oracle Mentors

http://asktom.oracle.com/

chuck

[edit] well, I can't get the #$%^ links to work!?
Just search the site for "hard parse"

**cjolley** · 17 February 2003, 07:08

I'm not having much luck finding stuff here at home.
When I get to work tomorrow I can get you more specific information.
chuck

**RedRed** · 17 February 2003, 09:16

Thanks CJ I am searching now....

Apparently, the Civil Service IT branch have just got on to our IT department - they have been looking at the OS level performance, and found a problem....

Apparently the Clarrion array has up to 200 MS access time! OUCH! Its a nine disk RAID 5 array - I was told by the software company it was adequate (though they didnt supply it)

that certainly wont be helpful......

I have now been tasked with replacing the box - I am going to outsource the advice - I dont know enough about the subject.....

I have either IBM or HP I can choose from, on our regional supply contract... I would have had to replace the box sooner or later as Oracle doesnt support DGUX beond Oracle 8i...

It will be intresting to find out more... It will probably take me 2 months go get it approved - nothing I can do there....

can anyone explain why the system appears to perform OK until I hit a certain number of users - and then simply flop? It doesnt crash exactly, but it performance probably halves per 5 users byond a particular threshold..... Before that it seems fine??? I thought page file useage - especially after hearing about the clarrion, but that wouldnt explain such a dramatic change....

Because I am tied into a support contract with the Data General mafia - (not my choice!) - adding more memory is going to cost me - perhaps a couple of tousand UK POUNDS for 4 gig.... the box cant go byond that anyway.....

Regards
RedRed

**Pyrophos** · 17 February 2003, 11:01

Actually, it is probably the paging that is the reason for your problem, but I cannot be sure since I don't have access to your server. (Remember, that pagefile access from disk is like 10000 times slower than memory access!)

I went through your initialization parameters and came up with this: The SGA is mostly configured ok, BUT, if I recall correctly, also the parameter HASH_AREA_SIZE is per session, not per instance (I have to check this when I have more time. I only remember that it is badly documented...).

If I am right, this means, that your HASH_AREA_SIZE will drain 24020000bytes * 90 users = 2GB! Total memory for your instance is near 3GB!

Try this:

Monitor your system performance throughout the day. Notice, when the system starts to swap and check how many active sessions there are connected to the DB and what the response times for your queries are. If these match, it is the paging that is killing your performance (and I believe it is). You can also do this backwards: keep monitoring the system performance in the background and immediately when you notice that the queries start slowing down, check what is the paging status and how many sessions there are active in DB. Also try to measure the memory size of Oracle-related processes.

If you think it is the paging (and I am right about the HASH_AREA_SIZE), try to reduce the amount of HASH_AREA_SIZE. Put it to eg. 2MB. It will require you to restart the DB, but you can probably do this at morning or other off-hours. If it didn't help, you can put the value back.

Hope this helps. If it doesn't, send more information and I'll try to think something else. (One question: what kind of application you are using on your DB?)

Edited some typos

**Pyrophos** · 17 February 2003, 11:09

Another thing just came to my mind: RAID-5 configurations are REALLY bad for redo-logs. The reasoning for this was that RAID-5 is ok for random-access, such as normal datafiles. But RAID-5 is very slow when writing lots of sequential data, such as redo-logs.
There is a Oracle paper about this, but I was (what a surprise...) unable to find it just now.

In really performance-hungry situations, I have seen lots of systems using RAID 0+1 or RAID 1 configurations so that eg. one table is on one mirrored disk. Sure, they are pain to manage, but the performance is excellent and you can tune the performance of the application by moving tables and tablespaces to different disks to reduce IO-contention.

In most cases, however, I think that "plug-and-play" disk arrays are just fine.

**Pyrophos** · 17 February 2003, 12:29

Checked the HASH_AREA_SIZE. It seems to be per-session, although the explanation was not exact. Thus, 24MB * 90 users = 2GB additional memory...

This would also explain the behavior of your system: As more users connect to the DB, they will eat all the memory. Then the system starts to swap. At first, the swapping will not slow down the system much but as more users connect to the DB the swapping will rapidly kill the performance. As long as the system can keep all the SGA & PGA data in memory, the performance will suffer only a little per additional session. But after exceeding the physical memory, the performance degeneration is almost exponential...

If I were the DBA of that instance, I would try to reduce the HASH_AREA_SIZE. HASH_AREA_SIZE defaults to 2 x SORT_AREA_SIZE and I believe that default value is adequate for most purposes.

**RedRed** · 18 February 2003, 02:56

Thanks guys - I think that between you you have hit the nails on the head!

I am going through my SAR logs to check page-file useage now,
I got TOAD to check the parse-to-hit ratio....

I was concerned that the box might not have sufficient RAM, but I can get the developer to tweak the HASH_AREA_SIZE down for the time being (until I get my new server - and I wont make the same mistake!)

I REALLY, REALLY appreciate your time and effort.... anyone comming to Belfast can have lunch on me!

Regards
Red Red

**cjolley** · 18 February 2003, 07:26

Red,
I'm working up some info for you now.
But, I did want to ask: Do you have a development server?
It sounds like the developers have DBA rights in your production box.
This is very dangerous!
I am the designer of all of our databases and write almost all of the SQL and PL/SQL that is used against it and I don't have, or want any write or compile rights in our production DB.
If you can, keep the box production is on now to accomodate a development instance of your DB when you get the new server.
Just an FYI warning.
chuck

**cjolley** · 18 February 2003, 08:04

Red,
Here is a way to find out if your app is crushing the SQL parser.

first:

Code:

function remove_constants( p_query in varchar2 ) return varchar2 as
    l_query long;
    l_char  varchar2(1);
    l_in_quotes boolean default FALSE;
begin
    for i in 1 .. length( p_query )
    loop
        l_char := substr(p_query,i,1);
        if ( l_char = '''' and l_in_quotes )
        then
            l_in_quotes := FALSE;
        elsif ( l_char = '''' and NOT l_in_quotes )
        then
            l_in_quotes := TRUE;
            l_query := l_query || '''#';
        end if;
        if ( NOT l_in_quotes ) then
            l_query := l_query || l_char;
        end if;
    end loop;
    l_query := translate( l_query, '0123456789', '@@@@@@@@@@' );
    for i in 0 .. 8 loop
        l_query := replace( l_query, lpad('@',10-i,'@'), '@' );
        l_query := replace( l_query, lpad(' ',10-i,' '), ' ' );
    end loop;
    return upper(l_query);
end;

Then:

Code:

create table t1 as select sql_text from v$sqlarea;
 
alter table t1 add sql_text_wo_constants varchar2(1000);
 
update t1 set sql_text_wo_constants = remove_constants(sql_text);

select sql_text_wo_constants, count(*) 
  from t1 
 group by sql_text_wo_constants 
having count(*) > 100 
 order by 2 desc;

The number of rows you get back shows how many copies of essentialy identical SQL it is parsing. The ones with counts under 100 are no big deal.

Your developers can the fix this problem by identifying the program code that is producing these queries.

Here's how it works.
Each query to your DB is like a little program that the DB engine has to compile before it can be run.
This compile overhead (called parsing) usualy takes more computing power than the acutal retrieval of the information.
To allow the DB to service a high number of concurrent users, one of the tricks that Oracle uses is to re-use the parsed queries if it can.
However, the queries must be identical for this to work.
So there is a process called binding that allowes for the use of variables in this small "program".
It goes like this inside the database:
1, bind 'Fred Blotnic' to variable v_name.
2, parse "select value from table where column = :v_var";
3, substitute 'Fred Blotnic' for v_var.
4, run query.
Next user to query "select value from table where column = :v_var" :
1, bind 'John Doe' to v_var
2, woops! don't need to parse.

3, substitute 'John Doe' for v_var.
4, run query.
Next user queries with the variable hard coded:
1, oops! no bind
2, parse "select value from table where column = 'Jane Roe'"

3, no bind to substitute
4, run query.

Imagine the parse step as being as costly as having to re-compile Word in order to edit a document.

Some tools (particularly web based) are realy bad about submiting queries with the variables hard coded instead of bound.
This can also happen to apps that are written with a lot of dynamically (and poorly) generated SQL.
The symtoms are just what you have.
The app does great untill active users are added to the point where the DB gets overloaded doing parsing.
It's as if you started opening lots of applications in Windows, it does fine untill you reach a saturation point then it's performance collapses.
This is a very common and incidious bug in DB front end programming against Oracle.
Chuck

PS I'd be interested to hear what result you get if you run the code I gave you.
Our apps use bind variables a lot and so we only got two rows back, both of them off of the ASP stuff.
In fact that's what I was working on just the last couple of days

[edit] for clarity

Announcement

Oracle 8i performance question - a cry for help!

Oracle 8i performance question - a cry for help!

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment