Tuesday, January 26, 2010

ora-4030 errors and swap sizing on DataWarehousing platforms

"ora-04030: out of process memory when trying to allocate %s bytes (%s,%s)"

Now, what would ORA-4030 errors have to do with swap sizing?

Now, my perspective is limited to DataWarehouse systems, so this would probably be relevant to such environments only. This is also on the Solaris platform, though a similar analogy could work on Linux too.

In DataWarehousing, PGA sizes can get fairly large and overall PGA consumption can vary dramatically depending on the type and nature of extracts/queries and the degree of parallelism.

I have seen PGA consumption as high as 2.5x PAT.

Since PGA is anonymous memory (private to the process), it would need to have swap reservations in place. Or in other words, an equivalent amount of the PGA would be allocated in swap to serve as a backing store. This backing store would be used in case of actual physical memory shortages. If using DISM, then there would be the swap reservations in place for the SGA also. I have used the word reservations/allocations and for the sake of this discussion, they are the same.

The important point to keep in mind is that regardless of whether you have actual memory shortages resulting in swap usage, swap allocation/reservation would always occur.

Consider this scenario - (Assuming /tmp is empty)

Operating System - Solaris
Memory on system = 64GB
Swap Device on Disk = 25GB

Memory Requirements
OS + other apps = 2GB
SGA = 12GB (ISM and so no swap requirements)
PAT = 24GB

If all your processes are consuming 24GB of aggregated PGA, then the swap backing store would require to be 24GB in size. Including the OS requirements, the total swap backing store would need to be 26GB in size.

In Solaris, this swap reservation/allocation can be met by a combination of free physical memory and/or physical swap device.

In this case, it is quite possible that the 26GB of swap backing store can be met entirely from free physical memory. If not, then sufficient space from the physical swap device (25GB in our case) would be used.

Now what happens when we start exceeding 24GB? Let us look at peak PGA usage.

Peak PGA usage (2X PAT) = 48GB
Total Peak Memory usage = 48GB + 12GB + 2GB = 62GB
Available free physical memory = 64GB - 62GB = 2GB

48GB of PGA would need 48GB of swap reservation. 2GB of OS requirements would also require swap reservations for a total of 50GB of swap reservations.

Since the peak total memory requirement is 62GB, only 2GB of free physical memory is available. The physical swap device is 25GB in size, thus making for only 27GB of possible available swap space. Obviously, the system cannot handle 48GB of PGA consumption. Swap is short by 23GB (50GB - 27GB).

Even though there is still 2GB free memory, you will definitely encounter ORA-4030 errors. Along with the 4030 errors, you would also see in the /var/adm/messages file

messages:Jan 20 20:20:56 oradb genunix: [ID 470503 kern.warning] WARNING: Sorry, no swap space to grow stack for pid 9489 (oracle)
messages:Jan 20 20:23:15 oradb genunix: [ID 470503 kern.warning] WARNING: Sorry, no swap space to grow stack for pid 9327 (oracle)

So how much can the PGA grow with a 25GB Swap Device (in the above scenario)? Somewhere around 36GB would be about right.

PGA = 36GB
SGA = 12GB
OS = 2GB

Total consumed Physical memory = 50GB (36GB + 12GB + 2GB)
Available free physical memory = 14GB (64GB - 50GB)
Available Swap = (64-50) + 25GB = 39GB

So the recommendation would be to have physical swap devices of atleast 3X PAT (assuming your peak PGA utilization is 2.5X PAT). This way you would not run into ORA-4030 errors due to insufficient swap space.


Jabulani said...

What if this occurs on Windows Server because it has no swap space ,how do I investigate the source of the problem or cause ?

Hnnar "Nick" Bluth said...

"This is also on the Solaris platform, though a similar analogy could work on Linux too."

Linux doesn't know the concept of swap reservations (i.e., unless you tell it to), and additionally has a very sophisticated COW mechanism to share memory pages. I've personally seen Oracle servers (OLTP though) that committed > 200GB of anonymous RAM on a server with 16GB DIMMs + 16GB swap.

This is a real advantage with modern systems. You're running into severe issues with those modern boxes with 512GB of DIMMs or so. To seriously use such an amount of space with Solaris, you need another 512GB (!) of swapspace only for swap reservation, even if you never really dip into it. A Linux on the same server will happily go > 80% memory used w/out any swapspace, and can be run with, say, 32GB of swapspace (even with todays HDDs, your customer will be on the phone once you're in the area of 10GB swap used ;-).

Best regards,

Gunnar "Nick" Bluth