Introduction
ASO (Active System Optimizer) is a user-space daemon designed to enhance system performance by optimizing hardware utilization. ASO employs a lightweight promotion mechanism for heavily used shared memory pages, identifying the most frequently accessed pages with a base size of 4 KB or 64 KB and promoting them to 16 MB. This promotion of hot shared memory pages to 16 MB reduces TLB misses for user applications, leading to improved performance.
ASO employs a three-step procedure for optimization:
1. Identify Eligible Workload
To qualify for optimization, application processes must meet the following criteria:
- Have a shared memory (SHM) attachment of at least 16 GB.
- Be running for a minimum of 10 minutes.
- Utilize at least two CPU cores.
2. Profile to Identify TLB Misses
Eligible processes are analyzed to determine whether their TLB misses exceed a predefined threshold, making them suitable for optimization.
3. Profile to Identify Hot Pages
For processes experiencing high TLB misses, frequently accessed pages within the shared memory region are identified. These hot pages are then promoted to 16 MB to improve performance.
Validation of Optimization
The effectiveness of the optimization can be validated using the following methods:
1. Checking ASO Logs
The ASO logs, located at /var/log/aso/aso_process.log, provide insights into the optimization process. Below is an example of the log output:
- Large Page Promotion Recommendation:
- "Recommending large page promotion, number of hot eas = 142, number of hot eas to promote = 35"
- Promotion Details:
- "THCP[1] After promotion of EA 0x7000203b1000000, psize=0x1000000, Soft64K=0x000, Soft16M=0x001, PID=23789832"
- Promotion Callback Summary:
- "Promote callback overall result: 1. Total attempted: 33, already promoted: 4, promoted to 16m: 29, failure: 0"
2. Using 'pfhdata -16m' in kdb
Another way to validate the optimization is by using the pfhdata -16m command in KDB.
Sample Output:
(0)> pfhdata -16m pfhdata 16m mpss
Requested.......(pf_psmd_requested_16m)......4K 00000000 .. 64K 00000000 Pass1 fail......(pf_psmd_pass1_fail_16m).....4K 00000000 .. 64K 00000000 Pass2 fail......(pf_psmd_pass2_fail_16m).....4K 00000000 .. 64K 00000000 Donor fail2.....(pf_psmd_donor_dail2_16m)....4K 00000000 .. 64K 00000000 Isolated........(pf_psmd_isolated_16m).......4K 00000000 .. 64K 00000000 Populate lwmig..(pf_psmd_populate_lwmig_16m).4K 00000000 .. 64K 00000000 Populate hwmig..(pf_psmd_populate_hwmig_16m).4K 00000000 .. 64K 00000000 Populate fail...(pf_psmd_populate_fail_16m)..4K 00000000 .. 64K 00000000 Populated.......(pf_psmd_populated_16m)......4K 00000000 .. 64K 00000000 Request pending.(pf_psmd_pend_16m)...........4K 00000000 .. 64K 00000000 Request pending.(pf_psmd_pend_16m)...........4K 00000000 .. 64K 00000000 Request pending.(pf_psmd_pend_16m)...........4K 00000000 .. 64K 00000000 Donor fail......(pf_psmd_donor_fail_16m).....4K 00000000 .. 64K 00000000 Minfree fail....(pf_psmd_minfree_fail_16m)...4K 00000000 .. 64K 00000000 Promote fail....(pf_promote_fail_16m)........4K 00000000 .. 64K 00000000 Promote timeout.(pf_promote_timeout_16m).....4K 00000000 .. 64K 00000000 Promote poked...(pf_promote_poked_16m).......4K 00000000 .. 64K 00000000 Promoted........(pf_promoted_16m)............4K 00000000 .. 64K 00000021 Demote poked....(pf_demote_poked_16m)........4K 00000000 .. 64K 00000000 Demoted.........(pf_demoted_16m).............4K 00000000 .. 64K 00000000
|
Ongoing Work on vmstat Integration
Efforts are underway to capture promotion statistics as part of the vmstat command for more accessible monitoring.
Example usage of the system call used to promote the SHM pages
Users can explicitly request the promotion of SHM pages to 16MB using the vm_pattr() system call. A sample code snippet demonstrating this is provided below. For further details, refer to the AIX documentation - https://www.ibm.com/docs/kk/aix/7.3?topic=v-vm-pattr-system-call-kvm-pattr-kernel-service
psize_t
mpss_set_psize_extended(ptr64_t ptr, size_t size)
{
int rc;
int64_t counts[2];
size64_t mask = PAGE_MASK;
counts[0] = counts[1] = 0;
struct vm_pa_psize_extended t_pa_size = {0};
/*
* To promote the SHM page to 16MB
*/
t_pa_size.pa_range.rng_start = (ptr64_t)( (uint64_t)ptr & mask);
t_pa_size.pa_range.rng_size = size;
t_pa_size.pa_psize = -1;
rc = vm_pattr(VM_PA_SET_PSIZE_EXTENDED, -1, (void *) &t_pa_size,
sizeof(struct vm_pa_psize_extended));
if (rc != 0) {
perror("vm_pattr() failed\n");
return -1;
}
/*
* To check the status of the promotion
*/
t_pa_size.pa_info_size = sizeof(counts);
t_pa_size.pa_info = (ptr64_t) & counts;
rc = vm_pattr(VM_PA_GET_PSIZE_EXTENDED, -1, (void *) &t_pa_size,
sizeof(struct vm_pa_psize_extended));
printf("On get - %dM @ 0x%lX: psize=0x%07llX, Soft64K=0x%03llX Soft16M=0x%03llX\n",
size >> 20, ptr, t_pa_size.pa_psize, counts[0], counts[1]);
return 0;
}
|