AIX

Connect with fellow AIX users and experts to gain knowledge, share insights, and solve problems.

View Only

Back to discussions

Expand all | Collapse all

JFS2 horrible slow

1. JFS2 horrible slow

Like
Archive User
Posted Thu November 07, 2013 10:07 AM

Reply
Originally posted by: XF07_Harald_Dunkel

Hi folks,

I've got 2 8231running AIX 6.1. 32GByte RAM, 2 SAS disks. No RAID. No virtual hosts. Problem: If logging is enabled, then JFS2 on a local disk is slower than a NFS connection to a remote Linux PC. My colleagues are complaining about the poor performance.

To give you some numbers:

Extracting the linux source tarball for 3.11.6 on a local JFS2 filesystem takes about 35 minutes. If I mount a remote filesystem via NFSv4 and use it for the same test, then it takes just 2 minutes. If I run the test local on the NFS server (Linux, amd64) then it takes only a few seconds, including sync.

I understand that this is a special case, writing a ton of tiny files. On daily work the poor performance doesn't show that much, but it is sufficient that nobody likes to work on the AIX hosts.

Is there something misconfigured? AFAICS the documentation says, that asynchronous IO is enabled by default. Using external or inline logging doesn't make a huge difference.

Every helpful comment would be highly appreciated.

Harri
2. Re: JFS2 horrible slow

Like
Archive User
Posted Mon November 11, 2013 09:19 AM

Reply
Originally posted by: Wouter Liefting

Can you post the mount options (output of mount command or the /etc/filesystems stanza) and the attributes of the filesystem/LV/disk (lsattr -El command)?

Also, if this is an LPAR, some LPAR settings like cpu entitlement?

What does topas and/or nmon tell you while you are running the tarball extract? 100% CPU time, 100% disk time?

Do you have any disk errors reported in errpt?
3. Re: JFS2 horrible slow

Like
Archive User
Posted Mon November 11, 2013 05:50 PM

Reply
Originally posted by: GarlandJoseph

Keep it simple...I would look at i/o before cpu performance. Is the joural file on the same disk? What exactly do you mean by "logging turned on" What does lsvg -p and lsvg -l show? You should use iostat to drill down to the i/o statistics. Set up a test program and copy i/o statistics before and after logging enabled.
4. Re: JFS2 horrible slow

Like
Archive User
Posted Wed November 20, 2013 06:36 AM

Reply
Originally posted by: XF07_Harald_Dunkel

Sorry for the delay. My test host was not available for tests.

To answer the questions:

- the journal is on the same disk (hdisk1), using a dedicated partition for logging.

- "logging turned off" means "log = NULL" in the appropriate record in /etc/filesystems.

- lsvg:

# lsvg -p localvg
localvg:
PV_NAME           PV STATE          TOTAL PPs   FREE PPs    FREE DISTRIBUTION
hdisk1            active            558         37          00..00..00..00..37
# lsvg -l localvg
localvg:
LV NAME             TYPE       LPs     PPs     PVs LV STATE      MOUNT POINT
loglv00             jfs2log    1       1       1    open/syncd    N/A
fslv01              jfs2       512     512     1    open/syncd    /export
fslv02              jfs2       8       8       1    open/syncd    /sample

- iostat (1sec interval) showed me this for logging disabled:

Disks:        % tm_act     Kbps      tps    Kb_read   Kb_wrtn
:
:
hdisk1          98.0     976.0     167.0          0       976
hdisk1         100.0     1108.0     169.0          0      1108
hdisk1          99.0     1168.0     171.0          0      1168
hdisk1         100.0     2232.0     194.0          0      2232
hdisk1         100.0     1384.0     178.0          0      1384
hdisk1         100.0     1136.0     166.0          0      1136
hdisk1          99.0     1680.0     183.0          0      1680
hdisk1          98.0     1288.0     172.0          0      1288
hdisk1          99.0     1408.0     198.0          0      1408
hdisk1          98.0     1188.0     167.0          0      1188
hdisk1         100.0     1264.0     168.0          0      1264
hdisk1         100.0     1460.0     173.0          0      1460
hdisk1          99.0     976.0     165.0          0       976
hdisk1         100.0     920.0     167.0          0       920
hdisk1         100.0     1180.0     170.0          0      1180
hdisk1          92.0     2368.0     194.0          0      2368
hdisk1         108.0     1716.0     180.0          0      1716
hdisk1         100.0     2072.0     191.0          0      2072
:
:

With logging enabled I got this:

Disks:        % tm_act     Kbps      tps    Kb_read   Kb_wrtn
:
:
hdisk1         100.0     828.0      96.0          0       828
hdisk1         100.0     720.0     104.0          0       720
hdisk1         100.0     848.0      89.0          0       848
hdisk1          98.0     576.0      94.0          0       576
hdisk1          99.0     852.0      90.0          0       852
hdisk1         100.0     1292.0      98.0          0      1292
hdisk1          99.0     1068.0     100.0          0      1068
hdisk1         100.0     988.0     101.0          0       988
hdisk1          99.0     1224.0      99.0          0      1224
hdisk1          99.0     1228.0      98.0          0      1228
hdisk1          99.0     720.0     100.0          0       720
hdisk1          99.0     636.0      89.0          0       636
hdisk1          99.0     1568.0     142.0          0      1568
hdisk1          97.0     916.0     102.0          0       916
hdisk1         100.0     708.0      99.0          0       708
:
:

In both cases I extracted the Linux source tar file, i.e. about 40000 files in about 2500 directories. Without logging this took "just" about 2 minutes. With logging enabled it took more than 25 minutes (including sync).
5. RE: JFS2 horrible slow

Like
jack smith
Posted Thu September 12, 2024 04:15 PM

Reply
And more than 10 years later I ran into the same issue as described here. Dealing with folders with lots of small files is ridiculously slow. As an example, my test case with logging enabled took 116 seconds. Without the log it took 4.1 seconds.
Of course having a log takes its toll but this is way too much. Even more so since doing the very same thing with SUSE and XFS (with journaling) on the same machine with the same disk doesn't even take 3 seconds. So XFS with logging is even faster than JFS2 without logging? That seems strange.

I found similar reports on other websites but, just like here, none had a proper solution.

So which buttons do I have to push to get reasonable speeds with JFS2 and logging?

------------------------------
jack smith
------------------------------

Original Message
6. RE: JFS2 horrible slow

Like
Alexey MARKOV
Posted Thu September 12, 2024 05:00 PM

Reply
Lower lokas of Samsara. I mean there should be hardcoded locks in firmware. I believe that the fix exists, just take a try.
----------------------------------------------------
Алексей Марков
Администратор БД
АШАН ТЕХ
al.markov@auchan.ru
Тел.: +7 (800) 700 58 00 доб. 79224016
IP:79224016 FMTN: *79224016
Моб.: +7 (903) 665 13 18

чт, 12 сент. 2024 г. в 23:14, jack smith via IBM TechXchange Community <Mail@connectedcommunity.org>:
And more than 10 years later I ran into the same issue as described here. Dealing with folders with lots of small files is ridiculously slow. As an example, my test case with logging enabled took 116 seconds. Without the log it took 4.1 seconds.
Of course having a log takes its toll but this is way too much. Even more so since doing the very same thing with SUSE and XFS (with journaling) on the same machine with the same disk doesn't even take 3 seconds. So XFS with logging is even faster than JFS2 without logging? That seems strange.

I found similar reports on other websites but, just like here, none had a proper solution.

So which buttons do I have to push to get reasonable speeds with JFS2 and logging?

------------------------------
jack smith
------------------------------

Original Message

Original Message:
Sent: 9/12/2024 4:15:00 PM
From: jack smith
Subject: RE: JFS2 horrible slow

And more than 10 years later I ran into the same issue as described here. Dealing with folders with lots of small files is ridiculously slow. As an example, my test case with logging enabled took 116 seconds. Without the log it took 4.1 seconds.
Of course having a log takes its toll but this is way too much. Even more so since doing the very same thing with SUSE and XFS (with journaling) on the same machine with the same disk doesn't even take 3 seconds. So XFS with logging is even faster than JFS2 without logging? That seems strange.

I found similar reports on other websites but, just like here, none had a proper solution.

So which buttons do I have to push to get reasonable speeds with JFS2 and logging?

------------------------------
jack smith
------------------------------
7. RE: JFS2 horrible slow

Like
jack smith
Posted Thu September 12, 2024 05:49 PM

Reply
Sorry, I'm not sure what you mean.

------------------------------
jack smith
------------------------------

Original Message
8. RE: JFS2 horrible slow

Like
Ralf Schmidt-Dannert
Posted Fri September 13, 2024 09:11 AM

Reply
I assume the JFS2 file system was not mounted with any special mount flags --> JFS2 file cache should be used.

How was the FS created?

Did you evaluate iostat data for "service queue full", IO service times in the case with logging?

SAS disks are slow and prone to seek time overhead - so wondering if the writing to the JFS2 internal logs resulted in LOTS of disk seeks == very poor IO service times.

Creating many new files == modify JFS2 meta data, will generate quite a bit of IO against the JFS2 redo log.

------------------------------
Ralf Schmidt-Dannert
------------------------------

Original Message
9. RE: JFS2 horrible slow

Like
Grover Davidson
Posted Fri September 13, 2024 12:04 PM

Reply
You have to understand how the logging works and that the IOs to the log device are synchronous.

Any change to the file system metadata - Not the data in the file, but the data the tells us about the file, gets logged.

So, you create a new file. We need to allocate an inode and (8) 4K pages of file system space. All of this needs log records. We queue the IO to the disk and WAIT for it to complete. Then we proceed to write the metadata - it may or may not flush to disk right away. As you then write to the file, we allocate blocks again and that also requires logging.

There is NO log activity when we write to a file (say in the middle of it) and do not change the number of blocks allocated. Of course, writing does cause the modified timestamp for the file to be updated and the log write.

One option is to put the log on a super fast storage for the short period you need it.

I did a paper for the 2019 Las Vegas TechU on JFS2 Performance that will show you numbers.

Depending in the size of the files, tuning the j2_nPagesPerRBNACluster may help you as well.

Let me know if you would like a copy of the paper. I am happy to share it.

Thanks!

Grover Davidson
AIX Screen Team
grover@us.ibm.com

Dick Stratton, who was held in the Hanoi Hilton for 2,251 days as a "prisoner at war," had taught me that a call from the field is not an interruption of the daily routine; it's the reason for the daily routine. - General James Mattis

Teamwork is built on individual contributions, each working independently but willingly helping others when needed. It is by sharing knowledge and skills that the abilities of the individual contributors is increased, and as a result, the whole team. - Grover Davidson

Original Message
10. RE: JFS2 horrible slow

Like
jack smith
Posted Fri September 13, 2024 12:41 PM
Edited by jack smith Fri September 13, 2024 12:42 PM

Reply
Moved down ...
11. RE: JFS2 horrible slow

Like
Henrik Morsing
Posted Thu September 19, 2024 04:28 AM

Reply
Hi Jack,

Please raise it with IBM. They have tuning experts willing to help you. I think we are all interested in hearing the potential outcome.

Regards,

Henrik Morsing

------------------------------
Henrik Morsing
------------------------------

Original Message
12. RE: JFS2 horrible slow

Like
Nigel Griffiths

IBM Champion
Posted Fri September 13, 2024 07:22 AM

Reply
You are running AIX 6.1 which is ancient history is came out in 2007 i.e. 17 years ago and has not had an improvements since AIX 7.1 arrived in about 2017.

So this is a 7 year old operating system at best. Your copy of AIX could be older still. I think is funny that you say the performance is the same as 10 years ago and the AIX could be easily be 10 years old !!!

IBM has since released AIX 7.1, AIX 7.2 and AIX 7.3.

I recommend you upgrade AIX 7.4 and try testing again. There has been vast performance increases in these newer AIX version.

Is the Linux and XFS you are comparing AIX too, the same age?

If you are stuck on AIX 6.1 for some strange reason then you are running an unsupported OS.

Have you at least got to AIX 6.1 Release 9 12 and the service pack 12? From memory, this is the latest/last update.

How many is "a ton of small files" ? Millions?

The tar command will not be using Asynchronous I/O - that is a features used by applications like Oracle RDBMS.

Your SAS disk is only going to manage 200 Disk I/O per second and is seems to be the limiting factor.

Normal tuning would involve added extra disks to spread the I/O.

If the files are only temporarily required then you could use a RAM based file system.

I hope something here can help you, N

------------------------------
Nigel Griffiths - IBM retired
London, UK
@mr_nmon
------------------------------

Original Message
13. RE: JFS2 horrible slow

Like
jack smith
Posted Fri September 13, 2024 12:43 PM
Edited by jack smith Sat September 14, 2024 04:16 AM

Reply
> I assume the JFS2 file system was not mounted with any special mount flags
I always use "noatime" and for the tests without logging I used "log=NULL" in addition.

> How was the FS created?
mklv -y testlv testvg 100
crfs -v jfs2 -d testlv -m /mnt/test -A no -p rw -a options=noatime -a agblksize=4096 -a isnapshot=no

> Did you evaluate iostat data for "service queue full", IO service times in the case with logging?
Not yet.

> You are running AIX 6.1 which is ancient history
Sorry, I should have mentioned that I ran into the same problem but with current AIX versions. I encountered the same problem with: 7.2.5.7, 7.3.2.1 and 7.3.2.2.
The machine in question is a POWER8 and I used a single, original IBM 146gb 15k sas hdd.

> I recommend you upgrade AIX 7.4
7.4? I don't think that's available yet.

> Is the Linux and XFS you are comparing AIX too, the same age?
Yes, both the most recent versions.

> How many is "a ton of small files" ? Millions?
My test case had only ~10000.

> Your SAS disk is only going to manage 200 Disk I/O per second and is seems to be the limiting factor.
Well, XFS doesn't seem to think so given the results I had with that.

> Normal tuning would involve added extra disks to spread the I/O
I did that as well. I added an additional disk to the VG and used that for the log. It brought the duration down from 116 seconds to 98 seconds. Still a joke compared to the 4.1 seconds without logging and the XFS results.

> Let me know if you would like a copy of the paper. I am happy to share it.
Oh definitely!

To broaden the overall picture I ran the same test on different machines. One on OSX with a traditional hdd formatted with HFS+ with journaling and the result was 4.7 seconds.
And one more on Solaris with ZFS and a traditional hdd as well. The first run with sync set to "standard" took 4.3 seconds and with sync disabled 3.6. With sync set to "always" it even surpassed JFS2's log and took 135 seconds.
Considering all results, something around 4 seconds seems to be reasonable.

Apparently ZFS' forced sync and JFS2's log cause heavy strain on the disks. The others seem to handle logging/journaling very differently. At least as far as the results go.

Worth a note is that the Solaris docs explicitly say that setting sync to "disabled" does not increase the risk of corrupting the pool. JFS2 without logging on the other hand required an fsck after just 3 reboots.
14. RE: JFS2 horrible slow

Like
José Pina Coelho
Posted Mon September 16, 2024 04:33 AM

Reply
A 15K RPM disk has an average rotational latency of 2ms, there's no way to create 10000 files in 4 seconds and maintain the required semantics.

What you're comparing is JFS2, where the file creation is committed to disk before you get control back and XFS in a mode where you get control back before anything is written to the disk. (And yes, JFS2 is bad for micro-spooling)

------------------------------
José Pina Coelho
IT Specialist at Kyndryl
------------------------------

Original Message
15. RE: JFS2 horrible slow

Like
Alexander Pettitt
Posted Mon September 16, 2024 06:04 AM

Reply
@José Pina Coelho has a good point about physical disk limitations and there is a simple solution.

AIX supports JFS2 ram drives see How to create a memory resident filesystem (ram disk)

------------------------------
Alexander Pettitt
------------------------------

Original Message
16. RE: JFS2 horrible slow

Like
jack smith
Posted Sun September 15, 2024 11:56 AM
Edited by jack smith Sun September 15, 2024 05:25 PM

Reply
In the meantime I tried the following:

ioo -p -o j2_dynamicBufferPreallocation=128
ioo -p -o j2_maxRandomWrite=32
ioo -p -o j2_nBufferPerPagerDevice=2048
ioo -p -o j2_nPagesPerRBNACluster=512
ioo -p -o j2_nRandomCluster=4
ioo -p -o numfsbufs=1568
vmo -p -o maxperm%=90 -o maxclient%=90 -o minperm%=3

None had any noticeable effect on the log speed.
17. RE: JFS2 horrible slow

Like
Phill Rowbottom

IBM Champion
Posted Mon September 16, 2024 04:37 AM

Reply
Is your Linux machine using SAN disk or local SAS disk? You might not be comparing apples to apples when it comes to the I/O performance of the disk devices here. Internal SAS will never be as fast as SAN (no massive write behind cache on the array).

Have you tried using an inline log for the jfs2 filesystem? From all reports and best practices this provides better performance than a log logical volume.

mklv -y testlv testvg 100 -> you haven't spread the LV over multiple disks add "-e x" to spread the LV over the disks to improve I/O performance.
crfs -v jfs2 -d testlv -m /mnt/test -A no -p rw -a options=noatime -a agblksize=4096 -a isnapshot=no -> try with an inline log "-a log=INLINE".

Using both of these options will a) spread the work evenly over multiple disks b) INLINE log will spread the log over multiple disks to improve performance of the log.

------------------------------
Phill Rowbottom
------------------------------

Original Message
18. RE: JFS2 horrible slow

Like
Grover Davidson
Posted Mon September 16, 2024 10:13 AM

Reply
None of these tunables will affect the log. The log IO is all SYNCHRONOUS. We wait for the IO to the log to complete.

The only way to speed things up is to either disable logging OR make sure you are using an external log and migrate that log device to a super fast disk.

Please remember that the log is there to protect the file system structure in the event of a crash. You really do not want to run normal operations without it. If you are restoring an entire file system, I would do that without logging and when the restore is complete, unmount the file system and mount it again with logging enabled.

Thanks!

Grover Davidson
AIX Screen Team
grover@us.ibm.com

Dick Stratton, who was held in the Hanoi Hilton for 2,251 days as a "prisoner at war," had taught me that a call from the field is not an interruption of the daily routine; it's the reason for the daily routine. - General James Mattis

Teamwork is built on individual contributions, each working independently but willingly helping others when needed. It is by sharing knowledge and skills that the abilities of the individual contributors is increased, and as a result, the whole team. - Grover Davidson

Original Message
19. RE: JFS2 horrible slow

Like
jack smith
Posted Mon September 16, 2024 08:01 AM
Edited by jack smith Mon September 16, 2024 08:56 AM

Reply
> What you're comparing is JFS2, where the file creation is committed to disk before you get control back and XFS in a mode where you get control back before anything is written to the disk
Well, after the command was done there was no disk led activity anymore.

> Is your Linux machine using SAN disk or local SAS disk?
As mentioned in my first post already: "same machine with the same disk"

> Have you tried using an inline log for the jfs2 filesystem?
Yes, in fact for all tries except the one time with a separate log lv as mentioned already as well.

> Using both of these options will a) spread the work evenly over multiple disks b) INLINE log will spread the log over multiple disks to improve performance of the log.
Of course using more disks will speed things up but that's neither the point nor the solution. The point is that in its default form the log, inline or on a separate lv, slows things down massively. Much more than any other filesystem I've ever used.

So my simple question is: can the JFS2 log be tuned or configured in a way to reduce the speed impact? To be usable in practice it should at least come near the performance levels of other filesystems.
20. RE: JFS2 horrible slow

Like
Phill Rowbottom

IBM Champion
Posted Mon September 16, 2024 08:23 AM

Reply
> Is your Linux machine using SAN disk or local SAS disk?
As mentioned in my first post already: "same machine with the same disk"

- It can't be, you've mentioned that your Linux machine is amd64. Unless you've found a really good emulator, you won't be running AIX & Linux amd64/x86_64 on the same hardware.

> Using both of these options will a) spread the work evenly over multiple disks b) INLINE log will spread the log over multiple disks to improve performance of the log.
Of course using more disks will speed things up but that's neither the point nor the solution. The point is that in its default form the log, inline or on a separate lv, slows things down massively. Much more than any other filesystem I've ever used.

- your example of how you have created the LV doesn't show this. It's showing a simple layout where the LV will NOT be spread over the disks. A technique we call PP striping, it's a course stripe and can make a massive difference to I/O performance v's non spread LVs.

>So my simple question is: can the JFS2 log be tuned or configured in a way to reduce the speed impact? To be usable in practice it should at least come near the performance levels of other filesystems.

Tuning the disk layout to provide a higher level of IOPS should help.

------------------------------
Phill Rowbottom
------------------------------

Original Message
21. RE: JFS2 horrible slow

Like
Russell Adams
Posted Mon September 16, 2024 09:36 AM

Reply
On Mon, Sep 16, 2024 at 12:01:03PM +0000, jack smith via IBM TechXchange Community wrote:
> So my simple question is: can the JFS2 log be tuned or configured in
> a way to reduce the speed impact? To be usable in practice it should
> at least come near the performance levels of other filesystems.

Jack,

In my experience many filesystems, no just JFS2, are slow to make
changes to their inode/FAT tables. My classic example is migrating
over a million small files between systems. It takes more time to
create the FAT entries than the copy the data, and so the migration
speed is abysmal.

Often time inode/FAT table updates are single threaded and aren't
optimized by the disk geometry. Some filesystems may block additional
inode updates until the current one is committed. Some filesystems
play fast and loose with this critical data for speed. Performance and
reliability vary.

Perhaps the root cause here is the difference in implementation of
journaled inode data for the filesystems. I see XFS does something
they call "delayed logging" which may allow asynchronous inode
updates:

https://www.kernel.org/doc/html/v6.11-rc7/filesystems/xfs/xfs-delayed-logging-design.html

It sounds like they are allowing async updates as long as they write
in parallel to the log for replay to achieve consistency.

I know JFS2 uses a dynamic b-tree for inode data, but I don't know if
it's sync or async, threaded, or under what conditions it blocks IO. I
would expect that IBM is more conservative in their approach placing
data safety first. Perhaps someone from IBM will explain more.

Please share if you find anything. Also if you could repeat your test
on XFS with the delayed logging disabled, you may find it is similar
in speed to JFS2.

Thanks.

------------------------------------------------------------------
Russell Adams Russell.Adams@AdamsSystems.nl
Principal Consultant Adams Systems Consultancy
https://adamssystems.nl/

Original Message
22. RE: JFS2 horrible slow

Like
Grover Davidson
Posted Tue September 17, 2024 03:40 PM

Reply
Your case is special in the sense that you are performing activities that will generate large amounts of j2 log entries. One you have the files established and they are not holey files, then reads and writes will generate very minimal amounts of log transactions.

Thanks!

Grover Davidson
AIX Screen Team
grover@us.ibm.com

Dick Stratton, who was held in the Hanoi Hilton for 2,251 days as a "prisoner at war," had taught me that a call from the field is not an interruption of the daily routine; it's the reason for the daily routine. - General James Mattis

Teamwork is built on individual contributions, each working independently but willingly helping others when needed. It is by sharing knowledge and skills that the abilities of the individual contributors is increased, and as a result, the whole team. - Grover Davidson

Original Message
23. RE: JFS2 horrible slow

Like
jack smith
Posted Mon September 16, 2024 09:01 AM

Reply
> you've mentioned that your Linux machine is amd64
No, I never said that.

> It's showing a simple layout where the LV will NOT be spread over the disks
Again, just using more disks is not the point. Please read my previous reply again.

------------------------------
jack smith
------------------------------

Original Message
24. RE: JFS2 horrible slow

Like
jack smith
Posted Mon September 16, 2024 09:52 AM

Reply
> if you could repeat your test on XFS with the delayed logging disabled
It seems that's not possible anymore. According to https://linux-xfs.oss.sgi.narkive.com/TPH8cSZ0/patch-01-27-xfs-update-mount-options-documentation the nodelaylog mount option has been removed more than 10 years ago.
But even if that would result in more similar results, it would still not solve the problem. Unless JFS2's log has something similar to XFS' delayed logging that can be enabled optionally.

------------------------------
jack smith
------------------------------

Original Message
25. RE: JFS2 horrible slow

Like
Russell Adams
Posted Mon September 16, 2024 10:19 AM

Reply
On Mon, Sep 16, 2024 at 01:52:19PM +0000, jack smith via IBM TechXchange Community wrote:
> > if you could repeat your test on XFS with the delayed logging disabled
> It seems that's not possible anymore. According to https://linux-xfs.oss.sgi.narkive.com/TPH8cSZ0/patch-01-27-xfs-update-mount-options-documentation the nodelaylog mount option has been removed more than 10 years ago.
> But even if that would result in more similar results, it would still not solve the problem. Unless JFS2's log has something similar to XFS' delayed logging that can be enabled optionally.

Jack,

I know you've tested with log and with log NULL. With log NULL you had
better performance, almost as good as others.

You said this is a single local SAS disk?

The commits of inode information with logging enabled will be latency
bound on small writes to the inode table and log device. The lower the
write latency of your media, the faster it will go. Reasonably you
would expect SAN or SSD to be faster than an old SAS drive. Again this
is not potential throughput, it is write latency.

With log NULL disk latency won't matter, but your results demonstrated
as much.

The disk geometry really isn't relevant overall, but I'm pointing out
where the bottleneck is for this case.

I'm hoping perhaps an IBMer will shed some light on how JFS2 manages
IO to the log and inode table like the documentation for XFS. I
haven't found any.

Thanks.

------------------------------------------------------------------
Russell Adams Russell.Adams@AdamsSystems.nl
Principal Consultant Adams Systems Consultancy
https://adamssystems.nl/

Original Message
26. RE: JFS2 horrible slow

Like
jack smith
Posted Mon September 16, 2024 10:39 AM

Reply
> but I'm pointing out where the bottleneck is for this case
Oh I see where the bottleneck is. The question is what other filesystems do to seemingly ignore that. And in turn if and how JFS2 can be configured in a similar way.

I wouldn't mind if it was twice as slow as others. Even 3 or 4 times slower would still be okay. But 29 times slower compared to without logging and other filesystems even with logging enabled is just not usable.

------------------------------
jack smith
------------------------------

Original Message
27. RE: JFS2 horrible slow

Like
Russell Adams
Posted Mon September 16, 2024 12:17 PM

Reply
On Mon, Sep 16, 2024 at 02:38:42PM +0000, jack smith via IBM TechXchange Community wrote:
> I wouldn't mind if it was twice as slow as others. Even 3 or 4 times
> slower would still be okay. But 29 times slower compared to without
> logging and other filesystems even with logging enabled is just not
> usable.

That's not entirely true.

Let's agree for a moment that JFS2 is quite a bit slower.

With a focus on data integrity and reliability, if that's what JFS2
needs, then that's the right speed. You'll have to weigh that against
your application and user expectations.

Linux filesystems are often not as robust and have in the past taken
shortcuts to sacrifice reliability for speed. Look at the problems
with Linux failing to honor flush to disk calls.

There's a variety of reasons other filesystems could be much
faster. However all the filesystems should be throttled by the same
write latency for the inode journals, and throughput for the data. If
one is suspiciously fast, I'd expect they aren't actually writing the
inode journal and are open to data loss.

After all, we can all write to the bitbucket with amazing speed!

------------------------------------------------------------------
Russell Adams Russell.Adams@AdamsSystems.nl
Principal Consultant Adams Systems Consultancy
https://adamssystems.nl/

Original Message
28. RE: JFS2 horrible slow

Like
Grover Davidson
Posted Wed September 18, 2024 09:49 AM

Reply
Russel,

You said: After all, we can all write to the bitbucket with amazing speed!

Well, not in all cases. Writes to /dev/null are still writes to a jfs2 inode (you ACCESS /dev/null through the j2 inode that is in turn a spec_node). And you are modifying the modification time. So, yes, we generate a j2 log transaction by default.

And we have had cases with the j2 log was so overwhelmed with IO to /dev/null that we brought IO to almost a complete stop. In this specific case, we introduced the raso tunable devnull_lazytime. By setting this to a value of 1 (raso -p -o devnull_lazytime=1) we only update the modification time about once a second.

Thanks!

Grover Davidson
AIX Screen Team
grover@us.ibm.com

Dick Stratton, who was held in the Hanoi Hilton for 2,251 days as a "prisoner at war," had taught me that a call from the field is not an interruption of the daily routine; it's the reason for the daily routine. - General James Mattis

Teamwork is built on individual contributions, each working independently but willingly helping others when needed. It is by sharing knowledge and skills that the abilities of the individual contributors is increased, and as a result, the whole team. - Grover Davidson

Original Message
29. RE: JFS2 horrible slow

Like
Russell Adams
Posted Wed September 18, 2024 10:20 AM

Reply
On Wed, Sep 18, 2024 at 01:48:36PM +0000, Grover Davidson via IBM TechXchange Community wrote:
> Russel,
> You said: After all, we can all write to the bitbucket with amazing speed!

> Well, not in all cases. Writes to /dev/null are still writes to a
> jfs2 inode (you ACCESS /dev/null through the j2 inode that is in
> turn a spec_node). And you are modifying the modification
> time. So, yes, we generate a j2 log transaction by default. And
> we have had cases with the j2 log was so overwhelmed with IO to
> /dev/null that we brought IO to almost a complete stop. In this
> specific case, we introduced the raso tunable
> devnull_lazytime. By setting this to a value of 1 (raso -p -o
> devnull_lazytime=1) we only update the modification time about
> once a second.

That's incredible!

It was really just a joke about discarding data. ;]

Thanks for the laugh.

------------------------------------------------------------------
Russell Adams Russell.Adams@AdamsSystems.nl
Principal Consultant Adams Systems Consultancy
https://adamssystems.nl/

Original Message
30. RE: JFS2 horrible slow

Like
Grover Davidson
Posted Wed September 18, 2024 09:48 AM

Reply
AIX is slower here because the IO is ALWAYS synchronous to the log device. There is no way to change that.

If we remove this, then the file system will have race conditions where the file system can become corrupt in the event of a crash. This is turn can result in fsck wiping the file system clean.

Thanks!

Grover Davidson
AIX Screen Team
grover@us.ibm.com

Dick Stratton, who was held in the Hanoi Hilton for 2,251 days as a "prisoner at war," had taught me that a call from the field is not an interruption of the daily routine; it's the reason for the daily routine. - General James Mattis

Teamwork is built on individual contributions, each working independently but willingly helping others when needed. It is by sharing knowledge and skills that the abilities of the individual contributors is increased, and as a result, the whole team. - Grover Davidson

Original Message
31. RE: JFS2 horrible slow

Like
jack smith
Posted Mon September 16, 2024 02:37 PM

Reply
> You'll have to weigh that against your application and user expectations
No, I have to weigh that against everything else that's available in the same market segment and in comparison to that JFS2's log is simply abysmal.

> Linux filesystems are often not as robust and have in the past taken shortcuts to sacrifice reliability for speed
As mentioned already, I also compared it with HFS+ and ZFS so this is not a Linux vs. AIX matter. Rather JFS vs. everything else.

> If one is suspiciously fast
Not one but all others.

------------------------------
jack smith
------------------------------

Original Message
32. RE: JFS2 horrible slow

Like
Russell Adams
Posted Mon September 16, 2024 05:48 PM

Reply
On Mon, Sep 16, 2024 at 06:36:44PM +0000, jack smith via IBM TechXchange Community wrote:
> > You'll have to weigh that against your application and user
> expectations No, I have to weigh that against everything else that's
> available in the same market segment and in comparison to that
> JFS2's log is simply abysmal.

So your Toyota can do a U-turn in 10 seconds but a cargo ship needs a
half hour. They can both carry some coal, so clearly they are the same
market segment. AIX carries workloads I would never trust to
Linux. You're comparing apples to oranges.

I agree your testing shows that JFS2's metadata changes are clearly
significantly slower than other filesystems on your hardware. Slow
enough to cause a problem? It doesn't seem to bother other workloads.

I'm inclined to think that JFS2 is more reliable than most other
filesystems. If that's the time it takes to do the job correctly and
reliably, that's acceptable.

Now that you know the bottleneck and some workarounds, are there still
problems with your process? Are you trying to frequently open lots of
tar files in bulk for your application?

------------------------------------------------------------------
Russell Adams Russell.Adams@AdamsSystems.nl
Principal Consultant Adams Systems Consultancy
https://adamssystems.nl/

Original Message
33. RE: JFS2 horrible slow

Like
jack smith
Posted Mon September 16, 2024 07:19 PM

Reply
> AIX carries workloads I would never trust to Linux
As I said twice now already: I did not only compare with Linux but also with Solaris (ZFS) and OSX (HFS+). Do I have to quote these previous posts?

> Slow enough to cause a problem?
Of course, that's the very reason why I posted in this thread in the first place.

> are there still problems with your process?
Of course, the situation hasn't changed at all. How should the problem magically be gone?

> Are you trying to frequently open lots of tar files in bulk for your application?
One of the envisaged purposes of the server in question is a fileserver. So yes, there's going to be exactly that workload all the time. If that was a one-off thing I wouldn't have asked here at all and just waited for it to finish. Whenever that might have happened.

------------------------------
jack smith
------------------------------

Original Message
34. RE: JFS2 horrible slow

Like
Phill Rowbottom

IBM Champion
Posted Tue September 17, 2024 07:15 AM

Reply
>Sorry, I should have mentioned that I ran into the same problem but with current AIX versions. I encountered the same problem with: 7.2.5.7, 7.3.2.1 and 7.3.2.2.

As you've tried this with current versions of AIX, you should be able to raise a call with IBM for support - I take it that you have a current SWMA agreement? The support centre would be able to engage the developers etc to look at the issue. You might get better answers from those who have access to the code that does the work.

------------------------------
Phill Rowbottom
------------------------------

Original Message
35. RE: JFS2 horrible slow

Like
jack smith
Posted Tue September 17, 2024 10:09 AM

Reply
Considering how old this issue is, I highly doubt that IBM will suddenly start enhancing the log just because I asked nicely.

So to summarize: there are no settings which change the way the log works and neither any which impact the log in some other way. Is that correct?

------------------------------
jack smith
------------------------------

Original Message
36. RE: JFS2 horrible slow

Like
Phill Rowbottom

IBM Champion
Posted Tue September 17, 2024 10:43 AM
Edited by Phill Rowbottom Tue September 17, 2024 10:43 AM

Reply
They may know something that we don't, no harm in asking. You pay for the support, use it!

------------------------------
Phill Rowbottom
------------------------------

Original Message
37. RE: JFS2 horrible slow

Like
Ralf Schmidt-Dannert
Posted Tue September 17, 2024 01:26 PM

Reply
As Grover pointed out, JFS2 logging is synchronous for every meta data change and depends on the write latency of the storage. Spinning disk will significantly impact that.

Linux is significantly more lazy in writing actual updates to physical disk. I have seen IO on Linux where everything basically ended up only in memory and then being de-staged at a later point in time. You may want to look at your XFS test environment and validate if/when Linux actually writes anything to disk.

So the question here is, what happens in that scenario when "power goes out" ...

With AIX, and synchronous redo logging meta data changes can be replayed very quickly and synced up for a consistent file system, with Linux / XFS that will require significantly more time, potentially more issues as everything stored in memory was lost.

FYI - sync is not a synchronous command, it just tells the OS to sync the buffer cache "at some time", but it returns before all IO has actually happened.

To answer your tuning question, I'm not aware of an option to force JFS2 to not enforce synchronous redo log writing when JFS2 logging is enabled. I'd also would question the purpose of "lazy logging" just to memory.

This means, as Grover already pointed out, you need to optimize the IO characteristics of your JFS2 redo to meet requirements.

With spinning disks you could think about a dedicated redo LV on a different hdisk / set of hdisks from where you write your data to. That should reduce the amount of seeks / seek time significantly as then the redo data is much closer to each other.

Preferably, of course, you'd want storage with no seek penalty for the JFS2 redo logs and low IO latency.

I did a quick test in my lab with a file system spread over 5 disks on IBM FlashSystem:

# lslv -L datalv
LOGICAL VOLUME: datalv VOLUME GROUP: datavg
LV IDENTIFIER: 00c65dc700004c0000000191a444acb5.1 PERMISSION: read/write
VG STATE: active/complete LV STATE: opened/syncd
TYPE: jfs2 WRITE VERIFY: off
MAX LPs: 4785 PP SIZE: 32 megabyte(s)
COPIES: 1 SCHED POLICY: striped
LPs: 4785 PPs: 4785
STALE PPs: 0 BB POLICY: relocatable
INTER-POLICY: maximum RELOCATABLE: no
INTRA-POLICY: middle UPPER BOUND: 5
MOUNT POINT: /oradata LABEL: /oradata
DEVICE UID: 0 DEVICE GID: 0
DEVICE PERMISSIONS: 432
MIRROR WRITE CONSISTENCY: on/PASSIVE
EACH LP COPY ON A SEPARATE PV ?: yes (superstrict)
Serialize IO ?: NO
INFINITE RETRY: no PREFERRED READ: 0
STRIPE WIDTH: 5
STRIPE SIZE: 1m
DEVICESUBTYPE: DS_LVZ
COPY 1 MIRROR POOL: None
COPY 2 MIRROR POOL: None
COPY 3 MIRROR POOL: None
ENCRYPTION: no

File system mounted like this:

/dev/datalv /oradata jfs2 Aug 30 15:24 rw,noatime,log=INLINE

Downloaded the linux-6.11 tar ball and did a test extract with normal JFS2 logging enabled:

# time tar -xf /stage/linux-6.11.tar

real 1m21.22s
user 0m0.31s
sys 0m3.88s

# ls
linux-6.11 pax_global_header

# find . | wc -l
91460

# du -g .|tail
0.00 ./linux-6.11/tools/writeback
0.08 ./linux-6.11/tools
0.00 ./linux-6.11/usr/dummy-include
0.00 ./linux-6.11/usr/include
0.00 ./linux-6.11/usr
0.00 ./linux-6.11/virt/kvm
0.00 ./linux-6.11/virt/lib
0.00 ./linux-6.11/virt
1.57 ./linux-6.11
1.57 .

------------------------------
Ralf Schmidt-Dannert
------------------------------

Original Message
38. RE: JFS2 horrible slow

Like
jack smith
Posted Tue September 17, 2024 08:54 PM

Reply
Thanks for the confirmation. Obviously the log needs some sort of flash storage for competitive results.

Also since all replies only focused on Linux, maybe my edited post was not transmitted to those following by email only. So here is that part again:

"To broaden the overall picture I ran the same test on different machines. One on OSX with a traditional hdd formatted with HFS+ with journaling and the result was 4.7 seconds.
And one more on Solaris with ZFS and a traditional hdd as well. The first run with sync set to "standard" took 4.3 seconds and with sync disabled 3.6. With sync set to "always" it even surpassed JFS2's log and took 135 seconds.
Considering all results, something around 4 seconds seems to be reasonable.

Apparently ZFS' forced sync and JFS2's log cause heavy strain on the disks. The others seem to handle logging/journaling very differently. At least as far as the results go.

Worth a note is that the Solaris docs explicitly say that setting sync to "disabled" does not increase the risk of corrupting the pool."

------------------------------
jack smith
------------------------------

Original Message
39. RE: JFS2 horrible slow

Like
Ralf Schmidt-Dannert
Posted Wed September 18, 2024 09:24 AM

Reply
Doing everything "in memory" without any physical IO to disk will always be fast, but has a significantly increased risk of meta data corruption in case of unplanned outage - regardless of what Solaris documentation states (8;-)

So it is up to the end user to determine what is more important here - speed, or meta data consistency in case of unplanned outages. Running fsck on a large file system without JFS2 logging enabled will take quite some time and there is no guarantee that fsck will be able to fix everything. Replaying any pending actions from JFS2 redo log is much faster.

------------------------------
Ralf Schmidt-Dannert
------------------------------

Original Message
40. RE: JFS2 horrible slow

Like
jack smith
Posted Wed September 18, 2024 05:51 PM

Reply
Well, I've been using XFS for more than 2 decades on IRIX and later Linux and I have no complaints. But maybe I was just lucky.

As to the Solaris docs: Oracle has lots of enterprise customers and if there was something in the docs that causes data loss or damage in general, these customers would sue the hell out of them.
The same goes for RedHat who declared XFS their official filesystem years ago.

Be that as it may, as far as the JFS2 log goes there doesn't seem to be a way of changing its behavior so thanks for all the replies!

------------------------------
jack smith
------------------------------

Original Message

AIX

AIX

JFS2 horrible slow

Archive UserThu November 07, 2013 10:07 AM

Archive UserMon November 11, 2013 09:19 AM

Archive UserMon November 11, 2013 05:50 PM

Archive UserWed November 20, 2013 06:36 AM

jack smithThu September 12, 2024 04:15 PM

Alexey MARKOVThu September 12, 2024 05:00 PM

jack smithThu September 12, 2024 05:49 PM

Ralf Schmidt-DannertFri September 13, 2024 09:11 AM

Grover DavidsonFri September 13, 2024 12:04 PM

jack smithFri September 13, 2024 12:41 PM

Henrik MorsingThu September 19, 2024 04:28 AM

Nigel GriffithsFri September 13, 2024 07:22 AM

jack smithFri September 13, 2024 12:43 PM

José Pina CoelhoMon September 16, 2024 04:33 AM

Alexander PettittMon September 16, 2024 06:04 AM

jack smithSun September 15, 2024 11:56 AM

Phill RowbottomMon September 16, 2024 04:37 AM

Grover DavidsonMon September 16, 2024 10:13 AM

jack smithMon September 16, 2024 08:01 AM

Phill RowbottomMon September 16, 2024 08:23 AM

Russell AdamsMon September 16, 2024 09:36 AM

Grover DavidsonTue September 17, 2024 03:40 PM

jack smithMon September 16, 2024 09:01 AM

jack smithMon September 16, 2024 09:52 AM

Russell AdamsMon September 16, 2024 10:19 AM

jack smithMon September 16, 2024 10:39 AM

Russell AdamsMon September 16, 2024 12:17 PM

Grover DavidsonWed September 18, 2024 09:49 AM

Russell AdamsWed September 18, 2024 10:20 AM

Grover DavidsonWed September 18, 2024 09:48 AM

jack smithMon September 16, 2024 02:37 PM

Russell AdamsMon September 16, 2024 05:48 PM

jack smithMon September 16, 2024 07:19 PM

Phill RowbottomTue September 17, 2024 07:15 AM

jack smithTue September 17, 2024 10:09 AM

Phill RowbottomTue September 17, 2024 10:43 AM

Ralf Schmidt-DannertTue September 17, 2024 01:26 PM

jack smithTue September 17, 2024 08:54 PM

Ralf Schmidt-DannertWed September 18, 2024 09:24 AM

jack smithWed September 18, 2024 05:51 PM

1. JFS2 horrible slow

2. Re: JFS2 horrible slow

3. Re: JFS2 horrible slow

4. Re: JFS2 horrible slow

5. RE: JFS2 horrible slow

6. RE: JFS2 horrible slow

7. RE: JFS2 horrible slow

8. RE: JFS2 horrible slow

9. RE: JFS2 horrible slow

10. RE: JFS2 horrible slow

11. RE: JFS2 horrible slow

12. RE: JFS2 horrible slow

13. RE: JFS2 horrible slow

14. RE: JFS2 horrible slow

15. RE: JFS2 horrible slow

16. RE: JFS2 horrible slow

17. RE: JFS2 horrible slow

18. RE: JFS2 horrible slow

19. RE: JFS2 horrible slow

20. RE: JFS2 horrible slow

21. RE: JFS2 horrible slow

22. RE: JFS2 horrible slow

23. RE: JFS2 horrible slow

24. RE: JFS2 horrible slow

25. RE: JFS2 horrible slow

26. RE: JFS2 horrible slow

27. RE: JFS2 horrible slow

28. RE: JFS2 horrible slow

29. RE: JFS2 horrible slow

30. RE: JFS2 horrible slow

31. RE: JFS2 horrible slow

32. RE: JFS2 horrible slow

33. RE: JFS2 horrible slow

34. RE: JFS2 horrible slow

35. RE: JFS2 horrible slow

36. RE: JFS2 horrible slow

37. RE: JFS2 horrible slow

Additional
Resources