AIX

AIX

Connect with fellow AIX users and experts to gain knowledge, share insights, and solve problems.


#Power
 View Only
  • 1.  JFS2 status ????

    Posted Mon March 20, 2006 07:38 AM

    Originally posted by: sreeninging


    Since the introduction of the JFS2 filesystem on AIX 5L we have experienced a lot of problems , like file system corruption , crashing of system during the creation of an mksysb , unable to restore mirrored images etc, etc.
    The most disturbing errors are the one's which produce a corrupt filesystem.
    These errors occur with I/O intensive applications using concurrent I/O like Oracle.
    Every time we run into an error , a call is placed at the IBM support organization, every time there is a patch available which should solve the issue.
    The problem here is that the patching never stops, every time there is a new issue with JFS2 and there is a new patch, since AIX 5.2.0 there hase been no stable version of JFS2, patching continues up to today ...

    We as a customer are very concerned about this because our data integrity is at stake here.
    The question I have is , are the more customers expiriencing problems with JFS2, and what is the vision of the IBM development Lab about JFS2.
    When will a stable version of JFS2 be produced , will it ever be stable , or maybe it will be replaced by JFS3 ????

    I'd like to hear from IBM and other customers about this issue.

    Thanks in advance.
    R.E. van Holk
    The Netherlands.
    #AIX-Forum


  • 2.  Re: JFS2 status ????

    Posted Mon March 20, 2006 12:46 PM

    Originally posted by: SystemAdmin


    We've had a terrible time with data corruption on JFS2. We don't use JFS2 at all except for a handful of filesystems that have files bigger than 64 gigabytes.

    anker
    #AIX-Forum


  • 3.  Re: JFS2 status ????

    Posted Tue March 21, 2006 08:29 AM

    Originally posted by: SystemAdmin


    I find this discussion interesting, because I am seeing just the opposite outcome in my environment. We have a mix of jfs and jfs2, and all of the corruption comes from the jfs side. I do have a question: Are you using inline logging with your jfs2? I have found better performance and better/faster recovery after hard reboots with jfs2 w/inline logging.

    I would love to get more input into this discussion if nothing else to see what the experience of others has been.

    SG
    #AIX-Forum


  • 4.  Re: JFS2 status ????

    Posted Tue March 21, 2006 09:05 AM

    Originally posted by: SystemAdmin


    SG, I completely agree with you that JFS2 is a better filesystem...if it weren't for those niggling corruption problem. I can't recall the last time we had JFS corruption--and that's with well over 500 filesystems, only 29 of which are JFS2. I wonder what's different between your setup and mine. We apply maintenance quarterly and always apply the penultimate maintenance level. That is, if IBM's highest maintenance level is 5200-08, then we apply 5200-07. Is your maintenance more or less agressive than ours?

    I've also run into problems without inline logging. When the JFS2 log volume fills, it requires unmounting all of the filesystems in the volume group. I now do all of my JFS2 logs inline which, if I remember correctly, automagically expands the log as necessary. (Correct me if I'm wrong about that.) I wish IBM had made 'inline' the default; I don't know any advantage to non-inline logs.

    This is a good discussion; it's great to hear the experiences of others.

    anker
    #AIX-Forum


  • 5.  Re: JFS2 status ????

    Posted Tue March 21, 2006 10:28 AM

    Originally posted by: SystemAdmin


    My maintenance schedule is bi-annually, and I just completed my latest round last month. All of my 5.3 boxen are running ML3, and 5.2 ML7. Whatever IBM has out I grab and test on our technical testbed, and what other failed project LPAR's we may have sitting idle. By the time my next round is due, I have tested most everything about the latest ML and start moving it to test/dev servers. After a week on those, I then move it to production. The way it has been working out is I have been applying one ML behind as well, simply due to how my schedule has been falling.

    Personally, coming from NCR, SCO (excuse the profanity), and Solaris, I find this common log area in jfs/jfs2 to be high headache with little return, especially on servers with more I/O activity. When I figured out I could embed the journal log within the affected file system, I knew that is where I needed to go, and so far, it has reaped decent dividends, especially with full file system checks. What would take several minutes at times, now takes seconds. I can't explain that, but I know what I see. Maybe there is more overhead with the journal/log algorithm when going to a separate log? Naturally, multiple file systems using a common log would slow the process down, but I am even talking when I do checks one at a time. Go figure.

    Now that IBM is top dog in this market, maybe they can strike a deal with Symantec(Veritas) like NCR and HP have to put a stripped down version of VXFS built into AIX.

    I wouldn't mind trying Reiser or ext3 on my guinea pig systems if it was possible. I can't see taking those to production, but if it was supported, then who knows?

    SG
    #AIX-Forum


  • 6.  Re: JFS2 status ????

    Posted Tue March 21, 2006 10:54 AM

    Originally posted by: SystemAdmin


    It sounds like you and I are within spitting distance on maintenance, so I don't know why you do so much better with JFS2 than I--virtuous living, probably. <grin>

    I've been doing AIX for about ten years and I've seldom seen IBM put out a major component with as many problems as JFS2 (and check out the number of APARs with "data loss" and "data corruption" if you don't think that others are having problems). I guess JFS2 got out the door too soon.

    Anyone else want to weigh in with orchids or raspberries for JFS2?
    #AIX-Forum


  • 7.  Re: JFS2 status ????

    Posted Mon May 01, 2006 05:23 PM

    Originally posted by: SystemAdmin


    I haven't personally encountered any significant problems with JFS2 under AIX 5.1, 5.2, or 5.3, but very few of the boxes I've dealt with are doing massive file system meta-data work. They're mostly dealing with (almost) statically allocated databases or growing a few files at a time...

    -Chris

    > It sounds like you and I are within spitting distance
    > on maintenance, so I don't know why you do so much
    > better with JFS2 than I--virtuous living, probably.
    > <grin>
    >
    > I've been doing AIX for about ten years and I've
    > seldom seen IBM put out a major component with as
    > many problems as JFS2 (and check out the number of
    > APARs with "data loss" and "data corruption" if you
    > don't think that others are having problems). I
    > guess JFS2 got out the door too soon.
    >
    > Anyone else want to weigh in with orchids or
    > raspberries for JFS2?

    #AIX-Forum


  • 8.  Re: JFS2 status ????

    Posted Tue March 21, 2006 11:07 AM

    Originally posted by: sreeninging


    The problem at our site is how to respond to this rapid release of apars.
    We have had major outage because of the corruption several times, today we are considering another update an AIX to prevent future problems.
    Why we have a problem in responding to this rapid release of patches ? well we have about 800 AIX systems, a lot of them run 24/7 so planned downtime is hard to get.
    Also if we request downtime every few weeks our customer is not getting real confident about our environment.
    Therefore we try to use fixed TL releases , every , lets say' six monts we plan an upgrade on the systems, this is acceptable for our clients.
    Of course , if an emergancy fix is needed on a system it will be applied, but we try to keep this to a minimum.
    The JFS2 corruption problem is a major problem for us, because of our large environment and SLA agreements.
    Question therefore is , will it be fixed and stable and if so when ( or never ? )

    #AIX-Forum


  • 9.  Re: JFS2 status ????

    Posted Wed March 22, 2006 10:38 AM

    Originally posted by: SystemAdmin


    IBM has put a considerable amount of effort into improving the stability of JFS2 over the past several years including additional testing and improved development practices. As a result, the number of critical filesystem problems, including data corruption issues, has decreased to a very low amount in 2005.

    Most AIX customers are using JFS2 because it offers superior scalability and capability compared to JFS2, including performance features such as concurrent and direct I/O.

    Filesystems are a key element of any information technology environment and insuring the validity of our client's data is a responsibility that we take very seriously. Because of this, we tend to be very proactive in releasing APARs for any problems with filesystems even if the prerequisite conditions that are required to expose the defect are unlikely to happen in most customer environments.
    #AIX-Forum


  • 10.  Re: JFS2 status ????

    Posted Tue April 04, 2006 11:34 AM

    Originally posted by: niella


    However...

    There is a known design flaw in JFS2 that comes down to readdir() duplicates - of which IBM is apparently aware. I read this in a newsgroup - here is the longer explanation:

    "In a situation where you have contention on a directory (that is, adding entries) whilst performing a readdir() operation (though note that the thread-safe variant readdir_r() also has the same problem), readdir() silently returns duplicate entries. This is unexpected (note that JFS is not affected, and behaves as a sensible filesystem would be expected to). In short, we contend that this breaks POSIX conformance for JFS2."

    Secondly, like you guys, I have preferred inline-logging, mostly because I cannot differentiate between different physical disks in my EMC environment (hdiskpower1 & hdiskpower2 may be on the same spindle). But, I have found the following interesting documentation (redpiece 0102):

    "There are internal and external logs. An internal log is the metadata log
    incorporated into the same physical location as the file system itself. An external log need not be tied to the same disk or physical location as the reserved areas of afile system. Administrators may want the external logs on standalone disks to help improve write throughput, because a write to a file system log must take effect before the actual data write to the file system. Performance is improved because the disk heads are not going to move between the log and the rest of the file system."

    I'd love to hear comments from anyone else who has experimented with comparing inline logs with seperate logs....

    Regards,
    Niel
    #AIX-Forum