Informix

 View Only
Expand all | Collapse all

14.10 - direct flooding "Network driver cannot accept a connection on the port"

  • 1.  14.10 - direct flooding "Network driver cannot accept a connection on the port"

    Posted Sun November 21, 2021 06:15 AM
    Hi,

    i have a problem after reboot and informix start

    i see tons of flooding when informix starts up in online.log of:

    12:12:54 listener-thread@soc_be.c:2701: err = -25573: oserr = 22: errstr = : Network driver cannot accept a connection on the port.
    System error = 22.
    12:12:54 listener-thread@soc_be.c:2701: err = -25573: oserr = 22: errstr = : Network driver cannot accept a connection on the port.
    System error = 22.
    12:12:54 listener-thread@soc_be.c:2701: err = -25573: oserr = 22: errstr = : Network driver cannot accept a connection on the port.
    System error = 22.
    12:12:54 listener-thread@soc_be.c:2701: err = -25573: oserr = 22: errstr = : Network driver cannot accept a connection on the port.
    System error = 22.
    12:12:54 listener-thread@soc_be.c:2701: err = -25573: oserr = 22: errstr = : Network driver cannot accept a connection on the port.
    System error = 22.
    12:12:54 listener-thread@soc_be.c:2701: err = -25573: oserr = 22: errstr = : Network driver cannot accept a connection on the port.
    System error = 22.
    12:12:54 listener-thread@soc_be.c:2701: err = -25573: oserr = 22: errstr = : Network driver cannot accept a connection on the port.
    System error = 22.
    12:12:54 listener-thread@soc_be.c:2701: err = -25573: oserr = 22: errstr = : Network driver cannot accept a connection on the port.
    System error = 22.
    12:12:54 listener-thread@soc_be.c:2701: err = -25573: oserr = 22: errstr = : Network driver cannot accept a connection on the port.
    System error = 22.
    12:12:54 listener-thread@soc_be.c:2701: err = -25573: oserr = 22: errstr = : Network driver cannot accept a connection on the port.
    System error = 22.

    Any idea? Its informix 14.10FC3 and RHEL8

    ------------------------------
    Marc Demhartner
    ------------------------------

    #Informix


  • 2.  RE: 14.10 - direct flooding "Network driver cannot accept a connection on the port"

    IBM Champion
    Posted Sun November 21, 2021 08:31 AM
    That happens during start up when sessions attempt to connect while the engine is still in recovery. HQ's agent could be one such culprit.

    ------------------------------
    Art S. Kagel, President and Principal Consultant
    ASK Database Management Corp.
    www.askdbmgt.com
    ------------------------------



  • 3.  RE: 14.10 - direct flooding "Network driver cannot accept a connection on the port"

    Posted Mon November 22, 2021 03:50 AM
    @Art Kagel Thanks for your quick reply - it is very strange, because this was after a refresh reboot of the Server. Also it keeps flooding (over 5min!) until i shutdown informix. HQ and/or agent is not running. When i restart the sever it disappears.

    What was "oserr = 22" again ?​

    ------------------------------
    Marc Demhartner
    ------------------------------



  • 4.  RE: 14.10 - direct flooding "Network driver cannot accept a connection on the port"

    IBM Champion
    Posted Mon November 22, 2021 06:37 AM
    Marc:

    That is strange. 

    errno 22 is "invalid argument to a system call"

    You get it from things like reading a pipe that is closed, etc.

    Art

    ------------------------------
    Art S. Kagel, President and Principal Consultant
    ASK Database Management Corp.
    www.askdbmgt.com
    ------------------------------



  • 5.  RE: 14.10 - direct flooding "Network driver cannot accept a connection on the port"

    Posted Mon November 22, 2021 09:18 AM
    Since its a FC3 and FC4W1 fixlist contains:

    "sqlexec thread can get stuck spinning indefinitely in pfsc_add_or_upd() while in critical section and block checkpoints"

    @Art Kagel do you think that could be related? ​

    ------------------------------
    Marc Demhartner
    ------------------------------



  • 6.  RE: 14.10 - direct flooding "Network driver cannot accept a connection on the port"

    IBM Champion
    Posted Mon November 22, 2021 09:54 AM
    That's a better question for our HCL colleagues than for me Marc. Guys?

    Art

    ------------------------------
    Art S. Kagel, President and Principal Consultant
    ASK Database Management Corp.
    www.askdbmgt.com
    ------------------------------



  • 7.  RE: 14.10 - direct flooding "Network driver cannot accept a connection on the port"

    IBM Champion
    Posted Mon November 22, 2021 09:59 AM
    Turn off pfsc and see if the problem goes away ??

    Cheers
    Paul

    Paul Watson
    Oninit LLC
    +1-913-387-7529
    www.oninit.com
    Oninit®️ is a registered trademark of Oninit LLC





  • 8.  RE: 14.10 - direct flooding "Network driver cannot accept a connection on the port"

    Posted Mon November 28, 2022 11:51 AM
    I'm getting this same problem on one server. It happened the first time last month after a reboot of the server and it happened again after a restart of the database at 3am on Sunday ( the 3am Sunday restarts are a cron job that's been running for a couple of years on this server! ).

    It's running 14.10.FC4W1WE running on CentOS 7.8

    The error that is spammed to the online.log is same the original poster. That message is repeated 6/8 times a second ( until we run out of disk space ).

    listener-thread@soc_be.c:2701: err = -25573: oserr = 22: errstr = : Network driver cannot accept a connection on the port.System error = 22.

    The work-around is to stop the instance, remove the log file and restart the instance. ( there is only one instance running on the server ).

    Any suggestions on a possible cause or permanent solution?

    ------------------------------
    Neil Martin
    ------------------------------



  • 9.  RE: 14.10 - direct flooding "Network driver cannot accept a connection on the port"

    IBM Champion
    Posted Mon November 28, 2022 12:22 PM
    Neil, Marc:

    Thinking about this again, with MAX_FILL_DATA_PAGES and PFSC_BOOST enabled, the PFSC code has to build the cache list of partially free pages in each variable length table to minimize the overhead of finding the best page to place a new variable length row on during an insert.

    It may be that that cache build blocks connections until it completes or has to complete before the server's ports are opened and some other process had connected (perhaps via shared memory) before the PFSC_BOOST process and is blocking it from starting causing the server to remain in a blocked state. Those messages are indicating problems clients are experiencing trying to connect to the database server. I have not seen this problem myself, the PFSC cache seems to load for me after startup and produces no locks of any kind. But, I only turned on PFSC_BOOST in v14.10FC8 and now in .FC9.

    One thing you can try would be to make sure that there are no clients attempting to connect to the engine until it is fully online. You can do that by either testing the status of the engine (onstat - >/dev/null) will return '5' when the engine is online and other values during startup. Alternatively you could start the engine with the -w option so that the oninit does not return until the engine is online. Your startup script can wait until the engine is online then touch a file that can be queries before starting other applications that require the database.

    Meanwhile, you should open a PMR to inquire whether this issue was corrected in a later release (14.10.FC9?).

    ------------------------------
    Art S. Kagel, President and Principal Consultant
    ASK Database Management Corp.
    www.askdbmgt.com
    ------------------------------



  • 10.  RE: 14.10 - direct flooding "Network driver cannot accept a connection on the port"

    Posted Mon November 28, 2022 12:45 PM
    I have no clue on PFSC, I don't believe it's being used at all. Certainly we have haven't enabled it.

    We do start the engine with:

    /bin/sh -a -c 'source $ENV_FILE && oninit -w'
    The number and types of processes that might try and connect make it rather difficult for us to try the 'touched file' idea, there are web services and monitoring processes and customer cron tasks etc most are not within our control as the server admin, but are created by the end user of the server. So it's very possible a process is trying to connect to the database immediately on startup.
    Part of the mystery is why it's happened twice in the last 2 months ( on the same server ) when we have 4/5 servers setup exactly the same way and it's never happened before - the servers have all been running for well over a year now. As I said before the instances are restarted at 3am on a Sunday, so the potential for the problem has existed on multiple servers for a couple of years now.

    ------------------------------
    Neil Martin
    ------------------------------



  • 11.  RE: 14.10 - direct flooding "Network driver cannot accept a connection on the port"

    IBM Champion
    Posted Mon November 28, 2022 12:40 PM
    Looks like a lot of accept() calls are failing, with EINVAL, in close succession and over some period of time which would imply that this is not about a blocked or otherwise occupied oninit VP (so not anything like above PFSC problem), otherwise that listener couldn't receive so many connection requests and report so many errors.

    'man accept' documents two possibilities for EINVAL:

    • EINVAL Socket is not listening for connections, or addrlen is invalid (e.g., is negative).
    • EINVAL (accept4()) invalid value in flags.
    I'd say they both don't look applicable, but who knows ...

    I guess the high frequency of these errors is the result of application(s) retrying their connects very impatiently - do you see corresponding application side messages?

    Dull question:  what has changed?

    ------------------------------
    Andreas Legner
    ------------------------------



  • 12.  RE: 14.10 - direct flooding "Network driver cannot accept a connection on the port"

    Posted Mon November 28, 2022 12:55 PM
    As far as the end user and monitoring are concerned the database is up and running fine - all applications, services and users are unaffected.

    The only symptom of the problem is the disk monitoring warns us they are running low of disk space - due the constant error line being written - the online.log was around 129gb by the time I intervened and stopped the engine, removed the log and restarted the engine.

    I wish I knew what changed or what's different between that server and some of the other identical servers. 
    There are various application changes by the end user and maybe they have another new process that's hitting the database more frequently at odd hours  ( normally there are no active users on the server 3am on a Sunday ! )   - from an OS and database POV nothing has changed.
    I will be asking the customer what they have changed recently.

    ------------------------------
    Neil Martin
    ------------------------------



  • 13.  RE: 14.10 - direct flooding "Network driver cannot accept a connection on the port"

    Posted Wed November 30, 2022 09:31 AM
    Edited by System Fri January 20, 2023 04:13 PM
    You should consider upgrade to FC7 at least. Since we did the upgrade noo issues anymore. FC3 before was also very unstable.

    ------------------------------
    Marc Demhartner
    ------------------------------