Global AI and Data Science

Global AI & Data Science

Train, tune and distribute models with generative AI and machine learning capabilities

 View Only
Expand all | Collapse all

Best approach for forecasting bank withdrawals and deposits using historical data

  • 1.  Best approach for forecasting bank withdrawals and deposits using historical data

    Posted Mon June 05, 2023 11:35 AM

    Hi everyone,

    I have the historical data of a bank having two categories withdrawal and deposit and I want to fit a machine-learning model to it so that I may be able to generate the forecast of withdrawal and deposit for its various branches for the next month. 

    Currently, I am considering the data for the last two years i.e. from Jan 2021 to Mar 2023

    I have two basic questions

    • I have discarded the data before 2021 due to the Covid pandemic, as it affected very much the banking system, am I doing it right, or should I also include it in my dataset?
    • My time series data consists of weekdays only (Monday to Friday) and does not include any sample point for weekends as banks are closed on weekends. Should I train my model on this data or should I include zeros against those dates of weekends?


    ------------------------------
    Gdin ABL
    ------------------------------

    #AIandDSSkills


  • 2.  RE: Best approach for forecasting bank withdrawals and deposits using historical data

    Posted Thu June 22, 2023 08:37 PM

    Well it's hard to tell without testing the model. To detail the answer even more :

    1. Training on weekdays-only data: Pros:

      • Focuses on the patterns and trends specific to weekdays when banking operations are active.
      • May capture the underlying dynamics and behaviors of withdrawals and deposits during weekdays more accurately.

      Cons:

      • Ignores potential weekend effects that may indirectly impact weekday behaviors.
      • May not capture any potential changes or trends specific to weekends that could influence weekday patterns.

      This approach is suitable if you believe that the weekend patterns have a minimal impact on the weekday forecasts and you want to focus solely on the active banking days.

    2. Including zeros for weekends: Pros:

      • Considers the full time series and captures the absence of activity on weekends.
      • Allows the model to learn the distinction between weekdays and weekends.

      Cons:

      • Assumes that weekends have no impact on weekday patterns, which may not always hold true.
      • Adds zeros for the weekends, which might introduce noise or skew the overall data distribution


    ------------------------------
    Youssef Sbai Idrissi
    Software Engineer
    ------------------------------



  • 3.  RE: Best approach for forecasting bank withdrawals and deposits using historical data

    Posted Mon June 26, 2023 11:07 AM

    Yeah, I'd be loosing the atypical covid data. 

    Dropping the weekends shouldn't be a problem, as there is no variability in the amount that could be withdrawn.

    Your next problem with doing just a Monday-Friday training is that you are assuming that all weeks are equal and that all days are independent.  How would you expect the results of the training to be different from simply running an average or linear recursion over the data for each day? On average, on Monday, the withdrawal is $200,000.  With linear recursion, the withdrawal on Monday is $190,000 plus $100 per week ($5,200/year) since Jan 1st 2021, so the withdrawal for next week would be, say, $212,500?  If you go with this, you'll need to develop a good understanding of the days when it'll be wrong - public holidays, and other times with a-typical money requirements (you then probably want to eliminate these days from your average/regression as they'll throw the values off).  Linear regression may work better with multiple shorter segments, as things like changes in interest rates can throw off the historical data (regression over the last 6 months vs regression over the full 30 months)..

    Running Fourier across the whole time sequence will improve your forecasts for fixed days and month start/end - Xmas, Thanksgiving etc..., but still struggle with floating days - Chinese New Year, Easter.   

    If you want to get a better model, you need to add some correlation of these dates into your training data, but note that for some of them you have only 2 data points and it'll still be thrown by lower relevance of older data.  IZPCA has a Fractal Forecasting algorithm that does this, but it's not available outside of the product (it uses it to predict CPU usage on computer systems, which is a sort of similar problem as the computers are processing, amongst other things, the withdrawal transactions)...



    ------------------------------
    Mik Clarke
    ------------------------------