IBM Security QRadar

 View Only
Expand all | Collapse all

Calculate disk space /store

  • 1.  Calculate disk space /store

    Posted Fri March 05, 2021 10:19 AM
    We install a lot of Qradar. And now a small problem with the calculation of disk space /store to save events and flows. This is very important for us because we cannot calculate approximately for our customers. We tried to take the formula from the tutorial https://www.ibm.com/support/pages/qradar-how-determine-average-event-payload-and-record-size-bytes-updated but it doesn't take into account that the older the logitim Qradar additionally compresses.
    Ьost often we need to calculate for 3 months, 6 and 12 months of saving events and flows. Help me! Can someone give a formula?

    ------------------------------
    Mykhailo Matsiuk
    ------------------------------


  • 2.  RE: Calculate disk space /store

    Posted Mon March 08, 2021 04:13 AM
    Are you looking for a formula to estimate the disk space usage, or a command to measure it?
    I can answer the later question. The disk usage data is not available through the API, so you need to use du. Events are stored in /store/ariel/events. Then, you have raw payloads in "payloads/" and extracted properties in "records". Then, in each of those directories, you have one "aux" directory which contains on subdir per tenant, named with the tenant id. Then more subdirectories: year, month, day and hour.
    Using du, you can therefore calculate disk usage for each customer for any time range. I can provide some python/pandas code if you want.

    ------------------------------
    Raphaël Langella
    SIEM Architect
    IMS Networks
    ------------------------------



  • 3.  RE: Calculate disk space /store

    Posted Mon March 08, 2021 05:20 AM
    Adding to Raphaël's post, you can also take a look at this thread on IBM support forum  - few useful references there to use when analyzing actual consumption.
    (I think the logs are compressed immediately by default since v. 7.2.8 or so).
    For advanced planning, there was an xls referenced in the post linked above, that was unfortunately not moved to the new platform from developerWorks; if you are an IBM business partner, however, you should be able to get this (or an updated) workbook that helps estimating the disk space consumption based on the estimated EPS/FPM rate.
    In general, you can only do a best-effort estimate for an average log size - as not only different log sources have different log sizes, but there could be difference between different environments as well. Then, besides raw records, there are also normalized records (from my experience, /records folder can take relatively substantial space) and indexes to be taken in count. Don't forget also that there's a limit of less than 95% usage for the /store partition (threshold when it stops collection and search) so this 5% difference should taken in count. ... And, of course, additional space needed for backup (usage limit for this mount point is less than 90%).

    ------------------------------
    Dusan VIDOVIC
    ------------------------------



  • 4.  RE: Calculate disk space /store

    Posted Mon March 08, 2021 06:14 AM
    I have tried using AQL but it makes for a very poor estimation. It doesn't take into account compression, nor normalized records and indexing. I've also observed that normalized records take usually more disk space than payload, which seems a bit counter intuitive at first.
    Speaking of indexing, where are the indexes stored? Along with the records? That would explained the high disk space usage.

    ------------------------------
    Raphaël Langella
    SIEM Architect
    IMS Networks
    ------------------------------



  • 5.  RE: Calculate disk space /store

    Posted Mon March 08, 2021 06:43 AM
    I also keep noticing that /store/ariel/events/records is larger than the /payloads part.
    As I recall, within each /records/year/month/date/hour folder there were /lucene and /super folders. In /super you should see files named like SourceIP~0 or UserName~0 or so ; these are (standard) indexes (the number after ~ in the name is related to the retention bucket, with 0 being the default bucket).

    ------------------------------
    Dusan VIDOVIC
    ------------------------------



  • 6.  RE: Calculate disk space /store

    Posted Mon August 08, 2022 06:39 AM
    Do you have the excel sheet to calculate the expected storage ?

    ------------------------------
    Tobin Mathew
    ------------------------------



  • 7.  RE: Calculate disk space /store

    IBM Champion
    Posted Tue August 09, 2022 10:52 AM
      |   view attached
    Hi
    enclosed is the excel sheet we use for our boot camp trainings and projects.
    Pls use from left to right. Row A,  B , D and F  contain measured or estimated values.
    Everything else is formula based. Adopt to your needs if needed.
    EPD can easily be measured in a POC exporting CSV values on a 24h interval for log source types grouped. Pls use Event count (SUM) column from log activity.
    XLS SUM line will show EPS needed based on 24h values. Some extra 25% are needed for EPS peak values.
    GB for 90 days will show your storage needed.

    ------------------------------
    [Karl] [Jaeger] [Business Partner]
    [QRadar Specialist]
    [pro4bizz]
    [Karlsruhe] [Germany]
    [4972190981722]
    ------------------------------

    Attachment(s)



  • 8.  RE: Calculate disk space /store

    IBM Champion
    Posted Tue August 09, 2022 11:47 AM
      |   view attached
    based on individual requests I have provided an English version for your convenience.
    BTW EPD are events per day (24h)

    ------------------------------
    [Karl] [Jaeger] [Business Partner]
    [QRadar Specialist]
    [pro4bizz]
    [Karlsruhe] [Germany]
    [4972190981722]
    ------------------------------

    Attachment(s)



  • 9.  RE: Calculate disk space /store

    Posted Tue August 09, 2022 01:25 PM
    I thought just to throw in some personal views...
    This table provides some major guidance:
    - Different systems generate logs with different rates
    - Different systems generate logs with different (average) size
    - QRadar employs compression by default for payloads
    - Use a PoC to assist you in planning
    The challenge is always to have a good sample of the logs on the daily basis to be able to extrapolate or at least have a good educated guess on the expected rate/load. (For example: I've encountered large firewalls generating 150-200 EPS but also over 5000 EPS - per single system).
    For Windows maybe you can have a look at this to support your evaluation.
    Space consumption per log source type can also vary considerably (for example, for Windows I've seen it between 350 and 10000 bytes - with average anywhere between 1000-1500 bytes); for some proxy systems or e.g. CISCO ISE it can be on average 2000bytes.
    So (just to be on a safe side) : for EPS calculation it is usually good to use a 300-400bytes assumption if you have e.g. data about a daily stored volume; however, for storage calculation, I would personally use an assumption of 800+ bytes average size.
    (As was mentioned previously above,  for storage sizing you should have in mind that the store cannot be 100% full and keep the things running - so another +5% to 10% over).
    As for the compression, 10:1 is an expect ratio, but it does not have to be so - I'd say it is within a variance of ranges between 5:1 to 10:1 among different  collected data.
    On top of the payload volume (that was discussed here) there's (at least) the part covering "records" (a.k.a. what comes of the processed "raw" payload). As was mentioned above, for events I keep seeing this being usually cca 3 times larger than the "payloads" part.

    ------------------------------
    Dusan VIDOVIC
    ------------------------------



  • 10.  RE: Calculate disk space /store

    Posted Wed August 10, 2022 03:14 AM
      |   view attached
    checkout this attached one. This is little old but will be of great help.

    ------------------------------
    rahul dhiman
    ------------------------------

    Attachment(s)