IBM Spectrum Computing Group

[LSF Explorer]Performance tuning in Elasticsearch

  • 1.  [LSF Explorer]Performance tuning in Elasticsearch

    Posted Tue December 29, 2020 02:13 AM

    There are several ways to improve the performance of LSF Explorer on Elasticsearch side.

    Set proper heap size of Elasticsearch

    Elasticsearch uses a heap of 1GB size by default which is too small for most production cases. It's important to set proper heap size of Elasticsearch to ensure the performance. If there are many garbage collection related messages in Elasticsearch log, it is probably that the heap size is not enough to meet the need. The proper heap size depends on the amount of RAM available on the host. Generally speaking, below rules should be followed.

    • Set Xmx and Xms to no more than 50% of your physical RAM
    • Set Xmx and Xms to no more than the threshold that the JVM uses for compressed object pointers (compressed oops)
    • Ideally set Xmx and Xms to no more than the threshold for zero-based compressed oops

    Refer to https://www.elastic.co/guide/en/elasticsearch/reference/7.x/heap-size.html for more details.

    Increase replica number of existing indices

    Each primary shard can have zero or more replicas. A replica is a copy of the primary shard. The primary shard number of an index can't be increased after the index creation. However, the replica number can be modified dynamically. The replica shard has two purposes. One is increasing failover. A replica shard can be promoted to a primary shard if the primary fails.

    The other one is increasing performance. Get and search requests can be handled by primary or replica shards.

    Increasing replica number of existing indices to a proper value may improve the query performance more or less. The optimal number depends on the property of index data and related queries. It could be figured out by experiments.

    Replica number of an index can be set by below request.

    PUT /<index_name>/_settings
    {
    "index" : {
    "number_of_replicas" : 2
    }
    }

    Add nodes for current Elasticsearch cluster

    Elasticsearch is a powerful search engine which can be expanded horizontally. If current computing capacity of Elasticsearch cluster reaches the bottleneck and the hosts are overburdened, you can add additional nodes into Elasticsearch cluster to expand the computing capacity.

    Refer to Add and remove nodes in your cluster | Elasticsearch Reference [7.10] | Elastic for more details.

    Close unused indices

    There may be plenty of "old" indices which are hardly touched. However, you want to keep them in the storage for some reason. They can be closed to save Elasticsearch cluster resource.

    A closed index is blocked for read/write operations and does not allow all operations that opened indices allow. It is not possible to index documents or to search for documents in a closed index. This allows closed indices to not have to maintain internal data structures for indexing or searching documents, resulting in a smaller overhead on the cluster.

    An index can be closed by below request.

    POST /<index>/_close

    Closed index can be opened by below request.

    POST /<index>/_open

    Refer to https://www.elastic.co/guide/en/elasticsearch/reference/7.x/indices-close.html for more details.

    Freeze indices of low usage frequency

    Elasticsearch indices keep some data structures in memory to allow you to search them efficiently and to index into them. If you have a lot of indices then the memory required for these data structures can add up to a significant amount. For indices that are searched frequently it is better to keep these structures in memory because it takes time to rebuild them. However, you might access some of your indices so rarely that you would prefer to release the corresponding memory and rebuild these data structures on each search. These indices of low usage frequency could be turned into frozen indices which consume much less heap than normal indices. This allows for a much higher disk-to-heap ratio than would otherwise be possible.

    Note that frozen indices are read-only. Searches on frozen indices are expected to execute slowly. Frozen indices are not intended for high search load. It is possible that a search of a frozen index may take seconds or minutes to complete, even if the same searches completed in milliseconds when the indices were not frozen.

    You can freeze the index using below request.

    POST /<index>/_freeze

    To make a frozen index writable again, use the below request.

    POST /<index>/_unfreeze

    Refer to https://www.elastic.co/guide/en/elasticsearch/reference/7.x/frozen-indices.html for more details.



    ------------------------------
    Edward Deng
    ------------------------------