Skip to main content

Search and indexing problems

The search bar doesn't show the option to search all folders

For all users

If all users are affected, the issue probably occurs following an email migration at the folder system level. This means that the Elasticsearch search index doesn't exist: you can run ConsolidateMailSpoolIndexJob to create the index and index all the messages on the server:

  • go to the admin console > System Management > Planning > run ConsolidateMailSpoolIndexJob

  • using our command line tool, run the consolidateIndex maintenance operation:

    bm-cli maintenance consolidateIndex domain.net

As this operation is very resource-intensive, it can be performed by groups of users, using the --match option:

bm-cli maintenance consolidateIndex --match "a.\*" domain.net
bm-cli maintenance consolidateIndex --match "[b-c].\*" domain.net

For a few users only

If the problem only concerns one or a few users, this means that the ElasticSearch index for them is non-existent or corrupted, and needs to be recreated:

  • either by going into each user's admin page and executing "Validate and repair user data" then "Consolidate mailbox index" then, if there's no improvement, "Reconstruct mailbox index"

  • or using our command line tool:

    bm-cli maintenance repair user@domain.net
    bm-cli maintenance consolidateIndex user@domain.net

In some cases "Consolidating index" is not enough and it is then necessary to restart a full indexing with the --reset flag:

bm-cli maintenance consolidateIndex --reset user@domain.tld

Some search results are missing

In the event of temporary problems with the indexing service, it is possible that some messages sent and received during this period have not been indexed. In this case, simply run the ConsolidateMailSpoolIndexJob task, which will calculate the difference between messages at IMAP level and in the index, then index only the missing messages.

This may be caused by an inconsistency between the list of IMAP folders and the database. You can use the "check and repair" maintenance operation -- which can be accessed from the Maintenance tab in the user's admin page – to rebuild this list. Then re-indexing the mailbox should fix the issue.

If this isn't the case, /var/log/bm-webmail.errors logs may the cause.

This is probably due to an indexing fault when the message was moved. You can use the "Consolidate mailbox index" maintenance operation – which can be accessed from the Maintenance tab in the user's admin page – to update the search index.

ElasticSearch switches to "read only" mode

Issue: ES puts itself in "read-only" mode, the cluster is green but a restart doesn't fix the issue.

Symptoms: Overall, BlueMind doesn't seem to work anymore. Besides search and display issues, users can't create events, reply to emails, etc.

Alerts containing bm-elasticsearch are raised on the Monitoring Console->General installation status page of the Administration Console as well as in TICK.

You will find errors containing ClusterBlockException in /var/log/bm/core.log. For example:

... class java.util.concurrent.ExecutionException: ClusterBlockException[index [mailspool_pending] blocked by: [TOO_MANY_REQUESTS/12/disk usage exceeded flood-stage watermark, index has read-only-allow-delete block];]

or

...  class net.bluemind.core.api.fault.ServerFault: java.util.concurrent.ExecutionException: ClusterBlockException[blocked by: [FORBIDDEN/12/index read-only / allow delete (api)];]

Confirmation : The following command returns ERROR: At least one index is read-only (to be executed on the Elasticsearch server)

curl -s 'http://localhost:9200/_all/_settings' | jq -r '
if any(.[]; .settings.index.blocks.read_only_allow_delete == "true")
then
"ERROR: At least one index is read-only"
else
"INFO: All indexes are read-write"
end'

Cause : The file system containing the ElasticSearch index data is more than 80% full.

Resolution :

  1. Increase file system size containing ElasticSearch index data (expand partition or change disk then resize file system)
  2. Check that ElasticSearch index data partition utilization has fallen below 80% (run on Elasticsearch server):
df /var/spool/bm-elasticsearch/{data,repo} -h
  1. Play the following command:
curl -X PUT "localhost:9200/_all/_settings" -H 'Content-Type: application/json' -d'{ "index.blocks.read_only_allow_delete" : false } }'
  1. Re-run the Confirmstep command, it should return INFO: All indexes are read-write.
  2. Restart indexing of boxes that need it (to be run on the core server):
bm-cli index coherency --workers=4 --run-consolidate all

Logs show esQuota and imapQuota errors

You find messages such as the one below in /var/log/bm-webmail/errors:

10-Nov-2019 17:37:38 UTC] [jdoe@bluemind.loc] esQuota < (imapQuota \* 0.8). disable es search. esQuota: 4199171, imapQuota: 6123568 

This means that for the account shown, less than 80% of the mailbox is indexed (esQuota = elasticsearch quota), elasticsearch search (== advanced search engine) is therefore disabled because inefficient.

To fix this, you have to consolidate or reindex the account.

If only a few identified users are affected

If the problem only concerns one or a few users, this means that the ElasticSearch index for them is non-existent or corrupted, and needs to be recreated:

  • either by going into each user's admin page and running "Validate and repair user data" next "Consolidate mailbox index", if no improvement, "Rebuild index"

  • or using our command line tool:

    bm-cli maintenance repair user@domain.net
    bm-cli maintenance consolidateIndex user@domain.net

If all users are affected

To repair all accounts, you can:

  1. find the accounts by running a grep on the log file:
grep "disable es search. esQuota:" /var/log/bm-webmail/errors
  1. copy the logins found into a text file (e.g. /tmp/accountWithoutEsSearch.txt)
  2. use the following command combination to launch index consolidation on each of the logins in the :
while read account; do bm-cli maintenance consolidateIndex $account;done < /tmp/accountWithoutEsSearch.txt

Global malfunction

Analyse

If you detect a search malfunction in BlueMind, you can see the status of the ElasticSearch cluster using the command:

$ curl -XGET --silent 'http://localhost:9200/_cluster/health'

If the cluster's status is 'green' then everything is fine, if it is 'red', this means there is an issue with Elasticsearch. This information is also fed into the monitoring console.

Using our command line admin tool, you can check a user's indexing status:

$ bm-cli index info admin@local.lan
{"email":"jdoe@domain.loc","ESQuota":3056,"IMAPQuota":3058,"ratio":95}

Resolution

Several issues may stop ElasticSearch from working:

  • an insufficiently indexed mailbox: if, when you check the index (see above), you find a ratio below 80, this means that fewer than 80% of the user's emails are indexed => you must reindex the mailbox:

    • go to the BlueMind admin console
    • go to the user's admin page > Maintenance tab
    • run "Consolidate mailbox index"
  • corruption of the indexes: mainly because of a lack of disk space, it is necessary at least 10% free disk space. If the disk containing the ES data (/var/spool/bm-elasticsearch) is missing space, it is possible that the search indexes are corrupted. In the ES logs, this results in an error when the :

    [2017-01-26 20:06:54,764][WARN ][cluster.action.shard] [Bill Foster] [mailspool][0] received shard failed for [mailspool][0], node[PcC6eICxRAajmWioK1mhDA], [P], s[INITIALIZING], indexUUID 
    [IEJHQkOnTtOcdY0bMMIFRA], reason [master [Bill Foster][PcC6eICxRAajmWioK1mhDA][bluemind.exa.net.uk][inet[/82.219.13.101:9300]] marked shard as initializing, but shard is marked as failed, resend shard failure]
    [2016-01-26 20:06:55,828][WARN ][indices.cluster] [Bill Foster] [mailspool][0] failed to start shard
    org.elasticsearch.index.gateway.IndexShardGatewayRecoveryException: [mailspool][0] failed to recover shard
    at org.elasticsearch.index.gateway.local.LocalIndexShardGateway.recover(LocalIndexShardGateway.java:287)
    at org.elasticsearch.index.gateway.IndexShardGatewayService$1.run(IndexShardGatewayService.java:132)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
    at java.lang.Thread.run(Thread.java:745)

    You must then :

    1. delete index files and restart ElasticSearch :
    service bm-elasticsearch stop
    rm -rf /var/spool/bm-elasticsearch/data/nodes/0/indices/\*
    service bm-elasticsearch start
    1. Reset indexes :
    bm-cli index reset mailspool
    bm-cli index reset mailspool_pending
    bm-cli index reset event
    bm-cli index reset contact
    bm-cli index reset todo
    bm-cli index reset im
    bm-cli index reset note
    1. Then start indexing again from scheduled jobs: Admin Console > System Management > Planning > select the domain "global.virt" and run all the *IndexJob tasks:
    • CalendarIndexJob

    • ContactIndexJob

    • ConsolidateMailSpoolIndexJob

      However, email indexing is an IO-intensive operation, so we recommend that you run this task in the evening or at the weekend. Batch indexing can be launched with bm-cli.

    • TodoListIndexJob

    • HSMIndexJob

  • corrupt translog: this can happen if the server has crashed or because of low memory. In this case, the general index is not corrupt and only the indexing of documents not written to the disk yet will be lost. In ES logs, this translates as an error on service start up:

    [2017-09-04 19:24:38,340][WARN ][indices.cluster          ] [Hebe] [mailspool][1] failed to start shard
    org.elasticsearch.index.gateway.IndexShardGatewayRecoveryException: [mailspool][1] failed to recover shard
    at org.elasticsearch.index.gateway.local.LocalIndexShardGateway.recover(LocalIndexShardGateway.java:287)
    at org.elasticsearch.index.gateway.IndexShardGatewayService$1.run(IndexShardGatewayService.java:132)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
    at java.lang.Thread.run(Thread.java:745)
    Caused by: org.elasticsearch.index.translog.TranslogCorruptedException: translog corruption while reading from stream
    at org.elasticsearch.index.translog.ChecksummedTranslogStream.read(ChecksummedTranslogStream.java:70)
    at org.elasticsearch.index.gateway.local.LocalIndexShardGateway.recover(LocalIndexShardGateway.java:257)
    ... 4 more

To delete corrupted translogs :

service bm-elasticsearch stop
rm -rf /var/spool/bm-elasticsearch/data/nodes/0/indices/mailspool/\*/translog
service bm-elasticsearch start

Running ConsolidateMailSpoolIndexJob reindexes the missing messages