Search and indexing problems

The search bar doesn't show the option to search all folders

For all users

If all users are affected, the issue probably occurs following an email migration at the folder system level. This means that the Elasticsearch index does not exist. You can run the ConsolidateMailSpoolIndexJob task to create the index and index all mails on the server:

go to the admin console > System Management > Planning > run ConsolidateMailSpoolIndexJob
or CLI Admin Client > execute the consolidateIndex maintenance operation:
```
bm-cli maintenance consolidateIndex domain.net
```

As this operation is very resource-intensive, it can be performed by groups of users, using the --match option:

bm-cli maintenance consolidateIndex --match "a.\*" domain.net
bm-cli maintenance consolidateIndex --match "[b-c].\*" domain.net

For a few users only

If the problem only concerns one or a few users, this means that the ElasticSearch index for them is non-existent or corrupted, and needs to be recreated:

either by going into each user's admin page and executing "Validate and repair user data" then "Consolidate mailbox index" then, if there's no improvement, "Reconstruct mailbox index"

or via CLI Admin Client:

bm-cli maintenance repair user@domain.net
bm-cli maintenance consolidateIndex user@domain.net 

In some cases "Consolidating index" is not enough and it is then necessary to restart a full indexing with the --reset flag:

bm-cli maintenance consolidateIndex --reset user@domain.tld

Some search results are missing

In the event of temporary problems with the indexing service, it is possible that some messages sent and received during this period have not been indexed. In this case, simply run the ConsolidateMailSpoolIndexJob task, which will calculate the difference between messages at IMAP level and in the index, then index only the missing messages.

An error is displayed during a search

This may be caused by an inconsistency between the list of IMAP folders and the database. You can use the "check and repair" maintenance operation -- which can be accessed from the Maintenance tab in the user's admin page – to rebuild this list. Then re-indexing the mailbox should fix the issue.

If this isn't the case, /var/log/bm-webmail.errors logs may the cause.

An error is displayed when accessing a message found by a search

This is probably due to an indexing fault when the message was moved. You can use the "Consolidate mailbox index" maintenance operation – which can be accessed from the Maintenance tab in the user's admin page – to update the search index.

ElasticSearch switches to "read only" mode

Issue: ES puts itself in "read-only" mode, the cluster is green but a restart doesn't fix the issue.

Symptoms: Overall, BlueMind doesn't seem to work anymore. Besides search and display issues, users can't create events, reply to emails, etc.

Alerts containing bm-elasticsearch are raised on the Monitoring Console->General installation status page of the Administration Console as well as in TICK.

You will find errors containing ClusterBlockException in /var/log/bm/core.log. For example:

... class java.util.concurrent.ExecutionException: ClusterBlockException[index [mailspool_pending] blocked by: [TOO_MANY_REQUESTS/12/disk usage exceeded flood-stage watermark, index has read-only-allow-delete block];]

...  class net.bluemind.core.api.fault.ServerFault: java.util.concurrent.ExecutionException: ClusterBlockException[blocked by: [FORBIDDEN/12/index read-only / allow delete (api)];]

Confirmation : The following command returns ERROR: At least one index is read-only (to be executed on the Elasticsearch server)

curl -s 'http://localhost:9200/_all/_settings' | jq -r '
if any(.[]; .settings.index.blocks.read_only_allow_delete == "true") 
then 
	"ERROR: At least one index is read-only" 
else 
	"INFO: All indexes are read-write" 
end'

Cause: The file system containing ElasticSearch index data is more than 80% full. See Minimum disk performance for more details.

Resolution :

Increase file system size containing ElasticSearch index data (expand partition or change disk then resize file system)
Check that ElasticSearch index data partition utilization has dropped below 80% (run on Elasticsearch server):

df /var/spool/bm-elasticsearch/{data,repo} -h

Play the following command:

curl -X PUT "localhost:9200/_all/_settings" -H 'Content-Type: application/json' -d'{ "index.blocks.read_only_allow_delete" : false } }'

Re-run the Confirmstep command, it should return INFO: All indexes are read-write.
Restart indexing of boxes that need it (to be run on the core server):

bm-cli index coherency --workers=4 --run-consolidate all

Logs show esQuota and imapQuota errors

You find messages such as the one below in /var/log/bm-webmail/errors:

10-Nov-2019 17:37:38 UTC] [jdoe@bluemind.loc] esQuota < (imapQuota \* 0.8). disable es search. esQuota: 4199171, imapQuota: 6123568 

This means that for the account shown, less than 80% of the mailbox is indexed (esQuota = elasticsearch quota), elasticsearch search (== advanced search engine) is therefore disabled because inefficient.

To fix this, you have to consolidate or reindex the account.

If only a few identified users are affected

If the problem only concerns one or a few users, this means that the ElasticSearch index for them is non-existent or corrupted, and needs to be recreated:

either by going into each user's admin page and running "Validate and repair user data" next "Consolidate mailbox index", if no improvement, "Rebuild index"

or CLI Admin Client > :

bm-cli maintenance repair user@domain.net
bm-cli maintenance consolidateIndex user@domain.net 

If all users are affected

To repair all accounts, you can:

find the accounts by running a grep on the log file:

grep "disable es search. esQuota:" /var/log/bm-webmail/errors

copy the logins found into a text file (e.g. /tmp/accountWithoutEsSearch.txt)
use the following command combination to launch index consolidation on each of the logins in the :

while read account; do bm-cli maintenance consolidateIndex $account;done < /tmp/accountWithoutEsSearch.txt

Global malfunction

Analyse

If you detect a search malfunction in BlueMind, you can see the status of the ElasticSearch cluster using the command:

$ curl -XGET --silent 'http://localhost:9200/_cluster/health'

If the cluster's status is 'green' then everything is fine, if it is 'red', this means there is an issue with Elasticsearch. This information is also fed into the monitoring console.

With CLI Admin Client >, you can view the index status of a given user:

$ bm-cli index info admin@local.lan
{"email":"jdoe@domain.loc","ESQuota":3056,"IMAPQuota":3058,"ratio":95}

Resolution

Several issues may stop ElasticSearch from working:

an insufficiently indexed mailbox: if the index (see above) shows a ratio of less than 80, this means that less than 80% of the user's emails are indexed ⇒ reindex the mailbox:
- go to the BlueMind admin console
- go to the user's admin page > Maintenance tab
- run "Consolidate mailbox index"

corruption of the indexes: mainly because of a lack of disk space, it is necessary at least 10% free disk space. If the disk containing the ES data (/var/spool/bm-elasticsearch) is missing space, it is possible that the search indexes are corrupted. In the ES logs, this results in an error when the :

[2017-01-26 20:06:54,764][WARN ][cluster.action.shard] [Bill Foster] [mailspool][0] received shard failed for [mailspool][0], node[PcC6eICxRAajmWioK1mhDA], [P], s[INITIALIZING], indexUUID 
[IEJHQkOnTtOcdY0bMMIFRA], reason [master [Bill Foster][PcC6eICxRAajmWioK1mhDA][bluemind.exa.net.uk][inet[/82.219.13.101:9300]] marked shard as initializing, but shard is marked as failed, resend shard failure]
[2016-01-26 20:06:55,828][WARN ][indices.cluster] [Bill Foster] [mailspool][0] failed to start shard
org.elasticsearch.index.gateway.IndexShardGatewayRecoveryException: [mailspool][0] failed to recover shard
      at org.elasticsearch.index.gateway.local.LocalIndexShardGateway.recover(LocalIndexShardGateway.java:287)
      at org.elasticsearch.index.gateway.IndexShardGatewayService$1.run(IndexShardGatewayService.java:132)
      at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
      at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
      at java.lang.Thread.run(Thread.java:745)

You must then :

delete index files and restart ElasticSearch :

service bm-elasticsearch stop
rm -rf /var/spool/bm-elasticsearch/data/nodes/0/indices/*
service bm-elasticsearch start

Reset indexes :

bm-cli index reset mailspool
bm-cli index reset mailspool_pending
bm-cli index reset event
bm-cli index reset contact
bm-cli index reset todo
bm-cli index reset im
bm-cli index reset note

Then start indexing again from scheduled jobs: Admin Console > System Management > Planning > select the domain "global.virt" and run all the *IndexJob tasks:

CalendarIndexJob
ContactIndexJob
ConsolidateMailSpoolIndexJob

❗ However, email indexing is an IO-intensive operation, so we recommend that you run this task in the evening or at the weekend. Batch indexing can be launched with bm-cli.
TodoListIndexJob
HSMIndexJob

corrupt translog: this can happen if the server has crashed or because of low memory. In this case, the general index is not corrupt and only the indexing of documents not written to the disk yet will be lost. In ES logs, this translates as an error on service start up:

[2017-09-04 19:24:38,340][WARN ][indices.cluster          ] [Hebe] [mailspool][1] failed to start shard
org.elasticsearch.index.gateway.IndexShardGatewayRecoveryException: [mailspool][1] failed to recover shard
    at org.elasticsearch.index.gateway.local.LocalIndexShardGateway.recover(LocalIndexShardGateway.java:287)
    at org.elasticsearch.index.gateway.IndexShardGatewayService$1.run(IndexShardGatewayService.java:132)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
    at java.lang.Thread.run(Thread.java:745)
Caused by: org.elasticsearch.index.translog.TranslogCorruptedException: translog corruption while reading from stream
    at org.elasticsearch.index.translog.ChecksummedTranslogStream.read(ChecksummedTranslogStream.java:70)
    at org.elasticsearch.index.gateway.local.LocalIndexShardGateway.recover(LocalIndexShardGateway.java:257)
	... 4 more

To delete corrupted translogs :

service bm-elasticsearch stop
rm -rf /var/spool/bm-elasticsearch/data/nodes/0/indices/mailspool/*/translog
service bm-elasticsearch start

Running ConsolidateMailSpoolIndexJob reindexes the missing messages

The search bar doesn't show the option to search all folders​

For all users​

For a few users only​

Some search results are missing​

An error is displayed during a search​

An error is displayed when accessing a message found by a search​

ElasticSearch switches to "read only" mode​

Logs show esQuota and imapQuota errors​

If only a few identified users are affected​

If all users are affected​

Global malfunction​

Analyse​

Resolution​