Search and indexing problems
The search bar doesn't show the option to search all folders
For all users
If all users are affected, the issue probably occurs following an email migration at the folder system level. This means that the Elasticsearch index does not exist.
You can run the ConsolidateMailSpoolIndexJob
task to create the index and index all mails on the server:
-
go to the admin console > System Management > Planning > run
ConsolidateMailSpoolIndexJob
-
or CLI Admin Client > execute the
consolidateIndex
maintenance operation:bm-cli maintenance consolidateIndex domain.net
As this operation is very resource-intensive, it can be performed by groups of users, using the --match option:
bm-cli maintenance consolidateIndex --match "a.\*" domain.net
bm-cli maintenance consolidateIndex --match "[b-c].\*" domain.net
For a few users only
If the problem only concerns one or a few users, this means that the ElasticSearch index for them is non-existent or corrupted, and needs to be recreated:
-
either by going into each user's admin page and executing "Validate and repair user data" then "Consolidate mailbox index" then, if there's no improvement, "Reconstruct mailbox index"
-
or via CLI Admin Client:
bm-cli maintenance repair user@domain.net
bm-cli maintenance consolidateIndex user@domain.net
In some cases "Consolidating index" is not enough and it is then necessary to restart a full indexing with the --reset flag:
bm-cli maintenance consolidateIndex --reset user@domain.tld
Some search results are missing
In the event of temporary problems with the indexing service, it is possible that some messages sent and received during this period have not been indexed. In this case, simply run the ConsolidateMailSpoolIndexJob
task, which will calculate the difference between messages at IMAP level and in the index, then index only the missing messages.
An error is displayed during a search
This may be caused by an inconsistency between the list of IMAP folders and the database. You can use the "check and repair" maintenance operation -- which can be accessed from the Maintenance tab in the user's admin page – to rebuild this list. Then re-indexing the mailbox should fix the issue.
If this isn't the case, /var/log/bm-webmail.errors
logs may the cause.
An error is displayed when accessing a message found by a search
This is probably due to an indexing fault when the message was moved. You can use the "Consolidate mailbox index" maintenance operation – which can be accessed from the Maintenance tab in the user's admin page – to update the search index.
ElasticSearch switches to "read only" mode
Issue: ES puts itself in "read-only" mode, the cluster is green but a restart doesn't fix the issue.
Symptoms: Overall, BlueMind doesn't seem to work anymore. Besides search and display issues, users can't create events, reply to emails, etc.
Alerts containing bm-elasticsearch
are raised on the Monitoring Console->General installation status
page of the Administration Console as well as in TICK.
You will find errors containing ClusterBlockException
in /var/log/bm/core.log
. For example:
... class java.util.concurrent.ExecutionException: ClusterBlockException[index [mailspool_pending] blocked by: [TOO_MANY_REQUESTS/12/disk usage exceeded flood-stage watermark, index has read-only-allow-delete block];]
or
... class net.bluemind.core.api.fault.ServerFault: java.util.concurrent.ExecutionException: ClusterBlockException[blocked by: [FORBIDDEN/12/index read-only / allow delete (api)];]
Confirmation : The following command returns ERROR: At least one index is read-only
(to be executed on the Elasticsearch server)
curl -s 'http://localhost:9200/_all/_settings' | jq -r '
if any(.[]; .settings.index.blocks.read_only_allow_delete == "true")
then
"ERROR: At least one index is read-only"
else
"INFO: All indexes are read-write"
end'
Cause: The file system containing ElasticSearch index data is more than 80% full. See Minimum disk performance for more details.
Resolution :
- Increase file system size containing ElasticSearch index data (expand partition or change disk then resize file system)
- Check that ElasticSearch index data partition utilization has dropped below 80% (run on Elasticsearch server):
df /var/spool/bm-elasticsearch/{data,repo} -h
- Play the following command:
curl -X PUT "localhost:9200/_all/_settings" -H 'Content-Type: application/json' -d'{ "index.blocks.read_only_allow_delete" : false } }'
- Re-run the Confirmstep command, it should return
INFO: All indexes are read-write
. - Restart indexing of boxes that need it (to be run on the core server):
bm-cli index coherency --workers=4 --run-consolidate all
Logs show esQuota and imapQuota errors
You find messages such as the one below in /var/log/bm-webmail/errors
:
10-Nov-2019 17:37:38 UTC] [jdoe@bluemind.loc] esQuota < (imapQuota \* 0.8). disable es search. esQuota: 4199171, imapQuota: 6123568
This means that for the account shown, less than 80% of the mailbox is indexed (esQuota = elasticsearch quota), elasticsearch search (== advanced search engine) is therefore disabled because inefficient.
To fix this, you have to consolidate or reindex the account.
If only a few identified users are affected
If the problem only concerns one or a few users, this means that the ElasticSearch index for them is non-existent or corrupted, and needs to be recreated:
-
either by going into each user's admin page and running "Validate and repair user data" next "Consolidate mailbox index", if no improvement, "Rebuild index"
-
or CLI Admin Client > :
bm-cli maintenance repair user@domain.net
bm-cli maintenance consolidateIndex user@domain.net
If all users are affected
To repair all accounts, you can:
- find the accounts by running a grep on the log file:
grep "disable es search. esQuota:" /var/log/bm-webmail/errors
- copy the logins found into a text file (e.g.
/tmp/accountWithoutEsSearch.txt
) - use the following command combination to launch index consolidation on each of the logins in the :
while read account; do bm-cli maintenance consolidateIndex $account;done < /tmp/accountWithoutEsSearch.txt
Global malfunction
Analyse
If you detect a search malfunction in BlueMind, you can see the status of the ElasticSearch cluster using the command:
$ curl -XGET --silent 'http://localhost:9200/_cluster/health'
If the cluster's status is 'green' then everything is fine, if it is 'red', this means there is an issue with Elasticsearch. This information is also fed into the monitoring console.
With CLI Admin Client >, you can view the index status of a given user:
$ bm-cli index info admin@local.lan
{"email":"jdoe@domain.loc","ESQuota":3056,"IMAPQuota":3058,"ratio":95}
Resolution
Several issues may stop ElasticSearch from working:
-
an insufficiently indexed mailbox: if the index (see above) shows a ratio of less than 80, this means that less than 80% of the user's emails are indexed ⇒ reindex the mailbox:
- go to the BlueMind admin console
- go to the user's admin page > Maintenance tab
- run "Consolidate mailbox index"
-
corruption of the indexes: mainly because of a lack of disk space, it is necessary at least 10% free disk space. If the disk containing the ES data (
/var/spool/bm-elasticsearch
) is missing space, it is possible that the search indexes are corrupted. In the ES logs, this results in an error when the :[2017-01-26 20:06:54,764][WARN ][cluster.action.shard] [Bill Foster] [mailspool][0] received shard failed for [mailspool][0], node[PcC6eICxRAajmWioK1mhDA], [P], s[INITIALIZING], indexUUID
[IEJHQkOnTtOcdY0bMMIFRA], reason [master [Bill Foster][PcC6eICxRAajmWioK1mhDA][bluemind.exa.net.uk][inet[/82.219.13.101:9300]] marked shard as initializing, but shard is marked as failed, resend shard failure]
[2016-01-26 20:06:55,828][WARN ][indices.cluster] [Bill Foster] [mailspool][0] failed to start shard
org.elasticsearch.index.gateway.IndexShardGatewayRecoveryException: [mailspool][0] failed to recover shard
at org.elasticsearch.index.gateway.local.LocalIndexShardGateway.recover(LocalIndexShardGateway.java:287)
at org.elasticsearch.index.gateway.IndexShardGatewayService$1.run(IndexShardGatewayService.java:132)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)You must then :
- delete index files and restart ElasticSearch :
service bm-elasticsearch stop
rm -rf /var/spool/bm-elasticsearch/data/nodes/0/indices/*
service bm-elasticsearch start- Reset indexes :
bm-cli index reset mailspool
bm-cli index reset mailspool_pending
bm-cli index reset event
bm-cli index reset contact
bm-cli index reset todo
bm-cli index reset im
bm-cli index reset note-
Then start indexing again from scheduled jobs: Admin Console > System Management > Planning > select the domain "global.virt" and run all the *IndexJob tasks:
-
CalendarIndexJob
-
ContactIndexJob
-
ConsolidateMailSpoolIndexJob
❗ However, email indexing is an IO-intensive operation, so we recommend that you run this task in the evening or at the weekend. Batch indexing can be launched with bm-cli.
-
TodoListIndexJob
-
HSMIndexJob
-
corrupt
translog
: this can happen if the server has crashed or because of low memory. In this case, the general index is not corrupt and only the indexing of documents not written to the disk yet will be lost. In ES logs, this translates as an error on service start up:[2017-09-04 19:24:38,340][WARN ][indices.cluster ] [Hebe] [mailspool][1] failed to start shard
org.elasticsearch.index.gateway.IndexShardGatewayRecoveryException: [mailspool][1] failed to recover shard
at org.elasticsearch.index.gateway.local.LocalIndexShardGateway.recover(LocalIndexShardGateway.java:287)
at org.elasticsearch.index.gateway.IndexShardGatewayService$1.run(IndexShardGatewayService.java:132)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Caused by: org.elasticsearch.index.translog.TranslogCorruptedException: translog corruption while reading from stream
at org.elasticsearch.index.translog.ChecksummedTranslogStream.read(ChecksummedTranslogStream.java:70)
at org.elasticsearch.index.gateway.local.LocalIndexShardGateway.recover(LocalIndexShardGateway.java:257)
... 4 more
To delete corrupted translogs :
service bm-elasticsearch stop
rm -rf /var/spool/bm-elasticsearch/data/nodes/0/indices/mailspool/*/translog
service bm-elasticsearch start
Running ConsolidateMailSpoolIndexJob
reindexes the missing messages