![]() ![]() Initially your disk space is going to take at least this: And let’s say the average document size is 2 kb. Having multiple shards for one collection does not necessarily result in a more resilient Solr. By the time we have an issue with one shard and the other shards could respond anyway, the time response or blocker would be the slowest shard. When we have multiple shards we divide the total number of documents by the shard count. This reduces the cache and disk size and improves index process. Is it possible that we have an overkill index / update process? Given the result of our experience, it is not overkilling. I will leave the analysis of this question for another post. Otherwise this is going to be too extensive. We have reached ~210K updates per hour (peak traffic) on our key markets. The only job Apache Zookeeper has in this environment is keeping the cluster state available for all the nodes as accurate as possible. One common issue if the replicas are recovering too frequently is that the cluster state might be out of sync from Zookeeper. This will produce inconsistent states among the running replicas and the one trying to recover ends up in a long loop that might last hours. Zookeeper is very stable and it may fail only due to network resources, or better said the lack of it. One of the most driving factors for Solr performance is RAM. Solr requires sufficient memory for the Java Heap and the free memory for the OS disk cache.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |