Stream: containers

Topic: K8s: Solr vs SolrCloud


view this post on Zulip Oliver Bertuch (Nov 03 2025 at 13:47):

So Bitnami stopped their Helm Charts to be open and free. Now I need to decide if I want to use Apache Solr Operator (which means having to use SolrCloud and make Dataverse use it) or run something brewed locally.

view this post on Zulip Oliver Bertuch (Nov 03 2025 at 13:47):

Any opinions?

view this post on Zulip Philip Durbin ๐Ÿš€ (Nov 03 2025 at 14:01):

Sounds like a tough choice. Are you saying we have a choice between switching to SolrCloud and becoming the maintainers of a Solr Operator?

view this post on Zulip Oliver Bertuch (Nov 03 2025 at 14:03):

Oh gosh, no! It's only that the Apache Solr folks only have an official Helm Chart for SolrCloud and their Solr Operator maintained. Running single instances or "classic HA mode" is not something they support. So if you want that, you must use something else. And that means either build yourself or reuse something.

view this post on Zulip Oliver Bertuch (Nov 03 2025 at 14:03):

I suppose these days it would be wise to start looking into SolrCloud. If running on K8s, that's the way to go with redundancy, load balancing etc.

view this post on Zulip Philip Durbin ๐Ÿš€ (Nov 03 2025 at 14:06):

So you're saying that Solr maintains an operator for SolrCloud but not Solr the way we run it. We'd become the maintainers of a Solr Operator for single instances. Do I have that right?

view this post on Zulip Oliver Bertuch (Nov 03 2025 at 14:06):

Yeah, if we actually would like to go that way, we'd need to create either an operator or an Helm chart of our own.

view this post on Zulip Oliver Bertuch (Nov 03 2025 at 14:07):

There are obviously already some charts out there and you could even fork some stuff, but as always: one needs to own their dependencies unless there is a community that takes care of it.

view this post on Zulip Philip Durbin ๐Ÿš€ (Nov 03 2025 at 14:08):

If we switch to SolrCloud, we'd watch to switch everywhere, right? In dev, in a "classic" installation of Dataverse, etc.

view this post on Zulip Oliver Bertuch (Nov 03 2025 at 14:09):

Not necessarily. It may be possible to do a parallel approach.

view this post on Zulip Oliver Bertuch (Nov 03 2025 at 14:10):

I don't have much experience, so I am relying on good prompting with Claude 4.5 and testing later. It seems with what we have, we mostly would need to use a different client implementation. That's easy to switch depending on a setting.

view this post on Zulip Philip Durbin ๐Ÿš€ (Nov 03 2025 at 14:10):

interesting

view this post on Zulip Oliver Bertuch (Nov 03 2025 at 14:11):

I'm already asking for ideas what else could be optimized and it is mostly saying that we can do larger batches at a time because we can distribute the load better across more resources.

view this post on Zulip Philip Durbin ๐Ÿš€ (Nov 03 2025 at 14:12):

The first step of https://solr.apache.org/guide/solr/latest/getting-started/tutorial-five-minutes.html is to launch SolrCloud. Maybe we should get with the program and start using it. :grimacing:

view this post on Zulip Oliver Bertuch (Nov 03 2025 at 14:13):

You can even run single node SolrCloud instances. It uses an embedded ZooKeeper.

view this post on Zulip Oliver Bertuch (Nov 03 2025 at 14:13):

Our current config files etc can stay as is.

view this post on Zulip Oliver Bertuch (Nov 03 2025 at 14:14):

There is some room for improvement concerning routing of documents to shards. It would make sense to have files etc routed per datasetid, so queries happen on the same node.

view this post on Zulip Philip Durbin ๐Ÿš€ (Nov 03 2025 at 14:14):

Gotcha. Well, I guess I'm leaning to toward switching to SolrCloud.


Last updated: Jan 09 2026 at 14:18 UTC