"The MOC team wants to test a containerized Dataverse installation on the MOC infrastructure to support large data and computing." -- https://github.com/IQSS/dataverse-pm/issues/105
A nice update posted to Slack just now. The containers are working!
I'm hacking on OpenShift support. Having some progress, but really slow
Currently customizing the base image to be more OpenShift security policy friendly
Great news!
PROGRESS!
image.png
go go go! ![]()
https://github.com/IQSS/dataverse/pull/10314
Here's what I have so far
Not much yet, but a start. Next up: adding Solr dependency
At least I can start Dataverse and get to a successful deploy and bootstrap :grinning_face_with_smiling_eyes:
Ah, you even link to CRC. Great. Doing a million things but I'll try to get it installed after lunch. This looks great! Thanks, @Oliver Bertuch ! ![]()
Technically, it _should_ also work with the K8s cluster in Docker Desktop
There even is an extension for it: https://developers.redhat.com/articles/2022/05/10/introducing-red-hat-openshift-extension-docker-desktop
Your mileage may vary!
Yes, I believe Leonid played around with that extension.
Looks kinda abandoned...
No commit for 2 years
There're the docs for the normal k8s cluster in docker desktop: https://docs.docker.com/desktop/kubernetes/
I'm getting an error when I try to use PR #10314:
pdurbin@air dataverse % mvn -Pct k8s:resource k8s:deploy
[INFO] Scanning for projects...
Downloading from central: https://repo.maven.apache.org/maven2/org/codehaus/mojo/maven-metadata.xml
Downloading from central: https://repo.maven.apache.org/maven2/org/apache/maven/plugins/maven-metadata.xml
Downloaded from central: https://repo.maven.apache.org/maven2/org/codehaus/mojo/maven-metadata.xml (21 kB at 108 kB/s)
Downloaded from central: https://repo.maven.apache.org/maven2/org/apache/maven/plugins/maven-metadata.xml (14 kB at 75 kB/s)
[INFO] ------------------------------------------------------------------------
[INFO] BUILD FAILURE
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 0.879 s
[INFO] Finished at: 2024-02-10T11:18:08-05:00
[INFO] ------------------------------------------------------------------------
[ERROR] No plugin found for prefix 'k8s' in the current project and in the plugin groups [org.apache.maven.plugins, org.codehaus.mojo] available from the repositories [local (/Users/pdurbin/.m2/repository), central (https://repo.maven.apache.org/maven2)] -> [Help 1]
[ERROR]
[ERROR] To see the full stack trace of the errors, re-run Maven with the -e switch.
[ERROR] Re-run Maven using the -X switch to enable full debug logging.
[ERROR]
[ERROR] For more information about the errors and possible solutions, please read the following articles:
[ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/NoPluginFoundForPrefixException
My money :money: is on you forgot to cd into modules/container-k8s
![]()
Thanks, something is happening!
pdurbin@air container-k8s % eval $(crc oc-env)
pdurbin@air container-k8s % oc get pods -w
NAME READY STATUS RESTARTS AGE
dataverse-894d594c7-xdr2x 0/2 Pending 0 45s
postgres-6c788b5588-2gvn4 1/1 Running 0 45s
I had to do just crc start though because crc start -c 8 -d 64 -m 16384 (and lower amounts of memory and cpu) keeps failing on my laptop at home. :disappointed:
Nothing seems to be happening. :thinking:
I did crc stop and crc delete and tried again with more memory crc start -m 12288 (12 GB instead of 9). I have 16 GB on this laptop.
Hmm, some kind of error, I guess:
pdurbin@air container-k8s % oc get pods -w
NAME READY STATUS RESTARTS AGE
dataverse-894d594c7-xm8tw 0/2 ContainerCreating 0 6s
postgres-6c788b5588-8vh2s 0/1 ContainerCreating 0 6s
postgres-6c788b5588-8vh2s 0/1 Running 0 14s
postgres-6c788b5588-8vh2s 1/1 Running 0 21s
dataverse-894d594c7-xm8tw 1/2 Running 0 39s
dataverse-894d594c7-xm8tw 0/2 Error 0 3m39s
dataverse-894d594c7-xm8tw 1/2 Running 1 (2s ago) 3m40s
Caused by: edu.harvard.iq.dataverse.settings.ConfigCheckService$ConfigurationError: Not all configuration checks passed successfully. See logs above.
Directory /dv/docroot for docroot space (see dataverse.files.docroot) exists, but is not writeable|#]
At work, where I have double the RAM (32 GB), crc start -c 8 -d 64 -m 16384 from the PR works, but I get the same "directory not writeable" error:
Screenshot-2024-02-12-at-1.13.59-PM.png
Great meeting, thanks all. I'll post the video as soon as I get it. For now here are the notes: https://docs.google.com/document/d/1eKZANop8IXgM2s6h_UQ14bAO2R0Xn_xo0wEE-WdfKVg/edit?usp=sharing
One thing that was bothering me was that while we did see the ERROR StatusLogger message on my machine too, we determined it was non-fatal, but then we didn't go on to see what the next fatal error is.
It turns out I can get a version from the API at least:
Screenshot-2024-02-15-at-10.50.58-AM.png
One thing I'm confused about is how to access Dataverse from the host, from my Mac. Under "details" I can see a Pod IP of 10.217.0.63 and a Host IP of 192.168.126.11 but they both timeout when I try to reach them.
Oh, I should point out that this is with the suggestion from Danni that I showed in the meeting, a change to modules/container-k8s/src/main/jkube/dataverse-deployment.yaml:
replicas: 1
template:
spec:
+ securityContext:
+ runAsUser: 1000
+ fsGroup: 1000
containers:
- name: dataverse
image: ghcr.io/gdcc/dataverse:openshift-poc
It looks like the root collection was created, great.
$ kubectl exec --stdin --tty dataverse-5f6bb9f895-lrfvb -- bash
Defaulted container "dataverse" out of: dataverse, bootstrap
payara@dataverse-5f6bb9f895-lrfvb:~$ curl -s localhost:8080/api/dataverses/:root | jq .
{
"status": "OK",
"data": {
"id": 1,
"alias": "root",
"name": "Root",
"dataverseContacts": [
{
"displayOrder": 0,
"contactEmail": "root@mailinator.com"
}
],
"permissionRoot": true,
"description": "The root dataverse.",
"dataverseType": "UNCATEGORIZED",
"creationDate": "2024-02-15T14:29:34Z"
}
}
I'm exec'ing into the container because I'm not sure how to access these API endpoints from the host, my Mac. :thinking:
@Oliver Bertuch or @Danni Shi do you know?
Oh, my bad, it's right in the description of #10314:
$ oc port-forward pods/dataverse-5f6bb9f895-lrfvb 8080
Forwarding from 127.0.0.1:8080 -> 8080
Forwarding from [::1]:8080 -> 8080
It works!
Screenshot-2024-02-15-at-2.45.45-PM.png
Like Oliver said during the meeting, there's Solr work to do but this is great progress!
Shouldn't this go in a YAML file? I'd rather not run this command every time.
@Oliver Bertuch should we go ahead and add the commit above about securityContext/runAsUser to #10314? Maybe with a TODO to remove it some day?
Ok, I just merged #10314 BUT before I did, I narrowed the scope, as discussed with Oliver earlier today. We need the various fixes to get our images running under OpenShift. However, I removed the other stuff we don't need yet, the config files, etc.
That's why I changed the title of the PR to "OpenShift prep work". I think this better reflects what I merged.
Anyway, onwards with #10337 and an operator! Oliver already pushed some code for us to look at! https://github.com/IQSS/dataverse/compare/develop...10337-k8s-operator
And there's a dedicated topic: #containers > dvopa K8s Operator #10337
I successfully followed the demo at Dataverse Guide and was able to run it on my local WSL Docker for testing.
Now, I am attempting to replicate this setup on OpenShift within my organization. However, I am encountering issues with Solr not functioning as expected. I am currently using the GDCC image of Dataverse along with Solr 9, but I am uncertain if this configuration is compatible.
Additionally, I have come across the IQSS images for Dataverse and Solr. Would these be a better alternative?
Are there any specific guides or images available for deploying Dataverse on OpenShift? I would appreciate any guidance you can provide.
Thank you.
Hi! The IQSS images on Docker Hub are old. You shouldn't use them. I'm glad the GDCC images are working for you, the demo version at least.
We have a topic going at #containers > OpenShift but it's been a while since anyone has posted anything.
Actually, I think I'll just merge this topic into that one.
@Gursahib can you please say more about what's not working?
Hi Philip,
Thank you for your response.
I am encountering an issue where the dvinitializer job requires root privileges, causing it to fail. However, even without it, I was able to deploy Dataverse to the UI level and log in as an admin.
In this deployment, I can create datasets and upload files, but Solr is unable to perform searches or create a Dataverse. I suspect that certain Solr actions require root privileges.
Below is the current pod log:
chown: /var/solr/data: Operation not permitted
chown: /var/solr/data: Operation not permitted
chown: /var/solr/logs/solr_gc.log: Operation not permitted
chown: /var/solr/logs/solr.log: Operation not permitted
chown: /var/solr/logs/solr_slow_requests.log: Operation not permitted
chown: /var/solr/logs/2025_02_25.request.log: Operation not permitted
chown: /var/solr/logs/2025_02_26.request.log: Operation not permitted
chown: /var/solr/logs/2025_02_27.request.log: Operation not permitted
chown: /var/solr/logs/2025_02_28.request.log: Operation not permitted
chown: /var/solr/logs: Operation not permitted
chown: /var/solr/logs: Operation not permitted
chown: /var/solr/log4j2.xml: Operation not permitted
chown: /var/solr: Operation not permitted
chown: /var/solr: Operation not permitted
Would you happen to have any Solr images that do not require root privileges? Any guidance on resolving this issue would be greatly appreciated.
Huh. I'm surprised Solr requires root. @Oliver Bertuch you've been messing around with Kubernetes lately. Any insight on this?
@zbenta you use Kubernetes, right?
Oh, wait. I think @zbenta is (or was?) using OpenStack, not OpenShift, based on these slides: https://osf.io/fjw9t
@Gursahib I would suggest asking at https://groups.google.com/g/dataverse-community if anyone is using OpenShift and has seen this problem with Solr. You'll reach more people there.
Technically, I am using RKE2, another distribution of K8s.
Ah, less strict about root?
Alright, Thank you Philip.
it seems the dv_initializer running script fix-fs-perm.sh in image gdcc/configbacker:alpha needs root.
8a7400b16cbd:/scripts# cat fix-fs-perms.sh
#!/bin/bash
# [INFO]: Fix folder permissions using 'chown' to be writeable by containers not running as root.
set -euo pipefail
if [[ "$(id -un)" != "root" ]]; then
echo "This script must be run as user root (not $(id -un)), otherwise no fix is possible."
fi
DEF_DV_PATH="/dv"
DEF_SOLR_PATH="/var/solr"
DEF_DV_UID="1000"
DEF_SOLR_UID="8983"
function usage() {
echo "Usage: $(basename "$0") (dv|solr|[1-9][0-9]{3,4}) [PATH [PATH [...]]]"
echo ""
echo "You may omit a path when using 'dv' or 'solr' as first argument:"
echo " - 'dv' will default to user $DEF_DV_UID and $DEF_DV_PATH"
echo " - 'solr' will default to user $DEF_SOLR_UID and $DEF_SOLR_PATH"
exit 1
}
# Get a target name or id
TARGET=${1:-help}
# Get the rest of the arguments as paths to apply the fix to
PATHS=( "${@:2}" )
ID=0
case "$TARGET" in
dv)
ID="$DEF_DV_UID"
# If there is no path, add the default for our app image
if [[ ${#PATHS[@]} -eq 0 ]]; then
PATHS=( "$DEF_DV_PATH" )
fi
;;
solr)
ID="$DEF_SOLR_UID"
# In case there is no path, add the default path for Solr images
if [[ ${#PATHS[@]} -eq 0 ]]; then
PATHS=( "$DEF_SOLR_PATH" )
fi
;;
# If there is a digit in the argument, check if this is a valid UID (>= 1000, ...)
*[[:digit:]]* )
echo "$TARGET" | grep -q "^[1-9][0-9]\{3,4\}$" || usage
ID="$TARGET"
;;
*)
usage
;;
esac
# Check that we actually have at least 1 path
if [[ ${#PATHS[@]} -eq 0 ]]; then
usage
fi
# Do what we came for
chown -R "$ID:$ID" "${PATHS[@]}"
But i'll check with the datverse community group as well
And i think solr intializer also uses same image to runsame script.
I suppose you will be running into problems because of copying the Docker Compose stuff 1:1 onto K8s. Depending on your OpenShift environment, you probably are enforced to use a Pod Security Admission of "restricted" or "baseline". (K8s docs, OS docs)
You should not use these scripts in init containers on K8s, they are meant for Docker Compose usage.
@Gursahib that fix perms script was written by @Oliver Bertuch so please listen to any advice he has! :big_smile:
Are you backing the storage locations with PVCs?
And are you (as you should if someone is already impossing baseline or restricted on you) setting all the necessary security bits for your Deployment's .spec.template.spec.securityContext and .spec.template.containers[].securityContext ?
Another thought is that we had a call with a Red Hat engineer almost a year ago. He's really into Solr so I assume he has it working well on OpenShift. You can see the notes from the call, watch the recording, and check out his https://github.com/smartabyar-smartvillage/smartvillage-operator project, which uses Solr. Maybe we can even pull him in and ask questions.
First of all i would like to clear, I am just a Technical designer exploring this for next solution :sweat_smile:
and this openshift stuff is new to me.
Oliver Bertuch said:
Are you backing the storage locations with PVCs?
Yes i have PVC's defined as well and yes the Pod security is restricted.
@Gursahib are you using a Helm Chart to deploy your Solr or the Solr Operator?
Well for exploration and evaluation I strongly suggest going for Docker Compose first. In the foreseeable future I will be working on better K8s support via Helmcharts and/or Operators for Dataverse, but there are other problems to address first.
If you want, you can post your K8s objects here and we can try to take a look.
Well I have already explored dataverse on Docker and of course it works flawlessly there. Now i am tasked to replicate the same in our openshift, its a big learning experience.
I believe I am using neither as I am defining my service myself.
here is the way I create yaml:
apiVersion: apps/v1
kind: StatefulSet
# kind: Deployment
metadata:
name: solr
labels:
app: solr
spec:
serviceName: solr
replicas: 1
selector:
matchLabels:
app: solr
template:
metadata:
labels:
app: solr
spec:
securityContext:
fsGroup: 1000810000
containers:
- name: solr
image: solr:9.4.1
ports:
- containerPort: 8983
command: ["solr-precreate", "collection1", "/template"]
volumeMounts:
- name: solr-data
mountPath: /var/solr
- name: solr-template
mountPath: /template
securityContext:
runAsUser: 1000810000
runAsGroup: 1000810000
allowPrivilegeEscalation: false
volumes:
- name: solr-data
persistentVolumeClaim:
claimName: solr-data-pvc
- name: solr-template
persistentVolumeClaim:
claimName: solr-template-pvc
---
apiVersion: v1
kind: Service
metadata:
name: solr
labels:
app: solr
spec:
type: ClusterIP
ports:
- name: solr
port: 8983
targetPort: 8983
selector:
app: solr
Have you tried to verify in a running container if the fsGroups are properly applied to the volumes and the user is in that group?
Yes, I have verified
Alright, let's see some logs. Can you share the error messages Solr gives you?
(I don't see the configbaker image anymore here as init container, as it should be :-) )
BTW - how do you propagate the /template PV? Solr will be having a hard time if there is no template in that folder...
Oliver Bertuch said:
BTW - how do you propagate the
/templatePV? Solr will be having a hard time if there is no template in that folder...
there is a solr initializer which runs this command:
command: ["sh", "-c", "fix-fs-perms.sh solr && cp -a /template/* /solr-template"]
which fails again due to root issues on fix-fs-perms.sh script
Oliver Bertuch said:
(I don't see the configbaker image anymore here as init container, as it should be :-) )
its used by solr initializer which is a deployment in itself as:
apiVersion: batch/v1
kind: Job
metadata:
name: solr-initializer
spec:
template:
metadata:
name: solr-initializer
spec:
securityContext:
fsGroup: 1000810000
containers:
- name: solr-initializer
image: gdcc/configbaker:alpha
# command: ["sh", "-c", "mkdir -p /var/solr/data && chown -R 8983:8983 /var/solr && fix-fs-perms.sh solr && cp -a /template/* /solr-template"]
command: ["sh", "-c", "fix-fs-perms.sh solr && cp -a /template/* /solr-template"]
volumeMounts:
- name: solr-data
mountPath: /var/solr
- name: solr-template
mountPath: /solr-template
securityContext:
runAsUser: 1000810000
runAsGroup: 1000810000
allowPrivilegeEscalation: false
restartPolicy: Never
volumes:
- name: solr-data
persistentVolumeClaim:
claimName: solr-data-pvc
- name: solr-template
persistentVolumeClaim:
claimName: solr-template-pvc
Yikes! That sounds like we found the culprit!
is it a big yikes or a small one :sweat_smile:
A) Please use a proper init container within the solr Deployment object, not a separate deployment. The piece in the docker-compose is also only used as an init container. https://kubernetes.io/docs/concepts/workloads/pods/init-containers/
B) As you are on K8s, you don't need to fix the permissions. This is taken care of by K8s for you, using the fsGroup trick that will ensure the volume is ready to go for you. So you won't need the first part of the shell command.
C) I suggest you use an emptyDir type volume for the template volume.
D) I also suggest moving the runAsUser and runAsGroup bits to .spec.template.spec.securityContext, so you only need to set it once for all containers in the pod.
Thank you @Oliver Bertuch for your insights. As I mentioned earlier, this is a new area (we are shifting to DevOps) for both myself and most of our team, so we are continuously learning these concepts over time.
I appreciate your suggestions and will incorporate them accordingly.
Also, it is a common (but not good/best) practice to set fsGroup to sth. else than runAsGroup. But that's just a minor thing, you can probably ignore it.
@Gursahib once you have this working, would you want to add a page about OpenShift to the Container Guide? I'm happy to help.
@Gursahib I also VERY STRONGLY suggest to set limits for CPU, memory, and storage, drop the capabilities, set liveness and readiness probes, try to use read-only root fs, set a pull image policy.
Also, you seem to be running in default namespace, which is a bad idea for production but might be OK for testing.
Also, the most common labels are missing, you are using very old standards here...
Here are some helpful tools: https://learnk8s.io/validating-kubernetes-yaml
Philip Durbin โ๏ธ said:
Gursahib once you have this working, would you want to add a page about OpenShift to the Container Guide? I'm happy to help.
Sure @Philip Durbin โ๏ธ , Once I have this running would share what I learned and can help in documenting it as well.
Oliver Bertuch said:
Gursahib I also VERY STRONGLY suggest to set limits for CPU, memory, and storage, drop the capabilities, set liveness and readiness probes, try to use read-only root fs, set a pull image policy.
Also, you seem to be running in
defaultnamespace, which is a bad idea for production but might be OK for testing.Also, the most common labels are missing, you are using very old standards here...
The environment i am working in is kind of a playground for testing so nothing to worry about here
Hey guys I am back and good news, after following Oliver's guidance the Dataverse is running on openshift environment. Last thing left is the previewers not working properly. I have tried everything but can't seem to find any solutions. The preview register is working fine. I am not sure what I am missing, if this works then I think we can have a complete documentation of deploying in Openshift.
Here Is the error i get from the logs of previewers pod:
/docker-entrypoint.sh: /docker-entrypoint.d/ is not empty, will attempt to perform configuration
/docker-entrypoint.sh: Looking for shell scripts in /docker-entrypoint.d/
/docker-entrypoint.sh: Launching /docker-entrypoint.d/10-listen-on-ipv6-by-default.sh
10-listen-on-ipv6-by-default.sh: info: can not modify /etc/nginx/conf.d/default.conf (read-only file system?)
/docker-entrypoint.sh: Sourcing /docker-entrypoint.d/15-local-resolvers.envsh
/docker-entrypoint.sh: Launching /docker-entrypoint.d/20-envsubst-on-templates.sh
20-envsubst-on-templates.sh: ERROR: /etc/nginx/templates exists, but /etc/nginx/conf.d is not writable
/docker-entrypoint.sh: Launching /docker-entrypoint.d/30-tune-worker-processes.sh
/docker-entrypoint.sh: Launching /docker-entrypoint.d/init.sh
Using Provider Url=http://previewers-provider:9000 for versions v1.4,betatest
Downloading local copies of remote JavaScript libraries:
./localinstall.sh: line 22: replace_js.sh: Permission denied
./localinstall.sh: line 23: urls_js.txt: Permission denied
./localinstall.sh: line 24: replace_js.sh: No such file or directory
cat: urls_js.txt: No such file or directory
Downloading local copies of remote CSS files:
./localinstall.sh: line 28: replace_css.sh: Permission denied
./localinstall.sh: line 29: urls_css.txt: Permission denied
./localinstall.sh: line 30: replace_css.sh: No such file or directory
cat: urls_css.txt: No such file or directory
./localinstall.sh: line 39: ../urls_js.txt: No such file or directory
./localinstall.sh: line 49: ../urls_css.txt: No such file or directory
Changing example curl commands to use local URLs
sed: couldn't open temporary file ./sedl9f3ws: Permission denied
Done changing example curl commands to use local URLs
Cleaning Up...
rm: cannot remove 'urls_js.txt': No such file or directory
rm: cannot remove 'urls_css.txt': No such file or directory
rm: cannot remove 'replace_js.sh': No such file or directory
rm: cannot remove 'replace_css.sh': No such file or directory
Done
cp: cannot create regular file 'previewers/v1.4/5.2curlcommands.md': Permission denied
cp: cannot create regular file 'previewers/v1.4/6.1curlcommands.md': Permission denied
cp: cannot create regular file 'previewers/v1.4/CONTRIBUTING.md': Permission denied
cp: cannot create regular file 'previewers/v1.4/README.md': Permission denied
cp: cannot create regular file 'previewers/v1.4/pre5.2curlcommands.md': Permission denied
Downloading local copies of remote JavaScript libraries:
./localinstall.sh: line 22: replace_js.sh: Permission denied
./localinstall.sh: line 23: urls_js.txt: Permission denied
./localinstall.sh: line 24: replace_js.sh: No such file or directory
cat: urls_js.txt: No such file or directory
Downloading local copies of remote CSS files:
./localinstall.sh: line 28: replace_css.sh: Permission denied
./localinstall.sh: line 29: urls_css.txt: Permission denied
./localinstall.sh: line 30: replace_css.sh: No such file or directory
cat: urls_css.txt: No such file or directory
./localinstall.sh: line 39: ../urls_js.txt: No such file or directory
./localinstall.sh: line 49: ../urls_css.txt: No such file or directory
Changing example curl commands to use local URLs
sed: couldn't open temporary file ./sedPdDuuk: Permission denied
Done changing example curl commands to use local URLs
Cleaning Up...
rm: cannot remove 'urls_js.txt': No such file or directory
rm: cannot remove 'urls_css.txt': No such file or directory
rm: cannot remove 'replace_js.sh': No such file or directory
Done
rm: cannot remove 'replace_css.sh': No such file or directory
cp: cannot create regular file 'previewers/betatest/5.2curlcommands.md': Permission denied
cp: cannot create regular file 'previewers/betatest/6.1curlcommands.md': Permission denied
cp: cannot create regular file 'previewers/betatest/CONTRIBUTING.md': Permission denied
cp: cannot create regular file 'previewers/betatest/README.md': Permission denied
cp: cannot create regular file 'previewers/betatest/pre5.2curlcommands.md': Permission denied
/docker-entrypoint.sh: Configuration complete; ready for start up
2025/03/24 13:38:43 [warn] 1#1: the "user" directive makes sense only if the master process runs with super-user privileges, ignored in /etc/nginx/nginx.conf:2
nginx: [warn] the "user" directive makes sense only if the master process runs with super-user privileges, ignored in /etc/nginx/nginx.conf:2
2025/03/24 13:38:43 [notice] 1#1: using the "epoll" event method
2025/03/24 13:38:43 [notice] 1#1: nginx/1.27.2
2025/03/24 13:38:43 [notice] 1#1: built by gcc 12.2.0 (Debian 12.2.0-14)
2025/03/24 13:38:43 [notice] 1#1: OS: Linux 4.18.0-372.131.1.el8_6.x86_64
2025/03/24 13:38:43 [notice] 1#1: getrlimit(RLIMIT_NOFILE): 1048576:1048576
2025/03/24 13:38:43 [notice] 1#1: start worker processes
And here is the Yaml config:
# kustomize/base/register-previewers.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: previewers-provider
labels:
app: previewers-provider
spec:
replicas: 1
selector:
matchLabels:
app: previewers-provider
template:
metadata:
labels:
app: previewers-provider
spec:
securityContext:
fsGroup: 1000810000
containers:
- name: previewers-provider
image: dockerhub/trivadis/dataverse-previewers-provider:latest
securityContext:
runAsUser: 1000810000
runAsGroup: 1000810000
allowPrivilegeEscalation: false
readOnlyRootFilesystem: false # Allow writing to mounted dirs
ports:
- containerPort: 9000
env:
- name: NGINX_HTTP_PORT
value: "9000"
- name: PREVIEWERS_PROVIDER_URL
value: "http://previewers-provider:9000"
- name: VERSIONS
value: "v1.4,betatest"
volumeMounts:
- name: nginx-config
mountPath: /etc/nginx/conf.d/default.conf
subPath: default.conf
- name: previewers-storage
mountPath: /usr/share/nginx/html/previewers
- name: cache-volume
mountPath: /var/cache/nginx
- name: run-volume
mountPath: /var/run
volumes:
- name: nginx-config
configMap:
name: nginx-config # Using subPath for correct mount
- name: previewers-storage
emptyDir: {} # Allowing writing
- name: cache-volume
emptyDir: {}
- name: run-volume
emptyDir: {}
---
apiVersion: v1
kind: Service
metadata:
name: previewers-provider
labels:
app: previewers-provider
spec:
type: ClusterIP
ports:
- port: 9000
targetPort: 9000
selector:
app: previewers-provider
---
apiVersion: v1
kind: ConfigMap
metadata:
name: nginx-config
data:
default.conf: |
server {
listen 9000;
server_name localhost;
# NGINX configuration to handle the requests
location /previewers/ {
root /usr/share/nginx/html;
try_files $uri $uri/ =404;
}
}
That's amazing that Dataverse itself is working fine on OpenShift. :tada:
@Gursahib do you want to add something about the Container Guide about this? :big_smile:
As for the previewers, I'm not sure how to test this. Maybe using OpenShift Local? :thinking:
It looks like there's also something called MicroShift: https://developers.redhat.com/articles/2025/02/20/why-developers-should-use-microshift
Philip Durbin โ๏ธ said:
That's amazing that Dataverse itself is working fine on OpenShift. :tada:
Gursahib do you want to add something about the Container Guide about this? :big_smile:
Sure @Philip Durbin โ๏ธ I can contribute to the whole section on Openshift, but right now I want to deliver this to researchers I am working with. And I want it to be working 100%. I promise you I'll add a full featured document of OpenShift installation.
Fantastic! No rush!
Yes, let's get this delivered to your researchers first. :sweat_smile:
Philip Durbin โ๏ธ said:
It looks like there's also something called MicroShift: https://developers.redhat.com/articles/2025/02/20/why-developers-should-use-microshift
I have heard of both systems but I am pretty sure these local deployments will work without a doubt, as there will be less security constraints. Still I would give it a try if nothing works.
@Gursahib as a starting point, could you please add an issue to https://github.com/IQSS/dataverse/issues about the trouble you're having with previewers in OpenShift?
@Gursahib these Trivadis Previewer Containers are not supported by the community. I would neither call them production ready nor well documented. Also, I wouldn't endorse using them at this point.
That said, judging from looking at the localinstall.sh script inside the container (the sources are nowhere to be found), the scripts wants to write to /app, which isn't backed by a volume in your Deployment.
Then what image do you suggest @Oliver Bertuch ?
I picked the Trivadis from the demo docker yaml
Right, we did recently add those Trivadis images to the "demo or eval" use case in #11181.
We've talked about building our own images but for now the Trivadis images are what we have.
@Oliver Bertuch you're looking for the Dockerfile? It's here: https://github.com/gdcc/dataverse-previewers/compare/develop...gschmutz:dataverse-previewers:develop
Thanks @Philip Durbin โ๏ธ for the link! I searched under the Trivadis repos (there is a preview register container) and tried searching, but did come up empty handed.
Indeed, the Trivadis Preview containers are currently the only thing that's out there @Gursahib, so it's probably fine that they are not production ready for your use case.
Its Alright, I will see what I can do. If it doesnt work then that's how it is :upside_down:.
Thank you for the help @Oliver Bertuch and @Philip Durbin โ๏ธ
I'll see when i can contribute for OpenShift deployment. Probably from next week.
Fantastic! :heart:
btw I added a volume for /app as well, and as you mentioned indeed seems to be empty:
docker-entrypoint.sh: /docker-entrypoint.d/ is not empty, will attempt to perform configuration
/docker-entrypoint.sh: Looking for shell scripts in /docker-entrypoint.d/
/docker-entrypoint.sh: Launching /docker-entrypoint.d/10-listen-on-ipv6-by-default.sh
10-listen-on-ipv6-by-default.sh: info: can not modify /etc/nginx/conf.d/default.conf (read-only file system?)
/docker-entrypoint.sh: Sourcing /docker-entrypoint.d/15-local-resolvers.envsh
/docker-entrypoint.sh: Launching /docker-entrypoint.d/20-envsubst-on-templates.sh
20-envsubst-on-templates.sh: ERROR: /etc/nginx/templates exists, but /etc/nginx/conf.d is not writable
/docker-entrypoint.sh: Launching /docker-entrypoint.d/30-tune-worker-processes.sh
/docker-entrypoint.sh: Launching /docker-entrypoint.d/init.sh
Using Provider Url=http://previewers-provider:9000 for versions v1.4,betatest
/docker-entrypoint.d/init.sh: line 15: ./localinstall.sh: No such file or directory
cp: cannot stat '*.md': No such file or directory
/docker-entrypoint.d/init.sh: line 19: ./localinstall.sh: No such file or directory
cp: cannot stat '*.md': No such file or directory
cp: cannot stat './previewers': No such file or directory
ther error's are now limited till this
Ooops, yeah obviously mounting it there will make the directory empty...
Let me take a look if we can trick the script into cooperating or if it needs to be extended
Urgh this is hard to do... The entrypoint changes the folder to the one where the previewers are located within the container.
And executes commands there
That's not gonna be easy to work around. You'd have to make a fork and create your own image based on what's there...
Would you be able to do a code contribution?
At some point I will look into this, but don't have the bandwidth at the moment.
exactly, this is quite challenging.
Oliver Bertuch said:
Would you be able to do a code contribution?
I will see if I achieve it on my side then sure, will take time though.
Don't worry its not breaking the Dataverse, its just one feature which need some polishing.
Anyways thank you for your help @Oliver Bertuch
Yeah well there's so many things that still require tweaking for a smooth container / K8s experience...
I'd be happy to receive feedback here what your pain points are
I assume to start working on our migration from Dataverse 4.20 on K8s to the latest and want to get some stuff done in a fork and upstream the changes later. Maybe some of these pain points align :smile_cat:
Where should the code contribution go? In https://github.com/gdcc/dataverse-previewers ? We'll want the Dockerfile to live there, eventually?
Yeah, that would be my guess (for both questions).
@Gursahib do you want to make a PR against that repo?
Oliver Bertuch said:
I assume to start working on our migration from Dataverse 4.20 on K8s to the latest and want to get some stuff done in a fork and upstream the changes later. Maybe some of these pain points align :smile_cat:
Haha, sure. I had to do lot of changes in the components along with Dataverse, but here are few Major ones:
The classic example for point 2 is smtp on port 25, and due to its failure, as port 25 is not allowed, there was connection request timeout period on whenever an Admin mail is created. And during this time out Dataverse freezes in operations like, creating new dataverse, dataset; publishing the dataset/ dataverse. Basically anything which needed to send an email notification.
I had used 2525 for port but default script uses 25, so I had to add a variable for that to.
These are few pain points I can think of and the major one is previewer functionality
Ad 1) I've been using init containers for that reason and it's also why the base image contains wait4x
Oliver Bertuch said:
Ad 1) I've been using init containers for that reason and it's also why the base image contains wait4x
Guess I have lot to learn
Philip Durbin โ๏ธ said:
Gursahib do you want to make a PR against that repo?
If i Succeed sure.
Here's an example how this worked in my old K8s stuff: https://github.com/gdcc/dataverse-kubernetes/blob/413cd89cc51ce4768dea831f0eb3280da60635f8/k8s/dataverse/deployment.yaml#L85
(Needs to be adapted)
Gursahib said:
These are few pain points I can think of and the major one is previewer functionality
It sounds like we're on the right track by focusing on previewers. :big_smile:
There's a dedicated topic for it, by the way: #containers > enable previewers
Bah. No one likes dedicated topics. </sarcasm>
ha
we all miss IRC :big_smile:
I don't. But yes please let's rope in a discussion around IRC into this thread... The good ol' days :smiling_devil: :see_no_evil: :crazy:
Alright guys its getting late on my side of the world, I'll come back with news soon either good or bad idk. Enjoy your day.
Have fun!
bye!
Last updated: Oct 30 2025 at 05:14 UTC