The ARM64 builds using QEMU on AMD64 Github Runners are a pain...
@Don Sizemore @Philip Durbin how about we use an AWS T4g or M6g machine as a remote builder?
(Not as a Github runner!)
If you look at https://github.com/GlobalDataverseCommunityConsortium/dataverse-ansible/blob/2f605efb9ac22614bce7cc4b9acf718ba2d66b3b/ec2/ec2-create-instance.sh#L213C1-L213C315 you can see how we spin up images:
INSTANCE_ID=$(aws $PROFILE ec2 run-instances --image-id $AMI_ID --security-groups $AWS_SG $TAGARG --count 1 --instance-type $SIZE --key-name $KEY_NAME --query 'Instances[0].InstanceId' --block-device-mappings '[ { "DeviceName": "/dev/sda1", "Ebs": { "DeleteOnTermination": true, "VolumeSize": 20 } } ]' | tr -d \")
I see!
So maybe you could give us an AMI id, etc.
It looks like you folks already gave me an AWS access back in the day...
oh dear!
The account is in the name of one Matthew Dunlap...
Does that seem right????
Hmm, we should update that. Thanks.
OK I created https://us-west-2.console.aws.amazon.com/ec2/home?region=us-west-2#InstanceDetails:instanceId=i-0d27e1b1fc1c0a8db. Named it "github-docker-buildx-arm64"
I'm heading out now, will play with it tomorrow morning :smile:
Wow this is really hacky... I need to manually give DMP a buildx instance configuration to make it pick up the remote builder...
I probably need to develop another extension for the plugin to push nodes config from the outside...
It takes about 500 MB of RAM peak when the base image starts the Payara instance. Almost 2 CPU cores are used
Someone definitely needs to explain to me how the AWS flavor says t4g.micro has 1 GB of RAM and it ends up with 815 MB
Oh wow it looks like we can even scale up one more and get a machine for free till 2024-12-31: https://aws.amazon.com/ec2/faqs//#t4g-instances
I can definitely say that this is MUCH faster
Let me try inject this into a CI config
Seems like we can cut the runtime to at least half:
image.png
(Both v6.2 and develop do not have the injected builder config because the feature branch isn't merged and obviously for the tag this will never happen. But for scheduled rebuilds I don't care about build time in the middle of the night. This is important for pushes to develop only. v6.0 and v6.1 base image builds are skipped on purpose)
I changed the setup on purpose to hit the builder with 3 jobs at once...
That made it go cry for resources and two of three jobs failed (OOM killed the processes). So let's see what happens if we double the RAM
With a t4g.small this works perfect for 3 parallel jobs!
We can still use the QEMU variant when doing scheduled builds (because we don't need to care about buildtime on sunday nights), but there is a chance multiple PRs come in at the same time and require the builder
Twice as fast? Great! :tada:
Even a bit more than twice :smile_cat:
TODO for the runner: add a cronjob that on every day stops the builder process and deletes the volumes (docker volume prune -a), as we don't have much space on the disk
https://github.blog/2024-04-26-github-actions-arm64-and-the-future-of-automotive-software-development/ seems related at least. I didn't really did into the details.
It's going on and on about how they now provide ARM based runners and GPU enabled runners in lots of industry/sales buzzwords.
That was my sense too. :grinning:
It's kind of non-related to us if we're not willing to make the pipeline a lot more complex to ship the images built on ARM64 and glue them together manually
It would be an awesome feature to have a sidecar runner available that you can interact with from a job
Right. Sounds like we should stay the course. Sorry for the noise!
Last updated: Oct 30 2025 at 05:14 UTC