Stream: troubleshooting

Topic: S3 Errors after upgrade


view this post on Zulip Deirdre Kirmis (Dec 19 2024 at 14:03):

I attempted to do normal patching today on our production dataverse, and now I'm seeing the following errors in the log:

[2024-12-19T13:52:17.943+0000] [Payara 6.2024.7] [WARNING] [] [edu.harvard.iq.dataverse.dataaccess.S3AccessIO] [tid: _ThreadID=95 _ThreadName=http-thread-pool::jk-connector(5)] [timeMillis: 1734616337943] [levelValue: 900] [[
Caught an AmazonClientException in S3AccessIO.savePathAsAux(): User: arn:aws:sts::141031099449:assumed-role/AmazonSSMRoleForInstancesQuickSetup/i-098a0e852528a90ec is not authorized to perform: s3:PutObject on resource: "arn:aws:s3:::<dataset ID>/export_schema.org.cached" because no identity-based policy allows the s3:PutObject action (Service: Amazon S3; Status Code: 403; Error Code: AccessDenied; Request ID: 74AKGFFAECW74WVA; S3 Extended Request ID: cNxNGnkZz/jjXoBuQqL0osMOs5ybNqItPkScpz5k9ZAutveyqngvN7mjUHnTQzNhVD/dmk8UChw=; Proxy: null)]]

[2024-12-19T13:52:16.860+0000] [Payara 6.2024.7] [WARNING] [] [edu.harvard.iq.dataverse.dataaccess.S3AccessIO] [tid: _ThreadID=93 _ThreadName=http-thread-pool::jk-connector(3)] [timeMillis: 1734616336860] [levelValue: 900] [[
Caught an AmazonClientException in S3AccessIO.isAuxObjectCached: Forbidden (Service: Amazon S3; Status Code: 403; Error Code: 403 Forbidden; Request ID: Q4J9MBT25S4NS90Z; S3 Extended Request ID: 5HsBIjXxuTsjOGZz7duteNynYJo9j5idbyHBLObZ5RVU2aPI8I+4BqyyVO7lUFFeqr4MN14faSc=; Proxy: null)]]

I restored the volume to before patching, but the error persists. I did not change anything related to my access keys or policies on the user having access. I still have my CORS policy in place, and the policy on the IAM user to allow basically everything on the s3 bucket. Access keys are the same and correct.

Any ideas what I'm missing?

view this post on Zulip Philip Durbin ๐Ÿš€ (Dec 19 2024 at 14:12):

Huh, my first thought is CORS policy but it sounds like you checked that already. :thinking:

view this post on Zulip Deirdre Kirmis (Dec 19 2024 at 14:14):

so strange .. this is happening on all of our installations .. the server can't see the bucket .. can't upload any files .. any idea why that would suddenly occur?

view this post on Zulip Deirdre Kirmis (Dec 19 2024 at 14:16):

this is our CORS policy
[
{
"AllowedHeaders": [
"*"
],
"AllowedMethods": [
"GET",
"HEAD",
"PUT"
],
"AllowedOrigins": [
"*"
],
"ExposeHeaders": [
"Content-Range",
"ETag",
"Content-Encoding",
"Accept-Ranges"
],
"MaxAgeSeconds": 0
}
]

view this post on Zulip Philip Durbin ๐Ÿš€ (Dec 19 2024 at 14:17):

Can you go over what changed? It sounds like the version of Dataverse itself didn't change but you patched the underlying OS? Linux, I assume. Which distro?

view this post on Zulip Deirdre Kirmis (Dec 19 2024 at 14:21):

i patched the OS, but then restored back to a snapshot of the volume, so now i'm running on that .. so the OS is not upgraded

view this post on Zulip Deirdre Kirmis (Dec 19 2024 at 14:21):

rocky linux 8

view this post on Zulip Philip Durbin ๐Ÿš€ (Dec 19 2024 at 14:25):

Strange that it's causing S3 trouble.

view this post on Zulip Deirdre Kirmis (Dec 19 2024 at 14:26):

i'm still using the aws credentials file for access .. could that be an issue?

view this post on Zulip Philip Durbin ๐Ÿš€ (Dec 19 2024 at 14:27):

I mean, that should depend on the version of Dataverse, I think, which hasn't changed.

view this post on Zulip Philip Durbin ๐Ÿš€ (Dec 19 2024 at 14:27):

It is worth testing outside of Dataverse?

view this post on Zulip Deirdre Kirmis (Dec 19 2024 at 14:27):

on prod we are running v6.4
on qa i did upgrade to v6.5 recently

view this post on Zulip Philip Durbin ๐Ÿš€ (Dec 19 2024 at 14:28):

Which one is having S3 trouble?

view this post on Zulip Deirdre Kirmis (Dec 19 2024 at 14:28):

both

view this post on Zulip Philip Durbin ๐Ÿš€ (Dec 19 2024 at 14:28):

D'oh! :doh:

view this post on Zulip Deirdre Kirmis (Dec 19 2024 at 14:30):

yea fun week .. had some other unfun issues earlier

view this post on Zulip Deirdre Kirmis (Dec 19 2024 at 14:38):

when i use the cli to list the bucket contents it works

view this post on Zulip Deirdre Kirmis (Dec 19 2024 at 14:43):

well it does on the prod server, but not on qa .. i get "cannot import name 'PROTOCOL_TLS'"

view this post on Zulip Philip Durbin ๐Ÿš€ (Dec 19 2024 at 14:44):

I wouldn't trust myself to test via cli but I'm glad you're trying!

view this post on Zulip Don Sizemore (Dec 19 2024 at 14:44):

what version of Dataverse? with 5.14+ Dataverse will prefer RBAC even if you give it access keys.

view this post on Zulip Don Sizemore (Dec 19 2024 at 14:45):

ah, v6.5. I have 6.5 running on two instances with S3 storage, one using RBAC, one still using keys.

view this post on Zulip Deirdre Kirmis (Dec 19 2024 at 14:45):

v6.4 and v6.5 .. but these have been working just fine for quite some time .. until today

view this post on Zulip Deirdre Kirmis (Dec 19 2024 at 14:45):

they are both still using keys

view this post on Zulip Don Sizemore (Dec 19 2024 at 14:46):

is there any AWS role assigned to the EC2 instance?

view this post on Zulip Don Sizemore (Dec 19 2024 at 14:46):

I have an open issue requesting the ability to disable the RBAC preference within Dataverse, but if there's a role assigned to the EC2 instance, that sets the S3 permission policy.

view this post on Zulip Deirdre Kirmis (Dec 19 2024 at 14:47):

we use aws backup so it has the "AmazonSSMRoleForInstancesQuickSetup" assigned to it

view this post on Zulip Deirdre Kirmis (Dec 19 2024 at 14:47):

it has been there for at least a year

view this post on Zulip Don Sizemore (Dec 19 2024 at 14:48):

what I did was create a meta-role for the instance (in my case it had the AWS SSM role assigned previously). the parent role included the SSM policy and an "inline" custom policy into which I copy-pasted the S3 permissions.

view this post on Zulip Deirdre Kirmis (Dec 19 2024 at 14:48):

sorry i mean systems manager

view this post on Zulip Don Sizemore (Dec 19 2024 at 14:48):

I'm not sure why things had been working after you upgraded to v5.14 - that's when mine broke.

view this post on Zulip Deirdre Kirmis (Dec 19 2024 at 14:50):

i'm not sure why we even have the systems manager role on this instance .. i'm not managing any of the rocky linux instances with SSM .. so i need to create a new role and add the SSM policy and the inline policy with the current s3 permissions that are on the IAM user?

view this post on Zulip Deirdre Kirmis (Dec 19 2024 at 14:50):

and attach that to the instance instead of the SSM one?

view this post on Zulip Don Sizemore (Dec 19 2024 at 14:52):

if you're not using SSM, you could create a role for the instance's S3 access and assign the bucket policy to that role

view this post on Zulip Don Sizemore (Dec 19 2024 at 14:53):

I'm still curious how/why you're just hitting this now. Does anyone else modify your AWS configuration?

view this post on Zulip Deirdre Kirmis (Dec 19 2024 at 14:55):

i wonder if it has to do with the recent upgrade of SSM .. if somewhere along the lines it assigned the instance policy to all the instances for some reason .. i have a distribution, host management, and patching config but none of them include any of the rocky linux instances

view this post on Zulip Deirdre Kirmis (Dec 19 2024 at 14:56):

i'm the only one that messes with our linux configs .. so it was likely something I did =)

view this post on Zulip Don Sizemore (Dec 19 2024 at 14:58):

I'm seeing now that I need to submit a pull request to flesh out the docs mentioned in https://github.com/IQSS/dataverse/issues/10707 and open a second issue for that disableRBAC boolean

view this post on Zulip Don Sizemore (Dec 19 2024 at 14:58):

but Dataverse will definitely prefer S3 permissions specified in the EC2 instance's assigned role over any access keys you offer it.

view this post on Zulip Deirdre Kirmis (Dec 19 2024 at 15:07):

That was it!!!! OMG i'm breathing again. DON .. thank you!!!!

view this post on Zulip Deirdre Kirmis (Dec 19 2024 at 15:08):

Now to figure out why it changed ...

view this post on Zulip Deirdre Kirmis (Dec 19 2024 at 15:08):

Thanks to you both for responding so quickly to this thread.

view this post on Zulip Philip Durbin ๐Ÿš€ (Dec 19 2024 at 15:08):

Really I was just stalling you until Don showed up. :sweat_smile:

view this post on Zulip Deirdre Kirmis (Dec 19 2024 at 15:09):

haha .. well thanks for that .. it calmed me down!

that change happened on both AWS accounts, so somehow the SSM role got assigned to the instance even though I'm not including it anywhere in the configs

view this post on Zulip Don Sizemore (Dec 19 2024 at 15:10):

I'm glad that fixed it! I'm submitting a PR for the docs now, that got buried in my to-do list.

view this post on Zulip Philip Durbin ๐Ÿš€ (Dec 19 2024 at 15:11):

@Deirdre Kirmis maybe you can review the PR Don makes to improve the docs. Or you could make the PR yourself if you want!

view this post on Zulip Deirdre Kirmis (Dec 19 2024 at 15:11):

Happy to help with anything! LMK and I'll for sure look at it.

view this post on Zulip Deirdre Kirmis (Dec 19 2024 at 15:18):

Thank you, thank you, thank you! :tada: :tada:

view this post on Zulip Deirdre Kirmis (Dec 19 2024 at 15:24):

I just read through that PR and it seems pretty clear .. I just hadn't seen it. :expressionless: Sorry about that!

view this post on Zulip Philip Durbin ๐Ÿš€ (Dec 19 2024 at 15:25):

The issue, you mean? https://github.com/IQSS/dataverse/issues/10707

view this post on Zulip Deirdre Kirmis (Dec 19 2024 at 15:27):

Yes, but now i read again i see you meant a PR for the docs! Sorry, my brain is kind of fried.

view this post on Zulip Philip Durbin ๐Ÿš€ (Dec 19 2024 at 15:28):

No worries. You probably got up early today.

view this post on Zulip Deirdre Kirmis (Dec 19 2024 at 15:30):

suuuuuper early .. patching thursday! :big_smile:

view this post on Zulip Philip Durbin ๐Ÿš€ (Dec 19 2024 at 15:30):

patching stuff, breaking stuff, fixing stuff

view this post on Zulip Deirdre Kirmis (Dec 19 2024 at 15:33):

haha that's my cycle! :sweat_smile:

view this post on Zulip Don Sizemore (Dec 19 2024 at 15:41):

NOW there's a PR: https://github.com/IQSS/dataverse/pull/11111/files
and I opened https://github.com/IQSS/dataverse/issues/11112

view this post on Zulip Philip Durbin ๐Ÿš€ (Dec 19 2024 at 15:55):

@Deirdre Kirmis do you see an invite at https://github.com/IQSS ? I'm trying add you to https://github.com/orgs/IQSS/teams/dataverse-readonly so I can request a review from you. :big_smile:

view this post on Zulip Deirdre Kirmis (Dec 19 2024 at 16:17):

yes .. i'm in!

view this post on Zulip Philip Durbin ๐Ÿš€ (Dec 19 2024 at 16:34):

Review requested!


Last updated: Oct 30 2025 at 06:21 UTC