Stream: containers

Topic: gitattributes files and windows auto conversion


view this post on Zulip Juan Pablo Tosca Villanueva (Sep 13 2023 at 19:23):

Sorry but I was in a couple of workshops back-to-back today. I wanted to create the thread as @Philip Durbin suggested on #9905 @Oliver Bertuch, this is my last response on that thread:

I think that _* text=auto_ just defaults to the default behavior :rolling_on_the_floor_laughing: where people have to set the configuration on their machine with _git config --global core.autocrlf input_

Both solutions work but the idea of this change was that that every person who pulls out the configuration needs to change their configuration to input or false to turn off the conversion, this is an additional step that can be omitted, also if someone pulled out the project before reading the documentation (:rolling_eyes:), and they change the configuration after the files will be converted already.

This doesn't affect linux users since the conversion happens to Windows and docker runs a linux instance, so basically on linux is - linux (project) -> linux (dev) - > linux(docker) and for widnows is linux (project) -> windows (dev) - > linux(docker)

let me know your thoughts, as for more additional files we can add them as they find conflicting, mostly .sh files are the ones that give trouble since they are executed on the docker (same logic linux (project) -> windows (dev) - > linux(docker) ). I could think that if we have more resources executed by the server, like an example would be a JS file executed by Node.JS then it would need this.

If I or someone else find more resources that need to be added to this file, we can add them to the .gitattributes

What do you guys think about it?

view this post on Zulip Philip Durbin ๐Ÿš€ (Sep 13 2023 at 20:51):

Ideally the solution would "just work". :big_smile:

view this post on Zulip Philip Durbin ๐Ÿš€ (Sep 13 2023 at 20:51):

With no extra config.

view this post on Zulip Philip Durbin ๐Ÿš€ (Sep 13 2023 at 20:59):

The maintainer of libgit2 left a related comment back in 2018. I just copied it into the pull request: https://github.com/IQSS/dataverse/pull/9905#issuecomment-1718309113

view this post on Zulip Oliver Bertuch (Sep 13 2023 at 21:05):

text=auto actually enables autocrlf for all text based files. The setting is switched off by default, setting it this way is IMHO the way to go.

view this post on Zulip Oliver Bertuch (Sep 13 2023 at 21:07):

I don't see a necessity right now where we need to explicitly define a different behavior for some file/type.

view this post on Zulip Oliver Bertuch (Sep 13 2023 at 21:09):

Quoting from https://www.git-scm.com/docs/gitattributes on * text=auto:

If you want to ensure that text files that any contributor introduces to the repository have their line endings normalized, you can set the text attribute to "auto" for all files.

view this post on Zulip Juan Pablo Tosca Villanueva (Sep 13 2023 at 21:56):

Probably I am getting a bit confused but you may have a better understanding. From what I see, the normalization is the process of transforming from LF (linux) to CRLF (windows) which would be ok if this was an application running on the host (traditional developer setup).

With *text= auto we "ensure that text files that any contributor introduces to the repository have their line endings normalized" which would cause the normalization from LF to CRLF and would cause issues when the normalized files try to execute on Docker's linux.

I will do some testing tomorrow (or maybe tonight) morning with the *text = auto feature and post the results but I think this is the opposite of the requirements for the application to work on the container.

view this post on Zulip Oliver Bertuch (Sep 14 2023 at 06:18):

You are right @Juan Pablo Tosca Villanueva - I did not think as far, the CRLF will be a problem when building the Linux containers on Windows! Good catch! :heart:

I digged some more into this and here's a thought: how about we go for this (stolen from here):

* text=auto eol=lf

# Windows batch scripts need CRLF
*.{cmd,[cC][mM][dD]} text eol=crlf
*.{bat,[bB][aA][tT]} text eol=crlf

This is BTW also recommended by VSCode here.

We probably should make our container build more bullet proof, too. Someone creating a new script or other file to be included in the container on Windows would create it with CRLF as it's the Windows default. (Not sure IDEs would resolve that in some magical way for you - IntelliJ has default as "system dependent".)
So maybe we need to run a dos2unix just to be sure during container builds (base, app, configbaker). Got the inspiration from here.

On a related note: @Philip Durbin we probably should cleanup CRLF files in the repo. Otherwise, they will be autoconverted with the lines above and show up as edited files in the checkout, probably causing great confusion. Yes, we could avoid that by doing the conversion for shell scripts only, but why would it be a good idea to have such an inconsistency in a codebase? (We simply screwed up by not taking care of it before)

view this post on Zulip Philip Durbin ๐Ÿš€ (Sep 14 2023 at 11:09):

You know, the diff is huge but it's easy enough to hide all the whitespace changes with https://github.com/IQSS/dataverse/pull/9905/files?diff=unified&w=1

If you say we need it, that's fine with me.

view this post on Zulip Philip Durbin ๐Ÿš€ (Sep 14 2023 at 11:10):

One quick though, should we put this best practice into place in a smaller repo first? One where we're free to merge at will?

view this post on Zulip Oliver Bertuch (Sep 14 2023 at 11:35):

should we put this best practice into place in a smaller repo first

Do you want to try sth first @Philip Durbin ? Something specific?

view this post on Zulip Philip Durbin ๐Ÿš€ (Sep 14 2023 at 11:50):

Buh, maybe https://github.com/IQSS/dataverse-installations or https://github.com/IQSS/dataverse-metrics ? I'm not picky.

view this post on Zulip Philip Durbin ๐Ÿš€ (Sep 14 2023 at 11:50):

The problem is that I don't have a way to test it. I don't have a Windows box.

view this post on Zulip Philip Durbin ๐Ÿš€ (Sep 14 2023 at 11:52):

There's a reformat-all.sh script in the installations repo, if that helps.

view this post on Zulip Oliver Bertuch (Sep 14 2023 at 11:54):

If you want, we can have a Zoom and do it on my windows box together. If that helps

view this post on Zulip Philip Durbin ๐Ÿš€ (Sep 14 2023 at 11:54):

This is what Zelig uses: https://github.com/IQSS/Zelig/blob/master/.gitattributes

view this post on Zulip Philip Durbin ๐Ÿš€ (Sep 14 2023 at 11:56):

I gotta walk the dog and get to work. Container meeting in 90 minutes. Maybe we can touch on this then. https://ct.gdcc.io

view this post on Zulip Oliver Bertuch (Sep 14 2023 at 13:20):

@Juan Pablo Tosca Villanueva you're very welcome to join us and hang on Zoom! 1 precious hour of container nerds talking shop! :party_ball:

view this post on Zulip Juan Pablo Tosca Villanueva (Sep 14 2023 at 15:05):

I am very happy to help @Oliver Bertuch ! That second script starts to make sense, DOS files (cmd, bat) should still be normalized, even if the metaverse probably doesn't have any windows native files (as far as I know) I don't think it would hurt. But it seems we are going in a good direction.

I am really looking forward to the Zoom call tonight! :smile:

view this post on Zulip Oliver Bertuch (Sep 14 2023 at 15:06):

Oh no you just missed us! It's in UTC, so in the morning for ET/EST

view this post on Zulip Juan Pablo Tosca Villanueva (Sep 14 2023 at 15:08):

:upside_down:

view this post on Zulip Juan Pablo Tosca Villanueva (Sep 14 2023 at 15:13):

AM, PM, LF, CRLF, too many acronyms man... :rolling_on_the_floor_laughing: I apologize :face_exhaling: , I was really looking forward to it but will be on next week then. Also, if there are any tests on windows, I can help just let me know, as I commented to Phillip I am on the 73 line or I can connect over zoom.

view this post on Zulip Sakshi Jain (Oct 17 2023 at 14:38):

Hi all, I tried running the mvn -Pct package docker:run command on my windows machine but in the container I'm getting the following error message:
[Entrypoint] running /opt/payara/scripts/init_2_configure.sh 2023-10-17T14:35:35.974842400Z /opt/payara/scripts/init_2_configure.sh: line 10: $'\r': command not found

view this post on Zulip Sakshi Jain (Oct 17 2023 at 14:46):

can anyone help me out?

view this post on Zulip Notification Bot (Oct 17 2023 at 15:19):

2 messages were moved here from #dev > Error when running "mvn -Pct package docker:run" by Philip Durbin.

view this post on Zulip Philip Durbin ๐Ÿš€ (Oct 17 2023 at 15:20):

@Sakshi Jain hi! I hope you don't mind, but because you're on Windows, I moved your question to this topic where we are discussing pull request #9905 by @Juan Pablo Tosca Villanueva (also on Windows) that I'm hoping will help you!

view this post on Zulip Philip Durbin ๐Ÿš€ (Oct 17 2023 at 15:21):

Hmm, something happened to that PR so I'd actually recommend not looking at the files right now. :sweat_smile:

view this post on Zulip Philip Durbin ๐Ÿš€ (Oct 17 2023 at 15:22):

At a high level, as a workaround, I believe you can either set up a global .gitconfig file or pass an argument when you clone the repo. I'm not sure about the details though.

view this post on Zulip Philip Durbin ๐Ÿš€ (Oct 17 2023 at 15:30):

@Sakshi Jain do you use the git command line? If so, what do you get when you run this?
git config --global --get core.autocrlf

view this post on Zulip Oliver Bertuch (Oct 17 2023 at 18:04):

Not intentionally

view this post on Zulip Oliver Bertuch (Oct 17 2023 at 18:04):

I don't have my windows host ATM, had to use it for a temporary other purpose

view this post on Zulip Sakshi Jain (Oct 18 2023 at 12:59):

Philip Durbin said:

Sakshi Jain do you use the git command line? If so, what do you get when you run this?
git config --global --get core.autocrlf

Hi @Philip Durbin I tried running this command but didn't get anything.

view this post on Zulip Philip Durbin ๐Ÿš€ (Oct 18 2023 at 13:02):

Ok, that's normal, I think.

view this post on Zulip Philip Durbin ๐Ÿš€ (Oct 18 2023 at 13:02):

I'm looking at https://docs.github.com/en/get-started/getting-started-with-git/configuring-git-to-handle-line-endings (the Windows tab)

view this post on Zulip Philip Durbin ๐Ÿš€ (Oct 18 2023 at 13:06):

@Sakshi Jain can you please try:

view this post on Zulip Sakshi Jain (Oct 18 2023 at 13:37):

Philip Durbin said:

Sakshi Jain can you please try:

@Philip Durbin Still getting the same error

view this post on Zulip Philip Durbin ๐Ÿš€ (Oct 18 2023 at 13:37):

bah!

view this post on Zulip Philip Durbin ๐Ÿš€ (Oct 18 2023 at 13:38):

Should we create a branch with @Juan Pablo Tosca Villanueva or @Oliver Bertuch 's fix for you to checkout? Maybe two branches? (The fixes are a little different.)

view this post on Zulip Philip Durbin ๐Ÿš€ (Oct 18 2023 at 13:51):

@Sakshi Jain ok I just pushed a new branch where I cherry-picked @Juan Pablo Tosca Villanueva 's fix. Can you please try:

view this post on Zulip Juan Pablo Tosca Villanueva (Oct 18 2023 at 14:15):

I think the issue is somewhere else because the changes to the local git config should have the same effect as the proposed change (adding the .gitattributes file). @Philip Durbin

@Sakshi Jain be sure to delete all the previous content before checking out again or you could also check out in a different directory. I will try to set this up, but I am not home until next week but if somehow, I figure it out, I will let you know.

view this post on Zulip Sakshi Jain (Oct 18 2023 at 14:16):

@Juan Pablo Tosca Villanueva sure no worries
@Philip Durbin I'll try with this branch once

view this post on Zulip Philip Durbin ๐Ÿš€ (Oct 18 2023 at 14:39):

@Sakshi Jain awesome. Thanks.

view this post on Zulip Philip Durbin ๐Ÿš€ (Oct 18 2023 at 14:40):

@Oliver Bertuch as I mentioned above, PR #9905 is... not in a good state. If you'd like @Sakshi Jain to try your solution, please push a new branch.

view this post on Zulip Philip Durbin ๐Ÿš€ (Oct 18 2023 at 14:48):

@Juan Pablo Tosca Villanueva yeah, it's weird. I was hoping the global config would be a good workaround. Oh well.

view this post on Zulip Sakshi Jain (Oct 18 2023 at 14:57):

Hi @Philip Durbin the cloning thing didn't work.
But I fixed the issue. It's a hacky solution but it worked :D
I opened the /init_2_configure.sh file in notepad++ and from there I was able to remove all the \r in the file
After that when I ran the docker the error didn't come again

view this post on Zulip Philip Durbin ๐Ÿš€ (Oct 18 2023 at 14:58):

Hmm, that's frustrating. Want me to push a new branch with the other fix? @Oliver Bertuch 's suggestion, I mean.

view this post on Zulip Philip Durbin ๐Ÿš€ (Oct 18 2023 at 14:58):

It would be great to figure this out for Windows developers once and for all.

view this post on Zulip Sakshi Jain (Oct 18 2023 at 14:58):

true

view this post on Zulip Philip Durbin ๐Ÿš€ (Oct 18 2023 at 15:23):

@Sakshi Jain ok, I just pushed @Oliver Bertuch 's fix to a new branch. If you'd like to try it:

git clone -b 9894-gitattributes-text-auto git@github.com:IQSS/dataverse.git

I really appreciate you helping out with this. I want to better support developers on Windows.

view this post on Zulip Sakshi Jain (Oct 18 2023 at 20:18):

Philip Durbin said:

Sakshi Jain ok, I just pushed Oliver Bertuch 's fix to a new branch. If you'd like to try it:

git clone -b 9894-gitattributes-text-auto git@github.com:IQSS/dataverse.git

I really appreciate you helping out with this. I want to better support developers on Windows.

Sure I'll check it out :+1:

view this post on Zulip Philip Durbin ๐Ÿš€ (Oct 18 2023 at 20:20):

Thanks!

view this post on Zulip Philip Durbin ๐Ÿš€ (Oct 18 2023 at 20:29):

Are you using git from the command line?

view this post on Zulip Sakshi Jain (Oct 18 2023 at 20:31):

well normally I clone repository from within intellij or visual studio itself. But for this issue I tried cloning through command line and even through github desktop but the same issue came everytime.

view this post on Zulip Sakshi Jain (Oct 18 2023 at 20:32):

and I can see that the autocrlf value is set as true

view this post on Zulip Philip Durbin ๐Ÿš€ (Oct 18 2023 at 20:33):

Do you see a .gitattributes file in the root of the repo? It should have different content based on the two branches.

view this post on Zulip Philip Durbin ๐Ÿš€ (Oct 18 2023 at 20:35):

view this post on Zulip Sakshi Jain (Oct 18 2023 at 20:37):

trying to clone the particular branches you mentioned is giving me the following error:
` git@github.com: Permission denied (publickey).
fatal: Could not read from remote repository.

Please make sure you have the correct access rights
and the repository exists. `

view this post on Zulip Philip Durbin ๐Ÿš€ (Oct 18 2023 at 20:38):

Ah, can you please try the https version instead of the git version? Do you know what I mean?

view this post on Zulip Sakshi Jain (Oct 18 2023 at 20:41):

I was thinking about adding the file manually in my git branch and cloning that
that way I'll be easily able to continue working in it as well if it works out

view this post on Zulip Philip Durbin ๐Ÿš€ (Oct 18 2023 at 20:49):

Yes! Please try that!

view this post on Zulip Sakshi Jain (Oct 18 2023 at 20:54):

cloned the repo with .gitattributes file having the text auto but still getting the same error

view this post on Zulip Sakshi Jain (Oct 18 2023 at 21:01):

I found a stackoverflow for this same issue : https://stackoverflow.com/questions/2517190/how-do-i-force-git-to-use-lf-instead-of-crlf-under-windows
trying out a solution mentioned on this

view this post on Zulip Sakshi Jain (Oct 18 2023 at 21:06):

this resolved the issue!

view this post on Zulip Philip Durbin ๐Ÿš€ (Oct 18 2023 at 21:06):

orly?

view this post on Zulip Sakshi Jain (Oct 18 2023 at 21:08):

I did the following:

  1. Added the .gitattributes file with the text auto change in the branch
  2. On my local I ran the following commands:
    git config --global core.autocrlf false
    git config --global core.eol lf

view this post on Zulip Sakshi Jain (Oct 18 2023 at 21:08):

after this I cloned the branch on my system and ran the maven package docker command and the containers started without any errors :smiley:

view this post on Zulip Philip Durbin ๐Ÿš€ (Oct 18 2023 at 21:11):

Wow. Good job. So do we need something else in .gitattributes, I wonder. @Juan Pablo Tosca Villanueva and @Oliver Bertuch please take note!

view this post on Zulip Philip Durbin ๐Ÿš€ (Oct 18 2023 at 21:12):

@Sakshi Jain for now, do you want to document what you did? Make a PR for this, I mean.

view this post on Zulip Sakshi Jain (Oct 18 2023 at 21:15):

well I can add the gitattributes file with the text auto change
can I add the steps I followed in the readme file so that if anyone works on windows can follow them?

view this post on Zulip Sakshi Jain (Oct 18 2023 at 21:19):

I see though that https://github.com/IQSS/dataverse/blob/9894-gitattributes-eol-lf/.gitattributes is having the eol lf command in it. I didn't try adding that in the gitattributes file in my branch so not sure if it works on my system or not.
Maybe I can try with that file once tomorrow on another windows system to see if that works. If it does then we could directly merge the 9894-gitattributes-eol-lf branch into develop

view this post on Zulip Philip Durbin ๐Ÿš€ (Oct 18 2023 at 21:24):

Yes, it would be good to confirm the best approach. I think we're close! Thanks again!

view this post on Zulip Juan Pablo Tosca Villanueva (Oct 19 2023 at 05:25):

Philip Durbin said:

Sakshi Jain can you please try:

I was reading this again after looking at the stackoverflow that @Sakshi Jain posted and I just realized that you mentioned to set autocrlf to true, so this is exactly the behavior that we want to avoid when using windows. these two solutions should work:

git config --global core.autocrlf false
git config --global core.autocrlf input (the one mentioned in the official dataverse documentation)

Here is the difference between false and input on the git documentation, but basically both avoid auto -conversion at checkout but auto will still auto-convert on commit. Probably @Sakshi Jain can help us with a test by changing it from false to input.

What is weird to me is that this behavior should be overwritten by setting up the .gitattributes, that is why on the first place why I introduced this request.

Philip Durbin said:

Do you see a .gitattributes file in the root of the repo? It should have different content based on the two branches.

I think the .gitattributes file is not checked out, but I have to test this again.

Sorry, I wanted to get back to both of you earlier, but we are in a hotel right now and the project took forever to download :rolling_on_the_floor_laughing: then we had to leave and we just got back to the room.

view this post on Zulip Sakshi Jain (Oct 19 2023 at 10:18):

Hi all, I just tested with changes from https://github.com/IQSS/dataverse/blob/9894-gitattributes-eol-lf/.gitattributes on another windows system and that solution seems to be working fine. I didn't need to make any other changes that just add this same .gitattributes file.

view this post on Zulip Sakshi Jain (Oct 19 2023 at 10:19):

9894-gitattributes-eol-lf branch can be merged into develop now as it resolves the windows crlf issue :)

view this post on Zulip Philip Durbin ๐Ÿš€ (Oct 19 2023 at 11:05):

@Sakshi Jain great news. Would you also be willing to test with https://github.com/IQSS/dataverse/blob/9894-gitattributes-text-auto/.gitattributes ?

view this post on Zulip Sakshi Jain (Oct 19 2023 at 20:25):

I tested with text auto attribute earlier but it didn't work

view this post on Zulip Philip Durbin ๐Ÿš€ (Oct 19 2023 at 20:27):

Ok. Thanks. @Oliver Bertuch are you following this?

view this post on Zulip Philip Durbin ๐Ÿš€ (Oct 19 2023 at 20:28):

@Sakshi Jain if you run git config --global --get core.autocrlf what do you get?

view this post on Zulip Sakshi Jain (Oct 19 2023 at 21:05):

as of now I've set it to false so that's what I get

view this post on Zulip Philip Durbin ๐Ÿš€ (Oct 19 2023 at 21:08):

Ok. It's late for Oliver but maybe we can check in with him tomorrow on what he thinks. I appreciate all the testing!

view this post on Zulip Oliver Bertuch (Oct 20 2023 at 15:10):

I'm sorry folks but its crazy over here today all day. I might have a chance looking into this next week on my windows host, but no promises. Open Science Week keeps me pretty busy.

view this post on Zulip Philip Durbin ๐Ÿš€ (Oct 20 2023 at 15:12):

No worries, Oliver! Have a good weekend!

view this post on Zulip kuhlaid (Dec 01 2023 at 14:43):

As a Windows user I have battled with Windows over end of line characters. Windows is especially problematic with regards to VSCode editor. I have been trying for ages to configure VSCode to STOP USING CRLF, but no matter what Git settings I set or VSCode setting are set, if I use the Clone Git Repository... option that VSCode prominently displays on the start up screen, IT ALWAYS clones with CRLF. DO NOT USE THIS OPTION... Instead clone a Git repo using the Git command git clone --config core.autocrlf=false [repo clone URL] (for example git clone --config core.autocrlf=false https://github.com/IQSS/dataverse.git). This ensures that Git will clone the repository with LF end of line characters.

view this post on Zulip Juan Pablo Tosca Villanueva (Dec 01 2023 at 14:48):

Hi @kuhlaid do you still have these issues? We recently added a .gitattributes file that should skip the conversion on *.sh files by default for everyone clonning the repo, all the other files should skip as your configuration is setup. Do you have any idea what other files should be converted? Right now the project works for me on Windows 10-11 with some exceptions using file storage but I haven't had a chance to investigate these.

view this post on Zulip kuhlaid (Dec 01 2023 at 15:19):

Hi @Juan Pablo Tosca Villanueva , this is just a problem with VSCode and not the Dataverse itself.

view this post on Zulip Juan Pablo Tosca Villanueva (Dec 01 2023 at 15:29):

Oh I see! Yeah you can suggest that the projects that add this file so you or everyone else doesn't need to tho additional configuration. I sugested it to another repos. Alternatively you could also setup your Git installation with git config --global core.autocrlf input, this way Git
convert CRLF to LF on commit but not the other way around.

Thanks for sharing this option!

view this post on Zulip kuhlaid (Dec 01 2023 at 17:09):

For Windows users who want a custom VsCode task to clone Git repositories with LF end of line characters (which is what you want if you are working with Dataverse code), I created some instructions that can be found at https://stackoverflow.com/a/77586955/10027828.


Last updated: Oct 30 2025 at 05:14 UTC