#7 nfsconvert.sh: Stop using the immutable bit
Opened 6 months ago by walters. Modified 5 months ago

file modified
+6 -11

@@ -11,14 +11,14 @@ 

  	exit 0

  fi

  

- # 

- # See if the conversion happen already

+ #

+ # See if the conversion happened already

  #

  grep "nfs.conf" /etc/sysconfig/nfs > /dev/null

  if [ $? -eq 0 ]; then

- 	# Make sure the file is immutable.

+ 	# Drop the write bit as a hint

  	if [ -w /etc/sysconfig/nfs ]; then

- 		chattr +i /etc/sysconfig/nfs

+ 		chmod ugo-w /etc/sysconfig/nfs || true

  	fi 

  	exit 0

  fi

@@ -41,13 +41,8 @@ 

  /usr/sbin/nfsconvert

  

  #

- # If successful, make the file immutable.

- # This is to ensure that configuration management 

- # software gets an error trying to modify it. 

- #

- # Run `chattr -i /etc/sysconfig/nfs` as root 

- # to make it mutable again.

+ # Drop the write bit as a hint not to modify it again.

  #

  if [ $? -eq 0 ]; then

- 	chattr +i /etc/sysconfig/nfs

+ 	chmod ugo-w /etc/sysconfig/nfs || true

  fi

The immutable bit is a bad idea; it's fairly obscure - many Linux
administrators won't know about it.

Further, it actively breaks ostree, which maintains multiple root
filesystems: https://discussion.fedoraproject.org/t/boot-partition-of-silverblue-is-without-space/771/12

Instead let's just chmod -w - however of course this doesn't
actually have an effect for root because of CAP_DAC_OVERRIDE
but we're doing what we can.

This all said; did you consider just doing mv /etc/sysconfig/nfs{,.migrated} or something?

Yet another option (possibly in addition to the rename above)

(echo "# MIGRATED TO /etc/nfs.conf DO NOT EDIT" &&
 cat /etc/sysconfig/nfs &&
 echo "# MIGRATED TO /etc/nfs.conf DO NOT EDIT") > /etc/sysconfig/nfs.new
mv /etc/sysconfig/nfs{.new,}
      The immutable bit is a bad idea; it's fairly obscure - many Linux

administrators won't know about it.
The point is /etc/sysconfig/nfs is no longer part of the NFS server configuration
/etc/nfs.conf is now the only way to configure the server.

Further, it actively breaks ostree, which maintains multiple root
filesystems: https://discussion.fedoraproject.org/t/boot-partition-of-silverblue-is-without-space/771/12
Replace /etc/sysconfig/nfs with /etc/nfs.conf should take care of this breakage
sysconfig/nfs should no longer be touched.

Instead let's just chmod -w - however of course this doesn't
actually have an effect for root because of CAP_DAC_OVERRIDE
but we're doing what we can.

The bit is used to cause auto configuration software (aka ansible ) to
fail when replace sysconfig/nfs, since it is no longer used... /etc/nfs.conf is.

Their configurations need to be updated to use /etc/nfs.conf instead of
sysconfig/nfs.

This all said; did you consider just doing mv /etc/sysconfig/nfs{,.migrated} or something?

Auto configuration software simple added their own sysconfig/nfs file,
not caring what current file exists.

That replacement has to fail to let admin know /etc/nfs.conf is now the
correct way to configure the NFS server.

[A better formatted response to the first question]

      The immutable bit is a bad idea; it's fairly obscure - many Linux

administrators won't know about it.

The point is /etc/sysconfig/nfs is no longer part of the NFS server configuration
/etc/nfs.conf is now the only way to configure the server.

Further, it actively breaks ostree, which maintains multiple root
filesystems: https://discussion.fedoraproject.org/t/boot-partition-of-silverblue-is-without-space/771/12

Replace /etc/sysconfig/nfs with /etc/nfs.conf should take care of this breakage
sysconfig/nfs should no longer be touched.

Instead let's just chmod -w - however of course this doesn't
actually have an effect for root because of CAP_DAC_OVERRIDE
but we're doing what we can.

The bit is used to cause auto configuration software (aka ansible ) to
fail when replace sysconfig/nfs, since it is no longer used... /etc/nfs.conf is.

Their configurations need to be updated to use /etc/nfs.conf instead of
sysconfig/nfs.

@walters So, a little background information. The selection of the immutable bit was very intentional. The purpose of it is to ensure that when configuration-management tools like Ansible or Puppet push out NFS configuration to systems that have been upgraded, they will fail noisily.

We want this because it's a major change to how configuration works and we want it to be extremely clear to config-management that they should not be writing to this file at all anymore. All configuration belongs in nfs.conf instead.

We can't update all possible config-management tools to know that this file location is special, so we need to do something on the client side that makes it impossible to modify.

The part that is complicated is that, if I remember correctly (@steved, please correct me), we can't actually modify NFS/nfs.service to entirely ignore /etc/sysconfig/nfs and we have no good way other than a config-management failure to adequately communicate back that the user isn't actually applying the configuration that they think they are. Even if we simply ignored this file, we now have no clues to provide the user about why they are setting certain options in their config that don't actually apply. (Yes, documentation will exist, but such issues will be very common and irritating for a lot of people.)

Personally, I think you are all overengineering this. Adding a header like "DO NOT EDIT" to the top of the file is going to be adequate.

But you have all so far not responded to the point about this breaking ostree. I guess I could change ostree's pruning logic to override the immutable bit, but...I find it troublesome given that nothing else does this.

If you want to go for more sophistication, embed a checksum of the migrated file at the top (or bottom), and on systemctl start nfs, validate it and if it's changed, error out. That will be impossible to miss.

Personally, I think you are all overengineering this. Adding a header like "DO NOT EDIT" to the top of the file is going to be adequate.

It's not, though. Because every time the config-management runs, it destructively replaces the file. That's going to cause problems.

But you have all so far not responded to the point about this breaking ostree. I guess I could change ostree's pruning logic to override the immutable bit, but...I find it troublesome given that nothing else does this.

You haven't actually explained how it breaks ostree. I thought ostree ignored the contents of /etc in general, since it's user-managed content.

If you want to go for more sophistication, embed a checksum of the migrated file at the top (or bottom), and on systemctl start nfs, validate it and if it's changed, error out. That will be impossible to miss.

That doesn't really work because it moves the failure away from the cause of it. It no longer fails when trying to change the configuration, it fails whenever the service is next restarted.

You haven't actually explained how it breaks ostree. I thought ostree ignored the contents of /etc in general, since it's user-managed content.

/etc is part of each deployment root. One of the unique features of OSTree is that it does a 3-way merge of /etc - some more details here.

(You're probably thinking more of /var which indeed ostree doesn't touch by design)

What's failing is the cleanup of unused deployments; see the above linked forum thread. This manifests as a disk space leak.

It no longer fails when trying to change the configuration, it fails whenever the service is next restarted.

"change the configuration" is almost always very closely tied to "apply the configuration". People using Ansible for example are definitely going to be using handlers to restart the NFS service.

On a semi-related topic...really all shell script should use bash strict mode.

diff --git a/nfsconvert.sh b/nfsconvert.sh
index 89c0672..2ace4c8 100644
--- a/nfsconvert.sh
+++ b/nfsconvert.sh
@@ -1,5 +1,7 @@
 #!/bin/bash

+set -euo pipefail
+
 #
 # Convert /etc/sysconfig/nfs values in /etc/nfs.conf valuse
 #

Now making this change would cause the script to fail if running in a regular podman/docker container which doesn't have CAP_LINUX_IMMUTABLE.

Also it'd be a bit cleaner if we did

diff --git a/nfs-convert.service b/nfs-convert.service
index 3185ff1..6720192 100644
--- a/nfs-convert.service
+++ b/nfs-convert.service
@@ -8,6 +8,8 @@ Before=rpc-statd-notify.service

 After=initrd-root-fs.target

+ConditionPathExists=/etc/sysconfig/nfs
+
 [Service]
 Type=oneshot
 ExecStart=/usr/libexec/nfs-utils/nfsconvert.sh

It no longer fails when trying to change the configuration, it fails whenever the service is next restarted.

"change the configuration" is almost always very closely tied to "apply the configuration". People using Ansible for example are definitely going to be using handlers to restart the NFS service.

That's true, but if I recall correctly, a major part of the problem here is that the configuration won't fail if this file is overwritten. It will either just be ignored or override the intended nfs.conf configuration. I forget which; @steved will need to remind me. It also might result in the nfsconvert.sh being run every time the ansible script is run, which seems wasteful.

The checksum idea is an interesting one, but if we went that route it would have to be in a separate, well-known location since otherwise the cfgmgt update would simply overwrite it, which doesn't help at all.

Also it'd be a bit cleaner if we did
diff --git a/nfs-convert.service b/nfs-convert.service
index 3185ff1..6720192 100644
--- a/nfs-convert.service
+++ b/nfs-convert.service
@@ -8,6 +8,8 @@ Before=rpc-statd-notify.service

After=initrd-root-fs.target

+ConditionPathExists=/etc/sysconfig/nfs
+
[Service]
Type=oneshot
ExecStart=/usr/libexec/nfs-utils/nfsconvert.sh

That doesn't seem unreasonable, though we know that /etc/sysconfig/nfs will be present, I'm pretty sure.

The part that is complicated is that, if I remember correctly (@steved, please correct me), we can't actually modify NFS/nfs.service to entirely ignore /etc/sysconfig/nfs and we have no good way other than a config-management failure to adequately communicate back that the user isn't actually applying the configuration that they think they are. Even if we simply ignored this file, we now have no clues to provide the user about why they are setting certain options in their config that don't actually apply. (Yes, documentation will exist, but such issues will be very common and irritating for a lot of people.)

This is the case... In the past the systemd scripts added command line arguments
to the process that were gotten from sysconfig/nfs. To be upstream compatible
as well as have one file that will configure NFS across all distros, the systemd
scripts were changed to ignore sysconfig/nfs. Each process (deamon/command)
now looks to /etc/nfs.conf for their configuration.

If we allow admins to override sysconfig/nfs their configs will be ignored...

Personally, I think you are all overengineering this. Adding a header like "DO NOT EDIT" to the top of the file is going to be adequate.

After the conversion done the following is added to sysconfig/nfs

This file is no longer used to configure NFS

ALL configuration values are in /etc/nfs.conf. See nfs.conf(5).

But you have all so far not responded to the point about this breaking ostree. I guess I could change ostree's pruning logic to override the immutable bit, but...I find it troublesome given that nothing else does this.

Noted... I'm sure this will not be the only push be on this...

On a semi-related topic...really all shell script should use bash strict mode.
diff --git a/nfsconvert.sh b/nfsconvert.sh
index 89c0672..2ace4c8 100644
--- a/nfsconvert.sh
+++ b/nfsconvert.sh
@@ -1,5 +1,7 @@
#!/bin/bash

+set -euo pipefail
+
#
# Convert /etc/sysconfig/nfs values in /etc/nfs.conf valuse
#

Now making this change would cause the script to fail if running in a regular podman/docker container which doesn't have CAP_LINUX_IMMUTABLE.

Is this something we want?? The failure that is...

Also it'd be a bit cleaner if we did
diff --git a/nfs-convert.service b/nfs-convert.service
index 3185ff1..6720192 100644
--- a/nfs-convert.service
+++ b/nfs-convert.service
@@ -8,6 +8,8 @@ Before=rpc-statd-notify.service

After=initrd-root-fs.target

+ConditionPathExists=/etc/sysconfig/nfs
+
[Service]
Type=oneshot
ExecStart=/usr/libexec/nfs-utils/nfsconvert.sh

This makes sense...

It no longer fails when trying to change the configuration, it fails whenever the service is next restarted.

"change the configuration" is almost always very closely tied to "apply the configuration". People using Ansible for example are definitely going to be using handlers to restart the NFS service.

That's true, but if I recall correctly, a major part of the problem here is that the configuration won't fail if this file is overwritten. It will either just be ignored or override the intended nfs.conf configuration. I forget which; @steved will need to remind me. It also might result in the nfsconvert.sh being run every time the ansible script is run, which seems wasteful.
The checksum idea is an interesting one, but if we went that route it would have to be in a separate, well-known location since otherwise the cfgmgt update would simply overwrite it, which doesn't help at all.

Any and all variables in sysconfig/nfs are ignored after the conversion

The checksum idea is an interesting one, but if we went that route it would have to be in a separate, well-known location since otherwise the cfgmgt update would simply overwrite it, which doesn't help at all.

If that was true, wouldn't the same cfgmgt also be removing the nfs.conf reference that the current migration is using to flag migration as having completed?

(Seems like the "did a migration" flag should actually canonically live in /etc/nfs.conf)

The checksum idea is an interesting one, but if we went that route it would have to be in a separate, well-known location since otherwise the cfgmgt update would simply overwrite it, which doesn't help at all.

If that was true, wouldn't the same cfgmgt also be removing the nfs.conf reference that the current migration is using to flag migration as having completed?

You are assuming the cfgmgt actually knows about nfs.conf... ;-) Forever all the
NFS configuration has gone through sysconfig/nfs so I doubt cfgmgt world
even knows about nfs.conf.... which is the reason we need to break it...

(Seems like the "did a migration" flag should actually canonically live in /etc/nfs.conf)

That flags is actually in sysconfg/nfs.

You are assuming the cfgmgt actually knows about nfs.conf... ;-) F

No, I meant this check:

# 
# See if the conversion happen already
#
grep "nfs.conf" /etc/sysconfig/nfs > /dev/null

You are assuming the cfgmgt actually knows about nfs.conf... ;-) F

No, I meant this check:

See if the conversion happen already

grep "nfs.conf" /etc/sysconfig/nfs > /dev/null

Hmm... That is the point of making the file immutable... So it can not change...
Maybe I'm missing your point?

Hmm... That is the point of making the file immutable... So it can not change... Maybe I'm missing your point?

Concretely, I'm suggesting that the migration script adds a comment like # Migrated from /etc/sysconfig/nfs to /etc/nfs.conf, and then the above grep changes to:

if [ -f /etc/nfs.conf ] && grep -q "Migrated from /etc/sysconfig/nfs"

In the short term I'm likely to apply this patch to the Fedora ostree package: https://github.com/ostreedev/ostree/pull/1796

But, I'm not happy about it. And I think this conversion script could be significantly improved in general based on some of the suggestions I'm making here, and also done in such a way to not rely on the immutable bit.

But, I'm not happy about it. And I think this conversion script could be significantly improved in general based on some of the suggestions I'm making here, and also done in such a way to not rely on the immutable bit.

If you would not mind..... Please documenting your suggestions in a bz.
That way can get more eyes on it and possibly come up with a solution that
might be more palatable.

Was looking at rebasing ostree in el8 and noticed this issue was fixed there, which lead me to https://bugzilla.redhat.com/show_bug.cgi?id=1673685.

The upstream version of that is https://bugzilla.redhat.com/show_bug.cgi?id=1668836. So AFAICT, once that's fixed, we can drop the OSTree patch we're carrying for this.

(Note that the OSTree case is similar to the container use case here: we never "uninstall" packages on an RPM-OSTree system in the librpm sense, nor do we "upgrade" them either.)

Metadata