On 9/8/25 05:59, Alyssa Ross wrote:
Demi Marie Obenour <demiobenour@gmail.com> writes:
The script will always get them right, whereas humans (the author of this commit included) generally will not.
Signed-off-by: Demi Marie Obenour <demiobenour@gmail.com>
I like the idea!
--- Documentation/development/built-in-vms.adoc | 7 ++ host/rootfs/Makefile | 107 ++------------------------ host/rootfs/file-list.mk | 100 ++++++++++++++++++++++++ img/app/Makefile | 74 +++--------------- img/app/file-list.mk | 65 ++++++++++++++++ lib/common.mk | 1 + scripts/genfiles.awk | 115 ++++++++++++++++++++++++++++ vm/sys/net/Makefile | 51 +++--------- vm/sys/net/file-list.mk | 41 ++++++++++ 9 files changed, 357 insertions(+), 204 deletions(-)
diff --git a/Documentation/development/built-in-vms.adoc b/Documentation/development/built-in-vms.adoc index e90009ee5a3c2c254a7ae11e36121576b819eee7..82d78705a6020bbdb06fbc123a32dbdd6fd50085 100644 --- a/Documentation/development/built-in-vms.adoc +++ b/Documentation/development/built-in-vms.adoc @@ -44,6 +44,13 @@ NOTE: As a special convenience, it's not necessary to run `make clean` if the only change to the Nix files is modifying the packages installed in the VM.
+The list of files used for the VM image is stored in a separate file, +`file-lists.mk`. You can edit it manually when developing. However, +after each commit that adds or removes files from it, you should run +`make update-file-list`, which will regenerate it from the output of +`git ls-files`. Any changes you made will be lost. This ensures +that the file lists are always in sync with the git repository. +
TBH editing it manually and then losing your changes is probably going to be more of a footgun than anything else. You can get the same result by staging new files and rerunning the script, so I'd avoid mentioning the manual editing option, especially given you have a "DO NOT EDIT" comment.
-$(dest): ../../scripts/make-erofs.sh $(PACKAGES_FILE) $(addprefix image/,$(FILES)) $(BUILD_FILES) build/empty build/fifo +$(dest): ../../scripts/make-erofs.sh $(PACKAGES_FILE) $(addprefix image/,$(FILES)) $(BUILD_FILES) build/empty build/fifo file-list.mk
Given that we don't include Makefile as a dependency, it probably doesn't make sense to depend on other included Makefile fragments either?
If the user regenerates the file lists, it is more convenient for them to not need to call 'make clean', and having the dependency is harmless.
@@ -207,6 +111,11 @@ debug: $(VMLINUX) .PHONY: debug
+update-file-list: + ../../scripts/genfiles.awk image > file-list.mk + +.PHONY: update-file-list +
Given this doesn't use any features of Make, it probably makes more sense to just run the script directly. It could output into file-list.mk by default for ergonomics.
Makes sense.
run: build/live.img $(EXT_FS) build/rootfs.verity.roothash @set -x && \ ext="$$(mktemp build/spectrum-rootfs-extfs.XXXXXXXXXX.img)" && \ diff --git a/host/rootfs/file-list.mk b/host/rootfs/file-list.mk new file mode 100644 index 0000000000000000000000000000000000000000..0817887d0bb25ab47e777f6a130a3b6214b25f0f --- /dev/null +++ b/host/rootfs/file-list.mk @@ -0,0 +1,100 @@ +# SPDX-License-Identifier: CC0-1.0 +# SPDX-FileCopyRightText: Not Copyrightable (machine-written)
SPDX-FileCopy*r*ightText, and should probably say that you're the owner, for consistency with e.g. lib/nixpkgs.default.nix, which is also generated. You at least made the template.
Makes sense.
+# Generated by scripts/genfile.awk, DO NOT EDIT! +override FILES ::= \
Our Makefiles are POSIX. (Mostly because it's the only sensible way to draw the line, and keep all the really advanced easy to misuse GNU stuff out.)
This is going to make factoring out common logic more difficult. I'll go into detail in the reply to your review to the next patch.
diff --git a/lib/common.mk b/lib/common.mk index 277c3544036d9a9057f8ba4ad37fe2207548cc59..0a03ff440cc671264d2b859a2ae048db9252d047 100644 --- a/lib/common.mk +++ b/lib/common.mk @@ -1,5 +1,6 @@ # SPDX-License-Identifier: EUPL-1.2+ # SPDX-FileCopyrightText: 2021, 2023, 2025 Alyssa Ross <hi@alyssa.is> +# SPDX-FileCopyrightText: 2025 Demi Marie Obenour <demiobenour@gmail.com>
BACKGROUND = background CPIO = cpio
This change looks like an accident?
Should have gone in the next patch.
diff --git a/scripts/genfiles.awk b/scripts/genfiles.awk new file mode 100755 index 0000000000000000000000000000000000000000..62863e78f157f1d9a0f6dbdb0f4380db9c9d48cb --- /dev/null +++ b/scripts/genfiles.awk @@ -0,0 +1,115 @@ +#!/usr/bin/env -S LC_ALL=C LANGUAGE=C awk -E +# SPDX-License-Identifier: EUPL-1.2+ +# SPDX-FileCopyrightText: 2025 Demi Marie Obenour <demiobenour@gmail.com> +function check_status(status) { + if (status < 0) { + printf "FATAL: getline: %s\n", status > "/dev/stderr"; + exit 1; + } + return status; +} + +function check_close(value, status) { + status = check_status(close(value)); + if (status != 0) { + printf "FATAL: command exited with status %d\n", status > "/dev/stderr"; + exit status; + } +} + +function shell_quote(command) { + gsub(/'/, "'\\\\&'", command); + return ("'" command "'"); +} + +function get(command, line, path, array_index, inode_type, mode, modes, symlink_count, symlinks, file_count, files, rc_count, rc_files, is_license, is_rc) { + file_count = 0; + symlink_count = 0; + rc_count = 0; + modes["120000"] = "symlink"; + modes["040644"] = "directory"; + modes["040755"] = "directory"; + modes["100644"] = "regular"; + modes["100755"] = "regular"; + print "# SPDX-License-Identifier: CC0-1.0"; + print "# SPDX-FileCopyRightText: Not Copyrightable (machine-written)"; + print "# Generated by scripts/genfile.awk, DO NOT EDIT!"; + while (check_status(command | getline line)) { + if (line !~ /^[0-7]{6}\t/) { + # this is a git bug + print "FATAL: git ls-files output didn't start with a valid mode" > "/dev/stderr"; + exit 1; + } + path = substr(line, 8); + if (path !~ /^[ -~]+$/) { + # also a git bug + print "FATAL: git ls-files didn't quote properly" > "/dev/stderr"; + exit 1; + } + if (path ~ /^\/|((^|\/)\.{0,2}($|\/))/) { + # also a git bug + printf "FATAL: git ls-files output non-canonical path '%s'\n", path > "/dev/stderr"; + exit 1; + } + if (path !~ /^[[:alnum:]_.+@/-]+$/) { + printf "FATAL: filename '%s' has forbidden characters\n", path > "/dev/stderr"; + exit 1; + }
I feel like this could be a lot nicer if we ran git ls-files outside awk, and could then use its nice top-level matching syntax?
That would work, but would need a wrapper script.
+ mode = modes[substr(line, 1, 6)]; + is_license = path ~ /\.license$/; + is_rc = path ~ /^etc\/s6-rc\//; + if (mode == "regular") { + if (is_license) { + continue; + } + if (is_rc) { + rc_count += 1; + rc_files[rc_count] = path; + } else { + file_count += 1; + files[file_count] = path; + } + continue; + } + if (mode == "symlink") { + if (is_rc) { + printf "FATAL: symlink in s6-rc-compile input: %s\n", path; + exit 1; + } + symlink_count += 1; + symlinks[symlink_count] = path; + } else if (mode != "directory") { + printf "FATAL: file %s has unknown mode %s\n", path, substr(line, 1, 6) > "/dev/stderr"; + exit 1; + } + if (is_license) { + printf "FATAL: %s (type %s) ends in .license\n", path, mode > "/dev/stderr"; + exit 1; + } + } + check_close(command); + + printf "override FILES ::="; + for (array_index = 1; array_index <= file_count; array_index += 1) { + printf " \\\n\t%s", files[array_index]; + } + printf ("\n\n" \ +"# These are separate because they need to be included, but putting\n" \ +"# them as make dependencies would confuse make.\n" \ +"override LINKS ::="); + for (array_index = 1; array_index <= symlink_count; array_index += 1) { + printf " \\\n\t%s", symlinks[array_index]; + } + printf "\n\noverride S6_RC_FILES ::="; + for (array_index = 1; array_index <= rc_count; array_index += 1) { + printf " \\\n\t%s", rc_files[array_index]; + } + printf "\n" +} + +BEGIN { + RS = "\n"; + FS = "\t"; + get("set -euo pipefail && { git -c core.quotePath=true -C " shell_quote(ARGV[1]) " ls-files '--format=%(objectmode)\t%(path)' -- .|sort -t '\t' -k 2;}"); + exit 0; +} -- Sincerely, Demi Marie Obenour (she/her/hers)