Discussion:
Bug#1011343: WISHLIST: Offical ALL-IN-ONE images?
(too old to reply)
Zhang Boyang
2022-05-20 14:20:01 UTC
Permalink
Package: debian-cd

Hello,

I suggest debian release a new variant of ISO images, the all-in-one
images. These all-in-one image contains ALL debian packages in a single
ISO image (possibly all source packages in another all-in-one ISO
image). Of course there is no such optical media can hold such a big
image, but it is useful for virtual-machines, remotely managed servers,
and archival purposes. The theoretical size limit of an ISO9660
filesystem is about 8TB, which is sufficient for including all debian
packages.

For the name of this variant, I suggest 'everything', 'allinone',
'world', 'virt'.

p.s. This is my personal interest, and I would appreciate if you can
kindly consider my suggestion.


Best Regards,
Zhang Boyang
Andy Simpkins
2022-05-20 23:20:01 UTC
Permalink
Post by Zhang Boyang
Package: debian-cd
Hello,
I suggest debian release a new variant of ISO images, the all-in-one images. These all-in-one image contains ALL debian packages in a single ISO image (possibly all source packages in another all-in-one ISO image). Of course there is no such optical media can hold such a big image, but it is useful for virtual-machines, remotely managed servers, and archival purposes. The theoretical size limit of an ISO9660 filesystem is about 8TB, which is sufficient for including all debian packages.
For the name of this variant, I suggest 'everything', 'allinone', 'world', 'virt'.
p.s. This is my personal interest, and I would appreciate if you can kindly consider my suggestion.
Best Regards,
Zhang Boyang
Sorry to put a dampener on your suggestion but why would you need that?

Why not just mirror the archive to a local disk instead?

Then you have your copy of everything and can just point a netinst at your local mirror so you can install from there.

I think that would deliver on every use case that you would be able to use your big ISO image and more....
Zhang Boyang
2022-05-21 05:40:01 UTC
Permalink
Hi,

Indeed, I admit super-big-iso is a crazy idea, and a local mirror is
more useful in most cases. I think there is a few special cases that a
super-big-iso might be more useful.

1) Computers / Virtual Machines isolated from public internet or have no
network at all. It is convenient to have such an ISO to install software
on demand. A single file is much more convenient than setting up a local
mirror. It's also easy to manage or verify integrity, if frequent
updates are not needed.

2) Archival purposes. If someone (in future, for example, in 2042) want
to install a very old debian system, he/she may grab the big ISO and all
he/she need is that single file. Although it's not easy to grab the file
in far future, but I guess there is always someone crazy enough to
archive all files, isn't it? :P

I think setting up a new variant of image is not very costly for debian
since there are already many variants, so why not give people more
choices :-)


Best Regards,
Zhang Boyang
Post by Andy Simpkins
Post by Zhang Boyang
Package: debian-cd
Hello,
I suggest debian release a new variant of ISO images, the all-in-one images. These all-in-one image contains ALL debian packages in a single ISO image (possibly all source packages in another all-in-one ISO image). Of course there is no such optical media can hold such a big image, but it is useful for virtual-machines, remotely managed servers, and archival purposes. The theoretical size limit of an ISO9660 filesystem is about 8TB, which is sufficient for including all debian packages.
For the name of this variant, I suggest 'everything', 'allinone', 'world', 'virt'.
p.s. This is my personal interest, and I would appreciate if you can kindly consider my suggestion.
Best Regards,
Zhang Boyang
Sorry to put a dampener on your suggestion but why would you need that?
Why not just mirror the archive to a local disk instead?
Then you have your copy of everything and can just point a netinst at your local mirror so you can install from there.
I think that would deliver on every use case that you would be able to use your big ISO image and more....
Andrew M.A. Cater
2022-05-21 10:20:01 UTC
Permalink
Hi,
Indeed, I admit super-big-iso is a crazy idea, and a local mirror is more
useful in most cases. I think there is a few special cases that a
super-big-iso might be more useful.
1) Computers / Virtual Machines isolated from public internet or have no
network at all. It is convenient to have such an ISO to install software on
demand. A single file is much more convenient than setting up a local
mirror. It's also easy to manage or verify integrity, if frequent updates
are not needed.
If you have a computer isolated from the internet / with no network connectivitythen you are essentially "set and forget" - because the only way to update this
is to hand carry packages in for security updates or whatever. For that, you
can use the DL-BD sized .iso - you'll need a computer that's connected to the
'Net to build it via jigdo / jigit - but you'd need a computer connected to
the internet to donwload the DVD or any other medium.

The double-layer Blu-Ray disk sized medium is 50GB or so - so you could write
that to a 64G USB flash disk. We - the debian images team that build and test
the images - don't routinely create all those full size images and put them in
the archive - because that would be terabytes with every point release.
They're there if you need them.

Actually, setting up a local mirror is potentially almost as easy a use case
as using gigantic media files. That's exactly what many hosting companies
do in their data centres for their own use (and it's also in some of those
data centres where some of the Debian country level mirrors are located).
So a large isolated network may find it useful to have a local mirror
updated periodically.
2) Archival purposes. If someone (in future, for example, in 2042) want to
install a very old debian system, he/she may grab the big ISO and all he/she
need is that single file. Although it's not easy to grab the file in far
future, but I guess there is always someone crazy enough to archive all
files, isn't it? :P
See, for example, snapshot.debian.org - which is growing. See also the
cdimage.debian.org archive directory where you can find most of the .iso
files for any release. Also, keeping large files around on disk for a long
time - there's some likelihood of data corruption. I'd hate a couple of
bit flips three quarters of the way through a 6TB file, say, to mean that
the whole thing isuseless.
I think setting up a new variant of image is not very costly for debian
since there are already many variants, so why not give people more choices
:-)
Best Regards,
Zhang Boyang
Post by Andy Simpkins
Post by Zhang Boyang
Package: debian-cd
Hello,
I suggest debian release a new variant of ISO images, the all-in-one images. These all-in-one image contains ALL debian packages in a single ISO image (possibly all source packages in another all-in-one ISO image). Of course there is no such optical media can hold such a big image, but it is useful for virtual-machines, remotely managed servers, and archival purposes. The theoretical size limit of an ISO9660 filesystem is about 8TB, which is sufficient for including all debian packages.
For the name of this variant, I suggest 'everything', 'allinone', 'world', 'virt'.
p.s. This is my personal interest, and I would appreciate if you can kindly consider my suggestion.
Best Regards,
Zhang Boyang
Sorry to put a dampener on your suggestion but why would you need that?
Why not just mirror the archive to a local disk instead?
Then you have your copy of everything and can just point a netinst at your local mirror so you can install from there.
I think that would deliver on every use case that you would be able to use your big ISO image and more....
Andy is absolutely right, I think.

If it helps, I'm the "other" Andy in the team along with Steve McIntyre -
and yes, I know the problems of copying large images around, have a local
mirror here and routinely build at least the single layer BD disk with
every point release.

This is a topic that comes up fairly frequently in our informal discussions
as various people have argued for various sizes of medium - someone was
asking for 128G a short while ago - practically, the impact on storage
sizes and the pain of testing each size means that we have a selection
of all possible requests.

It's an open question as to whether we will ever stop making media in
physical medium sizes - there's no obvious reason why an iso file needs
to fit on a DVD, for example - and then someone turns up who is still using
single layer DVDs on a regular basis. The number of people buying
burnt physical media is smaller and smaller all the time, but people still
request this from Steve and others.

With every good wish, as ever,

Andy Cater
Thomas Schmitt
2022-05-21 12:00:01 UTC
Permalink
Hi,

some technical nitpicking.
Post by Andrew M.A. Cater
Also, keeping large files around on disk for a long
time - there's some likelihood of data corruption.
The .jigdo and .template files of the DLBD ISOs are together smaller than
a netinst CD ISO. (Less than 90 MiB, see
https://cdimage.debian.org/debian-cd/current/amd64/jigdo-dlbd/ )

Building an ALL-IN-ONE would cost the computing time of another ISO set (*)
and 50 % percent more virtual memory than building DLBD1 (**).

I assume that the main workload with an additional set of jigdoized ISOs
is the need to once more shovel 75 GB of package files through libisofs
and libjte so that they can create .template and .jigdo files. The ISOs
themselves get directly piped into /dev/null, i guess.
Post by Andrew M.A. Cater
I'd hate a couple of
bit flips three quarters of the way through a 6TB file, say, to mean that
the whole thing isuseless.
The vast majority of data is stored as packages on the worldwide mirrors.
I'd expect some quality of filesystem and backup which keeps damage
confined to a few packages.

There are a lot of outdated mirrors around where one could dig for those
which really got lost from all active mirrors. I remember the hunt for
a package which once was overwritten by a newer version with the same
.deb file name. It was needed for building an old ISO from .jigdo which
was created when the new version did not yet exist.
But even without finding the older version, the emerging ISO would have
been usable for all purposes which don't touch that one rogue package.

------------------------------------------------------------------------

Every time this wish pops up, i begin to ponder what is theoretically
needed to unite several pool trees from multiple ISOs into one, so that
it works like an official ALL-IN-ONE ISO.

Putting all files together and making the ISO bootable would be no
problem. But what does a neat pool have to offer as merged lists or other
meta-data so that it properly announces its content ?

------------------------------------------------------------------------

Footnotes:
(*) Computing time:
Maybe it needs a bit more than DLBD[12] together, because the insertion
algorithm of libisofs has a quadratic aspect in its personality. But
the branchy pool tree helps a lot to keep this problem under the cover.
(**) Virtual memory consumption:
The memory consumption of libisofs mainly depends on the number of
files and the length of their names. *11.3.0-amd64-DLBD-[12].jigdo
together list ~ 59,000 files with ~ 2.2 MiB of basename length.
My daily BD backups have ~ 70,000 files with ~ 1.4 MiB of basename
length and do not demand gigabytes of RAM.


Have a nice day :)

Thomas
Andrew M.A. Cater
2022-05-21 13:50:01 UTC
Permalink
On Sat, May 21, 2022 at 01:57:59PM +0200, Thomas Schmitt wrote:

Hi Thomas,
Post by Thomas Schmitt
Hi,
some technical nitpicking.
Post by Andrew M.A. Cater
Also, keeping large files around on disk for a long
time - there's some likelihood of data corruption.
The .jigdo and .template files of the DLBD ISOs are together smaller than
a netinst CD ISO. (Less than 90 MiB, see
https://cdimage.debian.org/debian-cd/current/amd64/jigdo-dlbd/ )
Building an ALL-IN-ONE would cost the computing time of another ISO set (*)
and 50 % percent more virtual memory than building DLBD1 (**).
That's if you assume that building from jigdo and a mirror is always fine.
If the original poster wants one huge .iso as one file to download from
cdimage.debian.org - then 2 x double layer Blu-Ray (say) as one file
would be 100GB or so. Even on a good quality link, that's quite a time.

[If you've actually got a physical mirror near to you, the jigit scripts
work even better than jigdo-file - but they produce the .iso on the same
machine, which is not ideal for everyone.]


Then there's storage - generating multiples of those per architecture
per point release adds up to space on cdimage.debian.org.
Post by Thomas Schmitt
I assume that the main workload with an additional set of jigdoized ISOs
is the need to once more shovel 75 GB of package files through libisofs
and libjte so that they can create .template and .jigdo files. The ISOs
themselves get directly piped into /dev/null, i guess.
I don't actually know how long it take to create the .jigdo and .template
files - I don't think it's as long as generating the full .iso files and
copying them around between machines as they're generated on each point
release day, for example, and they don't get generated multiple times.
Post by Thomas Schmitt
Post by Andrew M.A. Cater
I'd hate a couple of
bit flips three quarters of the way through a 6TB file, say, to mean that
the whole thing isuseless.
This is the case for the hypothetical "all in one" .iso to contain all
architectures - be a "one source for Debian 10.12" .iso - which is feasible
but probably not sensible.
Post by Thomas Schmitt
The vast majority of data is stored as packages on the worldwide mirrors.
I'd expect some quality of filesystem and backup which keeps damage
confined to a few packages.
There are a lot of outdated mirrors around where one could dig for those
which really got lost from all active mirrors. I remember the hunt for
a package which once was overwritten by a newer version with the same
.deb file name. It was needed for building an old ISO from .jigdo which
was created when the new version did not yet exist.
But even without finding the older version, the emerging ISO would have
been usable for all purposes which don't touch that one rogue package.
------------------------------------------------------------------------
Every time this wish pops up, i begin to ponder what is theoretically
needed to unite several pool trees from multiple ISOs into one, so that
it works like an official ALL-IN-ONE ISO.
Putting all files together and making the ISO bootable would be no
problem. But what does a neat pool have to offer as merged lists or other
meta-data so that it properly announces its content ?
------------------------------------------------------------------------
Maybe it needs a bit more than DLBD[12] together, because the insertion
algorithm of libisofs has a quadratic aspect in its personality. But
the branchy pool tree helps a lot to keep this problem under the cover.
The memory consumption of libisofs mainly depends on the number of
files and the length of their names. *11.3.0-amd64-DLBD-[12].jigdo
together list ~ 59,000 files with ~ 2.2 MiB of basename length.
My daily BD backups have ~ 70,000 files with ~ 1.4 MiB of basename
length and do not demand gigabytes of RAM.
Have a nice day :)
Thomas
And you too - it's always good to have otehr people to think round a problem
with.

All best, as ever,

Andy Cater
Thomas Schmitt
2022-05-21 15:10:01 UTC
Permalink
Hi,

i reply to ***@bugs.debian.org and ***@gmail.com because
of my question and sketch for a procedure to merge DLBD-1 and -2 after
download into the desired ALL-IN-ONE ISO.
Post by Andrew M.A. Cater
If the original poster wants one huge .iso as one file to download from
cdimage.debian.org - then 2 x double layer Blu-Ray (say) as one file
would be 100GB or so.
11.3.0 is at (48245475328+35798159360)/1073741824 = 78.272 GiB.

$ gunzip <debian-11.3.0-amd64-DLBD-1.jigdo | grep '# Image size'
# Image size 48245475328 bytes
Post by Andrew M.A. Cater
Even on a good quality link, that's quite a time.
Yes. Especially since at least with jigdo-lite the bottleneck is the
latency of downloading the packages. A 78 GiB ISO might need a day to
become complete. jigdo-lite is graceful about being interrupted and resumed
the other day, though.

Running several jigdo-lite downloads simultaneously might mitigate the
latency problem so that bandwidth becomes the bottleneck.
Post by Andrew M.A. Cater
I don't actually know how long it take to create the .jigdo and .template
files - I don't think it's as long as generating the full .iso files
They are created by libjte under control of libisofs under control of
xorriso as side effect of actual ISO production under control of debian-cd.
(I'm developer of libisofs and xorriso, and co-funder of libjte, which is
now back in the hands of Steve McIntyre from whose genisoimage code it
got large parts of its entrails.)

I was possibly wrong with guessing that the ISO is dumped into /dev/null,
although debian-cd seems to be smart enough to read the various checksums
from the .jigdo files, rather than calculating them from the .iso files.
The checksum code
https://sources.debian.org/src/debian-cd/3.1.35/tools/imagesums/#L57
reminded me that there are .torrent files made, which obviously need
the .iso files at least at build time.

-------------------------------------------------------------------------
Post by Andrew M.A. Cater
it's always good to have otehr people to think round a problem with.
Is there documentation from which i could learn how the stuff in (i guess)
/dists of DLBD-1 and DLBD-2 could be merged so that it properly describes
a merged pool tree ?

My rough idea would be:
- mount both ISOs
- derive merged /dists files
- run xorriso to let it
- load DLBD-1 with its boot equipment
- merge-in the pool tree of mounted DLBD-2
- overwrite the old /dists files by the newly derived ones
- automatically replay the commands for the loaded boot equipment
- store the result as new .iso file

The second step is where i would need info or advise.


Have a nice day :)

Thomas
Zhang Boyang
2022-05-21 17:00:01 UTC
Permalink
Post by Thomas Schmitt
Is there documentation from which i could learn how the stuff in (i guess)
/dists of DLBD-1 and DLBD-2 could be merged so that it properly describes
a merged pool tree ?
- mount both ISOs
- derive merged /dists files
- run xorriso to let it
- load DLBD-1 with its boot equipment
- merge-in the pool tree of mounted DLBD-2
- overwrite the old /dists files by the newly derived ones
- automatically replay the commands for the loaded boot equipment
- store the result as new .iso file
The second step is where i would need info or advise.
Hi,

Sorry but I'm not familiar with internal details of ISOs, and I guess
there is no ready tool can merge two ISOs together. After a quick look
on dists/, I think these files are in same format with theses on online
mirrors, so referring apt's documentation might be useful. Also I found
these files are just plain texts (or gzipped plain texts), the format
might be very simple, I think.


Best Regards,
Zhang Boyang
Thomas Schmitt
2022-05-21 18:40:01 UTC
Permalink
Hi,
I guess there is no ready tool can merge two ISOs together.
Not directly. But xorriso can load the meta-data of an ISO, manipulate
that loaded model, and write the result as new ISO with the same boot
equipment as found in the loaded ISO.
As said, that would be no problem. Only enough disk space is needed, of
course, and a Debian version >= 9 or locally built GNU xorriso >= 1.4.2.
After a quick look on dists/,
I think these files are in same format with theses on online mirrors, so
referring apt's documentation might be useful.
Searching "Debian Packages.gz" brought me to
https://wiki.debian.org/DebianRepository/Format
which gives some hope for enlightenment.
I will try to merge DVD-1 and DVD-2 but then need a way to verify that
the result is fully functional.

It looks like i have to merge these files from DVD-1:

./dists/bullseye/contrib/binary-amd64/Packages.gz
./dists/bullseye/main/binary-amd64/Packages.gz
./dists/bullseye/main/debian-installer/binary-amd64/Packages.gz

./dists/bullseye/contrib/binary-amd64/Release
./dists/bullseye/main/binary-amd64/Release
./dists/bullseye/main/debian-installer/binary-amd64/Release
./dists/bullseye/Release

with these from DVD-2:

./bullseye/contrib/binary-amd64/Packages.gz
./bullseye/main/binary-amd64/Packages.gz

./bullseye/contrib/binary-amd64/Release
./bullseye/main/binary-amd64/Release
./bullseye/main/debian-installer/binary-amd64/Release
./bullseye/Release

Bystanders: Do i miss something yet ?

----------------------------------------------------------------------

To demonstrate the rest of my sketch i tried this with DVD instead of DLBD
(xorriso is GNU xorriso-1.5.2, equivalent to xorriso in Debian 11):

DLBD_1=debian-11.2.0-amd64-DVD-1.iso
DLDB_1_MOUNT=/mnt/dlbd_1
DLDB_2=debian-11.2.0-amd64-DVD-2.iso
DLDB_2_MOUNT=/mnt/dlbd_2
RESULT=ALL_IN_ONE.iso

sudo mkdir "$DLDB_1_MOUNT" "$DLDB_2_MOUNT"
sudo mount "$DLBD_1" "$DLDB_1_MOUNT"
cp -a "$DLDB_1_MOUNT/dists" merged_dists
sudo umount "$DLDB_1_MOUNT"
chmod -R u+w merged_dists
sudo mount "$DLBD_2" "$DLDB_2_MOUNT"

# TODO:
# Merge the files in merged_dists with those from "$DLDB_2_MOUNT"

xorriso -indev "$DLBD_1" \
-outdev "$RESULT" \
-map "$DLDB_2_MOUNT"/pool /pool \
-map merged_dists /dists \
-chown_r 0 /dists -- \
-chgrp_r 0 /dists -- \
-chmod_r a-w /dists -- \
-boot_image any replay \
-blank as_needed \
-stdio_sync off \
-padding included \
-compliance no_emul_toc

sudo umount "$DLDB_2_MOUNT"

xorriso -indev "$RESULT" -report_el_torito plain -report_system_area plain

yields the typical boot jackalope (still escapes more coyotes than any
other ISO partition layout):

El Torito catalog : 4301 1
El Torito cat path : /isolinux/boot.cat
El Torito images : N Pltf B Emul Ld_seg Hdpt Ldsiz LBA
El Torito boot img : 1 BIOS y none 0x0000 0x00 4 5598
El Torito boot img : 2 UEFI y none 0x0000 0x00 5184 4302
El Torito img path : 1 /isolinux/isolinux.bin
El Torito img opts : 1 boot-info-table isohybrid-suitable
El Torito img path : 2 /boot/grub/efi.img
System area options: 0x00000202
System area summary: MBR isohybrid cyl-align-off GPT APM
ISO image size/512 : 16754652
Partition offset : 0
MBR heads per cyl : 0
MBR secs per head : 0
MBR partition table: N Status Type Start Blocks
MBR partition : 1 0x80 0x00 0 16754652
MBR partition : 2 0x00 0xef 17208 5184
MBR partition path : 2 /boot/grub/efi.img
GPT : N Info
GPT disk GUID : 657a86211710b54c8e25101781372d9f
GPT entry array : 12 208 overlapping
GPT lba range : 64 16754598 16754651
GPT partition name : 1 490053004f00480079006200720069006400
GPT partname local : 1 ISOHybrid
GPT partition GUID : 1 657a86211710b54c8e24101781372d9f
GPT type GUID : 1 a2a0d0ebe5b9334487c068b6b72699c7
GPT partition flags: 1 0x1000000000000001
GPT start and size : 1 0 16754596
GPT partition name : 2 490053004f004800790062007200690064003100
GPT partname local : 2 ISOHybrid1
GPT partition GUID : 2 657a86211710b54c8e27101781372d9f
GPT type GUID : 2 a2a0d0ebe5b9334487c068b6b72699c7
GPT partition flags: 2 0x1000000000000001
GPT start and size : 2 17208 5184
GPT partition path : 2 /boot/grub/efi.img
APM : N Info
APM block size : 2048
APM gap fillers : 0
APM partition name : 1 EFI
APM partition type : 1 Apple_HFS
APM start and size : 1 4302 1296
APM partition path : 1 /boot/grub/efi.img

For details of the xorriso run, see man xorriso.
For the report format, see
xorriso -report_el_torito help -report_system_area help | less


Have a nice day :)

Thomas
Steve McIntyre
2022-05-21 22:40:01 UTC
Permalink
Hey folks,
Post by Thomas Schmitt
Post by Andrew M.A. Cater
I don't actually know how long it take to create the .jigdo and .template
files - I don't think it's as long as generating the full .iso files
Whether you make the .iso or not, we have to run the full process. The
*only* part of the process we can win on (a little) is by not actually
writing the .iso file to disk. If we don't want the .iso, we output to
/dev/null and save a little bit of output I/O [1]. All the same I/O us
used for input, of course - we have to scan all the data to generate
checksums, etc.
Post by Thomas Schmitt
They are created by libjte under control of libisofs under control of
xorriso as side effect of actual ISO production under control of debian-cd.
(I'm developer of libisofs and xorriso, and co-funder of libjte, which is
now back in the hands of Steve McIntyre from whose genisoimage code it
got large parts of its entrails.)
I was possibly wrong with guessing that the ISO is dumped into /dev/null,
although debian-cd seems to be smart enough to read the various checksums
from the .jigdo files, rather than calculating them from the .iso files.
The checksum code
https://sources.debian.org/src/debian-cd/3.1.35/tools/imagesums/#L57
reminded me that there are .torrent files made, which obviously need
the .iso files at least at build time.
No, youre initial guess is correct. We don't generate the .iso files
at all for the larger images [1]. This means we also don't have
torrent files for them [2]. There's no point generating the .torrent if
we don't have the full .iso available as well, after all.

[1] https://salsa.debian.org/images-team/debian-cd/-/blob/master/tools/make_image#L124
[2] https://cdimage.debian.org/cdimage/release/11.3.0/amd64/
Post by Thomas Schmitt
-------------------------------------------------------------------------
Post by Andrew M.A. Cater
it's always good to have otehr people to think round a problem with.
Is there documentation from which i could learn how the stuff in (i guess)
/dists of DLBD-1 and DLBD-2 could be merged so that it properly describes
a merged pool tree ?
- mount both ISOs
- derive merged /dists files
- run xorriso to let it
- load DLBD-1 with its boot equipment
- merge-in the pool tree of mounted DLBD-2
- overwrite the old /dists files by the newly derived ones
- automatically replay the commands for the loaded boot equipment
- store the result as new .iso file
The second step is where i would need info or advise.
The debian-cd code in tools/make_disc_trees.pl is not documentation
**as such**, but it's exactly how we create disc trees: all the
packages/sources and metadata of various flavours. It's baiscally just
making a self-contained apt repository on each medium.
--
Steve McIntyre, Cambridge, UK. ***@einval.com
“Why do people find DNS so difficult? It’s just cache invalidation and
naming things.”
-– Jeff Waugh (https://twitter.com/jdub)
Thomas Schmitt
2022-05-22 10:10:01 UTC
Permalink
Hi,
Post by Steve McIntyre
youre initial guess is correct. We don't generate the .iso files
at all for the larger images [1]. This means we also don't have
torrent files for them [2].
I began to ponder about a shortcut in libisofs which would trust the
checksum file (-checksum-list , -md5-list) enough to omit the reading of
all the package files' content.
Size, ownership, permissions, et.al would still be taken from the package
files on disk. No ISO image would emerge (because of no valid file content)
but .jigdo and .template would be created.
Probably libjte would need an API extension so that it knows that only
the count parameter of a libjte_show_data_chunk() call is valid.
Vice versa libisofs would have to ask libjte whether a particular file
is covered by the checksum list.

All tricky and probably not worth the risk of embarassing failure.
Post by Steve McIntyre
The debian-cd code in tools/make_disc_trees.pl is not documentation
I am using it now for checking detail questions.
Post by Steve McIntyre
It's baiscally just making a self-contained apt repository on each medium.
So
https://wiki.debian.org/DebianRepository/Format
looks like the specs to follow.

Question (to everybody):

The description of Packages[.gz] files talks of "paragraphs" but does not
exactly define a paragraph's end delimiter. From Packages.gz in the ISO
i'd guess it is an empty line or the "Package" field of the next paragraph.

Is an empty line needed between paragraphs ?
Would more than one empty line between paragraphs damage the readability ?

Reason: I want to merge the Packages.gz files like

(gunzip <from_DLBD_1/Packages.gz ; gunzip <from_DLBD_2/Packages.gz ) \
| gzip >temp_file

but am not sure that the found Packages.gz will always end by an empty
line. So i could simply insert an echo between the gunzips, or simply
trust that the empty line is not needed as separator, or begin to think ...


Have a nice day :)

Thomas
Steve McIntyre
2022-05-22 16:40:01 UTC
Permalink
Hey Thomas!
Post by Thomas Schmitt
Post by Steve McIntyre
youre initial guess is correct. We don't generate the .iso files
at all for the larger images [1]. This means we also don't have
torrent files for them [2].
I began to ponder about a shortcut in libisofs which would trust the
checksum file (-checksum-list , -md5-list) enough to omit the reading of
all the package files' content.
Size, ownership, permissions, et.al would still be taken from the package
files on disk. No ISO image would emerge (because of no valid file content)
but .jigdo and .template would be created.
Probably libjte would need an API extension so that it knows that only
the count parameter of a libjte_show_data_chunk() call is valid.
Vice versa libisofs would have to ask libjte whether a particular file
is covered by the checksum list.
All tricky and probably not worth the risk of embarassing failure.
Cute idea (grin!), but it's a non-started - we wouldn't be able to
generate the various checksums for the whole image.
Post by Thomas Schmitt
Post by Steve McIntyre
The debian-cd code in tools/make_disc_trees.pl is not documentation
I am using it now for checking detail questions.
Post by Steve McIntyre
It's baiscally just making a self-contained apt repository on each medium.
So
https://wiki.debian.org/DebianRepository/Format
looks like the specs to follow.
The description of Packages[.gz] files talks of "paragraphs" but does not
exactly define a paragraph's end delimiter. From Packages.gz in the ISO
i'd guess it is an empty line or the "Package" field of the next paragraph.
Is an empty line needed between paragraphs ?
Yup, it's an empty line.
Post by Thomas Schmitt
Would more than one empty line between paragraphs damage the readability ?
Not sure, to be honest.
Post by Thomas Schmitt
Reason: I want to merge the Packages.gz files like
(gunzip <from_DLBD_1/Packages.gz ; gunzip <from_DLBD_2/Packages.gz ) \
| gzip >temp_file
but am not sure that the found Packages.gz will always end by an empty
line. So i could simply insert an echo between the gunzips, or simply
trust that the empty line is not needed as separator, or begin to think ...
Might just work, yeah!
--
Steve McIntyre, Cambridge, UK. ***@einval.com
< liw> everything I know about UK hotels I learned from "Fawlty Towers"
Thomas Schmitt
2022-05-22 17:30:01 UTC
Permalink
Hi,
Post by Steve McIntyre
Post by Thomas Schmitt
I began to ponder about a shortcut in libisofs which would trust the
checksum file (-checksum-list , -md5-list) enough to omit the reading of
all the package files' content.
Cute idea (grin!), but it's a non-started - we wouldn't be able to
generate the various checksums for the whole image.
Yeah. Ain't it sad ?
Post by Steve McIntyre
Post by Thomas Schmitt
Would more than one empty line between paragraphs damage the readability ?
Not sure, to be honest.
So i will have to let my script think more. (It's at 287 lines, meanwhile.)
Post by Steve McIntyre
Might just work, yeah!
My script does meanwhile:
- Take as arguments the paths of
boot_iso boot_mount_directory add_iso add_mount_directory result_iso
- Mount the bootable ISO (e.g. debian-11.3.0-amd64-DLBD-1.iso) and the
add-on ISO (e.g. debian-11.3.0-amd64-DLBD-2.iso).
- Determine the release name $dist of the bootable ISO (e.g. "bullseye").
- Copy /dists and /md5sum.txt from bootable ISO to local working directory.
- Merge the /md5sum.txt of both ISOs.
- Read the paths from first checksummed file lists of both
/dists/$dist/Release files, sort and uniq them. (I don't bet that both
have the same list, although DVD-1 and DVD-2 have the same.)
- Merge the namesake .gz files from that list (if both exist) or copy .gz
file from the add-on ISO (if not yet in bootable ISO).
- Create a new file /dists/$dist/Release with the header lines from the
file in the bootable ISO and freshly computed checksums of the merged
files.
- Produce the bootable result_iso, overwriting /dists and /md5sum.txt
by the altered copies from the local directory.

This can be repeated by using result_iso as new boot_iso and another add_iso
and a new result_iso name.

------------------------------------------------------------------------

How can i verify that the resulting ISO properly announces all its
packages ?
(If i install it to a VM, what must i do to challenge its completeness ?)


Have a nice day :)

Thomas
Zhang Boyang
2022-05-23 03:30:01 UTC
Permalink
Hi,
Post by Thomas Schmitt
How can i verify that the resulting ISO properly announces all its
packages ?
(If i install it to a VM, what must i do to challenge its completeness ?)
I came up with a idea. Maybe you can use 'debian-cd' to create a DLBD
set, say disc A1 and A2, then create another ALL-IN-ONE set, say disc B.
Then compare if A1+A2==B. There might be small differences, like the
package order in Packages.gz, but I think if the overall format is OK,
then it will be OK.


Best Regards,
Zhang Boyang
Thomas Schmitt
2022-05-23 19:50:01 UTC
Permalink
Hi,

i uploaded the first version of my merger script as
https://dev.lovelyhq.com/libburnia/libisoburn/raw/branch/master/test/merge_2_debian_isos

Please download and give x-permission. The need for sudo can be avoided
by already mounting the two ISOs at the chosen mount points before running
the script.
Review and test reports are appreciated.

As motivation for tests by Debian installation experts:

If this works, then Debian could replace the DLBD sized ISOs by the
BD sized ISOs and instructions how to merge them to DLBD ISOs or a
QLBD/All-in-one ISO. (In that case the script should move into the hands
of debian-cd, i guess.)

=======================================================================
$ ./merge_2_debian_isos
usage: merge_2_debian_isos \
boot_iso boot_mount add_iso add_mount result_iso [for_dist]

Mounts by sudo the boot_iso at directory boot_mount and add_iso at
add_mount, if not already mounted that way. Then both Debian pools
and package lists get merged and a new ISO 9660 image result_iso
is produced, which must not yet exist.
If boot_iso is bootable then the new image will be bootable by the
same means.
This script creates the following temporary tree and files which
must not yet exist in the current working directory:
./merged_dists , ./merged_md5sum.txt , ./merged_REAMDE.txt
./temp_file
The optional sixth argument for_dist should only be given if
this script refuses to work and proposes to give this argument.
Exported non-empty variable MERGE_DATE enforces a particular
date string in the text which gets prepended to /README.txt .
Exported non-empty variable XORRISO overrides command xorriso,
which may be needed if installed xorriso is older than 1.4.2.
Example using GNU xorriso-1.5.4 instead of /usr/bin/xorriso:
export XORRISO=$HOME/xorriso-1.5.4/xorriso/xorriso
merge_2_debian_isos debian-11.2.0-amd64-DVD-1.iso /mnt/iso1 \
debian-11.2.0-amd64-DVD-2.iso /mnt/iso2 merged.iso
=======================================================================

The /README.txt of the result ISO gets a prefix text:
=======================================================================
Result of a run of merge_2_debian_isos at 20220523-20:09
Package pools and Packages lists were merged.
The other files stem from the first input ISO.

Input ISO: debian-11.2.0-amd64-DVD-1.iso
Debian GNU/Linux 11.2.0 "Bullseye" - Official amd64 DVD Binary-1
20211218-11:13

Input ISO: debian-11.2.0-amd64-DVD-2.iso
Debian GNU/Linux 11.2.0 "Bullseye" - Official amd64 DVD Binary-2
20211218-11:13

------------------------------------------------------------------------------
[... text of boot_iso's README.txt ...]

=======================================================================

The result of merging debian-11.2.0-amd64-DVD-[12].iso boots with
qemu-system-x86_64 -enable-kvm -m 512 -hda merged.iso
to a boot loader menu with Debian logo.

I did not go further with installation yet, mainly because i still lack
ideas and experience how i would verify that the ISO's main repo knows
6788 *.deb files instead of 4753 in DVD-1 and 2035 in DVD-2.

To ease usage and to avoid quadratic behavior (actually triangular), i plan
to beef up the script so that it can merge more than 2 ISOs in one run.
E.g. for downloading all 19 11.3.0 amd64 DVDs and merging them.
(19 /dev/loop* should be no problem, i hope.)
I came up with a idea. Maybe you can use 'debian-cd' to create a DLBD set,
say disc A1 and A2, then create another ALL-IN-ONE set, say disc B. Then
compare if A1+A2==B. There might be small differences, like the package
order in Packages.gz, but I think if the overall format is OK, then it will
be OK.
I fear that several of the tasks in debian-cd are beyond my talents.
(I am about the third worst sysadmin which i know of.)

In the end it is about whether the merged ISO works fully or not.


Have a nice day :)

Thomas
Thomas Schmitt
2022-05-24 20:00:01 UTC
Permalink
Hi,

for now it looks like the merged ISO works as fat DVD-1.

I installed a Debian 11.2.0 system from the merged DVD-1+ DVD-2 ISO
in a qemu VM via option -cdrom. Installation went smoothly.
During reboot i aborted the VM to next deface the El Torito boot sector
of the ISO. I removed the -net options from the qemu start command,
hopefully simulating a pulled network adapter.
Then i booted from the virtual -hda, having the -cdrom present too.
I removed from sources.list the deb and deb-src entries with http.

ping debian.org
gets no replies. So the network is effectively down, although
ip addr
shows some "ens3" with "altname enp0s3", which looks much like the
"nnp5s0" of the host machine except that the latter has no "altname".

To make sure that i can install stuff, i did as superuser

dpkg -s apt-file
dpkg -s lame
dpkg -s xorriso

which all three indicated that the packages are not yet installed.
Then i tried to install them:

# From pool of DVD-1
apt-get install apt-file

# From pool of DVD-2
apt-get install lame

# Not in the two DVD images
apt-get install xorriso

The first two succeeded and behaved as i know from apt-get.
The third did not succeed, as expected.

----------------------------------------------------------------------

But
apt-file update
fails with
E: The repository 'cdrom://[Debian GNU/Linux 11.2.0 _Bullseye_ - Official amd64 DVD Binary-1 20211218-11:13] bullseye Release' does not have a Release file
N: Update [...] can't be done securely [...] disabled by default.
N: See apt-secure(8) [...]

Suspecting a flaw with my merge procedure, i booted with the original
DVD-1 ISO as -cdrom.
I had to run
apt-cdrom add
before the repo was accepted (i'm riddling a lot) but then i could prove
that it works by
apt-get install needrestart
which is on DVD-1.
Nevertheless
apt-file update
tells the same "E:" and "N:" lines.

So the shortcomming seems already to be in DVD-1.
Anything known about it ?

(I learned that it was not necessary to deface El torito on DVD-1+2.
DVD-1 with El Torito at -cdrom does not keep -hda from booting.)

----------------------------------------------------------------------

What shall i further test to challenge the merged ISO whether it has
any shorcommings compared to the original DVD-1 ?


Have a nice day :)

Thomas
Thomas Schmitt
2022-06-15 11:30:01 UTC
Permalink
Hi,

although it was not the final solution of this bug report, i beefed up
my merger script for Debian ISOs so that it can combine an arbitrary
number of ISOs (within the limits of /dev/loop* and mount(8)).
Maybe it can serve as answer for the next time this wish comes up.

The script is uploaded as

https://dev.lovelyhq.com/libburnia/libisoburn/raw/branch/master/test/merge_debian_isos

with GPG detached signature
https://dev.lovelyhq.com/libburnia/libisoburn/raw/branch/master/test/merge_debian_isos.sig
for checking by gpg --verify.

After download, the user probably has to give x-permissions to the script.

When run without arguments it gives this help text:
-----------------------------------------------------------------------

usage: merge_debian_isos result_iso mount_template iso1 iso2 [... isoN]

Mounts by sudo the ISO 9660 images iso1 to isoN at directories
mount_template1 to mount_templateN, if not already mounted that way.
Then the Debian pools and package lists get merged and a new
ISO 9660 image result_iso is produced, which must not yet exist.
If iso1 is bootable then the new image will be bootable by the
same means.
At least the parent directory of mount_template must already exist.
All arguments must be single words without using quotation marks.
None of the isoN must be equal to another isoM.

This script creates and finally removes the following temporary tree
and files which must not yet exist in the current working directory:
./merged_dists , ./merged_md5sum.txt , ./merged_REAMDE.txt
./temp_file
Further it creates and finally removes directories mount_template*
if they are needed and do not exist when the script starts.
It depends on the following programs:
awk, basename, bash, cat, chmod, cp, dirname, expr, fgrep, grep,
gunzip, gzip, head, ls, mkdir, mount, mv, rm, rmdir, sha256sum,
sed, sort, stat, sudo, umount, xorriso
Recommended are: md5sum, sha1sum, sha512sum

Exported non-empty variable MERGE_DATE enforces a particular
date string in the text which gets prepended to /README.txt .
Exported non-empty variable MERGE_FOR_DIST enforces the use of a
particular directory in /dists of iso1. Normally only one
such directory is found and thus no need to set MERGE_FOR_DIST.
Exported non-empty variable XORRISO overrides command xorriso.
This may be needed if installed xorriso is older than 1.4.2.

Example using GNU xorriso-1.5.4 instead of /usr/bin/xorriso:
export XORRISO=$HOME/xorriso-1.5.4/xorriso/xorriso
mkdir merge_mount
merge_debian_isos merged.iso merge_mount/iso \
debian-11.2.0-amd64-DVD-[12345].iso
rmdir merge_mount

-----------------------------------------------------------------------

The old script merge_2_debian_isos still exists but will refuse to run,
unless the user removes an exit command near the start of the script.
In its downloaded form it tells the user:
-----------------------------------------------------------------------
THIS SCRIPT IS DEPRECATED ! USE ITS SUCCESSOR: merge_debian_isos

The script merge_2_debian_isos still exists only because it was mentioned
in Debian bug 1011343. The successor can merge more than two ISOs.
So do not edit this script to remove this warning and the 'exit 7' line.
-----------------------------------------------------------------------

I tested the new script by merging the first three ISOs of
debian-11.2.0-amd64-DVD-*.iso
and installing Debian from the result into a qemu-system-x86_64 VM with
4 GB RAM and 128 GB system disk.

After the installation was completed, i successfuly installed packages
which are in DVDs 1, 2, and 3 and were not installed yet:
apt-get install apt-file
apt-get install lame
apt-get install xorriso

To verify that this success does not come from an unwanted network
connection i tried to install packages from DVD 4
apt-get install bash-doc
apt-get install devede
Both attempts failed with the message that the packages are not available.


Have a nice day :)

Thomas
Steve McIntyre
2022-06-15 16:30:01 UTC
Permalink
Hey Thomas!
Post by Thomas Schmitt
Hi,
although it was not the final solution of this bug report, i beefed up
my merger script for Debian ISOs so that it can combine an arbitrary
number of ISOs (within the limits of /dev/loop* and mount(8)).
Maybe it can serve as answer for the next time this wish comes up.
The script is uploaded as
https://dev.lovelyhq.com/libburnia/libisoburn/raw/branch/master/test/merge_debian_isos
with GPG detached signature
https://dev.lovelyhq.com/libburnia/libisoburn/raw/branch/master/test/merge_debian_isos.sig
for checking by gpg --verify.
Cool. :-)

That might be a useful thing to include in a package. What do you think?
--
Steve McIntyre, Cambridge, UK. ***@einval.com
Armed with "Valor": "Centurion" represents quality of Discipline,
Honor, Integrity and Loyalty. Now you don't have to be a Caesar to
concord the digital world while feeling safe and proud.
Thomas Schmitt
2022-06-15 20:40:01 UTC
Permalink
Hi,
Post by Steve McIntyre
Post by Thomas Schmitt
https://dev.lovelyhq.com/libburnia/libisoburn/raw/branch/master/test/merge_debian_isos
That might be a useful thing to include in a package. What do you think?
Best would be if debian-cd would take it, so that it can be adapted when
the repository format in the ISOs gets changed. It could be renamed to
e.g. "debian-cd-merge-isos" and become a separate "binary" package from
the debian-cd source package.
I would of course be willing to help with maintaining it.

But first it needs more testing, especially whether the resulting ISO
lacks anything that an ISO-1 from debian-cd with the same packages has.
Zhang Boyang posted a comparison in
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1011343#115
https://bugs.debian.org/cgi-bin/bugreport.cgi?att=1;bug=1011343;filename=diff.details.txt;msg=115
about which i ponder:

- Shall the script create a new /.disk/mkisofs ?
The used xorriso run has few similarity with a xorriso run from
debian-cd. The pool content and package lists have no influence
on the xorriso run of debian-cd.
So it might be better to stay with the original.

- There are some /firmware files missing in the merged ISO. Like
/firmware/arm-trusted-firmware-tools_2.4+dfsg-2_amd64.deb
...
/firmware/gnome-firmware_3.36.0-1_amd64.deb
They are in the pool of DLBD-1. But how did they get into /firmware of
Zhang Boyang's CUSTOM(all-in-one) ? (Or: Why aren't they in /firmware
of DLBD-1 ?)


I hope that the script is sufficiently stable against bad arguments and
bad combinations of ISOs. At least i inserted a lot of error messages
and took care to trigger each of them.
Nevertheless, it should be tested independently of me whether it can be
tricked into destroying existing data on disk or leaving temporary files
on disk.


Have a nice day :)

Thomas
David
2022-06-16 02:40:01 UTC
Permalink
Post by Thomas Schmitt
Post by Thomas Schmitt
https://dev.lovelyhq.com/libburnia/libisoburn/raw/branch/master/test/merge_debian_isos
Nevertheless, it should be tested independently of me whether it can be
tricked into destroying existing data on disk or leaving temporary files
on disk.
Hi Thomas,

Thank you for your many valued contributions to Debian community!

As a casual observer, can I politely suggest a couple of things
regarding this script:

1) Consider checking the script using the https://www.shellcheck.net/
tool and looking at the warnings it gives. In particular iterating
over the output of 'ls' is not advisable.

2) Consider whether you do want your script to depend on bash, or
perhaps remove any bash-dependent features and make it
posix-compatible so that it can run in for example Debian's default
shell, dash. This can be checked by changing the first line of the
script to '#!/bin/sh' and checking again with the shellcheck tool.
I tried that and it looks like there are no obstacles to doing that,
just small changes.
Thomas Schmitt
2022-06-17 09:50:01 UTC
Permalink
Hi,

i followed David's advise to consider the warnings of shellcheck, to drop
the demand for bash, and to check early whether all needed programs are
available. xorriso gets now checked for being young enough for the job
of replaying the boot related commands as detected with input ISO 1.
(1.3.2 in Debian 8 is too old. 1.4.6 in Debian 9 is ok.)

The warnings yielded no real problem fixes but several improvements in
regard to general shell code quality.
The switch from /bin/bash to /bin/sh was not costly. A printf formatter
had to be changed and a use of "type -p" for a message was dropped.

An test installation with the resulting ISO succeeded and was able install
further packages from all three merged DVD ISOs.
(I still could need proposals for further checking the fitness of the ISO.)


As before, the script is available at
https://dev.lovelyhq.com/libburnia/libisoburn/raw/branch/master/test/merge_debian_isos
with
https://dev.lovelyhq.com/libburnia/libisoburn/raw/branch/master/test/merge_debian_isos.sig

The changes are to see at
https://dev.lovelyhq.com/libburnia/libisoburn/commit/34981b1278610d3f31da90f57fdf1378d6012074


Have a nice day :)

Thomas
Zhang Boyang
2022-06-18 16:30:01 UTC
Permalink
Hello,

Thanks for making this program! I will definitely try it as soon as I
finished my current work!

Thank you again :)

Best Regards,
Zhang Boyang
Post by Thomas Schmitt
Hi,
although it was not the final solution of this bug report, i beefed up
my merger script for Debian ISOs so that it can combine an arbitrary
number of ISOs (within the limits of /dev/loop* and mount(8)).
Maybe it can serve as answer for the next time this wish comes up.
Thomas Schmitt
2022-06-18 19:00:01 UTC
Permalink
Hi,
Post by Zhang Boyang
I will definitely try it
Meanwhile i got some insight into the riddle about diffs between merged.iso
and CUSTOM-1.iso like

Only in /groundtruth/firmware: arm-trusted-firmware-tools_2.4+dfsg-2_amd64.deb
Only in /groundtruth/firmware: gnome-firmware_3.36.0-1_amd64.deb

Indeed arm-trusted-firmware-tools_2.4+dfsg-2_amd64.deb does not exist in
the /firmware directory of DLBD-1 and thus did not get into merged.iso.

But /firmware/gnome-firmware_3.36.0-1_amd64.deb is in DLBD-1 as symbolic
link and its target is in DLBD-1, too.

Please verify that it is really not in merged.iso.
If so, then please record the messages of the next experiment with the
merge script and send them to me in private.

I now uploaded a new version of the script which merges the /firmware
directories. Just a guess, until i know what's up with /firmware.

--------------------------------------------------------------------

To those who are familiar with debian-cd, especially Steve McIntyre:

/firmware is not mentioned in
https://wiki.debian.org/DebianRepository/Format
So i guess that it is specific to debian-cd or the installer along
https://wiki.debian.org/Firmware#Firmware_during_the_installation
and that i should merge the /firmware directories.
Correct ?

I am a bit confused by the fact that debian-11.2.0-amd64-DVD-2.iso has
no firmware directory at all. How come ?
Can this happen to a first ISO, too ?
(DVD-1 has a /firmware tree with only inhabitant dep11/README.txt.)


Have a nice day :)

Thomas
Thomas Schmitt
2022-06-19 09:30:12 UTC
Permalink
Hi,

i tested merging of /firmware directories with barely sufficently
complete
debian-11.0.0-amd64-DLBD-[12].iso.tmp
from aborted jigdo-lite runs.

All files which are reported as being only in CUSTOM(all-in-one) by
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1011343#115
are listed by xorriso as present in the emerging ISO, of which i
suppressed the actual production out of storage space reasons.

So if the next diff between self-made CUSTOM(all-in-one) and merged.iso
shows again missing files in /firmware, i have to ask for full listings
of the /firmware trees in DLBD-1, DLBD-2, CUSTOM(all-in-one), and
merged.iso .

-------------------------------------------------------------------
About my shortcuts with download and ISO non-production:

I waited with aborting jigdo-lite just long enough until the first
downloaded packages caused the inflation of the .template file
to .iso.tmp. The directory tree and the management files are then
already present, whereas the packages' content is mostly waiting for
being filled in.

Further i modified the xorriso run to end by
-find /firmware -- -rollback_end
so that no ISO emerges but only the content of the planned /firmware
directory is shown before the program ends.

(I would go further. But the production of CUSTOM(all-in-one) via
debian-cd is out of my reach, given the section "ABOUT MIRROR" in message
#115. So the effort to complete the download and to produce a merged.iso
would bring no benefit for now.)


Have a nice day :)

Thomas
Zhang Boyang
2022-06-20 08:10:01 UTC
Permalink
Hi,
Post by Thomas Schmitt
Hi,
i tested merging of /firmware directories with barely sufficently
complete
debian-11.0.0-amd64-DLBD-[12].iso.tmp
from aborted jigdo-lite runs.
All files which are reported as being only in CUSTOM(all-in-one) by
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1011343#115
are listed by xorriso as present in the emerging ISO, of which i
suppressed the actual production out of storage space reasons.
So if the next diff between self-made CUSTOM(all-in-one) and merged.iso
shows again missing files in /firmware, i have to ask for full listings
of the /firmware trees in DLBD-1, DLBD-2, CUSTOM(all-in-one), and
merged.iso .
I tested my self-baked ISOs again. To make things clear, I'd like
explain my testing process in details.

1) Create a private mirror. (= a local latest debian mirror)
2) Create my own version of my-DLBD1.iso and my-DLBD2.iso from my
private mirror.
3) Create my own version of my-CUSTOM.iso (all-in-one) from my private
mirror, as ground-truth.
4) Merge my-DLBD1.iso and my-DLBD2.iso, as merged.iso, using merge script.
5) Compare merged.iso and my-CUSTOM.iso.

Note that I'm not using official-11.0.0-DLBD1.iso and
official-11.0.0-DLBD2.iso to create merged.iso. The reason behind it is:

1) There is no official-11.0.0-CUSTOM.iso (all-in-one) as ground-truth
for comparing if merge(official-11.0.0-DLBD1.iso,
official-11.0.0-DLBD2.iso) == official-11.0.0-CUSTOM.iso
2) There will be a lot of differences when comparing
merge(official-11.0.0-DLBD1.iso, official-11.0.0-DLBD2.iso) ==
my-CUSTOM.iso, because my-CUSTOM.iso is created from my private mirror,
which contains updates from 11.0.0 to current time.
3) So, to minimize differences, I choose create DLBD1.iso and DLBD2.iso
myself from my private mirror, then compare merge(my-DLBD1.iso,
my-DLBD2.iso) == my-CUSTOM.iso

Unfortunately I don't know how the official DLBDs are created, so I try
my best to change CONF.sh in `debian-cd' package, in order to minimize
the structure differences between my-DLBD1.iso and official-DLBD1.iso.

Then, back to the test results. This time the difference in /firmware is:

Only in /groundtruth/firmware:
arm-trusted-firmware-tools_2.4+dfsg-2_amd64.deb

I don't think this is the merge script's fault. This .deb is not exist
in both my-DLBD1.iso and my-DLBD2.iso. I think the reason might be
misconfiguration in my CONF.sh or some bug in `debian-cd'.

# diff <((ls /dlbd1/firmware/; ls /dlbd2/firmware/)|sort|uniq) <(ls
/groundtruth/firmware|sort)
1a2
Post by Thomas Schmitt
arm-trusted-firmware-tools_2.4+dfsg-2_amd64.deb
There are other differences in filesystem-tree. Some of them seems
harmless, e.g. Disc-title differs. If you are interested in them, please
refer to the attachments. MD5 are stripped as usual due to size reasons.


Best Regards,
Zhang Boyang
Thomas Schmitt
2022-06-20 20:30:01 UTC
Permalink
Hi,
2) Create my own version of my-DLBD1.iso and my-DLBD2.iso from my private
mirror.
This explains why a firmware package was missing which is in the official
DLBD-1 but obviously on your my-DLBD2.
arm-trusted-firmware-tools_2.4+dfsg-2_amd64.deb
I don't think this is the merge script's fault. This .deb is not exist in
both my-DLBD1.iso and my-DLBD2.iso.
Then it's fully ok for me. :))
There are other differences in filesystem-tree.
[attachment diff.details.txt]
.disk/mkisofs
This is just an unchanged copy of the file in your DLBD-1.
(As stated previously i see few reason to overwrite it with the xorriso
arguments which were used for merging.)
boot/grub/efi.img
Could be about timestamps in the FAT filesystem.

One could try whether exporting SOURCE_DATE_EPOCH=...seconds.since.1970...
with the same value for DLBD and CUSTOM would create the same efi.img
files.
( https://reproducible-builds.org/docs/source-date-epoch/ )
dists/bullseye/Release
This is a very central file. It would be interesting to see the full diff.
dists/bullseye/contrib/binary-amd64/Packages.gz
Could be the sequence of packages in this unsorted list of multi-line
records.
Critical are the last package record of DLBD-1 and the first of DLBD-2.
In the merged ISO's Packages.gz they have to be listed with a neat single
empty line between them. (In my tests this was the case.)
dists/bullseye/contrib/i18n/Translation-cs.gz
At least in the DVD set of 11.2.0 the i18n files which are mentioned in
Release are incomplete counterparts of the respective Packages.gz files
in various languages. Many packages are not listed in them.
dists/stable
isolinux/boot.cat
This file contains the block addresses of isolinux/isolinux.bin and of
boot/grub/efi.img. These addresses are subject to many influences during
ISO production. Somewhat random.
isolinux/f1.txt
Seems to be a production timestamp of d-i.
It would be interesting to see whether it obeys SOURCE_DATE_EPOCH.
isolinux/isolinux.bin
The -boot-info-table patching at the begin of ISO production writes the
block address of isolinux/isolinux.bin into isolinux/isolinux.bin.
md5sum.txt
Ouch. My script sorts the merged lines by the MD5 fields rather than by
the file paths.
Further this sorting is subject to locale settings, which is hardly
desirable, if the sequence of lines has a meaning at all.

I committed a new version which fixes both problems in my local tests:
https://dev.lovelyhq.com/libburnia/libisoburn/commit/87aab730582cb4268f50062361fec7f13e2b50ab


Have a nice day :)

Thomas
Zhang Boyang
2022-06-23 06:50:01 UTC
Permalink
Hi
Post by Thomas Schmitt
Hi,
Thanks for the detailed explanation! I think we can ignore minor
differences, since creating a bit-for-bit reproducible image seems too
hard for us.
Post by Thomas Schmitt
Post by Thomas Schmitt
md5sum.txt
Ouch. My script sorts the merged lines by the MD5 fields rather than by
the file paths.
Further this sorting is subject to locale settings, which is hardly
desirable, if the sequence of lines has a meaning at all.
I think maybe we should just create the md5sum from scratch? (or we can
just reuse those md5sum of .deb only) Because some file definitely
changed (like READMEs or files under /dists/...). A bad md5sum.txt will
cause cdrom-checker in d-i to fail. (Fortunately it's not a standard
step, but user can invoke it under `Advance options - Expert install')

Link:
https://salsa.debian.org/installer-team/cdrom-checker/-/blob/master/main.c


Thank you again :)

Best Regards,
Zhang Boyang
Thomas Schmitt
2022-06-23 07:30:02 UTC
Permalink
Hi,
Post by Zhang Boyang
Post by Thomas Schmitt
md5sum.txt
I think maybe we should just create the md5sum from scratch?
Well, not from scratch, because the paths would come from the merged
md5sum.txt.

But indeed the merged file needs polishing:
- Some files are listed multiple times because they appear in the
md5sum.txt files of multiple input ISOs.
- The files in ./dists are probably newly generated.

The bulk of packages and support files is not supposed to have changed.

So i will have to promote the md5sum program to hard depenceny and
make a filtering run over the merged and sorted md5sum.txt file to
care for above cases.

I will give a note when this is ready for testing.


Have a nice day :)

Thomas
Thomas Schmitt
2022-06-23 17:00:01 UTC
Permalink
Hi,

i explored two ways to get a correct and complete md5sum.txt after
merging and sorting:

- Generating the whole md5sum.txt from the emerging ISO's file tree is
possible by help of xorriso, although there is no file tree yet where
all regular files of the ISO can be found for checksumming.
But this would work only for MD5 and not for SHA256, because xorriso
has no API function to compute SHA256. (libjte has SHA256 but not as
API. There are many xorrisos out there which are not linked to libjte.)
Run time would be mediocre: 36 seconds for DVD-1+2+3 on a 500MB/s SSD.
(find+md5sum on mounted result ISO needs 27 seconds. But as said there
is no result ISO to mount when the script makes md5sum.txt.)

- Removing duplicates from md5sum.txt and identifying those files which
possibly have a changed MD5 after the merging activities is more error
prone but signficantly faster.
Especially it can be easily modified when Debian decides to retire
md5sum.txt in favor of a sha256sum.txt.
Run time is still annoying: 8 seconds with dash, 12 seconds with bash.
As it is now it works with echo | grep.
I could reduce it to less than a second by using the bashism
${Var:Offset:Count} to obtain a substring of the file paths.

Run time for an All-in-one ISO is estimated about 6 to 7 times the time
of DVD-1+2+3.
So i expect ~230 seconds for full MD5 regeneration, ~ 50 seconds for
a loop that runs on dash, and ~6 seconds with a bashism.

For now i decided to take the 50 seconds with dash.

The merged md5sum.txt is not 100% complete. Files in ./firmware which
appear in more than one input ISO will not be listed, because it is not
100% clear from which ISO the one stems which survives the competition.
It appears that the check in
https://salsa.debian.org/installer-team/cdrom-checker/-/raw/master/main.c
does not insist in a complete list. It only demands that all listed files
exist and yield the listed MD5 when being checksummed.

I tested the correctness of the merged md5sum.txt of DVD-1+2+3 by mounting
the result as /mnt/iso and running what i deem equivalent to the MD5 check
in installer-team/cdrom-checker :

(cd /mnt/iso
cat md5sum.txt | while read line
do
if echo -n "$line" | md5sum -c 1>/dev/null 2>&1
then
dummy=dummy
else
echo "BAD: $line"
fi
done ) 2>&1 | wc

Result from wc was "0 0 0" (after 47 seconds).

Regrettably i cannot check this with my dummy DLBD-1+2 ISO, because its
data files nearly all have fake content.

Committed changes:
https://dev.lovelyhq.com/libburnia/libisoburn/commit/0bc397c02c0ea7c960b59ce92daa267bed23fc07


Have a nice day :)

Thomas
Zhang Boyang
2022-06-25 14:30:01 UTC
Permalink
Hi,

Some good news, I tested:

1) Build my unofficial DVD set (17 DVDs), and merge them using the merge
script on my Debian machine.

2) Build my unofficial DLBD set (2 DLBDs), and merge them using the
merge script on my Debian machine.

3) Merge 2 DLBDs using the merge script under a Alpine Linux environment
(everything is almost busybox).

4) Run d-i cdrom-check on merged.iso from 1), 2), and 3).

5) Do fresh install into virtual machine, then install random packages
selected from {SOME-ISO-NAME}.list.gz .

These experiments all succeeded. Thank you very much! Good Job! :)
Post by Thomas Schmitt
Run time for an All-in-one ISO is estimated about 6 to 7 times the time
of DVD-1+2+3.
So i expect ~230 seconds for full MD5 regeneration, ~ 50 seconds for
a loop that runs on dash, and ~6 seconds with a bashism.
For now i decided to take the 50 seconds with dash.
I think runtime is not a issue, 50 seconds is totally acceptable. But if
you really want to reduce runtime I would suggest using `sort -s -u -k 2
merged_md5sum.txt' instead of processing each line by hand. By using
stable sort (`-s') and unique (`-u'), only first record of duplicate set
will be output. So as long as md5sum.txt in Disc1 comes first, it will
definitely in final result. I saw there are some other logic to process
md5 records from different group of files, so we can use `grep' and
`grep -v' to split them, process them separately, then merge them
finally. Unfortunately the option `-s' of `sort' is not standard
(although widely accepted), and BusyBox has bugs about it and must use
`sort -s -k 2 | uniq -f 1' to workaround (Link:
https://bugs.busybox.net/show_bug.cgi?id=14871).


Thank you again :)

Best Regards,
Zhang Boyang
Thomas Schmitt
2022-06-25 19:40:01 UTC
Permalink
Hi,
Post by Zhang Boyang
These experiments all succeeded. Thank you very much! Good Job! :)
Thank you for testing and challenging.
Post by Zhang Boyang
Post by Thomas Schmitt
For now i decided to take the 50 seconds with dash.
if you really want to reduce runtime I would suggest using
`sort -s -u -k 2 merged_md5sum.txt' instead of processing each line
by hand.
The task is to identify those which need newly computed MD5 because they
might have changed. Mostly i know which directories are suspects, because
they are on hard disk and get mapped back into the emerging ISO. Their
MD5s get recomputed from the files on hard disk.
Some other paths in the md5sum.txt may appear multiple times. In this case
it is clear that the data of the file in the emerging ISO stem from iso1.
But it is not clear which of the multiple lines in md5sum.txt stems from
iso1. So the MD5 has to be recomputed from the file in mounted iso1.
Post by Zhang Boyang
I saw there are some other logic to process md5 records from
different group of files, so we can use `grep' and `grep -v' to split them,
process them separately, then merge them finally.
That's a great idea.
The majority of files is in ./pool and surely needs no recomputing, even
if listed multiple times (due to overlapping ISO pools).

This here

( fgrep ' ./pool/' <merged_md5sum.txt | uniq
fgrep -v ' ./pool/' <merged_md5sum.txt | polish_md5sum_txt ) \
| sort -k 2 >temp_file

needs 1.9 seconds instead of 7.2 seconds with the old

polish_md5sum_txt >temp_file

Times were measured by date '+%s.%N' around the polishing commands.
polish_md5sum_txt and its subordinate were slightly modified for the new
method to read from stdin and to not expect any ./pool file.
The latter brought 0.9 seconds.

The number of lines in md5sum.txt is then the same as with the old method.
My test loop with md5sum -c on the mounted result ISO reports no
mismatches. (It is annoying that gzip inserts a time stamp, so that the
Packages.gz files differ although they bear the same uncompressed
content. So the md5sum.txt file shows differences, too, from run to run.)
Post by Zhang Boyang
Unfortunately the option `-s' of `sort' is not standard
I understand that it is needed to keep sort -k 2 from distinguishing
lines with differences outside of -k 2 so that sort -u could throw out
surplus lines with duplicate paths.

But with above code sort -u is not needed.
The pool lines have to be identical even if duplicate paths appear at
all. (I only know of one old debian package which existed with different
content but same name, long ago.) So uniq can do its job.
The other lines are made unique by the shell function polish_md5sum_txt.

Complexity-wise this replaces a slow O(n) algorithm by a faster O(n) and
an additional O(n * log(n)) run. At some size of Debian the slow speed
of the linear loop will be compensated by the sorting complexity.
But there is still room: A sort of 11,000 lines lasts about 0.03 seconds.

I will probably commit this change tomorrow. Now it needs cleaning and
handling of the new dependency uniq.


Have a nice day :)

Thomas
Zhang Boyang
2022-06-26 06:20:01 UTC
Permalink
Hi,
Post by Thomas Schmitt
Complexity-wise this replaces a slow O(n) algorithm by a faster O(n) and
an additional O(n * log(n)) run. At some size of Debian the slow speed
of the linear loop will be compensated by the sorting complexity.
But there is still room: A sort of 11,000 lines lasts about 0.03 seconds.
Theoretically if both file is already sorted, we can use the `-m' option
(e.g. `sort -m -k 2 A.txt B.txt') to merge them in O(n) like mergesort.
However I don't think O(n * log(n)) is a bottleneck so we may just keep
it simple and stupid.


Best Regards,
Zhang Boyang
Thomas Schmitt
2022-06-26 09:10:01 UTC
Permalink
Hi,
Post by Zhang Boyang
Theoretically if both file is already sorted, we can use the `-m' option
I like this idea. Just in case Debian grows to a million packages.
But i understand that i would need two separate files for sorting.

grep ' ./pool/' <merged_md5sum.txt | uniq >file1
fgrep -v ' ./pool/' <merged_md5sum.txt | polish_md5sum_txt >file2
sort -k 2 -m file1 file2 >temp_file

More temporary files means more need for pre-existence tests and more
cleanup effort.
Currently i just have to concatenate two stdout streams.
Post by Zhang Boyang
However I don't think O(n * log(n)) is a bottleneck so we may just keep it
simple and stupid.
Seems to be the best decision for now.
(Unless some shell wizzard shows a way to pipe both streams separately
into sort -m while staying dash compatible and without the need for new
persistent file objects which have to be cleaned up afterwards.)

Actually we did not yet identify a use case where md5sum.txt needs to
be sorted at all. I only sort it because the input files are sorted.


New version:
https://dev.lovelyhq.com/libburnia/libisoburn/commit/0e8227e76ae4c4f24097cfac2f415ef8e25ae4e7


Have a nice day :)

Thomas
Zhang Boyang
2022-06-27 09:30:02 UTC
Permalink
Hi,
Post by Thomas Schmitt
https://dev.lovelyhq.com/libburnia/libisoburn/commit/0e8227e76ae4c4f24097cfac2f415ef8e25ae4e7
Tested with:

1) Merge 17 DVDs, on Debian and Alpine Linux environment.

2) Merge 2 DLBDs, on Debian and Alpine Linux environment.

3) Run d-i cdrom-checker, on 4 merged ISOs from 1) and 2).

4) Test install into virtual machine, using 4 merged ISOs.

5) Install some random packages.

All OK! Thank you again :)

Best Regards,
Zhang Boyang
Thomas Schmitt
2022-06-27 12:00:01 UTC
Permalink
Hi,
Post by Zhang Boyang
All OK! Thank you again :)
So as far as we two are concerned, it seems we have a candidate for
release.

Currently it only goes into the GNU xorriso tarball, which is not src of
any Debian package. I could put it into the libisoburn tarball. But as
it is specific to debian-cd ISOs (not even to debian-live ones) i think
it should go into debian-cd (with a copy in GNU xorriso, which already
has libjte as Debian specialty).

Nevertheless, most important would be to offer it as plain text download
for those who have a shell and downloaded ISOs. If it were only available
as Debian package then it would not help those who still have to become
Debian users.
A cool place would be the download pages
https://cdimage.debian.org/debian-cd/current/{amd64,...}/jigdo-{dvd,bd,dlbd}
and mentioning in their *SUMS files.


Have a nice day :)

Thomas
Thomas Schmitt
2022-07-15 17:40:01 UTC
Permalink
Hi,

i added to merge_debian_isos the capability to burn the resulting
ISO 9660 filesystem directly to optical media and to write it directly
to storage device files. This removes the need for having substantial
free disk space beyond the capacity to store the original ISO images.

While burning to optical media is quite a harmless operation for the
normal user, writing to USB stick devices is not. Therefore
merge_debian_iso can cooperate with xorriso-dd-target, my offer for
safe copying of image files onto USB sticks on GNU/Linux.

I created
https://wiki.debian.org/MergeDebianIsos
to describe installation, feeding, and usage of merge_debian_isos.
Review and testing are welcome.

(The same applies to the older
https://wiki.debian.org/XorrisoDdTarget
)


Have a nice day :)

Thomas

Zhang Boyang
2022-05-28 20:30:01 UTC
Permalink
Hello,

Finally I set up my own mirror and I'm able to build my unofficial discs
with debian-cd. Thanks Andy Cater and Linux-Fan for instructions!

================== ABOUT MIRROR ======================================

Currently I'm using the recommended ftpsync.tar.gz scripts to set up my
mirror. However, although I was able to use ARCH_INCLUDE=amd64 to mirror
amd64 binaries only, I found there is no official/supported way to
include only one release (e.g. a bullseye only mirror. Please point me
out if I'm wrong) . Thus the mirror disk usage is too high because it
contains other older releases and testing release. It's now occupying
456GB of disk space. In contrast, the ISOs for single arch should
occupying about 71GB. I found the wiki page says 'debmirror' may fit my
needs. I will try it later.

================== ABOUT BUILDING DISCs ==============================

I tried my DLBD(50G) build and CUSTOM(all-in-one) build. The all-in-one
build time is slightly shorter than DLBD build. The test machine is a
virtual machine with 2 core and 3gb memory (host is 4 core / 8gb / ssd).

=== TIME RESULT FOR DLBD BUILD ===
real 32m18.505s
user 21m2.810s
sys 14m40.667s

=== TIME RESULT FOR ALL-IN-ONE BUILD ===
real 26m20.481s
user 19m29.255s
sys 10m14.450s

I used 'apt install debian-cd' to install, then I used the installed
scripts in /usr/share/debian-cd directly.

=== Changed configurations in 'CONF.sh' are ===

export MIRROR=/path/to/my/debian/mirror
export VARIANTS=xen
export CHECKSUMS="sha512 sha256"
export DISKTYPE=DLBD
or
export DISKTYPE=CUSTOM
export CUSTOMSIZE=4000000000

I tried my best to simulate an official build. I wasn't able to find the
configuration values for official build. Please correct me if I missed
something.

=== The build command is ===
***@mirror:/usr/share/debian-cd# time sh -c '. ./CONF.sh && export
COMPLETE=1 && make distclean && make status && make official_images &&
make imagesums && echo ALLOK'

=== About the time result ===
I think why all-in-one build time is shorter is because it doesn't need
the "try put one package, if can't, rollback" procedure. However,
because it only produce one ISO file, the building process can't be
parallelized. Thus with more cpu cores, the overall build time may vary,
and the result can be changed.


================== ABOUT MERGING DISCs ==============================

Then I tried Thomas Schmitt's merger script. Thanks Thomas Schmitt for
this script!

=== I ran this command to merge two DLBD isos ===
mkdir /cd1
mkdir /cd2
./merge_2_debian_isos debian-11.0.0-amd64-DLBD-1.iso /cd1
debian-11.0.0-amd64-DLBD-2.iso /cd2 merged.iso

=== Then use custom all-in-one iso as ground truth to compare them ===

mount -o loop,ro merged.iso /mnt
mount -o loop,ro debian-11.0.0-amd64-CUSTOM-1.iso /groundtruth
diff -q -r /mnt /groundtruth > diff.txt 2>&1
diff -r /mnt /groundtruth > diff.details.txt 2>&1

=== The result is like this ===
Files /mnt/.disk/cd_type and /groundtruth/.disk/cd_type differ
Files /mnt/.disk/info and /groundtruth/.disk/info differ
Files /mnt/.disk/mkisofs and /groundtruth/.disk/mkisofs differ
Files /mnt/README.html and /groundtruth/README.html differ
Files /mnt/README.txt and /groundtruth/README.txt differ
Files /mnt/boot/grub/efi.img and /groundtruth/boot/grub/efi.img differ
diff: /mnt/debian: recursive directory loop
Files /mnt/dists/bullseye/Release and
/groundtruth/dists/bullseye/Release differ
Files /mnt/dists/bullseye/contrib/binary-amd64/Packages.gz and
/groundtruth/dists/bullseye/contrib/binary-amd64/Packages.gz differ
Files /mnt/dists/bullseye/contrib/i18n/Translation-cs.gz and
[balabala]
Files /mnt/dists/bullseye/main/binary-amd64/Packages.gz and
/groundtruth/dists/bullseye/main/binary-amd64/Packages.gz differ
Files /mnt/dists/bullseye/main/debian-installer/binary-amd64/Packages.gz
and
/groundtruth/dists/bullseye/main/debian-installer/binary-amd64/Packages.gz
differ
Files /mnt/dists/bullseye/main/i18n/Translation-ca.gz and
/groundtruth/dists/bullseye/main/i18n/Translation-ca.gz differ
[balabala]
Files /mnt/dists/stable/Release and /groundtruth/dists/stable/Release differ
Files /mnt/dists/stable/contrib/binary-amd64/Packages.gz and
/groundtruth/dists/stable/contrib/binary-amd64/Packages.gz differ
Files /mnt/dists/stable/contrib/i18n/Translation-cs.gz and
/groundtruth/dists/stable/contrib/i18n/Translation-cs.gz differ
[balabala]
Files /mnt/dists/stable/main/binary-amd64/Packages.gz and
/groundtruth/dists/stable/main/binary-amd64/Packages.gz differ
Files /mnt/dists/stable/main/debian-installer/binary-amd64/Packages.gz
and
/groundtruth/dists/stable/main/debian-installer/binary-amd64/Packages.gz
differ
Files /mnt/dists/stable/main/i18n/Translation-ca.gz and
/groundtruth/dists/stable/main/i18n/Translation-ca.gz differ
[balabala]
Only in /groundtruth/firmware:
arm-trusted-firmware-tools_2.4+dfsg-2_amd64.deb
Only in /groundtruth/firmware: firm-phoenix-ware_4.7.5+repack-1_all.deb
Only in /groundtruth/firmware:
firmware-microbit-micropython-dl_1.2.4+dfsg-8_all.deb
Only in /groundtruth/firmware:
firmware-microbit-micropython-doc_1.0.1-2_all.deb
Only in /groundtruth/firmware: firmware-microbit-micropython_1.0.1-2_all.deb
Only in /groundtruth/firmware: firmware-tomu_2.0~rc7-2_all.deb
Only in /groundtruth/firmware: gnome-firmware_3.36.0-1_amd64.deb
Files /mnt/isolinux/boot.cat and /groundtruth/isolinux/boot.cat differ
Files /mnt/isolinux/f1.txt and /groundtruth/isolinux/f1.txt differ
Files /mnt/isolinux/isolinux.bin and /groundtruth/isolinux/isolinux.bin
differ
Files /mnt/md5sum.txt and /groundtruth/md5sum.txt differ

Most differs come from READMEs, and dist/ directory. ( I haven't tried
advance features of merger script yet, I will try it later)

The further details is in the attached file. I will try to analysis it.
(For size reasons, lines with md5 in it is filtered out by "sed -i -E -e
'/[a-f0-9]{32,32}/d' diff.details.txt")

Best Regards,
Zhang Boyang
Zhang Boyang
2022-05-21 15:10:01 UTC
Permalink
Hello Andy,
Post by Andrew M.A. Cater
If the original poster wants one huge .iso as one file to download from
cdimage.debian.org - then 2 x double layer Blu-Ray (say) as one file
would be 100GB or so.

Original poster here, IMO the iso needn't to be directly downloadable,
officially signed jigdo files and checksums are sufficient, just like
Blu-Ray variants. :-)


Best Regards,
Zhang Boyang
Zhang Boyang
2022-05-21 12:10:01 UTC
Permalink
Hi,
Post by Andrew M.A. Cater
Hi,
Indeed, I admit super-big-iso is a crazy idea, and a local mirror is more
useful in most cases. I think there is a few special cases that a
super-big-iso might be more useful.
1) Computers / Virtual Machines isolated from public internet or have no
network at all. It is convenient to have such an ISO to install software on
demand. A single file is much more convenient than setting up a local
mirror. It's also easy to manage or verify integrity, if frequent updates
are not needed.
If you have a computer isolated from the internet / with no network connectivitythen you are essentially "set and forget" - because the only way to update this
is to hand carry packages in for security updates or whatever. For that, you
can use the DL-BD sized .iso - you'll need a computer that's connected to the
'Net to build it via jigdo / jigit - but you'd need a computer connected to
the internet to donwload the DVD or any other medium.
The double-layer Blu-Ray disk sized medium is 50GB or so - so you could write
that to a 64G USB flash disk. We - the debian images team that build and test
the images - don't routinely create all those full size images and put them in
the archive - because that would be terabytes with every point release.
They're there if you need them.
Yes, I found only two DLBD images is sufficient to contain whole debian
distribution as for now. If my idea is not accepted, I would use that :-)
Post by Andrew M.A. Cater
Actually, setting up a local mirror is potentially almost as easy a use case
as using gigantic media files. That's exactly what many hosting companies
do in their data centres for their own use (and it's also in some of those
data centres where some of the Debian country level mirrors are located).
So a large isolated network may find it useful to have a local mirror
updated periodically.
I admit a local mirror is more suitable for large set of computers. But
for a small set of computers, for example, 1-5 computers, setting up a
local mirror might be too heavy.
Post by Andrew M.A. Cater
2) Archival purposes. If someone (in future, for example, in 2042) want to
install a very old debian system, he/she may grab the big ISO and all he/she
need is that single file. Although it's not easy to grab the file in far
future, but I guess there is always someone crazy enough to archive all
files, isn't it? :P
See, for example, snapshot.debian.org - which is growing. See also the
cdimage.debian.org archive directory where you can find most of the .iso
files for any release. Also, keeping large files around on disk for a long
time - there's some likelihood of data corruption. I'd hate a couple of
bit flips three quarters of the way through a 6TB file, say, to mean that
the whole thing isuseless.
IMO snapshot.debian.org is centralized platform, it might be lost if
something very bad happened. For cdimage.debian.org, it's not sufficient
to archive a full debian distribution because most files are jigdos.

For bit flip corruptions, I would recommand PAR2 (Parchive 2) which can
use Reed-Solomon to create a parity archive, and that archive can be
used to fix these corruptions.
Post by Andrew M.A. Cater
I think setting up a new variant of image is not very costly for debian
since there are already many variants, so why not give people more choices
:-)
Best Regards,
Zhang Boyang
Post by Andy Simpkins
Post by Zhang Boyang
Package: debian-cd
Hello,
I suggest debian release a new variant of ISO images, the all-in-one images. These all-in-one image contains ALL debian packages in a single ISO image (possibly all source packages in another all-in-one ISO image). Of course there is no such optical media can hold such a big image, but it is useful for virtual-machines, remotely managed servers, and archival purposes. The theoretical size limit of an ISO9660 filesystem is about 8TB, which is sufficient for including all debian packages.
For the name of this variant, I suggest 'everything', 'allinone', 'world', 'virt'.
p.s. This is my personal interest, and I would appreciate if you can kindly consider my suggestion.
Best Regards,
Zhang Boyang
Sorry to put a dampener on your suggestion but why would you need that?
Why not just mirror the archive to a local disk instead?
Then you have your copy of everything and can just point a netinst at your local mirror so you can install from there.
I think that would deliver on every use case that you would be able to use your big ISO image and more....
Andy is absolutely right, I think.
If it helps, I'm the "other" Andy in the team along with Steve McIntyre -
and yes, I know the problems of copying large images around, have a local
mirror here and routinely build at least the single layer BD disk with
every point release.
This is a topic that comes up fairly frequently in our informal discussions
as various people have argued for various sizes of medium - someone was
asking for 128G a short while ago - practically, the impact on storage
sizes and the pain of testing each size means that we have a selection
of all possible requests.
Oh, what I thought was "building a new variant is as easy as adding a
line in build script", but it seems I was wrong. I would apologize if I
offended you. If debian image team decide to refuse my suggestion, I
would respect the decision.
Post by Andrew M.A. Cater
It's an open question as to whether we will ever stop making media in
physical medium sizes - there's no obvious reason why an iso file needs
to fit on a DVD, for example - and then someone turns up who is still using
single layer DVDs on a regular basis. The number of people buying
burnt physical media is smaller and smaller all the time, but people still
request this from Steve and others.
With every good wish, as ever,
Andy Cater
Best Regards,
Zhang Boyang
s***@caiway.net
2022-05-23 12:10:02 UTC
Permalink
On Sat, 21 May 2022 13:33:02 +0800
Post by Zhang Boyang
Hi,
Indeed, I admit super-big-iso is a crazy idea, and a local mirror is
more useful in most cases. I think there is a few special cases that a
super-big-iso might be more useful.
1) Computers / Virtual Machines isolated from public internet or have no
network at all. It is convenient to have such an ISO to install software
on demand. A single file is much more convenient than setting up a local
mirror. It's also easy to manage or verify integrity, if frequent
updates are not needed.
2) Archival purposes. If someone (in future, for example, in 2042) want
to install a very old debian system, he/she may grab the big ISO and all
he/she need is that single file. Although it's not easy to grab the file
in far future, but I guess there is always someone crazy enough to
archive all files, isn't it? :P
Hi Zhang!

A very good idea!

I have local repositories mirrorred with debmirror, all versions.
Suddenly I found my archives of the old distributions were (almost) empty.

bo is lost
potato is lost
sarge is lost
etch is lost
lenny is lost

Upstream was deleted and debmirror, well, just mirrored them.

I spend several hours to find intact repositories for them globally, did not succeed.

So having another way of keeping debian history for future generations is a very good idea in my opinion!

Arne
Mike Hosken
2022-05-23 13:10:01 UTC
Permalink
Hi Arne,

Just fyi, archive.debian.org or a mirror site has a copy of the distros you have lost a long with the rest of the official Debian releases.

Mike Hosken
Sent via my iPhone
Post by s***@caiway.net
On Sat, 21 May 2022 13:33:02 +0800
Post by Zhang Boyang
Hi,
Indeed, I admit super-big-iso is a crazy idea, and a local mirror is
more useful in most cases. I think there is a few special cases that a
super-big-iso might be more useful.
1) Computers / Virtual Machines isolated from public internet or have no
network at all. It is convenient to have such an ISO to install software
on demand. A single file is much more convenient than setting up a local
mirror. It's also easy to manage or verify integrity, if frequent
updates are not needed.
2) Archival purposes. If someone (in future, for example, in 2042) want
to install a very old debian system, he/she may grab the big ISO and all
he/she need is that single file. Although it's not easy to grab the file
in far future, but I guess there is always someone crazy enough to
archive all files, isn't it? :P
Hi Zhang!
A very good idea!
I have local repositories mirrorred with debmirror, all versions.
Suddenly I found my archives of the old distributions were (almost) empty.
bo is lost
potato is lost
sarge is lost
etch is lost
lenny is lost
Upstream was deleted and debmirror, well, just mirrored them.
I spend several hours to find intact repositories for them globally, did not succeed.
So having another way of keeping debian history for future generations is a very good idea in my opinion!
Arne
s***@caiway.net
2022-05-23 15:50:01 UTC
Permalink
On Tue, 24 May 2022 00:51:35 +1200
Post by Mike Hosken
Hi Arne,
Just fyi, archive.debian.org or a mirror site has a copy of the distros you have lost a long with the rest of the official Debian releases.
Mike Hosken
Sent via my iPhone
Thanks alot Mike!

Looks good!

Soon I will try to rebuild my lost mirrors.


Thanks again,

Arne
Linux-Fan
2022-05-21 15:00:01 UTC
Permalink
Hi,
Post by Andrew M.A. Cater
Hi,
Indeed, I admit super-big-iso is a crazy idea, and a local mirror is more
useful in most cases. I think there is a few special cases that a
[...]
Post by Andrew M.A. Cater
Actually, setting up a local mirror is potentially almost as easy a use case
as using gigantic media files. That's exactly what many hosting companies
do in their data centres for their own use (and it's also in some of those
data centres where some of the Debian country level mirrors are located).
So a large isolated network may find it useful to have a local mirror
updated periodically.
I admit a local mirror is more suitable for large set of computers. But for
a small set of computers, for example, 1-5 computers, setting up a local
mirror might be too heavy.
[...]

Actually I think this may be a misconception. Setting up a mirror for
internal use is (from my experience with the `ftpsync` script, cf.
https://www.debian.org/mirror/ftpmirror) pretty straight-forward. AFAIK
the minimal steps are as follows:

- Download and extract ftpsync to a location
- Configure distrib/etc/ftpsync.conf
- Setup a webserver to serve the mirror directory
- Invoke mirror script
- Then point clients to the webserver location

The advantages of using a mirror over iso images are probably worth noting,
too:

- No need to understand the working of jigdo
- Works well with all kinds of machines
(physical, virtual, remote etc.)
- Can span multiple architectures:
It should come as a huge advantage in storage requirements
if you ever need more than one architecture because unlike the
.iso-based approach only the architectures of interest will be
contained in the mirror and packages common for all architectures
will only be stored once.
- Can take advantage of better file system performance, load balanching
and caching. This would probably only affect large installations,
though.
- In case networking is really not wanted on client machines, a mirror
can also be rsync'ed to target storage media and then referenced by
`file://` entries in the client's `/etc/apt/sources.list`.

HTH and YMMV
Linux-Fan

öö
Zhang Boyang
2022-05-21 16:30:01 UTC
Permalink
Post by Linux-Fan
Post by Zhang Boyang
I admit a local mirror is more suitable for large set of computers.
But for a small set of computers, for example, 1-5 computers, setting
up a local mirror might be too heavy.
Actually I think this may be a misconception. Setting up a mirror for
internal use is (from my experience with the `ftpsync` script, cf.
https://www.debian.org/mirror/ftpmirror) pretty straight-forward. AFAIK
- Download and extract ftpsync to a location
- Configure distrib/etc/ftpsync.conf
- Setup a webserver to serve the mirror directory
- Invoke mirror script
- Then point clients to the webserver location
Thanks for this information. I think I overestimated the difficulty of
creating a mirror.
Post by Linux-Fan
The advantages of using a mirror over iso images are probably worth
- No need to understand the working of jigdo
As a user, jigdo is easy :-) Feed 'jigdo-lite' with 'xxx.iso.jigdo' and
then go to sleep. When you wake up, the 'xxx.iso' file is ready.
Post by Linux-Fan
- Works well with all kinds of machines
  (physical, virtual, remote etc.)
  It should come as a huge advantage in storage requirements
  if you ever need more than one architecture because unlike the
  .iso-based approach only the architectures of interest will be
  contained in the mirror and packages common for all architectures
  will only be stored once.
- Can take advantage of better file system performance, load balanching
  and caching. This would probably only affect large installations,
  though. > - In case networking is really not wanted on client machines, a mirror
  can also be rsync'ed to target storage media and then referenced by
  `file://` entries in the client's `/etc/apt/sources.list`.
Indeed, a local mirror is more suitable in most cases. I only come up
with a few cases that ISOs are more useful:

1) When installing packages using a mirror, the mirror machine must be
powered on. There is no need for a dedicated mirror machine if using
ISO. (This doesn't affect offline storage medias, though)

2) ISOs are faster to copy than a lot of small packages. (Rsync will be
easier to update, though)

3) ISOs are easy to verify integrity. Run 'sha512sum -c SHA256SUMS' then
you are confident there is no problem in its contents.

4) ISOs are easy to archive or create parity archives like PAR2
archives. It's more robust than having lots of small files. (Creating
tarballs may provide same advantages, though)

After all, I admit the advantages of ISOs are minor. But I think it's
good to have an alternative :-)
Post by Linux-Fan
HTH and YMMV
Linux-Fan
öö
Best Regards,
Zhang Boyang
Andrew M.A. Cater
2022-05-21 19:00:01 UTC
Permalink
[Here for anyone else that may need it - probably offtopic for this list
after that.]
Post by Zhang Boyang
Post by Linux-Fan
Post by Zhang Boyang
I admit a local mirror is more suitable for large set of computers.
But for a small set of computers, for example, 1-5 computers,
setting up a local mirror might be too heavy.
Actually I think this may be a misconception. Setting up a mirror for
internal use is (from my experience with the `ftpsync` script, cf.
https://www.debian.org/mirror/ftpmirror) pretty straight-forward. AFAIK
- Download and extract ftpsync to a location
- Configure distrib/etc/ftpsync.conf
- Setup a webserver to serve the mirror directory
- Invoke mirror script
- Then point clients to the webserver location
Thanks for this information. I think I overestimated the difficulty of
creating a mirror.
Quoting myself - which might be bad form

http://flosslinuxblog.blogspot.com/2020/02/rebuilding-mirror-software-mirroring-of.html

gives the full steps to set up a mirror by editing one script, more or less.

You do need rsync and a mirror to pull from but this is really, really easy to do.

Setting up Apache is covered in another blog in that series at almost the
same time - it's essentially just unocmmenting the stanza for /srv in the
default configuration. [My mirror directories are all under /srv ]

Hope this helps,

Andy Cater
Post by Zhang Boyang
Best Regards,
Zhang Boyang
Zhang Boyang
2022-05-22 14:10:01 UTC
Permalink
Hi,

Thanks for this information :-) I will try it.

Best Regards,
Zhang Boyang
Post by Andrew M.A. Cater
Quoting myself - which might be bad form
http://flosslinuxblog.blogspot.com/2020/02/rebuilding-mirror-software-mirroring-of.html
Post by Andrew M.A. Cater
gives the full steps to set up a mirror by editing one script, more or less.
You do need rsync and a mirror to pull from but this is really, really easy to do.
Setting up Apache is covered in another blog in that series at almost the
same time - it's essentially just unocmmenting the stanza for /srv in the
default configuration. [My mirror directories are all under /srv ]
Hope this helps,
Andy Cater
Loading...