Project

General

Profile

Haketilo/Hydrilla package building

Added by koszko over 1 year ago

The tool used to make Haketilo packages is Hydrilla builder. Right now Hydrilla builder does not handle program builds in the conventional sense. There's no compilation nor minification, only computation of files' SHA256 sums and (optionally) generation of an SPDX report.

We want to give Hydrilla builder the ability to install specified dependencies and perform necessary software builds. This will be done as Roadmap task 16 (which, btw, is important enough that we should prioritize it over some of the lower-numbered tasks).

There are many possible ways to solve this. If I just pick one and jump straight to implementing it, I will most likely miss something and will have to write it from scratch later as is usually the case. Hence I want to discuss the feature before implementing anything. I hope to see some comments and suggestions.

Requirements

  1. We should be able to specify the build procedure for a package.
    • Hydrilla builder should be able to automatically perform package build according to that procedure.
  2. We should be able to specify the tools needed to build a package.
    • Hydrilla builder should be able to automatically install those tools when performing a build (by default preferably to an isolated environment).
  3. It should be possible to build all Haketilo packages from source.
  4. We should leverage existing software repositories where possible without violating the requirements above (this applies to both the build tools and software to be packaged as Haketilo resources).

Considerations

Relying on a GNU/Linux distribution's repository for build tools

We could allow Haketilo packages to name their build dependencies from repositories of a chosen distro. We'd then implement an ability for Hydrilla builder to install those for the duration of a build. We'd preferably perform builds inside a chroot, PRoot or a Linux namespace-based container, although UML is also an option here1.

Choosing a distro

At the beginning it's probably best to just tie Hydrilla builder to one chosen distro. With time we can add support for using build dependencies from other ones. This means right now we just need to choose which distro to start with. I'm considering the 3 described below and leaning towards Guix. I excluded Hyperbola due to it lacking some important packages (NodeJS) and Parabola because it's too unstable and in my view has suboptimal packaging hygiene.

Trisquel

Drawbacks:

  • It doesn't emphasize reproducibility.
  • It's an LTS distro. Move to a new stable release could cause problems. Pros:
  • FSDG-compliance included.
  • Affiliated with GNU which we also affiliate with.
Guix

Drawbacks:

  • Doesn't have as many packages as Trisquel or NixOS. Pros:
  • Highly reproducible.
  • FSDG-compliance included.
  • Affiliated with GNU which we also affiliate with.
  • Easy installation of a specific version of a tool.
NixOS

Drawbacks:

  • Not FSDG-compliant (although it filters out nonfree packages by default which should be sufficient for our pruposes). Pros:
  • Highly reproducible.
  • FSDG-compliance included.
  • Affiliated with NLnet and NGI0 Discovery which we also affiliate with.
  • Easy installation of a specific version of a tool.

Filling the missing parts

At some point we'll reach a situation where a certain build tool is not packaged for the distro or is present in a version that's inappropriate for building certain Haketilo package. There's no chance we can get a distro to package a tool just because we want so and to do it in a timely fashion. That's when we need to create our own, supplemental repository which would provide the required build tools. At that point we'll also need to give Hydrilla builder the ability to use custom repositories together with the main one.

Luckily, this is a problem we don't have to address now.

Naming build dependencies

The same tool might be named differently under different distros. If we want to enable building a certain Haketilo package under many distros, we have to specify how the build tools are named under each of them. We could do that inside the Haketilo source package definition, but that'd cause repetition when many Haketilo packages rely on the same tool. It'd be good to instead create meta-packages that specify how a build tool is named under each of supported distros. There'd be 1 meta-package for 1 build tool.

This is also something we need not be concerned about right now.

Relying on NPM

Many common tools for working with Web technologies are installable from Node Package Manager. This is what most people use for building and distributing JavaScript programs and integrating with it could make building Haketilo packages easier. Unfortunately, this is a bit more problematic than relying on distros. I explain below.

Software freedom

NPM repository is not FSDG-compliant. Even though it does require each project to declare a license from SPDX list, AFAIC it does not require it to be a free software license, not to mention fulfilling other requirements of FSDG. There's also no requirement for complete source code to be included in NPM package (although it is usually included by convention). This means relying on NPM could theoretically lead us to using running tools that have a free software license but are only distributed as obfuscated code2. You might want to read Michael McMahon's article for some more insight into the problems of language-specific package managers.

Additionally, NPM allows anyone to register and upload a new package as long as its name is not yet used by another package. This by itself is a good thing because it fosters sharing. However, this also means NPM packages might have low hygiene.

In order to make NPM usable in freedom we need to spin up our own NPM repository with sanitized versions of packages from the official one. This might seem like an overkill but there're 2 important reasons to consider it:

  1. A libre version of the NPM repo (and also of other language-specific package managers repos) is what the Free Software Community desperately needs. It would benefit more than just Haketilo.
  2. We don't need to pull all free software packages from NPM. We should start with just the ones we need.

Missing tools

Not all tools related to web development are available through NPM. Some compiler toolchains, especially those targetting WebAssembly, are not in there and might never get included3. This means even with NPM integration Hydrilla builder would sometimes need to rely on some distro.

Work and experience needed

Regardless of the considerations above, I admit cloning NPM is not something we should do right now. That's especially true if I am considered, since I lack experience using NPM. Although tools to set up a private NPM registry are available, merely learning to use them and getting some packages served would cost too much precious time. Therefore, please consider all I wrote about NPM an idea we might want to implement somwehere in the future.

Relying on existing repositories for Haketilo packages as well

So far I discussed how we could satisfy Haketilo packages dependencies on build tools. However, many site fixes will not only require some build tools to be packaged but also have runtime dependencies on libraries. And it happens that the most common JS libraries are already packaged in distros like Trisquel and Guix. Instead of building these libs in terms of Hydrilla builder operation, we could arrange to use the artifacts provided by distros.

Let's consider JQuery as an example. Normally, we'd put JQuery sources in a Hydrilla source package. Hydrilla builder would perform any build steps necessary and would output the files of a JQuery Haketilo package ready to be served by Hydrilla.

Now, the alternative I am suggesting. Under Debian-based distros running apt install libjs-jquery places a non-minified JQuery script under /usr/share/javascript/jquery/jquery.js. Thus, to make a Haketilo JQuery package, we wouldn't even need to put JQuery sources inside our source package. We could instead make the source package use a "dummy" build procedure that simply copies over (or symlinks?) the JQuery script. The libjs-jquery APT package would of course be specified as a build dependency in the Hydrilla source package.

The same idea could be applied to NPM as well - facilitate making NPM packages into Haketilo libraries. Given the number of libraries in NPM this probably explains why I mentioned it in the first place...

benefits:

  • It's easy to quickly make some popular JS libs into Haketilo packages and unblock some other tasks.
  • It prevents us from duplicating the work distro packagers already did.
  • We could cooperate with distro maintainers helping each other. drawbacks:
  • It will be way harder or even impossible to make the build distro-agnostic.
  • Packages already present in distros are not always built the way we expect. For example, the JavaScript file from libjs-jquery mentioned above also includes a library called Sizzle.js4. In Haketilo, we'd rather have JQuery and Sizzle as separate packages with a dependency of one on the other. This means we'd still want to package these on our own (but we can temporarily accept what distros provide).
  • We will end up with some Haketilo packages being true packages and some (mostly libraries) being dummy packages. This looks like a severe inconsistency which I really dislike. We can either:
    • live with it or
    • require all Haketilo packages to be built as distro packages (this idea quite interestingly affects the other things5).
  • What if some JS library is licensed under GPLv2-only? When distributing a built version we're also obligated to distribute the sources. But the true sources are not in the dummy Haketilo source package but rather in the distro's source package. This and similar nuances would be fairly easy to overlook. Also, having to set up a mirror of (parts of) distro's repo just to comply with a license seems like an overkill. Hopefully, this won't be a real issue in practice.
  • How should we supply license information corresponding to the file(s) taken from distro package we're piggybacking on?
    • For JQuery, utilizing a Debian-based distro, we could just have the Haketilo package include everything under /usr/share/doc/libjs-jquery/. However, in case of other libs some of the files under /usr/share/common-licenses/ would need to be included as well.
    • I am yet to learn how this would look when utilizing Guix.

Keeping the source package format reasonable

How should we define build procedure (if at all5)? Debian packages use Makefiles. Arch packages use shell scripts.

Once we decide on strategies we want to adopt we could also talk at more length about how we want to change the source package format.

Request for comments

I started writing this with the goal of getting some feedback. Turns out I spent on it way more time than I planned but it also helped me clear my mind. Now I kind of see what I personally prefer.

Nevertheless, I'd still find any suggestions extremely helpful. Please, take voice :)


  1. Chroot has the drawback of requiring root, namespaces have the drawback of being non-functional in certain situations (e.g. inside a chroot), UML has the drawback of requiring a separate kernel build to be carried along. Additionally, all methods but Chroot are Linux-specific. There are surely also performance differences between them, but they can probably be neglected. Right now I'd be leaning towards PRoot and maybe add support for chroot later on. Additionally, support for containers would be helpful if we get to provide a build service in the future. 

  2. Even if the unobfuscated code of an NPM package is available somewhere else (e.g. in a public git repo), it is no longer possible to automatically rebuild a package from source. I personally want to stick to using packages built by either myself, a distro I trust6 or an individual I trust (it seems my personal expectations are higher that those of FSDG in this case). With NPM packages distributed this way it is made difficult. 

  3. Although this is theoretically possible if the entire toolchain gets compiled to WASM. I'm not sure what's the current state of efforts towards that. 

  4. Interestingly, Debian also has a libjs-sizzle package. Perhaps a copy of Sizzle is included in JQuery build because too much other software using JQuery expects to be able to load it as a standalone script? Idk. 

  5. When requiring all Haketilo packages to be built as distro packages, we'd no longer need support for running any build procedure in Hydrilla builder. 

  6. But I consider it OK to use Ubuntu packages that are included unmodified in Trisquel. It's off-topic, but you can ask me about this in private. 


Replies (1)

RE: Haketilo/Hydrilla package building - Added by koszko over 1 year ago

Forwarded from an email from Michael McMahon:

Hi, Wojtek!

Thanks for the shout out in the post!

For distribution packaging, I am not super familiar with the node
ecosystem, but from what I gather they tend to use a lot of packages
and newer is better. The low number of packages in guix and nixos
put them at a significant disadvantage here generally. Trisquel's
slow development puts it also at a disadvantage as it is usually one
LTS version behind the latest Ubuntu LTS. If you want to stick with
FSDG distributions and need an up-to-date one, that leaves you with
Parabola and Hyperbola. I take it Hyperbola is stuck working on
their BSD switch-over so that leaves Parabola really. If you want
something similar but not FSDG, I would recommend Debian Sid without
the nonfree repos. I run a personal server for about a year using
Debian Sid and it has been extremely stable for a rolling release
distribution.

I am not really familiar with the packages that you will be building
so I cannot comment too deep. I assume that using a distribution for
packaging would be very limiting and even if you go that route, I
would expect that you would eventually need npm. I have only used a
handful of node things in production so I could be wrong.

Best,
Michael McMahon | Web Developer, Free Software Foundation
GPG Key: 4337 2794 C8AD D5CA 8FCF FA6C D037 59DA B600 E3C0
https://fsf.org

US government employee? Use CFC charity code 63210 to support us
through the Combined Federal Campaign. https://cfcgiving.opm.gov/

    (1-1/1)

    Reply