Package Management - Identifiers
The other really big topic when it comes to package is how to identify them. This is a lot more complicated because I view packages in a polyglot manner instead of a single foreign ecosystem.
Series
This is going to be a series of posts, but I have no idea of how fast I'll be writing them out. I want to work out my ideas, maybe have a few conversations, and then start to move to more technical concepts.
- 2023-02-07 Package Management - Introduction
- 2023-02-08 Package Management - Versions
- 2023-02-12 Package Management - Identifiers
- 2023-02-13 Package Management - Dependencies
- 2023-09-20 Package Management - Identifiers 2
- 2023-11-30 Package Management - Formats and Registries
Previous Updates
While I got to thinking (this is a living series), I didn't like the upgrade format in the versions post of this series. In there, I suggested this for Bakfu:
{
content: {
version: "1.10.0",
majorUpdate: "[1.10.0, 1.15.0)",
minorUpdate: "[1.10.0, 1.11.0)",
},
package: {
version: "2.0.0",
}
}
I think putting the update rules in their own category makes sense since they related to each other but also to handle adding other rules as needed in there. I thought about putting everything in a version, but then I would have {version: { current: "1.2.3" }}
which didn't feel right.
We also should have the package format version.
{
bakfu: {
version: "0.0.1",
},
content: {
version: "1.10.0",
update: {
major: "[1.10.0, 1.15.0)",
minor: "[1.10.0, 1.11.0)",
},
},
package: {
version: "2.0.0",
}
}
Identifier Schemas
A package identifier is one of those things that varies greatly between different languages:
- NPM:
markdowny
,@mfgames-writing/format
- NuGet:
MfGames.Nitride
,System.Collections.Generic
- Minetest:
beds
,moreores
- Java:
com.mfgames.ruby
(an old game of mine) - Go:
github.com/callicoder/packer/numbers
- Cargo:
serde
- Atom:
spell-check
That gives a pretty good overview of most languages that use packages. I don't think there are many other ways of referencing them. In addition to the ones above, I want to be able to allow anyone to create an arbitrary package ecosystem because I don't need to have a game using my calendar's packages (unless it wants to) but they are distinct.
Package Groups and Organizations
I will be up front, I don't like flat package systems such as Node and Cargo. They have a significant problem with name squatting (where someone “reserves” a package name that is ideal) and also because many of the packages are generic (Minetest's beds
or Node's time
) even if they are as capable.
If you notice, almost all of my packages start with mfgames
or MfGames
. I do that to avoid conflicts with other packages. I thought I wouldn't for Nitride but when I finally created the packages, someone had put one up on nuget.org and I promptly turned it back to MfGames.Nitride
. At this point, if someone is using MfGames.
in their packages, it's intentional (and probably malicious).
Scoped packages are better, more so if there is a way of controlling who can use that scope. On nuget.org, you can reserve a prefix which is how they prevent someone from uploading a Microsoft.HackYourComputer
package. In theory, I could do the same with MfGames
but it requires some dependencies that I don't have lying around (cash).
That said, I don't like NPM's @organization
because they also use @1.3.2
for versions. So, then you have package and version identifiers as @mfgames-writing/format@3.4.0
. It isn't “pretty” and I am frequently driven by what I think looks good. As such, I'm inclined to avoid the @
whenever possible.
Polyglot Identifiers
pol·y·glot /ˈpälēˌɡlät/ -> adj. knowing or using several languages.
When working with code, not only do we use different packages, we use packages that require entirely different systems. Probably one of the most common one for me is the bane of my JavaScript and TypeScript development: node-gyp
. Node-gyp requires a C compiler and Python to run, but it is for Node packages. So, that means that if we really want to identify a package (and its dependencies, a later post), we need to be able to address all of those dependencies in the system addressing scheme.
Uniform Resource Identifiers (URIs)
Since I want to be able to identify packages by something that jumps languages, it seems like a perfect use of a URI. In short, a URI looks like this:
URI = scheme “:” ["//" authority] path ["?" query] ["#" fragment]
The most common one we see are URLs:
In the above example, https
is the scheme, //fedran.com
is the authority, and flight-of-the-scions/
is the path. If we take the same idea, we could create a pseudo version of package identifiers that use the same thing.
(Side note, Humanizer is an awesome package for C# and everything Charm.sh is very pretty. Markdowny is my own tool for working with Markdown + YAML files, so I think it is fantastic.)
This works for well-known package systems, but what about arbitrary ones? Say I create something for Author Intrusion, I could use authorintrusion:spell-check
but that would require registering a URI if I want to avoid conflict. A more efficient way (that I also will help with some future ideas) is to prefix them with bakfu:
instead and then allow domain or scoped registrations for non-well-known ecosystems.
bakfu:npm:markdown
bakfu:go:github.com/charmbracelet/lipgloss
bakfu:authorintrusion.com/spell-check
(I really should do something about that website.)
Well-Known Systems and Aliases
Sadly, this does assume a centralized repository for the well-known (e.g., npm
, go
, nuget
) packages. It also leans into DNS instead of using other addressable schemas or pet names, I could also use dotnet.microsoft.com
for nuget
(since that is the general packaging system), nodejs.org
, and go.dev
for those languages.
If I do go with the well-known, I would also have to establish a extension process that would allow communities to introduce new aliases or “register” their own to avoid conflict. While that isn't something I'd want to do, if this entire idea goes beyond me, it would be the first thing I would establish because I like to play well with others.
I would also call it BEEP - Bakfu Emerging Extension Process so then I could have BEEP-0000
be the first one.
Qualified Identifiers
Using the URI, we can use URL-style query parameters to create a key/value pair for additional details about the package such as Cargo's features, .NET's frameworks, or whatever variants are important to include.
For ones used by Bakfu, using a XML-style prefix would be more clear than either having a name collision (version=
could be common) or an arbitrary character prefix (_
). In these cases, bakfu:version
would be useful and allow for further extensions.
For keys that need multiple, such as crate's features, the key would be suffixed by []
to indicate an array (“Ruby style” is what I've heard this called).
Rust has both platforms and featuers:
bakfu:cargo:serde?bakfu:version=1.0.152&feature[]=derive&feature[]=rc
bakfu:cargo:serde?bakfu:version=1.0.152&platform=x86_64-unknown-linux-gnu
NuGet has frameworks:
bakfu:nuget:Autofac?bakfu:version=6.5.0&framework=netstandard2.0
bakfu:nuget:Autofac?bakfu:version=6.5.0&framework=net6.0
Given that, and the future discussion on dependencies, I think it would make sense to make it a query parameter instead of using the #
fragment. Also, some foreign ecosystems don't have versions (Minetest, at least one of the Python systems) which means even a version is optional.
Expanded Form
To simplify coding, the above URL could easily be broken down into a standard JSON structure for less condensed format with the “bakfu:” attributes pulled up a level.
{
// bakfu:cargo:serde?bakfu:version=1.0.152&feature[]=derive&feature[]=rc&platform=x86_64-unknown-linux-gnu
type: "cargo",
id: "serde",
version: "1.0.152",
attributes: {
"platform": "x86_64-unknown-linux-gnu",
"features": ["derive", "rc"],
},
}
At the moment, I'm going to use both formats but I lean toward the more expanded form in actual files with the condensed URI form when trying to explain things. I'm sure I will bounce back and forth between something that feels “right”. You may also notice that I spell out most things because I don't like abbreviations while communicating concepts.
Distributed Packages
This is for a future topic, but I am planning on addressing distributed packages that don't encourage using only a single source for the data. So, the URI above only addresses the framework-level components that identify a distinct package irregardless if you find it on nuget.org, npmjs.com, gitlab.com, or even src.mfgames.com.
Metadata
Categories:
Tags: