Project

General

Profile

Hydrilla on-disk data format » History » Version 14

koszko, 11/13/2021 04:30 PM
minor corrections

1 1 koszko
# Hydrilla on-disk data format
2
3
This page explains the upcoming format for Hydrilla site content stored in the filesystem. It refers to the upcoming Hydrilla 0.2 release.
4
5
{{toc}}
6
7
## How Hydrilla loads content
8
9 14 koszko
Hydrilla expects a content directory to be specified in its configuration file (under the key "content-dir"). It then processes all its direct subdirectories. If given subdirectory contains an `index.json` file, Hydrilla loads it (smartly ignoring "//" comments in it) and collects the definitions of site resources, pattern->payload mappings and licenses in it.
10 1 koszko
11
## Format of an index.json
12
13
To understand the format, look into this example file with explanatory comments in it:
14
15
``` javascript
16
// SPDX-License-Identifier: CC0-1.0
17
18
// Copyright (C) 2021 Wojtek Kosior
19
// Available under the terms of Creative Commons Zero v1.0 Universal.
20
21
// This is an example index.json file describing Hydrilla site content. As you
22
// can see, for storing site content information Hydrilla utilizes JSON with an
23
// additional extension in the form of '//' comments support. Hydrilla shall
24
// look into each direct subdirectory of the content directory passed to it
25
// (via a cofig file option). If such subsirectory contains an index.json file,
26
// Hydrilla shall process it.
27
28
// An index.json file conveys definitions of site resources, pattern->payload
29
// mappings and licenses thereof. The definitions may reference files under
30
// index.json's containing directory, using relative paths. This is how scripts,
31
// license texts, etc. are included. Unix paths (using '/' as separator) are
32
// assumed. It is not allowed for an index.json file to reference files outside
33
// its directory.
34
35 5 koszko
// Certain objects are allowed to contain a "comment" field. Although '//'
36
// comments can be used in index.json files, they will be stripped when the file
37
// is processed. If a comment should be included in the JSON definitions served
38
// by Hydrilla API, it should be put in a "comment" field of the proper object.
39
40 7 koszko
// Various kinds of objects contain version information. Version is always an
41
// array of integers, with major version number being the first array item. When
42 9 koszko
// applicable, a version is accompanied by a revision field which contains a
43
// positive integer. If versions specified by arrays of different length need to
44
// be compared, the shorter array gets padded with zeroes on the right. This
45
// means that for example version 1.3 could be given as both [1, 3] and
46
// [1, 3, 0, 0] (aka 1.3.0.0) and either would mean the same.
47 7 koszko
48 1 koszko
{
49
    // Once our json schema changes, this version will change. Our software will
50 7 koszko
    // be able to handle both current and older formats thanks to this
51 10 koszko
    // information present in every index.json file. Different schema versions
52
    // are always incompatible (e.g. a Hydrilla instance that understands schema
53
    // version 0.2.0.0 will not understand version 0.2.0.1). Schemas that are
54
    // backwards-compatible will be denoted by a different revision.
55 14 koszko
    // We will try to make schema version match the version of Hydrilla software
56 10 koszko
    // that introduced it.
57 7 koszko
    "schema_version": [0, 2],
58
    "schema_revision": 1,
59 1 koszko
60
    // Copyright of this json file. It's a list of copyright holder information
61
    // objects. Alternatively, "auto" can be used to make Hydrilla attempt to
62
    // extract copyright info from the comment at the beginning of the file.
63
    "copyright":  [
64
	// There can be multiple entries, one for each co-holder of the
65
	// copyright.
66
	{
67
	    // There can also be multiple years, like ["2021","2023-2024"].
68
	    "years": ["2021"],
69 5 koszko
	    // Name of the copyright holder. Depending on the situation it can
70 1 koszko
	    // be just the first name, name+surname, a company name, a
71
	    // pseudonym, etc.
72
	    "holder": "Wojtek Kosior"
73
	}
74
    ],
75
76
    // License of this json file. Identifier has to be known to Hydrilla. Can
77 6 koszko
    // be defined either in the same or another index.json file as a "license"
78 1 koszko
    // item. It is possible to specify license combinations, like:
79
    // [["Expat", "and", "Apache-2.0"], "or", "GPL-3.0-only"]
80
    // Alternatively, "auto" can be used to make Hydrilla attempt to extract
81
    // copyright info from this file's SPDX license identifier.
82
    "licenses": "CC0-1.0",
83
84
    // Where this software/work initially comes from. In some cases (i.e. when
85 5 koszko
    // the developer of content is also the one who packages it for Hydrilla)
86 1 koszko
    // this might be the same as "package_url".
87
    "upstream_url": "https://git.koszko.org/pydrilla/tree/example_content/hello",
88
89 5 koszko
    // Where sources for the packaging of this content can be found.
90
    "package_url": "https://git.koszko.org/pydrilla/tree/example_content/hello",
91
92
    // Additional "comment" field can be used if needed.
93 1 koszko
    // "comment": ""
94
95
    // List of actual site resources, pattern->payload mappings and licenses.
96
    // Each of them is represented by an object. Meta-sites and replacement site
97 6 koszko
    // interfaces will also belong here once they get implemented.
98 1 koszko
    "definitions": [
99
	{
100
	    // Value of "type" can currently be one of: "resource", "license"
101
	    // and "mapping". The one we have here, "resource", defines a list
102
	    // of injectable scripts that can be used as a payload or as a
103
	    // dependency of another "resource". In the future CSS style sheets
104
	    // and WASM modules will also be composite parts of a "resource" as
105
	    // scripts are now.
106
	    "type": "resource",
107 3 koszko
108
	    // Used when referring to this resource in "dependencies" list of
109
	    // another resource or in "payload" field of a mapping. Should
110
	    // be consize and can only use a restricted set of characters. It
111
	    // has to match: [-0-9a-zA-Z]
112 1 koszko
	    "identifier": "helloapple",
113 3 koszko
114
	    // "long_name" should be used to specify a user-friendly alternative
115 8 koszko
	    // to an identifier. It should generally not collide with a long
116
	    // name of some resource with a different uuid and also shouldn't
117
	    // change in-between versions of the same resource, although
118
	    // exceptions to both rules might be considered. Long name is
119
	    // allowed to contain arbitrary unicode characters (within reason!).
120 1 koszko
	    "long_name": "Hello Apple",
121
122 8 koszko
	    // Different versions (e.g. 1.0 and 1.3) of the same resource can be
123
	    // defined in separate index.json files. This makes it easy to
124
	    // accidently cause an identifier clash. To help detect it, we
125
	    // require that each resource has a uuid associated with it. Attempt
126
	    // to define multiple resources with the same identifier and
127
	    // different uuids will result in an error being reported. Defining
128
	    // multiple resources with different identifiers and the same uuid
129
	    // is disallowed for now (it may be later permitted if we consider
130
	    // it good for some use-case).
131 1 koszko
	    "uuid": "a6754dcb-58d8-4b7a-a245-24fd7ad4cd68",
132
133 7 koszko
	    // Version should match the upstream version of the resource (e.g. a
134
	    // version of javascript library). Revision number starts as 1 for
135
	    // each new resource version and gets incremented by 1 each time a
136 11 koszko
	    // modification to the packaging of this version is done. Hydrilla
137
	    // will allow multiple definitions of the same resource to load, as
138 13 koszko
	    // long as their versions differ. Thanks to the "version" and
139
	    // "revision" fields, clients will know they have to update certain
140
	    // resource after it has been updated. If multiple definitions of
141
	    // the same version of given resource are provided, an error is
142 11 koszko
	    // generated (even if those definitions differ by revision number).
143 7 koszko
	    "version": [2021, 11, 10],
144 9 koszko
	    "revision": 1,
145 1 koszko
146
	    // A short, meaningful description of what the resource is and/or
147
	    // what it does.
148
	    "description": "greets an apple",
149
150
	    // If needed, a "comment" field can be added to provide some
151
	    // additional information.
152
	    // "comment": "this resource something something",
153
154
	    // One should specify the copyright and licensing terms of the
155
	    // entire package. The format is the same as when specifying these
156
	    // for the index.json file, except "auto" cannot be used.
157
	    "copyright": [{"years": ["2021"], "holder": "Wojtek Kosior"}],
158
	    "licenses": "CC0-1.0",
159
160
	    // Resource's "dependencies" array shall contain names of other
161
	    // resources that (in case of scripts at least) should get evaluated
162
	    // on a page before this resource's own scripts.
163
	    "dependencies": ["hello-message"],
164
165
	    // Array of javascript files that belong to this resource.
166
	    "scripts": [
167
		{
168
		    // Script name. It should also be a valid file path.
169
		    "name": "hello.js",
170 2 koszko
		    // Copyright and license info of a script file can be
171
		    // specified using the same format as in the case of the
172 1 koszko
		    // index.json file itself. If "copyright" or "license" is
173
		    // not provided, Hydrilla assumes it to be the same as the
174
		    // value specified for the resource itself.
175
		    "copyright": "auto",
176
		    "licenses":  "auto"
177
		}, {
178
		    "name":   "bye.js"
179
		}
180 3 koszko
	    ]
181 1 koszko
	}, {
182
	    "type":       "resource",
183
	    "identifier": "hello-message",
184
	    "long_name":  "Hello Message",
185 3 koszko
	    "uuid":       "1ec36229-298c-4b35-8105-c4f2e1b9811e",
186
	    "version":     [2021, 11, 10],
187 2 koszko
	    "revision":    2,
188
	    "description": "define messages for saying hello and bye",
189 3 koszko
	    "copyright":   [{"years": ["2021"], "holder": "Wojtek Kosior"}],
190 1 koszko
	    "licenses":    "CC0-1.0",
191
	    // If "dependencies" is empty, it can also be omitted.
192
	    // "dependencies": [],
193
	    "scripts": [{"name": "message.js"}]
194
	}, {
195
	    "type": "mapping",
196
197 3 koszko
	    // Has similar function to resource's identifier. Should be consize
198
	    // and can only use a restricted set of characters. It has to match:
199
	    // [-0-9a-zA-Z]
200
	    // It can be the same as some resource identifier (those are
201
	    // different entities and are treated separately).
202
	    "identifier": "helloapple",
203 1 koszko
204 8 koszko
	    // "long name" and "uuid" have the same meaning as in the case of
205
	    // resources. Uuids of a resource and a mapping can technically be
206
	    // the same, but it is recommended to avoid even this kind of
207
	    // repetition.
208 1 koszko
	    "long_name": "Hello Apple",
209 4 koszko
	    "uuid": "54d23bba-472e-42f5-9194-eaa24c0e3ee7",
210
211
	    // "version" differs from its counterpart in resource in that it has
212 7 koszko
	    // no accompanying revision number.
213
	    "version": [2021, 11, 10],
214 1 koszko
215
	    // A short, meaningful description of what the mapping does.
216
	    "description": "causes apple to get greeted on Hydrillabugs issue tracker",
217
218
	    // A comment, if necessary.
219
	    // "comment": "blah blah because bleh"
220
221 12 koszko
	    // The "payloads" array specifies, which payloads are to be
222 1 koszko
	    // applied to which URLs.
223 12 koszko
	    "payloads": [
224 1 koszko
		{
225
		    // Should be a valid Haketilo URL pattern.
226
		    "pattern": "https://hydrillabugs.koszko.org/***",
227
		    // Should be the name of an existing resource. The resource
228
		    // may, but doesn't have to, be defined in the same
229
		    // index.json file.
230
		    "payload": "helloapple"
231
		},
232
		// More associations may follow.
233
		{
234
		    "pattern": "https://hachettebugs.koszko.org/***",
235
		    "payload": "helloapple"
236
		}
237
	    ]
238
	}, {
239
	    "type": "license",
240
241
	    // Will be used to refer to this license in other places. Should
242 6 koszko
	    // match the SPDX identifier if possible (despite that, please use
243
	    // "Expat" instead of "MIT" where possible). Unlike other definition
244
	    // types, "license" does not allow uuids to be used to avoid license
245
	    // id clashes. Any attempt to define multiple licenses with the same
246 1 koszko
	    // id will result in an error being reported.
247
	    "identifier": "CC0-1.0",
248
249
	    // This long name must also be unique among all license definitions.
250
	    "long_name": "Creative Commons Zero v1.0 Universal",
251 13 koszko
252
	    // We don't use "version" in license definitions. We do, however,
253
	    // use "revision" to indicate changes to the packaging of a license.
254
	    // Revision should be increased by 1 at each such change.
255
	    "revision": 2,
256 1 koszko
257
	    "legal_text": [
258
		// Legal text can be available in multiple forms. Usually just
259
		// plain .txt file is enough, though.
260
		{
261
		    // "format" should match an agreed-upon MIME type if
262
		    // possible.
263
		    "format": "text/plain",
264
		    // Value of "file" should be a path relative to the
265
		    // directory of index.json file.
266
		    "file":   "cc0.txt"
267
		}
268
		// If a markdown version of CC0 was provided, we could add this:
269
		// {
270
		//     "format": "text/markdown",
271
		//     "file": "cc0.md"
272
		// }
273
	    ]
274
275
	    // If needed, a "comment" field can be added to clarify something.
276
	    // For example, when definind "Expat" license we could add:
277
	    //
278
	    // "comment": "Expat license is the most common form of the license often called \"MIT\". Many other forms of \"MIT\" license exist. Here the name \"Expat\" is used to avoid ambiguity."
279
280
	    // If applicable, a "notice" can be included. It shall then be a
281
	    // path (relative to index.json) to a plain text file with that
282
	    // notice.
283
	    //
284
	    // "notice": "license-notice.txt"
285
	    //
286
	    // This is needed for example in case of GNU licenses (both with and
287
	    // without exceptions). For example,
288
	    // "GPL-3.0-or-later-with-html-exception" could have the following
289
	    // in its notice file:
290
	    //
291
	    // This program is free software: you can redistribute it and/or
292
	    // modify it under the terms of the GNU General Public License as
293
	    // published by the Free Software Foundation, either version 3 of
294
	    // the License, or (at your option) any later version.
295
	    //
296
	    // This program is distributed in the hope that it will be useful,
297
	    // but WITHOUT ANY WARRANTY; without even the implied warranty of
298
	    // MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
299
	    // GNU General Public License for more details.
300
	    //
301
	    // As a special exception to the GPL, any HTML file which merely
302
	    // makes function calls to this code, and for that purpose
303
	    // includes it by reference shall be deemed a separate work for
304
	    // copyright law purposes.  If you modify this code, you may extend
305
	    // this exception to your version of the code, but you are not
306
	    // obligated to do so.  If you do not wish to do so, delete this
307
	    // exception statement from your version.
308
	    //
309
	    // You should have received a copy of the GNU General Public License
310
	    // along with this program.  If not, see
311
	    // <https://www.gnu.org/licenses/>.
312
	}
313
    ]
314
}
315
```