1LOGIWEB(7)                File formats and protocols                LOGIWEB(7)
2
3
4

NAME

6       logiweb - Logiweb protocol
7

DESCRIPTION

9       Logiweb  is  a  system  for storing, locating, and transmitting Logiweb
10       pages. Logiweb pages may contain free text mixed with machine  intelli‐
11       gible objects like computer programs, testsuites, and formal proofs.
12
13       Logiweb  defines  a referencing scheme in which each Logiweb page has a
14       unique Logiweb reference. A Logiweb reference is  typically  around  30
15       bytes  long.  A  Logiweb  reference contains, among other, a RIPEMD-160
16       hash key of the referenced page.
17
18       The purpose of the Logiweb protocol is to locate a Logiweb page,  given
19       its reference.
20
21       To maximize efficiency, the Logiweb protocol was originally intended to
22       be a protocol of its own, using its own UDP port.
23
24       As application for a UDP port turned out to be  too  complicated,  how‐
25       ever,  the  Logiweb  protocol will be channeled through http instead. A
26       new protocol will be defined based on the protocol defined below,  c.f.
27       logiweb(5).  The  present  document is included as logiweb(7) until the
28       new protocol becomes available.
29

PROTOCOL DEFINITION

31       Internet Draft                                                   K. Grue
32       <draft-grue-logiweb-protocol-1-00.txt>               Associate Professor
33       Category: Experimental                                              DIKU
34       Expires September 8, 2007                                  March 8, 2007
35
36                          Logiweb Protocol Version 1
37                    <draft-grue-logiweb-protocol-1-00.txt>
38
39Status of this Memo
40   Distribution of this memo is unlimited.
41
42   By submitting this Internet-Draft, each author represents that any
43   applicable patent or other IPR claims of which he or she is aware
44   have been or will be disclosed, and any of which he or she becomes
45   aware will be disclosed, in accordance with Section 6 of BCP 79.
46
47   Internet-Drafts are working documents of the Internet Engineering Task
48   Force (IETF), its areas, and its working groups. Note that other groups
49   may also distribute working documents as Internet-Drafts.
50
51   Internet-Drafts are draft documents valid for a maximum of six months
52   and may be updated, replaced, or obsoleted by other documents at any
53   time.  It is inappropriate to use Internet-Drafts as reference material
54   or to cite them other than as "work in progress."
55
56   The list of current Internet-Drafts can be accessed at
57   http://www.ietf.org/1id-abstracts.html
58
59   The list of Internet-Draft Shadow Directories can be accessed at
60   http://www.ietf.org/shadow.html.
61
62Abstract
63
64   When publishing mechanically verified mathematics on the Internet,
65   there is a need for referencing previously published documents. As an
66   example, referenced documents may contain needed definitions, lemmas,
67   and proofs. References from one mechanically verified document to
68   another is much like any other Uniform Resource Locator, but there is
69   a need to ensure that referenced documents do not change after
70   publication. This is so because otherwise a change of e.g. a
71   definition in a referenced document could invalidate the correctness
72   of a referencing document.
73
74   The present document describes the protocol used by an experimental
75   system named "Logiweb" which allows to publish mechanically verified,
76   immutable mathematical documents.
77
78
79
80Table of Contents
81
82   1. Introduction ....................................................3
83   2. Protocol ........................................................4
84      2.1. Cardinals ..................................................5
85      2.2. Identifiers ................................................6
86      2.3. Logiweb identifier .........................................6
87      2.4. Timestamps .................................................6
88      2.5. Ping requests ..............................................7
89      2.6. Pong responses .............................................7
90      2.7. Event responses ............................................8
91      2.8. Nop requests ...............................................8
92      2.9. Prefix messages ............................................9
93      2.10. Vectors ...................................................9
94      2.11. Get messages .............................................10
95      2.12. Server states ............................................11
96      2.13. Server states are binary trees ...........................12
97      2.14. The type attribute .......................................12
98      2.15. The update attribute .....................................13
99      2.16. The left and right attributes ............................13
100      2.17. The sibling attribute ....................................13
101      2.18. The url attribute ........................................15
102      2.19. The leap attribute .......................................16
103      2.20. Other attribute classes ..................................17
104      2.21. The initial state ........................................17
105      2.22. Got messages .............................................18
106      2.23. Put messages .............................................20
107   3. Security Considerations ........................................21
108      3.1. Unwanted outgoing information .............................21
109      3.2. State corruption ..........................................22
110      3.3. Incoming denial-of-service attacks ........................23
111      3.4. Outgoing denial-of-service attacks ........................23
112   4. IANA Considerations ............................................24
113      4.1. Well Known Port 332 .......................................24
114      4.2. MIME type application/prs.logiweb .........................24
115   5. References .....................................................25
116   5.1. Normative References .........................................25
117   5.2. Informative References .......................................25
118
119
120
1211. Introduction
122
123   This document defines the 'Logiweb protocol' version 1.
124
125   Logiweb is a system for publication of immutable documents of high
126   typographic quality which contain computer programs and mathematical
127   definitions, theorems, and proofs [Logiweb].
128
129   To understand the Logiweb protocol, only the following features of
130   the Logiweb system are needed:
131
132      o  A Logiweb document is a sequence of bytes.  A Logiweb document
133         consists of a version number followed by a RIPEMD-160 hash key
134         [RIPEMD] followed by a time stamp followed by a sequence of
135         bytes.
136
137      o  Any Logiweb document has a 'Logiweb reference'.  The reference
138         is a sequence of bytes.  The reference of a document is the
139         version number followed by the hash key followed by the time
140         stamp of the document.
141
142      o  It is assumed (c.f. the section on security considerations
143         later in this document) that any two Logiweb documents with the
144         same reference are identical. This is ensured by the RIPEMD-160
145         hash key in all probability.
146
147      o  To be considered 'published', a Logiweb document must be
148         accessible using the World Wide Web (WWW).  A published Logiweb
149         document may be mirrored such that it is available under more
150         than one Uniform Resource Locator (URL) [RFC3986].  A published
151         Logiweb document may be moved and copies of it may be deleted
152         such that the set of URLs associated with a Logiweb document
153         may change with time.
154
155      o  The Logiweb system comprises Logiweb 'servers' and Logiweb
156         'clients'.
157
158      o  A Logiweb server is a running computer program which
159         communicates with Logiweb clients and other Logiweb servers
160         using the Logiweb protocol, and which provides the services
161         described in the following.
162
163      o  A Logiweb client is a running computer program which
164         communicates with Logiweb servers using the Logiweb protocol,
165         and which uses the services described in the following.
166
167   The main task of Logiweb servers is to keep track of the relationship
168   between Logiweb references and their associated fluctuating set of
169   URLs. The main service provided by Logiweb servers is to translate
170   Logiweb references to URLs. All Logiweb servers on the Internet shall
171   cooperate on this.
172
173   As mentioned above, a Logiweb document must be accessible using the
174   WWW to be considered 'published'. In addition, the URL of at least
175   one copy of the document must be known to at least one of the
176   cooperating Logiweb servers.
177
178   As secondary services, a Logiweb server can identify itself as a
179   Logiweb server, it can tell what time it is according to the servers
180   clock, and it can tell what leap seconds have occurred.
181
182   Logiweb servers are not supposed to deliver Logiweb documents. They
183   are merely supposed to translate Logiweb references to URLs. The
184   actual delivery of Logiweb documents is supposed to be performed by
185   http servers.
186
187   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
188   "SHOULD", "SHOULD NOT", "RECOMMENDED",  "MAY", and "OPTIONAL" in this
189   document are to be interpreted as described in RFC 2119 [RFC2119].
190
191
192
1932.  Protocol
194
195   The Logiweb protocol defines the syntax and semantics of 'Logiweb
196   messages'.  Logiweb messages are the units of exchange when using the
197   Logiweb protocol.
198
199   The Logiweb protocol is an application layer protocol.  The Logiweb
200   protocol can build on top of connection-based transport protocols
201   like TCP [RFC0793] as well as datagram protocols like UDP [RFC0768].
202
203   When using a datagram protocol, each datagram contains one and only
204   one Logiweb message.  When using a connection-based protocol, Logiweb
205   messages are transmitted back-to-back.
206
207   The Logiweb protocol specifies that some messages require a response
208   and some do not.  However, as an overall rule, whenever an
209   application receives a datagram containing a Logiweb message, the
210   application is allowed not to respond.  Furthermore, whenever an
211   application receives Logiweb messages over a connection-based
212   transport, the application is allowed to close the connection at any
213   time.
214
215   Applications should respond to message which require a response
216   unless they have a reason for not doing so.  Reasons for not
217   responding to a datagram or closing a connection could be that the
218   application is short of outgoing bandwidth, that the application
219   thinks it is suffering a denial-of-service attack, or that the
220   application thinks that the other end of the communication is broke
221   or malicious.
222
223   Furthermore, whenever an application receives a message which
224   requires a response, the application is allowed to respond with a
225   Logiweb 'Sorry' message.  A 'Sorry' message indicates that the
226   application is unwilling to answer at the given time but may be
227   willing to answer later if the same question is sent again.
228
229   Whenever an application receives a message which requires a response
230   via a connection-based protocol, the application is required to
231   respond properly OR respond with a 'Sorry' message OR disconnect.
232   Not responding is no option when using a connection based transport.
233
234   The syntax of Logiweb messages is expressed in ABNF [RFC4234] in the
235   following. The semantics is expressed in clear.
236
237
238
2392.1.  Cardinals
240
241         middle-septet = %d128-255
242         end-septet    = %d000-127
243         cardinal      = *middle-septet end-septet
244
245   A middle-septet X represents the number X minus 128.  An end-septet X
246   represents the number X.  A cardinal represents a non-negative
247   integer using little endian base 128.  As an example, the cardinal
248
249         129 002
250
251   represents the non-negative integer 1 + 128 * 2 = 513. The cardinal
252
253         129 130 000
254
255   also represents 513.
256
257
258
2592.2.  Identifiers
260
261         x00           = %d000 / %d128 x00
262         x01           = %d001 / %d129 x00
263         x02           = %d002 / %d130 x00
264         x03           = %d003 / %d131 x00
265         x04           = %d004 / %d132 x00
266         x05           = %d005 / %d133 x00
267         x06           = %d006 / %d134 x00
268         x07           = %d007 / %d135 x00
269
270   The syntax class x03 covers all cardinals whose value is three. The
271   other syntax classes are similar.
272
273
274
2752.3.  Logiweb identifier
276
277         L             = %d204
278         o             = %d239
279         g             = %d231
280         i             = %d233
281         w             = %d247
282         e             = %d229
283         b             = %d226
284         id-version    = %d001
285         id-Logiweb    = L o g i w e b id-version
286
287   Logiweb applications use id-Logiweb for indicating that they use the
288   Logiweb protocol. Note that the 204 is a middle septet which
289   represents the number 204 - 128 = 76 which is the Unicode [Unicode]
290   of a Latin capital letter L.
291
292
293
2942.4.  Timestamps
295
296         timestamp     = mantissa exponent
297         mantissa      = cardinal
298         exponent      = cardinal
299
300   Logiweb measures the time at which an event occurred in 'Logiweb
301   time'. Logiweb time measures the number of seconds that have elapsed
302   according to International Atomic Time (TAI) since TAI 00:00:00 of
303   Modified Julian Day (MJD) 0.
304
305   For information, MJD 0 is November 17, 1858 in the Gregorian
306   calender. TAI 00:00:00 of MJD 0 equals Universal Coordinated Time
307   (UTC) 00:00:10 of MJD 0 since, by convention, TAI and UTC were 10
308   seconds apart before June 30, 1972. In short, UTC is a time scale in
309   which it is noon when Greenwich is under the sun.
310
311   A timestamp consists of two cardinals, M and E and represents the
312   number M*10^(-E) where u^v denotes u raised to power v. As an example
313
314         129 002 009
315
316   denotes 513 nanoseconds past the Logiweb epoch where the Logiweb
317   epoch is TAI 00:00:00 of MJD 0.
318
319
320
3212.5.  Ping requests
322
323         message       = ping
324         ping          = id-ping
325         id-ping       = x02
326
327   A ping request represents the question 'who are you, and what time is
328   it'.
329
330
331
3322.6  Pong responses
333
334         message       =/ pong
335         pong          = id-pong id-Logiweb timestamp
336         id-pong       = x03
337
338   A Logiweb server which receives a ping request shall do one of the
339   following:
340
341      o  Respond by a pong message containing the current time.
342
343      o  Respond by a 'Sorry' message.
344
345      o  Avoid responding if the ping is transported by a datagram.
346
347      o  Disconnect if the ping is transported by a connection-based
348         transport.
349
350   Logiweb servers are supposed to respond to ping requests. Logiweb
351   clients should consider the other end of the connection as broke if
352   it receives a ping request.
353
354   Logiweb applications SHALL NOT respond to pong responses.
355
356
357
3582.7.  Event responses
359
360         message       =/ event
361         event         = id-event notice
362         id-event      = x01
363         notice        = sorry / received / rejected
364         sorry         = x00
365         received      = x01
366         rejected      = x02
367
368   A 'sorry' response indicates that the sender has received a request
369   which it is unwilling to answer at the given time but may be willing
370   to answer later. Logiweb applications are allowed to send a 'sorry'
371   response to any request which requires a response.
372
373   A 'received' message indicates that the sender acknowledges the
374   receipt of a request but is not going to give any further answer. A
375   'received' message is the proper response to the 'put' request
376   described later.
377
378   A 'rejected' message indicates that the sender acknowledges the
379   receipt of a request but will not and never will answer that
380   particular request. Logiweb applications may respond by a 'rejected'
381   message only when they receive a malformed request.
382
383   Logiweb applications SHALL NOT respond to event responses.
384
385
386
3872.8.  Nop requests
388
389         message       =/ nop
390         nop           = id-nop
391         id-nop        = x00
392
393   Logiweb applications SHALL NOT respond to nop requests. Nop requests
394   may be used for padding when using connection-based transports. There
395   is no point in sending nop datagrams. Applications are allowed to
396   disconnect connection-based transports at any time, so even though
397   applications are not allowed to respond to nop requests, they may
398   still disconnect on a 'nop' without violating the protocol.
399
400
401
402   2.9.  Prefix messages
403
404         message       =/ prefix
405         prefix        = id-prefix code contents
406         id-prefix     = x07
407         code          = cardinal
408         contents      = message
409
410   Whenever a Logiweb application receives a prefix message, it shall
411   process the contents of the message. If the application responds to
412   the contents, it shall prefix the given code to the response.
413
414   Example: Suppose an application receives a ping with two prefixes:
415
416         007 100 007 101 002
417
418   Furthermore, suppose the application decides to respond with a
419   'sorry' message. Then the response should be:
420
421         007 100 007 101 001 000
422
423   Because of prefixes, messages can be arbitrarily long. Messages are
424   typically less than 100 bytes in length. Applications are suggested
425   to process message that are up to 65536 bytes long. When receiving
426   messages longer than that, applications are suggested to disconnect
427   if the message is received over a connection-based transport and to
428   discard if the message is received as a mammoth datagram.
429
430
431
4322.10.  Vectors
433
434         vector        = length bytes
435         length        = cardinal
436         bytes         = *byte
437         byte          = %d0-255
438
439   A vector represents a list of bits. The given length is the number of
440   bits in the list. The syntax of vectors is NOT context free since the
441   number of bytes must be equal to the given length divided by eight
442   and rounded up to the nearest integer. As an example,
443
444         012 128 015
445
446   represents a list comprising twelve bits. The length field occupies
447   the first byte. Twelve divided by eight and rounded up equals two,
448   indicating that the next two bytes are part of the vector.
449
450   The vector 012 128 015 is translated to a list of bytes as follows.
451   First write the bytes in binary big endian:
452
453            1000 0000  0000 1111
454
455   Then bit swap each byte:
456
457            0000 0001  1111 0000
458
459   Then pick the first twelve bits:
460
461            0000 0001 1111
462
463   Sane programmers don't bit swap. Sane programmers realize and utilize
464   that Logiweb is little endian.
465
466
467
4682.11.  Get messages
469
470         message       =/ get
471         get           = id-get address class index
472         id-get        = x04
473         address       = vector
474         class         = update / type / left / right
475         class         =/ sibling / url / leap
476         update        = x00
477         type          = x01
478         left          = x02
479         right         = x03
480         sibling       = x04
481         url           = x05
482         leap          = x06
483         index         = cardinal
484
485   Logiweb servers are supposed to maintain a 'state' which is visible
486   from the outside. Clients and other servers may query the state of a
487   Logiweb server using get messages. A get message requests a Logiweb
488   server to return the 'attribute' which the server associates to the
489   given address, class, and index.
490
491   A Logiweb server has no other visible state than what can be queried
492   using get messages.
493
494
495
4962.12.  Server states
497
498   The state of a server is a function which, given an address and a
499   class, returns a list of attributes. Addresses and classes were
500   defined in the previous section. An attribute consists of a timestamp
501   and a value where the value is a vector as defined in Section 2.10.
502
503   Server states may change with time. When a server receives a 'get'
504   message as described in the previous section, it responds with a
505   'got' message as described later. The contents of the 'got' message
506   reflects the server state at the time the 'get' is processed by the
507   server.
508
509   The server state may change at any time. Processing of each 'get'
510   message is atomic, but the server state may change between any two
511   'get' messages.
512
513   The server state can only change in two ways: an attribute may be
514   added or an attribute may be removed. Whenever an attribute is
515   removed, it is removed from the list of attributes it belongs to
516   without reordering the remaining attributes on that list. Whenever an
517   attribute is added, it is added at the end of an attribute list. For
518   that reason, all attribute lists are chronological with the oldest
519   attribute first.
520
521   Every attribute comprises a timestamp and a value. The value is an
522   arbitrary vector. The timestamp indicates at what time the given
523   attribute was added to the server state.
524
525   A get message with address A, class C, and index I requests the I'th
526   oldest attribute with address A and class C. The oldest attribute has
527   index one. A get message with index zero or an index larger than the
528   number of attributes with the given address and class requests the
529   newest attribute with the given address and class.
530
531
532
5332.13.  Server states are binary trees
534
535   As mentioned, the state of a server is a function which, given an
536   address and a class, returns a list of attributes. Addresses are bit
537   vectors. We shall refer to all attributes with a given address on a
538   given server as the 'node' at that server at that address.
539
540   We shall refer to the empty list of bits as the 'root address' and to
541   the node with that address as the 'root node'. For all addresses A,
542   we refer to A with a zero or one bit added at the end as the 'left'
543   and 'right subaddress', respectively. For non-empty addresses A, we
544   refer to the A with one bit removed at the end as the 'super-address'
545   of A.
546
547      As an example,
548      1110 is the left subaddress of 111
549      1111 is the right subaddress of 111
550      11   is the superaddress of 111
551
552   We shall say that a a server 'has a node with address A' if its state
553   contains at least one attribute with address A.
554
555   A server state is a binary tree in the sense that whenever a server
556   has a node N1 with non-empty address A then it also has a node N2
557   whose address is the superaddress of A. We shall refer to N2 as the
558   supernode of N1.
559
560   If a server has a node N with address A, then we shall refer to N as
561   a 'leaf' node if the server has no nodes whose addresses are the left
562   or right subaddresses of A. We shall refer to N as a 'branch' node if
563   the server has nodes for both the left and the right subaddress of A.
564   Server states only contain leaf and branch nodes. A server state
565   cannot contain a node that has a left but not a right subnode or vice
566   versa.
567
568
569
5702.14.  The type attribute
571
572   Every node of a server contains exactly one attribute of class 'type'
573   (i.e. of class 1). The value of that attribute is the empty bit
574   vector if the node is a leaf node. The value is a one-element bit
575   vector whose sole bit is a one-bit if the node is a branch node. The
576   time stamp of the attribute equals the time at which the node was
577   created or last changed type.
578
579
580
5812.15.  The update attribute
582
583   Every node of a server has six attributes of class 'update' (i.e. of
584   class 0). The six update attributes have values 1, 10, 11, 100, 101,
585   and 110, respectively. The timestamps of those attributes are as
586   follows:
587
588   1   Identical to the timestamp of the 'type' attribute.
589   10  The time of the last change in the left subtree of the node.
590   11  The time of the last change in the right subtree of the node.
591   100 The time of the last change in the 'sibling' attribute list of
592       the node
593   101 The time of the last change in the 'url' attribute list of the
594       node
595   110 The time of the last change in the 'leap' attribute list of the
596       node
597
598   The time stamps for the update attributes with value 10 and 11 equal
599   the timestamp of the 'type' attribute for leaf nodes. The time stamps
600   for the update attributes with value 100, 101, and 110 equal the
601   timestamp of the 'type' attribute if the node never has had
602   attributes of class 'sibling', 'url', or 'leap', respectively.
603
604   Contrary to other attribute lists, update attribute lists may contain
605   several attributes with identical timestamps. That occurs when a
606   single addition or deletion of an attribute has consequential
607   changes. Among other, all update attributes are set to the current
608   server time when a node is created.
609
610
611
6122.16.  The left and right attributes
613
614   Server states have no attributes of class 2 (left) or 3 (right).
615   These two classes only occur as values in update attributes.
616
617
618
6192.17.  The sibling attribute
620
621   Two nodes with the same address on different servers are 'siblings'.
622   A 'branch sibling' of a node is a sibling which is at the same time a
623   branch node. Sibling attributes of a node are references to servers
624   that store branch siblings of the given node.
625
626   The value of a sibling attribute is a byte vector, i.e. a bit vector
627   whose length is a multiple of 8. The bytes part of the bit vector may
628   have a value like
629
630         "udp/logiweb.eu/65535/http://logiweb.eu/logiweb/server/relay/"
631
632   The string above contains 60 characters and, hence, 480 bits. For
633   that reason its encoding is
634
635         224 003 117 100 112 047 108 ...
636
637   Above, the middle-septet 224 represents 224-128=96 and the length
638   field 224 003 represents 96+128*3=480. The number 117 is a Latin
639   small letter u as in "udp". The little-endian nature of bit vectors
640   has no observable effect here.
641
642   In general, sibling attributes have form
643
644         protocol "/" host "/" port "/" relay
645
646   The protocol may be 'tcp' or 'udp'. The host and port identify the
647   Logiweb server. The relay must be an URL [RFC3986].
648
649   The purpose and function of a 'relay' is outside the scope of the
650   present document. For information, however, a relay is a special
651   Logiweb client which runs as a CGI-program [CGI]. If a relay is
652   invoked with a path of '/64/...' or '/32/...' or '/16/...' where the
653   dots express a Logiweb reference expressed base 64, 32, or 16, then
654   the relay contacts a Logiweb server to get the reference translated
655   to an URL and returns an indirection to that URL. As an example,
656   looking up http://logiweb.eu/logiweb/server/relay/64/... in a web
657   browser is supposed to open the Logiweb document with the given
658   reference. Looking up e.g.
659   http://logiweb.eu/logiweb/server/relay/64/.../2/index.html is
660   supposed to do the same but then to back up 2 slashes and then add
661   index.html.
662
663   Logiweb relays typically have further facilities. At the time of
664   writing, the relay at http://logiweb.eu/server/relay contains a self-
665   documenting interface to a Logiweb server which allows any user to
666   experiment with the protocol described in the present document. The
667   given relay was the first Logiweb relay established on the Internet
668   and is supposed to exist as long as Logiweb itself exists.
669
670   Logiweb relays will not be mentioned any more in the present
671   document.
672
673   We shall refer to sibling attributes as sibling pointers. Sibling
674   pointers are said to be 'valid' if they point to servers which store
675   a branch sibling of the given node. A sibling pointer is said to be
676   'dangling' otherwise. Hence, a sibling pointer is dangling if the
677   server pointed to stores no sibling of the given node. Furthermore, a
678   sibling pointer is dangling if the server pointed to does store a
679   sibling but that sibling is a leaf node.
680
681   A server SHALL try its best to avoid dangling pointers. No server can
682   be perfect here because the state of other servers may change without
683   notice. But a server is supposed to validate its sibling pointers
684   regularly.
685
686   Furthermore, each server SHALL try its best to populate all its nodes
687   with sibling pointers. The only excuse for not populating a node with
688   sibling pointers is if no Logiweb server in the world stores a branch
689   sibling of the given node.
690
691   Finally, each server SHALL do its best to ensure that all branch
692   siblings in the world of each node of the server are reachable from
693   the node by following sibling pointers. This is even more difficult
694   to satisfy than the two previous requirements, however, since not
695   only may other server states change without notice but, furthermore,
696   no server has any control over any other server. So, servers are
697   basically required to be resonable and cooperative.
698
699
700
7012.18.  The url attribute
702
703   The address of a node is a bit vector. A Logiweb reference is also a
704   bit vector. If the address of a node is a valid Logiweb reference
705   then the url attributes of the node shall be Uniform Resource
706   Locators (URLs) [RFC3986] of Logiweb documents with the given
707   reference.
708
709   Url attributes of nodes whose addresses are not valid Logiweb
710   references are reserved for future extensions.
711
712
713
7142.19.  The leap attribute
715
716   Only root nodes have leap attributes. Each leap attribute indicates
717   the location of a leap second. Leap attributes are byte vectors, i.e.
718   bit vectors whose length is a multiple of eight. Leap attributes have
719   format
720
721         leap          = step mjd
722         step          = cardinal
723         mjd           = cardinal
724
725   Each leap second occurs at the end of a UTC day (i.e. at midnight in
726   Greenwich). The mjd field indicates which Modified Julian Day (MJD)
727   is affected by the leap. The step is 1 if that day is prolonged by
728   one second. The step is 2 if that day is shortened by one second.
729   Hence, step is 1 for a +1 leap and 2 for a -1 leap. If the
730   International Earth Rotation Service (IERS) ever decides to make
731   multiple leaps, the relationship is intended to be as follows:
732
733            step  0  1  2  3  4  5  6 ...
734            leap  0 +1 -1 +2 -2 +3 -3 ...
735
736   IERS only intends to use leaps of +1 and -1. Leaps of -1 have never
737   occurred and maybe never will. IERS intends to let leaps occur at the
738   end of June 30 and December 31. IERS intends to announce leaps in
739   advance. Leaps affect the length of the last minute of the last hour
740   of the affected UTC day.
741
742   As for all other attributes, the timestamps of leap attributes
743   indicate the time at which the attribute entered the state of the
744   server. At startup, a server is likely to read leap second
745   information from a configuration file or fetch it from another
746   Logiweb server. Servers should arrange leaps chronologically with the
747   oldest leap first.
748
749   Leap attributes shall comprise all past leaps announced by the IERS.
750   Leap attributes should comprise all past and future leaps announced
751   by the IERS. In other words, newly announced leaps shall enter the
752   state before the leap occurs.
753
754
755
7562.20.  Other attribute classes
757
758   Only attributes of class 0, 1, 4, 5, and 6 may occur in server
759   states. Attribute class 2 and 3 never will occur in server states.
760   Attribute class 7 is reserved for information about which future
761   classes a server supports. Class 8-15 are reserved for experiments.
762   Classes from 16 to 2^160-1 inclusive are reserved for first come
763   first served classes. Classes from 2^160 and up are reserved for
764   classes based on the value of Logiweb references. Only class 0, 1, 4,
765   5, and 6 are permitted according to the present document.
766
767
768
7692.21.  The initial state
770
771   When a server starts up, its state contains one node. That node is a
772   root node and it contains seven attributes: one 'type' attribute and
773   six 'update' attributes. The value of the 'type' attribute is the
774   empty bit vector indicating that the root node is a leaf. The values
775   of the update attributes are 1, 10, 11, 100, 101, and 110. All seven
776   timestamps are equal and indicate the time at which the root node was
777   created.
778
779   We shall refer to sibling, url, and leap attributes as 'proper'
780   attributes.  After creation of the root node, the state is changed by
781   adding and removing proper attributes. Update and type attributes
782   only change as a consequence of adding and removing proper
783   attributes. At any time, the server must contain the least number of
784   nodes which are enough to contain the stored proper attributes. For
785   that reason, removing a proper attribute may cause an avalanche of
786   node deletions and adding a proper attribute may cause an avalanche
787   of node creations.
788
789   When adding a proper attribute, the timestamp of all consequential
790   changes must be equal to the timestamp of the new attribute which in
791   turn must reflect the time at which the attribute was added. When
792   removing a proper attribute, all consequential changes must have the
793   same timestamp and that timestamp must reflect the time at which the
794   attribute was removed. The timestamps of successive additions and
795   removals of proper attributes must be strictly increasing. If the
796   resolution of the server clock is insufficient for that, then the
797   server must fake a higher resolution.
798
799   Consequential changes may involve changing the value of update and
800   type attributes. Such changes shall be treated as a simultaneous
801   removal of the old attribute and addition of a new one such that the
802   new attribute appears at the end of its attribute list.
803
804
805
8062.22.  Got messages
807
808         message       =/ got
809         got           = id-got address class index
810                         norm count timestamp value
811         id-got        = x05
812         norm          = cardinal
813         count         = cardinal
814         value         = vector
815
816   A Logiweb server which receives a get request shall do one of the
817   following:
818
819      o  Respond by a got message as described later in this section.
820
821      o  Respond by a 'Sorry' message.
822
823      o  Avoid responding if the get is transported by a datagram.
824
825      o  Disconnect if the get is transported by a connection-based
826         transport.
827
828   Logiweb servers are supposed to respond to get requests. Logiweb
829   clients should consider the other end of the connection as broke if
830   it receives a get request.
831
832   Logiweb applications SHALL NOT respond to got responses.
833
834   If a Logiweb server responds with a 'got' response to a 'get'
835   request, then the 'got' response shall reflect the state of the
836   server at the time the 'get' is processed. The address, class, and
837   index of the 'got' response shall be identical to the address, class,
838   and index of the associated 'get' request. The norm, count,
839   timestamp, and value shall be as follows:
840
841   CASE 1: the state contains an attribute with the given address,
842   class, and index. The norm shall be the length of the address. The
843   count shall be the number of attributes in the state that have the
844   given address and class. The timestamp and value shall be the time
845   stamp and value, respectively, of the attribute with the given
846   address, class, and index.
847
848   CASE 2: the state contains an attribute with the given address and
849   class, but none with the given index. The norm shall be the length of
850   the address. The count shall be the number of attributes in the state
851   that have the given address and class. The timestamp and value shall
852   be the time stamp and value, respectively, of the attribute with the
853   largest index of the given address and class.
854
855   CASE 3: the state contains an attribute with the given address, but
856   none with the given class. The norm shall be the length of the
857   address. The count shall be zero. The timestamp shall be the current
858   server time. The value shall be the empty bit vector.
859
860   CASE 4: the state contains no attributes with the given address. In
861   this case, let A2 be the longest prefix of the given address for
862   which the state does contain an attribute.
863
864   CASE 4A: the state contains a sibling attribute with address A2. The
865   norm shall be the length of A2. The count shall be the number of
866   sibling attributes in the state that have address A2. The timestamp
867   and value shall be the time stamp and value, respectively, of a
868   randomly picked attribute with address A2 and class sibling.
869
870   CASE 4B: the state contains no sibling attributes with address A2.
871   The norm shall be the length of A2. The count shall be zero. The
872   timestamp shall be the current server time. The value shall be the
873   empty bit vector.
874
875   CASE 4A covers the case where the given server is unable to answer
876   the given question (the one encoded in the get request), but is able
877   to refer to some other server which stores a branch node with address
878   A2. In other words, CASE 4A covers the case where a server can refer
879   to a server more knowledgeable on the given question.
880
881   CASE 4B covers the case where the given server is unable to answer
882   the given question and unable to refer to a server which stores a
883   branch node with address A2. Logiweb servers SHALL try their best to
884   avoid CASE 4B in cases where there exists a server which has a branch
885   node with address A2. No server can be perfect here, however, since
886   all states of all other servers may change without notice. But
887   servers are required to crawl Logiweb to ensure they have a plentiful
888   supply of sibling attributes for all their nodes.
889
890   Clients who need e.g. to translate a Logiweb reference R into an URL
891   are supposed to issue a get message with address R, class URL, and
892   index 0. When the client receives a got message whose norm equals the
893   length of R, it uses the returned URL (if any). If the client
894   receives a got message whose norm is less than the length of R, it
895   resends to get request to the indicated sibling (if any). At each
896   redirection, the norm is supposed to increase. If the norm does not
897   increase, then the state of the penultimate server is outdated. In
898   this case, the client may as a courtesy send the penultimate server a
899   'put' message which tells the server to remove its dangling sibling
900   pointer. Put messages are described later.
901
902   When a server or a client crawls Logiweb, it may do so iteratively.
903   As an example, a client may remember when it last visited a given
904   server. Next time the client visits the server, it may start querying
905   the server time with a ping request. Then the client may find out
906   what has changed using update attributes without wasting time on
907   attribute classes and subtrees that have not changed since last.
908   Finally, the client may set its time of last visit to the response
909   from the initial ping.
910
911   Whenever such a client reads a changed attribute list, it should read
912   it in reverse chronological order. To do so, it may start with index
913   0 to get the newest attribute and the number C of attributes. Then it
914   may query index C minus one, C minus two, and so on in that order. If
915   attributes are removed between queries, then the client may receive
916   the same attribute more than once, but it will never miss an
917   attribute. For attributes other than update attributes, distinct
918   attributes have distinct timestamps, so the client can eliminate
919   duplicates on basis of timestamps.
920
921
922
9232.23.  Put messages
924
925         message       =/ put
926         put           = id-put address class operation value
927         id-put        = x06
928         operation     = remove / add
929         remove        = x00
930         add           = x01
931
932   A Logiweb server which receives a put request shall do one of the
933   following:
934
935      o  Respond by a 'Received' message.
936
937      o  Respond by a 'Sorry' message.
938
939      o  Avoid responding if the put is transported by a datagram.
940
941      o  Disconnect if the put is transported by a connection-based
942         transport.
943
944   Logiweb servers are supposed to respond to put requests. Logiweb
945   clients should consider the other end of the connection as broke if
946   it receives a put request.
947
948   A server which receives a put message whose operation is 'remove' may
949   consider to remove an attribute with the given address, class, and
950   value. The remove message contains no index since the index of an
951   attribute can decrease at any time because of removal of older
952   attributes on the same attribute list.
953
954   A server which receives a put message whose operation is 'add' may
955   consider to add an attribute with the given address, class, and
956   value. The add message contains no timestamp since the timestamp of
957   the new attribute should be set to the current server time rather
958   than being supplied.
959
960   A server should consider almost all put requests with almost infinite
961   suspicion. A put request could be forged to corrupt the state of a
962   server or could be forged to fool the server into participating in a
963   denial-of-service attack on some other Logiweb server or some other
964   service on the Internet. This is why a server only tells the sender
965   of a put request that the server has 'received' the request. It does
966   not reveal any information about what the server is going to do with
967   the request. Is is perfectly legitimate for a server to ignore all
968   put requests.
969
970
971
9723. Security Considerations
973
974   3.1. Unwanted outgoing information
975
976   A Logiweb server provides information to the outside world through
977   pong responses, event responses, and got responses.
978
979   Pong responses identifies the server as a Logiweb server and tells
980   what time it is. The owner of a Logiweb server must be prepared to
981   share this information with the world.
982
983   Event responses (received, rejected, and sorry responses) tells the
984   world about the mood of the server. The owner of the server must be
985   prepared to share that as well.
986
987   Got responses tell the world about the publicly available state of
988   the server. In principle, the owner of the server should be prepared
989   to share that as well.
990
991   A Logiweb server, however, typically indexes given subtrees of the
992   owners web site. A Logiweb server typically does so by crawling the
993   file system of the host. In doing so, the server could find documents
994   whose existence the owner wants to keep secret, and then make the
995   existence of those documents publicly known. After that, the secret
996   documents may be retrieved from the owners web server.
997
998   As a countermeasure for that, Logiweb servers should only index files
999   with extension 'lgw' ('lgw' for 'logiweb'). Among those files, the
1000   server should check that the first byte of the file contains the
1001   number 1, and that the next twenty bytes contain the RIPEMD-160 hash
1002   key of the remaining bytes of the file. That ensures with great
1003   likelihood that only genuine Logiweb documents are indexed, avoiding
1004   inadvertent indexing of other kinds of documents. Authors of Logiweb
1005   documents who want their Logiweb documents to remain secret should
1006   keep them out of reach of the local Logiweb server.
1007
1008   As another use of got messages, an attacker may use got responses to
1009   figure out how the server reacts to put requests. Doing so, the
1010   attacker may be able to find a security hole which allows the
1011   attacker to fool the server to participate in a denial-of-service
1012   attack on some other service. The ultimate countermeasure to this is
1013   to let the server ignore all put messages. Otherwise, one must try to
1014   avoid security holes in the server.
1015
1016
1017
1018   3.2. State corruption
1019
1020   Using put messages, an attacker may try to persuade a server to place
1021   incorrect information in the server state. The ultimate
1022   countermeasure to this is to let the server ignore all put messages.
1023   Otherwise, a server should not react directly to put messages.
1024   Rather, the server should repeatedly crawl its host file system to
1025   keep its url attributes up to date and should repeatedly crawl
1026   Logiweb to keep its sibling attributes up to date. In doing so, a
1027   server could take a put message as a hint to crawl some particular
1028   area earlier than it would otherwise do.
1029
1030   One source of put messages are notifications from inside the owners
1031   firewall that some Logiweb document has been added to or removed from
1032   the file system. To respond reasonably, servers are suggested to
1033   classify sender IP's suitably in order to follow up more promptly on
1034   put requests from more trusted senders. This only works, of course,
1035   for sender IP's which an attacker cannot tamper with.
1036
1037   Even if a server is persuaded to place incorrect information in its
1038   state, this will at most prevent clients from finding Logiweb
1039   documents. If a server translates a reference into an URL, then the
1040   client is supposed to retrieve the associated Logiweb document and to
1041   verify using RIPEMD-160 [RIPEMD] that the retrieved document is the
1042   one requested.
1043
1044
1045
1046   3.3. Incoming denial-of-service attacks
1047
1048   If a large number of clients start sending requests to a single
1049   Logiweb server, the ingoing bandwidth of the server may get
1050   saturated. To avoid saturating the outgoing bandwidth if this occurs,
1051   the 'sorry' message has been included in the protocol. The 'sorry'
1052   message allows the server to respond to incoming messages using
1053   little bandwidth and little computational resources. Furthermore, the
1054   protocol allows the server not to respond at all, which accounts for
1055   messages lost due to limitation of ingoing bandwidth.
1056
1057   Logiweb clients should maintain a list of Logiweb servers, and if one
1058   server does not respond or responds with a 'sorry', then the client
1059   should switch to another Logiweb server.
1060
1061
1062
1063   3.4. Outgoing denial-of-service attacks
1064
1065   An attacker may launch an indirect denial-of-service attack by
1066   sending requests to a Logiweb server whose sender field contain the
1067   IP of the victim. To counter for that, the Logiweb protocol specifies
1068   that each request can result in at most one response. In that way, an
1069   attacker cannot use a Logiweb server to 'amplify' the attack.
1070
1071   Logiweb servers are supposed to crawl Logiweb on their own
1072   initiative. Furthermore, put messages may suggest to Logiweb servers
1073   that they should promote crawling of particular servers. An attacker
1074   could use this to persuade a number of Logiweb servers to crawl one
1075   victim simultaneously. To counter for that, the present document does
1076   not specify exactly what a Logiweb server is supposed to do with put
1077   messages. Furthermore, Logiweb servers should approach other servers
1078   gently, waiting for their responses to see that the contacted servers
1079   do respond and do not send out 'sorry' messages. Finally, Logiweb
1080   servers should check that they actually do talk with Logiweb servers
1081   and not with some innocent other service. Logiweb servers may do so
1082   by sending a ping request to services whose identity they are not
1083   sure of.
1084
1085
1086
10874. IANA Considerations
1088
1089   4.1.  Well Known Port 332
1090
1091   The format of sibling attributes allows Logiweb servers to run on
1092   arbitrary UDP and TCP ports. At present, Logiweb servers use UDP port
1093   65535 by default.
1094
1095   To avoid making the use of port 65535 permanent, udp and tcp Well
1096   Known Port 332 is requested to be registered.
1097
1098   Port number 332 is suggested because 332 = 256 + 76 where 76 is the
1099   Unicode of Latin capital letter L, which is the first letter in
1100   "Logiweb". On some occasions not covered in the present document, the
1101   Logiweb system represents strings by numbers, in which case the one
1102   character string "L" happens to be represented by the number 332.
1103   Furthermore, port 332 is unassigned and appears at the end of an
1104   interval of unassigned numbers so that assignment will not lead to
1105   fragmentation.
1106
1107   Suggested port name: "Logiweb".
1108
1109
1110
11114.2.  MIME type application/prs.logiweb
1112
1113   As mentioned, the main purpose of Logiweb servers is to translate
1114   Logiweb references into an URL of an associated Logiweb document.
1115   When looking up the URL of the Logiweb document, http servers
1116   currently deliver the Logiweb document with MIME type application/x-
1117   logiweb.
1118
1119   To avoid making the use of MIME type application/x-logiweb permanent,
1120   MIME type application/prs.logiweb is requested to be registered.
1121
1122   The format of Logiweb documents is:
1123
1124         document      = id-version ripemd timestamp contents
1125         id-version    = %d001
1126         ripemd        = 20*20 byte
1127         contents      = *byte
1128         byte          = %d0-255
1129
1130   For the syntax of timestamps, see the section entitled "Timestamps".
1131
1132   The ripemd field of a document must be the RIPEMD-160 hash key
1133   [RIPEMD] of all bytes following the ripemd field (including the
1134   timestamp).
1135
1136   The reference of a Logiweb document comprises the document with the
1137   contents removed.
1138
1139   The description above of the contents as a sequence of bytes is
1140   sufficient as far as the Logiweb protocol is concerned. A more
1141   complete description may be found at
1142   http://logiweb.eu/logiweb/doc/server/protocol.html#Pages.
1143
1144
1145
11465.  References
1147
11485.1.  Normative References
1149
1150   [RFC2119]  Bradner, S., "Key words for use in RFCs to Indicate
1151              Requirement Levels", BCP 14, RFC 2119, March 1997.
1152
1153   [RFC4234]  Crocker, D. and P. Overell, "Augmented BNF for Syntax
1154              Specifications: ABNF", RFC 4234, October 2005.
1155
1156
1157
11585.2.  Informative References
1159
1160   [Logiweb]  http://logiweb.eu/ (see also Grue, K., "Logiweb - A System
1161              for Web Publication of Mathematics", Mathematical Software
1162              - ICMS 2006, Lecture Notes in Computer Science,
1163              pp.343--353, vol.4151, Springer, 2006).
1164
1165   [CGI]      http://www.w3.org/CGI/
1166
1167   [RIPEMD]   Dobbertin, H., Bosselaers, A., and Preneel, B.,
1168              "RIPEMD-160: A Strengthened Version of RIPEMD", Fast
1169              Software Encryption, 71-82, 1996
1170
1171   [Unicode]  http://www.unicode.org/
1172
1173   [RFC0768]  Postel, J., "User Datagram Protocol", STD 6, RFC 768,
1174              August 1980.
1175
1176   [RFC0793]  Postel, J., "Transmission Control Protocol", STD 7, RFC
1177              793, September 1981.
1178
1179   [RFC3986]  Berners-Lee, T., Fielding, R., and L. Masinter, "Uniform
1180              Resource Identifier (URI): Generic Syntax", STD 66, RFC
1181              3986, January 2005.
1182
1183
1184
1185Authors' Address
1186
1187
1188         Klaus Grue
1189         DIKU
1190         University of Copenhagen
1191         Universitetsparken 1
1192         DK-2100 Copenhagen
1193         Denmark
1194
1195         email - grue@diku.dk
1196
1197Full Copyright Statement
1198
1199   Copyright (C) The IETF Trust (2007).
1200
1201   This document is subject to the rights, licenses and restrictions
1202   contained in BCP 78, and except as set forth therein, the authors
1203   retain all their rights.
1204
1205   This document and the information contained herein are provided on an
1206   "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS
1207   OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY, THE IETF TRUST, AND
1208   THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR
1209   IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE
1210   INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED
1211   WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.
1212
1213Intellectual Property
1214
1215   The IETF takes no position regarding the validity or scope of any
1216   Intellectual Property Rights or other rights that might be claimed
1217   to pertain to the implementation or use of the technology
1218   described in this document or the extent to which any license
1219   under such rights might or might not be available; nor does it
1220   represent that it has made any independent effort to identify any
1221   such rights.  Information on the procedures with respect to
1222   rights in RFC documents can be found in BCP 78 and BCP 79.
1223
1224   Copies of IPR disclosures made to the IETF Secretariat and any
1225   assurances of licenses to be made available, or the result of an
1226   attempt made to obtain a general license or permission for the use
1227   of such proprietary rights by implementers or users of this
1228   specification can be obtained from the IETF on-line IPR repository
1229   at http://www.ietf.org/ipr.
1230
1231   The IETF invites any interested party to bring to its attention
1232   any copyrights, patents or patent applications, or other
1233   proprietary rights that may cover technology that may be required
1234   to implement this standard.  Please address the information to the
1235   IETF at ietf-ipr@ietf.org.
1236
1237
1238
1239
1240Logiweb                         JULY 2009                     LOGIWEB(7)
Impressum