1LOGIWEB(7) File formats and protocols LOGIWEB(7)
2
3
4
6 logiweb - Logiweb protocol
7
9 Logiweb is a system for storing, locating, and transmitting Logiweb
10 pages. Logiweb pages may contain free text mixed with machine intelli‐
11 gible objects like computer programs, testsuites, and formal proofs.
12
13 Logiweb defines a referencing scheme in which each Logiweb page has a
14 unique Logiweb reference. A Logiweb reference is typically around 30
15 bytes long. A Logiweb reference contains, among other, a RIPEMD-160
16 hash key of the referenced page.
17
18 The purpose of the Logiweb protocol is to locate a Logiweb page, given
19 its reference.
20
21 To maximize efficiency, the Logiweb protocol was originally intended to
22 be a protocol of its own, using its own UDP port.
23
24 As application for a UDP port turned out to be too complicated, how‐
25 ever, the Logiweb protocol will be channeled through http instead. A
26 new protocol will be defined based on the protocol defined below, c.f.
27 logiweb(5). The present document is included as logiweb(7) until the
28 new protocol becomes available.
29
31 Internet Draft K. Grue
32 <draft-grue-logiweb-protocol-1-00.txt> Associate Professor
33 Category: Experimental DIKU
34 Expires September 8, 2007 March 8, 2007
35
36 Logiweb Protocol Version 1
37 <draft-grue-logiweb-protocol-1-00.txt>
38
39Status of this Memo
40 Distribution of this memo is unlimited.
41
42 By submitting this Internet-Draft, each author represents that any
43 applicable patent or other IPR claims of which he or she is aware
44 have been or will be disclosed, and any of which he or she becomes
45 aware will be disclosed, in accordance with Section 6 of BCP 79.
46
47 Internet-Drafts are working documents of the Internet Engineering Task
48 Force (IETF), its areas, and its working groups. Note that other groups
49 may also distribute working documents as Internet-Drafts.
50
51 Internet-Drafts are draft documents valid for a maximum of six months
52 and may be updated, replaced, or obsoleted by other documents at any
53 time. It is inappropriate to use Internet-Drafts as reference material
54 or to cite them other than as "work in progress."
55
56 The list of current Internet-Drafts can be accessed at
57 http://www.ietf.org/1id-abstracts.html
58
59 The list of Internet-Draft Shadow Directories can be accessed at
60 http://www.ietf.org/shadow.html.
61
62Abstract
63
64 When publishing mechanically verified mathematics on the Internet,
65 there is a need for referencing previously published documents. As an
66 example, referenced documents may contain needed definitions, lemmas,
67 and proofs. References from one mechanically verified document to
68 another is much like any other Uniform Resource Locator, but there is
69 a need to ensure that referenced documents do not change after
70 publication. This is so because otherwise a change of e.g. a
71 definition in a referenced document could invalidate the correctness
72 of a referencing document.
73
74 The present document describes the protocol used by an experimental
75 system named "Logiweb" which allows to publish mechanically verified,
76 immutable mathematical documents.
77
78
79
80Table of Contents
81
82 1. Introduction ....................................................3
83 2. Protocol ........................................................4
84 2.1. Cardinals ..................................................5
85 2.2. Identifiers ................................................6
86 2.3. Logiweb identifier .........................................6
87 2.4. Timestamps .................................................6
88 2.5. Ping requests ..............................................7
89 2.6. Pong responses .............................................7
90 2.7. Event responses ............................................8
91 2.8. Nop requests ...............................................8
92 2.9. Prefix messages ............................................9
93 2.10. Vectors ...................................................9
94 2.11. Get messages .............................................10
95 2.12. Server states ............................................11
96 2.13. Server states are binary trees ...........................12
97 2.14. The type attribute .......................................12
98 2.15. The update attribute .....................................13
99 2.16. The left and right attributes ............................13
100 2.17. The sibling attribute ....................................13
101 2.18. The url attribute ........................................15
102 2.19. The leap attribute .......................................16
103 2.20. Other attribute classes ..................................17
104 2.21. The initial state ........................................17
105 2.22. Got messages .............................................18
106 2.23. Put messages .............................................20
107 3. Security Considerations ........................................21
108 3.1. Unwanted outgoing information .............................21
109 3.2. State corruption ..........................................22
110 3.3. Incoming denial-of-service attacks ........................23
111 3.4. Outgoing denial-of-service attacks ........................23
112 4. IANA Considerations ............................................24
113 4.1. Well Known Port 332 .......................................24
114 4.2. MIME type application/prs.logiweb .........................24
115 5. References .....................................................25
116 5.1. Normative References .........................................25
117 5.2. Informative References .......................................25
118
119
120
1211. Introduction
122
123 This document defines the 'Logiweb protocol' version 1.
124
125 Logiweb is a system for publication of immutable documents of high
126 typographic quality which contain computer programs and mathematical
127 definitions, theorems, and proofs [Logiweb].
128
129 To understand the Logiweb protocol, only the following features of
130 the Logiweb system are needed:
131
132 o A Logiweb document is a sequence of bytes. A Logiweb document
133 consists of a version number followed by a RIPEMD-160 hash key
134 [RIPEMD] followed by a time stamp followed by a sequence of
135 bytes.
136
137 o Any Logiweb document has a 'Logiweb reference'. The reference
138 is a sequence of bytes. The reference of a document is the
139 version number followed by the hash key followed by the time
140 stamp of the document.
141
142 o It is assumed (c.f. the section on security considerations
143 later in this document) that any two Logiweb documents with the
144 same reference are identical. This is ensured by the RIPEMD-160
145 hash key in all probability.
146
147 o To be considered 'published', a Logiweb document must be
148 accessible using the World Wide Web (WWW). A published Logiweb
149 document may be mirrored such that it is available under more
150 than one Uniform Resource Locator (URL) [RFC3986]. A published
151 Logiweb document may be moved and copies of it may be deleted
152 such that the set of URLs associated with a Logiweb document
153 may change with time.
154
155 o The Logiweb system comprises Logiweb 'servers' and Logiweb
156 'clients'.
157
158 o A Logiweb server is a running computer program which
159 communicates with Logiweb clients and other Logiweb servers
160 using the Logiweb protocol, and which provides the services
161 described in the following.
162
163 o A Logiweb client is a running computer program which
164 communicates with Logiweb servers using the Logiweb protocol,
165 and which uses the services described in the following.
166
167 The main task of Logiweb servers is to keep track of the relationship
168 between Logiweb references and their associated fluctuating set of
169 URLs. The main service provided by Logiweb servers is to translate
170 Logiweb references to URLs. All Logiweb servers on the Internet shall
171 cooperate on this.
172
173 As mentioned above, a Logiweb document must be accessible using the
174 WWW to be considered 'published'. In addition, the URL of at least
175 one copy of the document must be known to at least one of the
176 cooperating Logiweb servers.
177
178 As secondary services, a Logiweb server can identify itself as a
179 Logiweb server, it can tell what time it is according to the servers
180 clock, and it can tell what leap seconds have occurred.
181
182 Logiweb servers are not supposed to deliver Logiweb documents. They
183 are merely supposed to translate Logiweb references to URLs. The
184 actual delivery of Logiweb documents is supposed to be performed by
185 http servers.
186
187 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
188 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
189 document are to be interpreted as described in RFC 2119 [RFC2119].
190
191
192
1932. Protocol
194
195 The Logiweb protocol defines the syntax and semantics of 'Logiweb
196 messages'. Logiweb messages are the units of exchange when using the
197 Logiweb protocol.
198
199 The Logiweb protocol is an application layer protocol. The Logiweb
200 protocol can build on top of connection-based transport protocols
201 like TCP [RFC0793] as well as datagram protocols like UDP [RFC0768].
202
203 When using a datagram protocol, each datagram contains one and only
204 one Logiweb message. When using a connection-based protocol, Logiweb
205 messages are transmitted back-to-back.
206
207 The Logiweb protocol specifies that some messages require a response
208 and some do not. However, as an overall rule, whenever an
209 application receives a datagram containing a Logiweb message, the
210 application is allowed not to respond. Furthermore, whenever an
211 application receives Logiweb messages over a connection-based
212 transport, the application is allowed to close the connection at any
213 time.
214
215 Applications should respond to message which require a response
216 unless they have a reason for not doing so. Reasons for not
217 responding to a datagram or closing a connection could be that the
218 application is short of outgoing bandwidth, that the application
219 thinks it is suffering a denial-of-service attack, or that the
220 application thinks that the other end of the communication is broke
221 or malicious.
222
223 Furthermore, whenever an application receives a message which
224 requires a response, the application is allowed to respond with a
225 Logiweb 'Sorry' message. A 'Sorry' message indicates that the
226 application is unwilling to answer at the given time but may be
227 willing to answer later if the same question is sent again.
228
229 Whenever an application receives a message which requires a response
230 via a connection-based protocol, the application is required to
231 respond properly OR respond with a 'Sorry' message OR disconnect.
232 Not responding is no option when using a connection based transport.
233
234 The syntax of Logiweb messages is expressed in ABNF [RFC4234] in the
235 following. The semantics is expressed in clear.
236
237
238
2392.1. Cardinals
240
241 middle-septet = %d128-255
242 end-septet = %d000-127
243 cardinal = *middle-septet end-septet
244
245 A middle-septet X represents the number X minus 128. An end-septet X
246 represents the number X. A cardinal represents a non-negative
247 integer using little endian base 128. As an example, the cardinal
248
249 129 002
250
251 represents the non-negative integer 1 + 128 * 2 = 513. The cardinal
252
253 129 130 000
254
255 also represents 513.
256
257
258
2592.2. Identifiers
260
261 x00 = %d000 / %d128 x00
262 x01 = %d001 / %d129 x00
263 x02 = %d002 / %d130 x00
264 x03 = %d003 / %d131 x00
265 x04 = %d004 / %d132 x00
266 x05 = %d005 / %d133 x00
267 x06 = %d006 / %d134 x00
268 x07 = %d007 / %d135 x00
269
270 The syntax class x03 covers all cardinals whose value is three. The
271 other syntax classes are similar.
272
273
274
2752.3. Logiweb identifier
276
277 L = %d204
278 o = %d239
279 g = %d231
280 i = %d233
281 w = %d247
282 e = %d229
283 b = %d226
284 id-version = %d001
285 id-Logiweb = L o g i w e b id-version
286
287 Logiweb applications use id-Logiweb for indicating that they use the
288 Logiweb protocol. Note that the 204 is a middle septet which
289 represents the number 204 - 128 = 76 which is the Unicode [Unicode]
290 of a Latin capital letter L.
291
292
293
2942.4. Timestamps
295
296 timestamp = mantissa exponent
297 mantissa = cardinal
298 exponent = cardinal
299
300 Logiweb measures the time at which an event occurred in 'Logiweb
301 time'. Logiweb time measures the number of seconds that have elapsed
302 according to International Atomic Time (TAI) since TAI 00:00:00 of
303 Modified Julian Day (MJD) 0.
304
305 For information, MJD 0 is November 17, 1858 in the Gregorian
306 calender. TAI 00:00:00 of MJD 0 equals Universal Coordinated Time
307 (UTC) 00:00:10 of MJD 0 since, by convention, TAI and UTC were 10
308 seconds apart before June 30, 1972. In short, UTC is a time scale in
309 which it is noon when Greenwich is under the sun.
310
311 A timestamp consists of two cardinals, M and E and represents the
312 number M*10^(-E) where u^v denotes u raised to power v. As an example
313
314 129 002 009
315
316 denotes 513 nanoseconds past the Logiweb epoch where the Logiweb
317 epoch is TAI 00:00:00 of MJD 0.
318
319
320
3212.5. Ping requests
322
323 message = ping
324 ping = id-ping
325 id-ping = x02
326
327 A ping request represents the question 'who are you, and what time is
328 it'.
329
330
331
3322.6 Pong responses
333
334 message =/ pong
335 pong = id-pong id-Logiweb timestamp
336 id-pong = x03
337
338 A Logiweb server which receives a ping request shall do one of the
339 following:
340
341 o Respond by a pong message containing the current time.
342
343 o Respond by a 'Sorry' message.
344
345 o Avoid responding if the ping is transported by a datagram.
346
347 o Disconnect if the ping is transported by a connection-based
348 transport.
349
350 Logiweb servers are supposed to respond to ping requests. Logiweb
351 clients should consider the other end of the connection as broke if
352 it receives a ping request.
353
354 Logiweb applications SHALL NOT respond to pong responses.
355
356
357
3582.7. Event responses
359
360 message =/ event
361 event = id-event notice
362 id-event = x01
363 notice = sorry / received / rejected
364 sorry = x00
365 received = x01
366 rejected = x02
367
368 A 'sorry' response indicates that the sender has received a request
369 which it is unwilling to answer at the given time but may be willing
370 to answer later. Logiweb applications are allowed to send a 'sorry'
371 response to any request which requires a response.
372
373 A 'received' message indicates that the sender acknowledges the
374 receipt of a request but is not going to give any further answer. A
375 'received' message is the proper response to the 'put' request
376 described later.
377
378 A 'rejected' message indicates that the sender acknowledges the
379 receipt of a request but will not and never will answer that
380 particular request. Logiweb applications may respond by a 'rejected'
381 message only when they receive a malformed request.
382
383 Logiweb applications SHALL NOT respond to event responses.
384
385
386
3872.8. Nop requests
388
389 message =/ nop
390 nop = id-nop
391 id-nop = x00
392
393 Logiweb applications SHALL NOT respond to nop requests. Nop requests
394 may be used for padding when using connection-based transports. There
395 is no point in sending nop datagrams. Applications are allowed to
396 disconnect connection-based transports at any time, so even though
397 applications are not allowed to respond to nop requests, they may
398 still disconnect on a 'nop' without violating the protocol.
399
400
401
402 2.9. Prefix messages
403
404 message =/ prefix
405 prefix = id-prefix code contents
406 id-prefix = x07
407 code = cardinal
408 contents = message
409
410 Whenever a Logiweb application receives a prefix message, it shall
411 process the contents of the message. If the application responds to
412 the contents, it shall prefix the given code to the response.
413
414 Example: Suppose an application receives a ping with two prefixes:
415
416 007 100 007 101 002
417
418 Furthermore, suppose the application decides to respond with a
419 'sorry' message. Then the response should be:
420
421 007 100 007 101 001 000
422
423 Because of prefixes, messages can be arbitrarily long. Messages are
424 typically less than 100 bytes in length. Applications are suggested
425 to process message that are up to 65536 bytes long. When receiving
426 messages longer than that, applications are suggested to disconnect
427 if the message is received over a connection-based transport and to
428 discard if the message is received as a mammoth datagram.
429
430
431
4322.10. Vectors
433
434 vector = length bytes
435 length = cardinal
436 bytes = *byte
437 byte = %d0-255
438
439 A vector represents a list of bits. The given length is the number of
440 bits in the list. The syntax of vectors is NOT context free since the
441 number of bytes must be equal to the given length divided by eight
442 and rounded up to the nearest integer. As an example,
443
444 012 128 015
445
446 represents a list comprising twelve bits. The length field occupies
447 the first byte. Twelve divided by eight and rounded up equals two,
448 indicating that the next two bytes are part of the vector.
449
450 The vector 012 128 015 is translated to a list of bytes as follows.
451 First write the bytes in binary big endian:
452
453 1000 0000 0000 1111
454
455 Then bit swap each byte:
456
457 0000 0001 1111 0000
458
459 Then pick the first twelve bits:
460
461 0000 0001 1111
462
463 Sane programmers don't bit swap. Sane programmers realize and utilize
464 that Logiweb is little endian.
465
466
467
4682.11. Get messages
469
470 message =/ get
471 get = id-get address class index
472 id-get = x04
473 address = vector
474 class = update / type / left / right
475 class =/ sibling / url / leap
476 update = x00
477 type = x01
478 left = x02
479 right = x03
480 sibling = x04
481 url = x05
482 leap = x06
483 index = cardinal
484
485 Logiweb servers are supposed to maintain a 'state' which is visible
486 from the outside. Clients and other servers may query the state of a
487 Logiweb server using get messages. A get message requests a Logiweb
488 server to return the 'attribute' which the server associates to the
489 given address, class, and index.
490
491 A Logiweb server has no other visible state than what can be queried
492 using get messages.
493
494
495
4962.12. Server states
497
498 The state of a server is a function which, given an address and a
499 class, returns a list of attributes. Addresses and classes were
500 defined in the previous section. An attribute consists of a timestamp
501 and a value where the value is a vector as defined in Section 2.10.
502
503 Server states may change with time. When a server receives a 'get'
504 message as described in the previous section, it responds with a
505 'got' message as described later. The contents of the 'got' message
506 reflects the server state at the time the 'get' is processed by the
507 server.
508
509 The server state may change at any time. Processing of each 'get'
510 message is atomic, but the server state may change between any two
511 'get' messages.
512
513 The server state can only change in two ways: an attribute may be
514 added or an attribute may be removed. Whenever an attribute is
515 removed, it is removed from the list of attributes it belongs to
516 without reordering the remaining attributes on that list. Whenever an
517 attribute is added, it is added at the end of an attribute list. For
518 that reason, all attribute lists are chronological with the oldest
519 attribute first.
520
521 Every attribute comprises a timestamp and a value. The value is an
522 arbitrary vector. The timestamp indicates at what time the given
523 attribute was added to the server state.
524
525 A get message with address A, class C, and index I requests the I'th
526 oldest attribute with address A and class C. The oldest attribute has
527 index one. A get message with index zero or an index larger than the
528 number of attributes with the given address and class requests the
529 newest attribute with the given address and class.
530
531
532
5332.13. Server states are binary trees
534
535 As mentioned, the state of a server is a function which, given an
536 address and a class, returns a list of attributes. Addresses are bit
537 vectors. We shall refer to all attributes with a given address on a
538 given server as the 'node' at that server at that address.
539
540 We shall refer to the empty list of bits as the 'root address' and to
541 the node with that address as the 'root node'. For all addresses A,
542 we refer to A with a zero or one bit added at the end as the 'left'
543 and 'right subaddress', respectively. For non-empty addresses A, we
544 refer to the A with one bit removed at the end as the 'super-address'
545 of A.
546
547 As an example,
548 1110 is the left subaddress of 111
549 1111 is the right subaddress of 111
550 11 is the superaddress of 111
551
552 We shall say that a a server 'has a node with address A' if its state
553 contains at least one attribute with address A.
554
555 A server state is a binary tree in the sense that whenever a server
556 has a node N1 with non-empty address A then it also has a node N2
557 whose address is the superaddress of A. We shall refer to N2 as the
558 supernode of N1.
559
560 If a server has a node N with address A, then we shall refer to N as
561 a 'leaf' node if the server has no nodes whose addresses are the left
562 or right subaddresses of A. We shall refer to N as a 'branch' node if
563 the server has nodes for both the left and the right subaddress of A.
564 Server states only contain leaf and branch nodes. A server state
565 cannot contain a node that has a left but not a right subnode or vice
566 versa.
567
568
569
5702.14. The type attribute
571
572 Every node of a server contains exactly one attribute of class 'type'
573 (i.e. of class 1). The value of that attribute is the empty bit
574 vector if the node is a leaf node. The value is a one-element bit
575 vector whose sole bit is a one-bit if the node is a branch node. The
576 time stamp of the attribute equals the time at which the node was
577 created or last changed type.
578
579
580
5812.15. The update attribute
582
583 Every node of a server has six attributes of class 'update' (i.e. of
584 class 0). The six update attributes have values 1, 10, 11, 100, 101,
585 and 110, respectively. The timestamps of those attributes are as
586 follows:
587
588 1 Identical to the timestamp of the 'type' attribute.
589 10 The time of the last change in the left subtree of the node.
590 11 The time of the last change in the right subtree of the node.
591 100 The time of the last change in the 'sibling' attribute list of
592 the node
593 101 The time of the last change in the 'url' attribute list of the
594 node
595 110 The time of the last change in the 'leap' attribute list of the
596 node
597
598 The time stamps for the update attributes with value 10 and 11 equal
599 the timestamp of the 'type' attribute for leaf nodes. The time stamps
600 for the update attributes with value 100, 101, and 110 equal the
601 timestamp of the 'type' attribute if the node never has had
602 attributes of class 'sibling', 'url', or 'leap', respectively.
603
604 Contrary to other attribute lists, update attribute lists may contain
605 several attributes with identical timestamps. That occurs when a
606 single addition or deletion of an attribute has consequential
607 changes. Among other, all update attributes are set to the current
608 server time when a node is created.
609
610
611
6122.16. The left and right attributes
613
614 Server states have no attributes of class 2 (left) or 3 (right).
615 These two classes only occur as values in update attributes.
616
617
618
6192.17. The sibling attribute
620
621 Two nodes with the same address on different servers are 'siblings'.
622 A 'branch sibling' of a node is a sibling which is at the same time a
623 branch node. Sibling attributes of a node are references to servers
624 that store branch siblings of the given node.
625
626 The value of a sibling attribute is a byte vector, i.e. a bit vector
627 whose length is a multiple of 8. The bytes part of the bit vector may
628 have a value like
629
630 "udp/logiweb.eu/65535/http://logiweb.eu/logiweb/server/relay/"
631
632 The string above contains 60 characters and, hence, 480 bits. For
633 that reason its encoding is
634
635 224 003 117 100 112 047 108 ...
636
637 Above, the middle-septet 224 represents 224-128=96 and the length
638 field 224 003 represents 96+128*3=480. The number 117 is a Latin
639 small letter u as in "udp". The little-endian nature of bit vectors
640 has no observable effect here.
641
642 In general, sibling attributes have form
643
644 protocol "/" host "/" port "/" relay
645
646 The protocol may be 'tcp' or 'udp'. The host and port identify the
647 Logiweb server. The relay must be an URL [RFC3986].
648
649 The purpose and function of a 'relay' is outside the scope of the
650 present document. For information, however, a relay is a special
651 Logiweb client which runs as a CGI-program [CGI]. If a relay is
652 invoked with a path of '/64/...' or '/32/...' or '/16/...' where the
653 dots express a Logiweb reference expressed base 64, 32, or 16, then
654 the relay contacts a Logiweb server to get the reference translated
655 to an URL and returns an indirection to that URL. As an example,
656 looking up http://logiweb.eu/logiweb/server/relay/64/... in a web
657 browser is supposed to open the Logiweb document with the given
658 reference. Looking up e.g.
659 http://logiweb.eu/logiweb/server/relay/64/.../2/index.html is
660 supposed to do the same but then to back up 2 slashes and then add
661 index.html.
662
663 Logiweb relays typically have further facilities. At the time of
664 writing, the relay at http://logiweb.eu/server/relay contains a self-
665 documenting interface to a Logiweb server which allows any user to
666 experiment with the protocol described in the present document. The
667 given relay was the first Logiweb relay established on the Internet
668 and is supposed to exist as long as Logiweb itself exists.
669
670 Logiweb relays will not be mentioned any more in the present
671 document.
672
673 We shall refer to sibling attributes as sibling pointers. Sibling
674 pointers are said to be 'valid' if they point to servers which store
675 a branch sibling of the given node. A sibling pointer is said to be
676 'dangling' otherwise. Hence, a sibling pointer is dangling if the
677 server pointed to stores no sibling of the given node. Furthermore, a
678 sibling pointer is dangling if the server pointed to does store a
679 sibling but that sibling is a leaf node.
680
681 A server SHALL try its best to avoid dangling pointers. No server can
682 be perfect here because the state of other servers may change without
683 notice. But a server is supposed to validate its sibling pointers
684 regularly.
685
686 Furthermore, each server SHALL try its best to populate all its nodes
687 with sibling pointers. The only excuse for not populating a node with
688 sibling pointers is if no Logiweb server in the world stores a branch
689 sibling of the given node.
690
691 Finally, each server SHALL do its best to ensure that all branch
692 siblings in the world of each node of the server are reachable from
693 the node by following sibling pointers. This is even more difficult
694 to satisfy than the two previous requirements, however, since not
695 only may other server states change without notice but, furthermore,
696 no server has any control over any other server. So, servers are
697 basically required to be resonable and cooperative.
698
699
700
7012.18. The url attribute
702
703 The address of a node is a bit vector. A Logiweb reference is also a
704 bit vector. If the address of a node is a valid Logiweb reference
705 then the url attributes of the node shall be Uniform Resource
706 Locators (URLs) [RFC3986] of Logiweb documents with the given
707 reference.
708
709 Url attributes of nodes whose addresses are not valid Logiweb
710 references are reserved for future extensions.
711
712
713
7142.19. The leap attribute
715
716 Only root nodes have leap attributes. Each leap attribute indicates
717 the location of a leap second. Leap attributes are byte vectors, i.e.
718 bit vectors whose length is a multiple of eight. Leap attributes have
719 format
720
721 leap = step mjd
722 step = cardinal
723 mjd = cardinal
724
725 Each leap second occurs at the end of a UTC day (i.e. at midnight in
726 Greenwich). The mjd field indicates which Modified Julian Day (MJD)
727 is affected by the leap. The step is 1 if that day is prolonged by
728 one second. The step is 2 if that day is shortened by one second.
729 Hence, step is 1 for a +1 leap and 2 for a -1 leap. If the
730 International Earth Rotation Service (IERS) ever decides to make
731 multiple leaps, the relationship is intended to be as follows:
732
733 step 0 1 2 3 4 5 6 ...
734 leap 0 +1 -1 +2 -2 +3 -3 ...
735
736 IERS only intends to use leaps of +1 and -1. Leaps of -1 have never
737 occurred and maybe never will. IERS intends to let leaps occur at the
738 end of June 30 and December 31. IERS intends to announce leaps in
739 advance. Leaps affect the length of the last minute of the last hour
740 of the affected UTC day.
741
742 As for all other attributes, the timestamps of leap attributes
743 indicate the time at which the attribute entered the state of the
744 server. At startup, a server is likely to read leap second
745 information from a configuration file or fetch it from another
746 Logiweb server. Servers should arrange leaps chronologically with the
747 oldest leap first.
748
749 Leap attributes shall comprise all past leaps announced by the IERS.
750 Leap attributes should comprise all past and future leaps announced
751 by the IERS. In other words, newly announced leaps shall enter the
752 state before the leap occurs.
753
754
755
7562.20. Other attribute classes
757
758 Only attributes of class 0, 1, 4, 5, and 6 may occur in server
759 states. Attribute class 2 and 3 never will occur in server states.
760 Attribute class 7 is reserved for information about which future
761 classes a server supports. Class 8-15 are reserved for experiments.
762 Classes from 16 to 2^160-1 inclusive are reserved for first come
763 first served classes. Classes from 2^160 and up are reserved for
764 classes based on the value of Logiweb references. Only class 0, 1, 4,
765 5, and 6 are permitted according to the present document.
766
767
768
7692.21. The initial state
770
771 When a server starts up, its state contains one node. That node is a
772 root node and it contains seven attributes: one 'type' attribute and
773 six 'update' attributes. The value of the 'type' attribute is the
774 empty bit vector indicating that the root node is a leaf. The values
775 of the update attributes are 1, 10, 11, 100, 101, and 110. All seven
776 timestamps are equal and indicate the time at which the root node was
777 created.
778
779 We shall refer to sibling, url, and leap attributes as 'proper'
780 attributes. After creation of the root node, the state is changed by
781 adding and removing proper attributes. Update and type attributes
782 only change as a consequence of adding and removing proper
783 attributes. At any time, the server must contain the least number of
784 nodes which are enough to contain the stored proper attributes. For
785 that reason, removing a proper attribute may cause an avalanche of
786 node deletions and adding a proper attribute may cause an avalanche
787 of node creations.
788
789 When adding a proper attribute, the timestamp of all consequential
790 changes must be equal to the timestamp of the new attribute which in
791 turn must reflect the time at which the attribute was added. When
792 removing a proper attribute, all consequential changes must have the
793 same timestamp and that timestamp must reflect the time at which the
794 attribute was removed. The timestamps of successive additions and
795 removals of proper attributes must be strictly increasing. If the
796 resolution of the server clock is insufficient for that, then the
797 server must fake a higher resolution.
798
799 Consequential changes may involve changing the value of update and
800 type attributes. Such changes shall be treated as a simultaneous
801 removal of the old attribute and addition of a new one such that the
802 new attribute appears at the end of its attribute list.
803
804
805
8062.22. Got messages
807
808 message =/ got
809 got = id-got address class index
810 norm count timestamp value
811 id-got = x05
812 norm = cardinal
813 count = cardinal
814 value = vector
815
816 A Logiweb server which receives a get request shall do one of the
817 following:
818
819 o Respond by a got message as described later in this section.
820
821 o Respond by a 'Sorry' message.
822
823 o Avoid responding if the get is transported by a datagram.
824
825 o Disconnect if the get is transported by a connection-based
826 transport.
827
828 Logiweb servers are supposed to respond to get requests. Logiweb
829 clients should consider the other end of the connection as broke if
830 it receives a get request.
831
832 Logiweb applications SHALL NOT respond to got responses.
833
834 If a Logiweb server responds with a 'got' response to a 'get'
835 request, then the 'got' response shall reflect the state of the
836 server at the time the 'get' is processed. The address, class, and
837 index of the 'got' response shall be identical to the address, class,
838 and index of the associated 'get' request. The norm, count,
839 timestamp, and value shall be as follows:
840
841 CASE 1: the state contains an attribute with the given address,
842 class, and index. The norm shall be the length of the address. The
843 count shall be the number of attributes in the state that have the
844 given address and class. The timestamp and value shall be the time
845 stamp and value, respectively, of the attribute with the given
846 address, class, and index.
847
848 CASE 2: the state contains an attribute with the given address and
849 class, but none with the given index. The norm shall be the length of
850 the address. The count shall be the number of attributes in the state
851 that have the given address and class. The timestamp and value shall
852 be the time stamp and value, respectively, of the attribute with the
853 largest index of the given address and class.
854
855 CASE 3: the state contains an attribute with the given address, but
856 none with the given class. The norm shall be the length of the
857 address. The count shall be zero. The timestamp shall be the current
858 server time. The value shall be the empty bit vector.
859
860 CASE 4: the state contains no attributes with the given address. In
861 this case, let A2 be the longest prefix of the given address for
862 which the state does contain an attribute.
863
864 CASE 4A: the state contains a sibling attribute with address A2. The
865 norm shall be the length of A2. The count shall be the number of
866 sibling attributes in the state that have address A2. The timestamp
867 and value shall be the time stamp and value, respectively, of a
868 randomly picked attribute with address A2 and class sibling.
869
870 CASE 4B: the state contains no sibling attributes with address A2.
871 The norm shall be the length of A2. The count shall be zero. The
872 timestamp shall be the current server time. The value shall be the
873 empty bit vector.
874
875 CASE 4A covers the case where the given server is unable to answer
876 the given question (the one encoded in the get request), but is able
877 to refer to some other server which stores a branch node with address
878 A2. In other words, CASE 4A covers the case where a server can refer
879 to a server more knowledgeable on the given question.
880
881 CASE 4B covers the case where the given server is unable to answer
882 the given question and unable to refer to a server which stores a
883 branch node with address A2. Logiweb servers SHALL try their best to
884 avoid CASE 4B in cases where there exists a server which has a branch
885 node with address A2. No server can be perfect here, however, since
886 all states of all other servers may change without notice. But
887 servers are required to crawl Logiweb to ensure they have a plentiful
888 supply of sibling attributes for all their nodes.
889
890 Clients who need e.g. to translate a Logiweb reference R into an URL
891 are supposed to issue a get message with address R, class URL, and
892 index 0. When the client receives a got message whose norm equals the
893 length of R, it uses the returned URL (if any). If the client
894 receives a got message whose norm is less than the length of R, it
895 resends to get request to the indicated sibling (if any). At each
896 redirection, the norm is supposed to increase. If the norm does not
897 increase, then the state of the penultimate server is outdated. In
898 this case, the client may as a courtesy send the penultimate server a
899 'put' message which tells the server to remove its dangling sibling
900 pointer. Put messages are described later.
901
902 When a server or a client crawls Logiweb, it may do so iteratively.
903 As an example, a client may remember when it last visited a given
904 server. Next time the client visits the server, it may start querying
905 the server time with a ping request. Then the client may find out
906 what has changed using update attributes without wasting time on
907 attribute classes and subtrees that have not changed since last.
908 Finally, the client may set its time of last visit to the response
909 from the initial ping.
910
911 Whenever such a client reads a changed attribute list, it should read
912 it in reverse chronological order. To do so, it may start with index
913 0 to get the newest attribute and the number C of attributes. Then it
914 may query index C minus one, C minus two, and so on in that order. If
915 attributes are removed between queries, then the client may receive
916 the same attribute more than once, but it will never miss an
917 attribute. For attributes other than update attributes, distinct
918 attributes have distinct timestamps, so the client can eliminate
919 duplicates on basis of timestamps.
920
921
922
9232.23. Put messages
924
925 message =/ put
926 put = id-put address class operation value
927 id-put = x06
928 operation = remove / add
929 remove = x00
930 add = x01
931
932 A Logiweb server which receives a put request shall do one of the
933 following:
934
935 o Respond by a 'Received' message.
936
937 o Respond by a 'Sorry' message.
938
939 o Avoid responding if the put is transported by a datagram.
940
941 o Disconnect if the put is transported by a connection-based
942 transport.
943
944 Logiweb servers are supposed to respond to put requests. Logiweb
945 clients should consider the other end of the connection as broke if
946 it receives a put request.
947
948 A server which receives a put message whose operation is 'remove' may
949 consider to remove an attribute with the given address, class, and
950 value. The remove message contains no index since the index of an
951 attribute can decrease at any time because of removal of older
952 attributes on the same attribute list.
953
954 A server which receives a put message whose operation is 'add' may
955 consider to add an attribute with the given address, class, and
956 value. The add message contains no timestamp since the timestamp of
957 the new attribute should be set to the current server time rather
958 than being supplied.
959
960 A server should consider almost all put requests with almost infinite
961 suspicion. A put request could be forged to corrupt the state of a
962 server or could be forged to fool the server into participating in a
963 denial-of-service attack on some other Logiweb server or some other
964 service on the Internet. This is why a server only tells the sender
965 of a put request that the server has 'received' the request. It does
966 not reveal any information about what the server is going to do with
967 the request. Is is perfectly legitimate for a server to ignore all
968 put requests.
969
970
971
9723. Security Considerations
973
974 3.1. Unwanted outgoing information
975
976 A Logiweb server provides information to the outside world through
977 pong responses, event responses, and got responses.
978
979 Pong responses identifies the server as a Logiweb server and tells
980 what time it is. The owner of a Logiweb server must be prepared to
981 share this information with the world.
982
983 Event responses (received, rejected, and sorry responses) tells the
984 world about the mood of the server. The owner of the server must be
985 prepared to share that as well.
986
987 Got responses tell the world about the publicly available state of
988 the server. In principle, the owner of the server should be prepared
989 to share that as well.
990
991 A Logiweb server, however, typically indexes given subtrees of the
992 owners web site. A Logiweb server typically does so by crawling the
993 file system of the host. In doing so, the server could find documents
994 whose existence the owner wants to keep secret, and then make the
995 existence of those documents publicly known. After that, the secret
996 documents may be retrieved from the owners web server.
997
998 As a countermeasure for that, Logiweb servers should only index files
999 with extension 'lgw' ('lgw' for 'logiweb'). Among those files, the
1000 server should check that the first byte of the file contains the
1001 number 1, and that the next twenty bytes contain the RIPEMD-160 hash
1002 key of the remaining bytes of the file. That ensures with great
1003 likelihood that only genuine Logiweb documents are indexed, avoiding
1004 inadvertent indexing of other kinds of documents. Authors of Logiweb
1005 documents who want their Logiweb documents to remain secret should
1006 keep them out of reach of the local Logiweb server.
1007
1008 As another use of got messages, an attacker may use got responses to
1009 figure out how the server reacts to put requests. Doing so, the
1010 attacker may be able to find a security hole which allows the
1011 attacker to fool the server to participate in a denial-of-service
1012 attack on some other service. The ultimate countermeasure to this is
1013 to let the server ignore all put messages. Otherwise, one must try to
1014 avoid security holes in the server.
1015
1016
1017
1018 3.2. State corruption
1019
1020 Using put messages, an attacker may try to persuade a server to place
1021 incorrect information in the server state. The ultimate
1022 countermeasure to this is to let the server ignore all put messages.
1023 Otherwise, a server should not react directly to put messages.
1024 Rather, the server should repeatedly crawl its host file system to
1025 keep its url attributes up to date and should repeatedly crawl
1026 Logiweb to keep its sibling attributes up to date. In doing so, a
1027 server could take a put message as a hint to crawl some particular
1028 area earlier than it would otherwise do.
1029
1030 One source of put messages are notifications from inside the owners
1031 firewall that some Logiweb document has been added to or removed from
1032 the file system. To respond reasonably, servers are suggested to
1033 classify sender IP's suitably in order to follow up more promptly on
1034 put requests from more trusted senders. This only works, of course,
1035 for sender IP's which an attacker cannot tamper with.
1036
1037 Even if a server is persuaded to place incorrect information in its
1038 state, this will at most prevent clients from finding Logiweb
1039 documents. If a server translates a reference into an URL, then the
1040 client is supposed to retrieve the associated Logiweb document and to
1041 verify using RIPEMD-160 [RIPEMD] that the retrieved document is the
1042 one requested.
1043
1044
1045
1046 3.3. Incoming denial-of-service attacks
1047
1048 If a large number of clients start sending requests to a single
1049 Logiweb server, the ingoing bandwidth of the server may get
1050 saturated. To avoid saturating the outgoing bandwidth if this occurs,
1051 the 'sorry' message has been included in the protocol. The 'sorry'
1052 message allows the server to respond to incoming messages using
1053 little bandwidth and little computational resources. Furthermore, the
1054 protocol allows the server not to respond at all, which accounts for
1055 messages lost due to limitation of ingoing bandwidth.
1056
1057 Logiweb clients should maintain a list of Logiweb servers, and if one
1058 server does not respond or responds with a 'sorry', then the client
1059 should switch to another Logiweb server.
1060
1061
1062
1063 3.4. Outgoing denial-of-service attacks
1064
1065 An attacker may launch an indirect denial-of-service attack by
1066 sending requests to a Logiweb server whose sender field contain the
1067 IP of the victim. To counter for that, the Logiweb protocol specifies
1068 that each request can result in at most one response. In that way, an
1069 attacker cannot use a Logiweb server to 'amplify' the attack.
1070
1071 Logiweb servers are supposed to crawl Logiweb on their own
1072 initiative. Furthermore, put messages may suggest to Logiweb servers
1073 that they should promote crawling of particular servers. An attacker
1074 could use this to persuade a number of Logiweb servers to crawl one
1075 victim simultaneously. To counter for that, the present document does
1076 not specify exactly what a Logiweb server is supposed to do with put
1077 messages. Furthermore, Logiweb servers should approach other servers
1078 gently, waiting for their responses to see that the contacted servers
1079 do respond and do not send out 'sorry' messages. Finally, Logiweb
1080 servers should check that they actually do talk with Logiweb servers
1081 and not with some innocent other service. Logiweb servers may do so
1082 by sending a ping request to services whose identity they are not
1083 sure of.
1084
1085
1086
10874. IANA Considerations
1088
1089 4.1. Well Known Port 332
1090
1091 The format of sibling attributes allows Logiweb servers to run on
1092 arbitrary UDP and TCP ports. At present, Logiweb servers use UDP port
1093 65535 by default.
1094
1095 To avoid making the use of port 65535 permanent, udp and tcp Well
1096 Known Port 332 is requested to be registered.
1097
1098 Port number 332 is suggested because 332 = 256 + 76 where 76 is the
1099 Unicode of Latin capital letter L, which is the first letter in
1100 "Logiweb". On some occasions not covered in the present document, the
1101 Logiweb system represents strings by numbers, in which case the one
1102 character string "L" happens to be represented by the number 332.
1103 Furthermore, port 332 is unassigned and appears at the end of an
1104 interval of unassigned numbers so that assignment will not lead to
1105 fragmentation.
1106
1107 Suggested port name: "Logiweb".
1108
1109
1110
11114.2. MIME type application/prs.logiweb
1112
1113 As mentioned, the main purpose of Logiweb servers is to translate
1114 Logiweb references into an URL of an associated Logiweb document.
1115 When looking up the URL of the Logiweb document, http servers
1116 currently deliver the Logiweb document with MIME type application/x-
1117 logiweb.
1118
1119 To avoid making the use of MIME type application/x-logiweb permanent,
1120 MIME type application/prs.logiweb is requested to be registered.
1121
1122 The format of Logiweb documents is:
1123
1124 document = id-version ripemd timestamp contents
1125 id-version = %d001
1126 ripemd = 20*20 byte
1127 contents = *byte
1128 byte = %d0-255
1129
1130 For the syntax of timestamps, see the section entitled "Timestamps".
1131
1132 The ripemd field of a document must be the RIPEMD-160 hash key
1133 [RIPEMD] of all bytes following the ripemd field (including the
1134 timestamp).
1135
1136 The reference of a Logiweb document comprises the document with the
1137 contents removed.
1138
1139 The description above of the contents as a sequence of bytes is
1140 sufficient as far as the Logiweb protocol is concerned. A more
1141 complete description may be found at
1142 http://logiweb.eu/logiweb/doc/server/protocol.html#Pages.
1143
1144
1145
11465. References
1147
11485.1. Normative References
1149
1150 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
1151 Requirement Levels", BCP 14, RFC 2119, March 1997.
1152
1153 [RFC4234] Crocker, D. and P. Overell, "Augmented BNF for Syntax
1154 Specifications: ABNF", RFC 4234, October 2005.
1155
1156
1157
11585.2. Informative References
1159
1160 [Logiweb] http://logiweb.eu/ (see also Grue, K., "Logiweb - A System
1161 for Web Publication of Mathematics", Mathematical Software
1162 - ICMS 2006, Lecture Notes in Computer Science,
1163 pp.343--353, vol.4151, Springer, 2006).
1164
1165 [CGI] http://www.w3.org/CGI/
1166
1167 [RIPEMD] Dobbertin, H., Bosselaers, A., and Preneel, B.,
1168 "RIPEMD-160: A Strengthened Version of RIPEMD", Fast
1169 Software Encryption, 71-82, 1996
1170
1171 [Unicode] http://www.unicode.org/
1172
1173 [RFC0768] Postel, J., "User Datagram Protocol", STD 6, RFC 768,
1174 August 1980.
1175
1176 [RFC0793] Postel, J., "Transmission Control Protocol", STD 7, RFC
1177 793, September 1981.
1178
1179 [RFC3986] Berners-Lee, T., Fielding, R., and L. Masinter, "Uniform
1180 Resource Identifier (URI): Generic Syntax", STD 66, RFC
1181 3986, January 2005.
1182
1183
1184
1185Authors' Address
1186
1187
1188 Klaus Grue
1189 DIKU
1190 University of Copenhagen
1191 Universitetsparken 1
1192 DK-2100 Copenhagen
1193 Denmark
1194
1195 email - grue@diku.dk
1196
1197Full Copyright Statement
1198
1199 Copyright (C) The IETF Trust (2007).
1200
1201 This document is subject to the rights, licenses and restrictions
1202 contained in BCP 78, and except as set forth therein, the authors
1203 retain all their rights.
1204
1205 This document and the information contained herein are provided on an
1206 "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS
1207 OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY, THE IETF TRUST, AND
1208 THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR
1209 IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE
1210 INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED
1211 WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.
1212
1213Intellectual Property
1214
1215 The IETF takes no position regarding the validity or scope of any
1216 Intellectual Property Rights or other rights that might be claimed
1217 to pertain to the implementation or use of the technology
1218 described in this document or the extent to which any license
1219 under such rights might or might not be available; nor does it
1220 represent that it has made any independent effort to identify any
1221 such rights. Information on the procedures with respect to
1222 rights in RFC documents can be found in BCP 78 and BCP 79.
1223
1224 Copies of IPR disclosures made to the IETF Secretariat and any
1225 assurances of licenses to be made available, or the result of an
1226 attempt made to obtain a general license or permission for the use
1227 of such proprietary rights by implementers or users of this
1228 specification can be obtained from the IETF on-line IPR repository
1229 at http://www.ietf.org/ipr.
1230
1231 The IETF invites any interested party to bring to its attention
1232 any copyrights, patents or patent applications, or other
1233 proprietary rights that may cover technology that may be required
1234 to implement this standard. Please address the information to the
1235 IETF at ietf-ipr@ietf.org.
1236
1237
1238
1239
1240Logiweb JULY 2009 LOGIWEB(7)