1LGC(5)                           File Formats                           LGC(5)
2
3
4

NAME

6       lgc - the lgs source file format for the lgc compiler
7

DESCRIPTION

9       Source  files of the Logiweb compiler lgc (lgc(1)) are expressed in the
10       LoGiweb Source language (lgs). The lgs language allows to express math‐
11       ematics in a seminatural style.
12
13       To  learn  lgs,  simply  read  the Logiweb source of the 'base' page at
14       http://logiweb.eu/1.0/doc/pages/base/source.lgs. The comments in  there
15       give  much  more  details  than could reasonably be included here. Then
16       read the 'lgc' page found same place. It  documents  the  lgc  compiler
17       including lots of details on lgs.
18
19       An overview is given in the following, however.
20

STANDARDIZATION ISSUES

22       The  lgc  compiler translates lgs into Logiweb vectors, racks, and ren‐
23       derings. The Logiweb standard defines the format of Logiweb vectors and
24       racks, and defines precisely how vectors are translated to racks.
25
26       The  Logiweb standard does not, however, define the lgs format. The lgc
27       compiler is the compiler which happens to come with the Logiweb distri‐
28       bution  and  the  lgs  format happens to be the input format of the lgc
29       compiler. But Logiweb does not consider lgs as part  of  the  standard.
30       Any  compiler  which  produduces  vectors, racks, and renderings may be
31       used in connection with Logiweb.
32
33       The Logiweb standard partially defines what a rendering is: A rendering
34       is  a file tree rooted at a 'rendering directory'. The rendering direc‐
35       tory is supposed to contain a file named vector.lgw which contains  the
36       page in vector format, a file named rack.lgr which contains the page in
37       rack format, and a subdirectory named page which contains the rendering
38       of  the page. Compilers for Logiweb are free to produce additional con‐
39       tents of the rendering directory such as an index.html file.
40
41       Logiweb compilers are only required to (1) produce a vector.lgw file in
42       Logiweb vector format, (2) to produce an associated rack.lgr file which
43       is derived from vector.lgw in exactly the same way as lgc does, and (3)
44       a  'page'  subdirectory  which  is derived from rack.lgr in exactly the
45       same way as lgc does.
46

CHARACTER SET

48       Each lgs file is expressed in Unicode UTF-8. Lines may be terminated by
49       LF (code 10), CR (code 13), CRLF (code 10 followed by code 13), or LFCR
50       (code 13 followed by code 10).
51
52       Internally, Logiweb uses LF for terminating lines. More  specificially,
53       plain  text inside Logiweb vectors and Logiweb racks uses LF for termi‐
54       nating lines. The purpose of this is to ensure interoperability between
55       different platforms.
56
57       lgc translates to LF when reading lgs files and translates to host new‐
58       line convention when producing renderings.
59

MULTIQUOTE AND DIRECTIVES

61       The only reserved character in lgs is the double quote  character.  The
62       lgs language uses double quote characters for many different purposes.
63
64       We  shall refer to a sequence of two or more double quote characters as
65       a 'multiquote' and to an isolated double quote  character  as  a  'lone
66       quote'.
67
68       We  shall  refer  to  a multiquote followed by a non-quote as a 'direc‐
69       tive'.
70

COMMENTS

72       Comments start with ""{ or ""; directives (i.e. with two or more double
73       quote characters followed by a left brace or a semicolon).
74
75       Comments that start with ""; end at the end of the line.
76
77       Comments  that start with ""{ can span any number of lines. They end at
78       the first ""} directive which has exactly the  same  number  of  double
79       quote characters as the opening directive. This is an example of a com‐
80       ment:
81          """{ A ""} ends a comment starting with ""{ """}
82       Note that the comment is enclosed in brace directives with three double
83       quotes.  The  brace  directives  with two double quotes are part of the
84       comment.
85
86       Comments may occur anywhere except after a double quote since if it did
87       then that double quote would be considered to be part of the directive.
88       In particular, comments may occur inside strings and in the  middle  of
89       keywords.
90
91       If  the  first four characters of a file constitute the magic code "";;
92       then the first line of the file is considered to be a 'header'. All hex
93       characters  from  the  magic code and up to the first non-hex character
94       suggests what the reference of the page might  be.  Whenever  a  source
95       file with a header is translated, the suggested reference is used if it
96       fits the contents. Otherwise, a new reference is generated and the com‐
97       piler writes the new reference back into the header. To use this facil‐
98       ity, let your source file start with  a  line  containing  nothing  but
99       "";;.  At  first  translation,  a  reference will be stored back in the
100       header. After that, whenever you retranslate the source without  having
101       done  changes  to it, the page will get the same reference as last time
102       it was translated. Without a header, the page will get a new time stamp
103       at each translation.
104

EXAMPLE

106       The following is a wellformed lgs file:
107           ""P my page
108           ""R base
109           ""D
110           " square
111           ""B
112           "We have that "[[ 2 square ]]" is four."
113

PAGE NAME

115       Each  lgs file must contain one ""P directive which defines the name of
116       the page being defined. The page name comprises all characters from the
117       directive  until  the  end of the line. One may use a newline directive
118       (""n) instead of the end of the line to delimit the page name.
119
120       Lone quotes after the ""P directive have a special meaning described in
121       the section named QUALIFIERS below.
122
123       Comments  in page names are ignored. Note that if the line defining the
124       page name ends with a ""; comment then the end of line is  ignored  and
125       the  page name effectively continues on the next line. A similar remark
126       holds for ""{ comments which spans several lines.
127
128       By convention, the ""P directive of an lgs file  should  occur  at  the
129       beginning of the file, possibly after a "";; header and a comment about
130       copyright.
131

REFERENCES

133       Each lgs file may contain zero, one, or more ""R directives.  Each  ""R
134       directive  names  a  page  being referenced. The name of the referenced
135       page comprises all characters from the ""R directive until the  end  of
136       the line or until the first ""n directive, whatever comes first.
137
138       The  page  named  by the first ""R directive is reference number 1, the
139       one named by the second is reference number 2, and so  on.  Implicitly,
140       the page being defined is considered to be 'reference number 0'.
141
142       Lone  quotes  after  ""R directives have a special meaning described in
143       the section named QUALIFIERS below.
144
145       By convention, all ""R directives  should  come  right  after  the  ""P
146       directive.
147
148       Referenced  pages may be pointed at in many, different ways. Some exam‐
149       ples read:
150       ""R file:/usr/share/logiweb/name/base/vector.lgw
151       ""R file:~/.logiweb/name/base/vector.lgw
152       ""R file:../name/base/vector.lgw
153       ""R http://logiweb.eu/1.0/doc/pages/base/vector.lgw
154       ""R base
155       ""R lgw:017451CF6643931035C71796AC493D382EC8357EE9A390D5D6DBCDAA0806
156       The first three reference Logiweb vectors in  the  local  file  system,
157       relative  to  the  root  directory, the home directory, and the current
158       directory, respectively. The fourth one references  a  particular  http
159       url.  The  fifth  makes  a  reference  by name which is resolved by the
160       'namepath' parameter of the lgc compiler. The last one uses  a  Logiweb
161       reference  which  is  resolved  by the 'path' parameter of the lgc com‐
162       piler.
163
164       See the 'lgc' Logiweb page or http://logiweb.eu/ for  more  details  on
165       references.
166

DEFINITIONS

168       Each  lgs  file may contain zero, one, or more ""D directives. Each ""D
169       directive defines zero, one, or more syntactical constructs.
170
171       Each line following a ""D directive and until the first ""P, ""R,  ""D,
172       or  ""B  directive  defines  one syntactical construct (blank lines are
173       ignored, though).
174
175       In construct definitions, lone  quotes  serve  as  placeholders.  Three
176       examples of constructs read:
177           " square
178           " < "
179           if " then " else "
180       The constructs above allow to write expressions like
181           if 2 square < 3 square then 4 else 5
182
183       Each  page has a Logiweb reference of about 30 bytes and each construct
184       defined on a page has an index. The first construct defined  has  index
185       1, then second has index 2 and so on. Implicitly, the page name is also
186       considered to be a construct. The page name has index 0.
187
188       When a page defines a construct, that page  is  considered  to  be  the
189       'home  page'  of  the construct. Each Logiweb page is identified by its
190       world wide unique Logiweb reference. Each Logiweb construct is uniquely
191       identified by its index together with the reference of its home page.
192
193       By convention, ""D sections come after the ""R sections.
194

CHARGES

196       One  may  assign a 'charge' to defined constructs. As an example, it is
197       customary to assign a larger charge to addition than to  multiplication
198       such that e.g.
199           2 * 3 + 4 * 5
200       means
201           ( 2 * 3 ) + ( 4 * 5 )
202       A  charge  is the opposite of a priority such that constructs with high
203       charge has low priority and vice versa.
204
205       Charges are expressed as lists of integers, separated by  dots.  As  an
206       example, 2.-3.4 is an example of a charge.
207
208       Charges are sorted lexicographically such that e.g.
209           1.2.-1 < 1.2 < 1.2.2 < 2.1
210       When  comparing  two  charges  of  different length, the shorter one is
211       padded with zeros at the end. As an example 1.2 and  1.2.0  denote  the
212       same charge.
213
214       One  may include a charge between a ""D directive and the first newline
215       character after it. The charge applies to all constructs introduced  by
216       the  given ""D section. As an example, the following definitions assign
217       charge 1.6 to multiplication and 1.8 to addition and subtraction:
218           ""D 1.6
219           " * "
220           ""D 1.8
221           " + "
222           " - "
223       One may also give a charge indirectly. As  an  example,  the  following
224       assigns the charge of multiplication to division:
225           ""D " * "
226           " / "
227       By  convention,  all  constructs  which neither start nor end by a lone
228       quote should have charge zero. The page symbol always has charge  zero.
229       If no charge is given after a ""D directive then all constructs defined
230       by the directive get charge zero.
231
232       A charge is said to  be  odd/even  if  its  last,  nonzero  element  is
233       odd/even.  As an example, 2.4.6.7.0.0 is odd. As a special case, charge
234       zero is considered to be even.
235
236       Constructs with even charge are preassociative. A  preassociative  con‐
237       struct  is  left associative in text written left to right, right asso‐
238       ciative in text written right to left, and counterclockwise associative
239       in  text  written  in clockwise spirals. Constructs with odd charge are
240       postassociative. As an example, if subtraction has charge 1.8 then sub‐
241       traction is preassociative. man pages are written left to right so pre‐
242       associative means left associative here. Hence,
243           6 - 2 - 3
244       means
245           ( 6 - 2 ) - 3
246

BODY

248       The body of a page comprises all of an lgs file except  comments,  page
249       name,  references, and definitions. By convention, the body comes after
250       the ""D sections.
251
252       The ""B directive may be used to terminate a ""D section. Terminating a
253       ""D section, however, implicitly starts or resumes the body section, so
254       one may think of ""B as a 'body directive'.
255
256       The body of a page is made up of constructs, strings, and  body  direc‐
257       tives.
258
259       The  constructs  may  be  constructs defined on the page itself or con‐
260       structs defined on directly referenced pages. Directly referenced pages
261       are  those mentioned in ""R directives, as opposed to transitively ref‐
262       erenced pages which are the directly referenced pages  plus  the  pages
263       transitively referenced by directly referenced pages.
264

SPACES

266       The  lgs  language  treats  all characters almost equal, the exceptions
267       being the characters in the range 0 to 32 (inclusive). Characters  with
268       codes  0-8, 11, and 14-31 are ignored. In the body and outside strings,
269       any sequence of spaces (code 32), vertical tabs (code  9),  line  feeds
270       (code  10),  form  feeds  (code 12), and carriage returns (code 13) are
271       treated as a single space character. Apart from that, space  characters
272       are treated like any other character.
273
274       As an example, consider addition:
275           ""D 1.6
276           " + "
277       The definition allows to interpret
278           2   +   3
279       as the sum of 2 and 3 whereas
280           2+3
281       is unparseable due to missing spaces around the sum sign.
282       The la
283

STRINGS

285       Strings are arbitrary sequences of characters enclosed in string delim‐
286       iters. A string can start with a lone quote or by a  ""-  directive.  A
287       string can end with a lone quote or a "". directive.
288
289       The empty string, however, cannot be enclosed in lone quotes since that
290       would produce two double quotes in a row which counts as the  beginning
291       of a directive. The "". directive, however, may be used both for ending
292       a string and for representing the empty string.  One  can  always  tell
293       from context which meaning "". has. The following four lines all repre‐
294       sent an emtpy string.
295           "".
296           ""-"
297           ""-"".
298           ""-""{Comment""}"
299
300       The lgc compiler applies 'newline translation' to  strings:  CR,  CRLF,
301       LFCR,  and  FF  are  translated to LF, TAB characters are translated to
302       space characters, and characters with codes below 32 (Space) other than
303       TAB,  LF,  FF,  and CR are removed. Each TAB character is translated to
304       one and only one space character. To include characters like CR and TAB
305       in strings, one has to use directives.
306
307       Inside strings, one may use the following directives:
308           ""- No character
309           ""! Double quote
310           ""f Form feed
311           ""n Line feed
312           ""r Carriage return
313           ""t Horizontal tab
314           ""x Characters given in hexadecimal (until period)
315
316       As  an  example  of  use  of  the  ""x  directive, "I""x4A4B4C.M" means
317       "IJKLM".
318

BODY DIRECTIVES

320       The directives that can be used in the body are:
321           ""# (until lone quote) include given file verbatim as a string
322           ""$ (until lone quote) same, but with newline processing
323           ""S include the lgs source text itself as a string
324           ""N include name definitions
325           ""C include charge definitions
326
327       For details on these  directives,  consult  the  lgc  Logiweb  page  or
328       http://logiweb.eu/. A short list of examples follow, however:
329           ""#logiweb.png"
330       Include  the  Logiweb  icon as a string of raw bytes. Keep the bytes as
331       they are.
332           ""$README"
333       Include the given README as a string and apply newline  translation  to
334       it.
335           ""S
336       Include  the  lgs source file itself as a string. Inclusion is like ""#
337       but with a twist: If the lgs file does not start with a header, a  line
338       containing  nothing  but  "";;  is  prepended. And if the lgs file does
339       start with a header then all hex digits in the header are removed.  The
340       latter  ensures that an lgs file with a header gives the same result if
341       translated twice. The former ensures that if the source.lgs file gener‐
342       ated  as part of the rendering is retranslated then the result is iden‐
343       tical to the result of the first translation.
344
345       A README consists of plain text, so it is reasonable to  apply  newline
346       processing. A png file contains binary data, so translation of CR to LF
347       could corrupt the file.
348
349       It is debatable how e.g. an html file should be included. An html  file
350       is  near-plain  without  being  completely plain. Furthermore, the html
351       standard specifies CRLF to be used as line terminator. One  may  choose
352       to include it with newline processing in which case one should remember
353       to translate back to CRLF if writing it back to disk. Or one may choose
354       to include it raw and consider the CRLFs to be part of the html format.
355
356       Note that lgs has nothing which resembles #include of the C programming
357       language: The three include directives of Logiweb only allow to include
358       a file as a single string. Beta-test versions of Logiweb had a #include
359       like feature, but the feature has been removed.
360
361       The ""N directive expands into a list of definitions which records  the
362       relationship  between  construct  indexes  and construct names. The ""C
363       directive expands into a list of definitions which  records  the  rela‐
364       tionship between construct indexes and construct charges. The body of a
365       page should include one ""N and one ""C directive placed in a  suitable
366       context.  Otherwise,  information about construct names and charges are
367       lost in translation. Look at the lgs sources of  the  pages  that  come
368       with Logiweb for examples on how to use ""N and ""C.
369

QUALIFICATION

371       When  referencing  pages one may run into the problem that two distinct
372       constructs may have the same name. To cope with  that,  ""R  directives
373       allows constructs to be qualified.
374
375       Qualifiers  modify constucts as they are imported. After the ""R direc‐
376       tive, one may list an arbitrary number of qualifiers before the  refer‐
377       ence, separated by lone quotes
378
379       As an example, suppose the base page defines these constructs:
380           if " then " else "
381           " + "
382       Furhtermore,  suppose a page references the base page using the follow‐
383       ing reference:
384           ""R abc " def " base
385       The reference is to the base page and has qualifiers abc and def.
386
387       With the reference above, one may refer to  the  if-then-else  and  the
388       addition constructs under these names:
389           abc if " then " else "
390           def if " then " else "
391           " abc + "
392           " def + "
393
394       One  may  include the empty qualifier in the list of qualifiers. If the
395       empty qualifier is included, it has to appear first. As an example, the
396       reference
397          ""R" abc " def " base
398       allows to reference the if-then-else construct under these names:
399           if " then " else "
400           abc if " then " else "
401           def if " then " else "
402
403       As  can  be  seen, each construct may be known under more than one name
404       and distinct constructs may have the same name. If a  name  belongs  to
405       more  than one construct, then lgc will protest if that name is used in
406       the body.
407
408       For more on qualifiers, including handling of spaces, see the lgc Logi‐
409       web page or http://logiweb.eu.
410

VECTORS

412       The  frontend  of the lgc compiler translates an lgs source text into a
413       Logiweb vector. The Logiweb vector consists of a bibliography,  a  dic‐
414       tionary,  and a body, c.f. logiweb(5). The bibliography consists of the
415       references of all referenced pages, starting with reference  zero  (the
416       reference  of the page itself). The dictionary records the relationship
417       between construct indexes and construct arities. The arity  of  a  con‐
418       struct  equals  the number of lone quotes in the construct. The body is
419       no more than the parse tree of the body expressed in Polish prefix.
420
421       The codifier of the lgc compiler translates the vector to a  rack.  The
422       renderer  of  the lgc compiler than translates the rack to a rendering.
423       These translations have little to do with the lgs format.
424
425       See the lgc Logiweb page or http://logiweb.eu/ for more.
426

AUTHOR

428       Klaus Grue, http://logiweb.eu/
429

SEE ALSO

431       lgc(1), logiweb(5), http://logiweb.eu/
432
433
434
435Logiweb                            JULY 2009                            LGC(5)
Impressum