1ZEBRASRV(8)                        Commands                        ZEBRASRV(8)
2
3
4

NAME

6       zebrasrv - Zebra Server
7

SYNOPSIS

9       zebrasrv [-install] [-installa] [-remove] [-a file] [-v level]
10                [-l file] [-u uid] [-c config] [-f vconfig] [-C fname]
11                [-t minutes] [-k kilobytes] [-d daemon] [-w dir] [-p pidfile]
12                [-ziDST1] [listener-spec...]
13

DESCRIPTION

15       Zebra is a high-performance, general-purpose structured text indexing
16       and retrieval engine. It reads structured records in a variety of input
17       formats (e.g. email, XML, MARC) and allows access to them through exact
18       boolean search expressions and relevance-ranked free-text queries.
19
20       zebrasrv is the Z39.50 and SRU frontend server for the Zebra search
21       engine and indexer.
22
23       On Unix you can run the zebrasrv server from the command line - and put
24       it in the background. It may also operate under the inet daemon. On
25       WIN32 you can run the server as a console application or as a WIN32
26       Service.
27

OPTIONS

29       The options for zebrasrv are the same as those for YAZ' yaz-ztest.
30       Option -c specifies a Zebra configuration file - if omitted zebra.cfg
31       is read.
32
33       -a file
34           Specify a file for dumping PDUs (for diagnostic purposes). The
35           special name - (dash) sends output to stderr.
36
37       -S
38           Don't fork or make threads on connection requests. This is good for
39           debugging, but not recommended for real operation: Although the
40           server is asynchronous and non-blocking, it can be nice to keep a
41           software malfunction (okay then, a crash) from affecting all
42           current users. The server can only accept a single connection in
43           this mode.
44
45       -1
46           Like -S but after one session the server exits. This mode is for
47           debugging only.
48
49       -T
50           Operate the server in threaded mode. The server creates a thread
51           for each connection rather than a fork a process. Only available on
52           UNIX systems that offers POSIX threads.
53
54       -s
55           Use the SR protocol (obsolete).
56
57       -z
58           Use the Z39.50 protocol (default). This option and -s complement
59           each other. You can use both multiple times on the same command
60           line, between listener-specifications (see below). This way, you
61           can set up the server to listen for connections in both protocols
62           concurrently, on different local ports.
63
64       -l file
65           Specify an output file for the diagnostic messages. The default is
66           to write this information to stderr
67
68       -c config-file
69           Read configuration information from config-file. The default
70           configuration is ./zebra.cfg
71
72       -f vconfig
73           This specifies an XML file that describes one or more YAZ frontend
74           virtual servers. See section VIRTUAL HOSTS for details.
75
76       -C fname
77           Sets SSL certificate file name for server (PEM).
78
79       -v level
80           The log level. Use a comma-separated list of members of the set
81           {fatal,debug,warn,log,malloc,all,none}.
82
83       -u uid
84           Set user ID. Sets the real UID of the server process to that of the
85           given user. It's useful if you aren't comfortable with having the
86           server run as root, but you need to start it as such to bind a
87           privileged port.
88
89       -w working-directory
90           The server changes to this working directory during before
91           listening on incoming connections. This option is useful when the
92           server is operating from the inetd daemon (see -i).
93
94       -p pidfile
95           Specifies that the server should write its Process ID to file given
96           by pidfile. A typical location would be /var/run/zebrasrv.pid.
97
98       -i
99           Use this to make the the server run from the inetd server (UNIX
100           only). Make sure you use the logfile option -l in conjunction with
101           this mode and specify the -l option before any other options.
102
103       -D
104           Use this to make the server put itself in the background and run as
105           a daemon. If neither -i nor -D is given, the server starts in the
106           foreground.
107
108       -install
109           Use this to install the server as an NT service (Windows NT/2000/XP
110           only). Control the server by going to the Services in the Control
111           Panel.
112
113       -installa
114           Use this to install and activate the server as an NT service
115           (Windows NT/2000/XP only). Control the server by going to the
116           Services in the Control Panel.
117
118       -remove
119           Use this to remove the server from the NT services (Windows
120           NT/2000/XP only).
121
122       -t minutes
123           Idle session timeout, in minutes. Default is 60 minutes.
124
125       -k size
126           Maximum record size/message size, in kilobytes. Default is 1024 KB
127           (1 MB).
128
129       -d daemon
130           Set name of daemon to be used in hosts access file. See
131           hosts_access(5) and tcpd(8).
132
133       A listener-address consists of an optional transport mode followed by a
134       colon (:) followed by a listener address. The transport mode is either
135       a file system socket unix, a SSL TCP/IP socket ssl, or a plain TCP/IP
136       socket tcp (default).
137
138       For TCP, an address has the form
139
140               hostname | IP-number [: portnumber]
141
142
143       The port number defaults to 210 (standard Z39.50 port) for privileged
144       users (root), and 9999 for normal users. The special hostname "@" is
145       mapped to the address INADDR_ANY, which causes the server to listen on
146       any local interface.
147
148       The default behavior for zebrasrv - if started as non-privileged user -
149       is to establish a single TCP/IP listener, for the Z39.50 protocol, on
150       port 9999.
151
152               zebrasrv @
153               zebrasrv tcp:some.server.name.org:1234
154               zebrasrv ssl:@:3000
155
156
157       To start the server listening on the registered port for Z39.50, or on
158       a filesystem socket, and to drop root privileges once the ports are
159       bound, execute the server like this from a root shell:
160
161               zebrasrv -u daemon @
162               zebrasrv -u daemon tcp:@:210
163               zebrasrv -u daemon unix:/some/file/system/socket
164
165
166       Here daemon is an existing user account, and the unix socket
167       /some/file/system/socket is readable and writable for the daemon
168       account.
169

Z39.50 PROTOCOL SUPPORT AND BEHAVIOR

171   Z39.50 Initialization
172       During initialization, the server will negotiate to version 3 of the
173       Z39.50 protocol, and the option bits for Search, Present, Scan,
174       NamedResultSets, and concurrentOperations will be set, if requested by
175       the client. The maximum PDU size is negotiated down to a maximum of 1
176       MB by default.
177
178   Z39.50 Search
179       The supported query type are 1 and 101. All operators are currently
180       supported with the restriction that only proximity units of type "word"
181       are supported for the proximity operator. Queries can be arbitrarily
182       complex. Named result sets are supported, and result sets can be used
183       as operands without limitations. Searches may span multiple databases.
184
185       The server has full support for piggy-backed retrieval (see also the
186       following section).
187
188   Z39.50 Present
189       The present facility is supported in a standard fashion. The requested
190       record syntax is matched against the ones supported by the profile of
191       each record retrieved. If no record syntax is given, SUTRS is the
192       default. The requested element set name, again, is matched against any
193       provided by the relevant record profiles.
194
195   Z39.50 Scan
196       The attribute combinations provided with the termListAndStartPoint are
197       processed in the same way as operands in a query (see above).
198       Currently, only the term and the globalOccurrences are returned with
199       the termInfo structure.
200
201   Z39.50 Sort
202       Z39.50 specifies three different types of sort criteria. Of these Zebra
203       supports the attribute specification type in which case the use
204       attribute specifies the "Sort register". Sort registers are created for
205       those fields that are of type "sort" in the default.idx file. The
206       corresponding character mapping file in default.idx specifies the
207       ordinal of each character used in the actual sort.
208
209       Z39.50 allows the client to specify sorting on one or more input result
210       sets and one output result set. Zebra supports sorting on one result
211       set only which may or may not be the same as the output result set.
212
213   Z39.50 Close
214       If a Close PDU is received, the server will respond with a Close PDU
215       with reason=FINISHED, no matter which protocol version was negotiated
216       during initialization. If the protocol version is 3 or more, the server
217       will generate a Close PDU under certain circumstances, including a
218       session timeout (60 minutes by default), and certain kinds of protocol
219       errors. Once a Close PDU has been sent, the protocol association is
220       considered broken, and the transport connection will be closed
221       immediately upon receipt of further data, or following a short timeout.
222
223   Z39.50 Explain
224       Zebra maintains a "classic" Z39.50 Explain[1] database on the side.
225       This database is called IR-Explain-1 and can be searched using the
226       attribute set exp-1.
227
228       The records in the explain database are of type grs.sgml. The root
229       element for the Explain grs.sgml records is explain, thus explain.abs
230       is used for indexing.
231
232           Note
233           Zebra must be able to locate explain.abs in order to index the
234           Explain records properly. Zebra will work without it but the
235           information will not be searchable.
236

THE SRU SERVER

238       In addition to Z39.50, Zebra supports the more recent and web-friendly
239       IR protocol SRU[2].  SRU can be carried over SOAP or a REST-like
240       protocol that uses HTTP GET or POST to request search responses. The
241       request itself is made of parameters such as query, startRecord,
242       maximumRecords and recordSchema; the response is an XML document
243       containing hit-count, result-set records, diagnostics, etc.  SRU can be
244       thought of as a re-casting of Z39.50 semantics in web-friendly terms;
245       or as a standardisation of the ad-hoc query parameters used by search
246       engines such as Google and AltaVista; or as a superset of A9's
247       OpenSearch (which it predates).
248
249       Zebra supports Z39.50, SRU GET, SRU POST, SRU SOAP (SRW) - on the same
250       port, recognising what protocol is used by each incoming requests and
251       handling them accordingly. This is a achieved through the use of Deep
252       Magic; civilians are warned not to stand too close.
253
254   Running zebrasrv as an SRU Server
255       Because Zebra supports all protocols on one port, it would seem to
256       follow that the SRU server is run in the same way as the Z39.50 server,
257       as described above. This is true, but only in an uninterestingly
258       vacuous way: a Zebra server run in this manner will indeed recognise
259       and accept SRU requests; but since it doesn't know how to handle the
260       CQL queries that these protocols use, all it can do is send failure
261       responses.
262
263           Note
264           It is possible to cheat, by having SRU search Zebra with a PQF
265           query instead of CQL, using the x-pquery parameter instead of
266           query. This is a non-standard extension of CQL, and a very naughty
267           thing to do, but it does give you a way to see Zebra serving SRU
268           ``right out of the box''. If you start your favourite Zebra server
269           in the usual way, on port 9999, then you can send your web browser
270           to:
271
272                    http://localhost:9999/Default?version=1.1
273                    &operation=searchRetrieve
274                    &x-pquery=mineral
275                    &startRecord=1
276                    &maximumRecords=1
277
278
279           This will display the XML-formatted SRU response that includes the
280           first record in the result-set found by the query mineral. (For
281           clarity, the SRU URL is shown here broken across lines, but the
282           lines should be joined together to make single-line URL for the
283           browser to submit.)
284
285       In order to turn on Zebra's support for CQL queries, it's necessary to
286       have the YAZ generic front-end (which Zebra uses) translate them into
287       the Z39.50 Type-1 query format that is used internally. And to do this,
288       the generic front-end's own configuration file must be used. See the
289       section called “YAZ SERVER VIRTUAL HOSTS”; the salient point for SRU
290       support is that zebrasrv must be started with the -f frontendConfigFile
291       option rather than the -c zebraConfigFile option, and that the
292       front-end configuration file must include both a reference to the Zebra
293       configuration file and the CQL-to-PQF translator configuration file.
294
295       A minimal front-end configuration file that does this would read as
296       follows:
297
298
299               <yazgfs>
300               <server>
301               <config>zebra.cfg</config>
302               <cql2rpn>../../tab/pqf.properties</cql2rpn>
303              </server>
304              </yazgfs>
305
306
307       The <config> element contains the name of the Zebra configuration file
308       that was previously specified by the -c command-line argument, and the
309       <cql2rpn> element contains the name of the CQL properties file
310       specifying how various CQL indexes, relations, etc. are translated into
311       Type-1 queries.
312
313       A zebra server running with such a configuration can then be queried
314       using proper, conformant SRU URLs with CQL queries:
315
316               http://localhost:9999/Default?version=1.1
317               &operation=searchRetrieve
318               &query=title=utah and description=epicent*
319               &startRecord=1
320               &maximumRecords=1
321
322

SRU PROTOCOL SUPPORT AND BEHAVIOR

324       Zebra running as an SRU server supports SRU version 1.1, including CQL
325       version 1.1. In particular, it provides support for the following
326       elements of the protocol.
327
328   SRU Search and Retrieval
329       Zebra supports the searchRetrieve operation.
330
331       One of the great strengths of SRU is that it mandates a standard query
332       language, CQL, and that all conforming implementations can therefore be
333       trusted to correctly interpret the same queries. It is with some shame,
334       then, that we admit that Zebra also supports an additional query
335       language, our own Prefix Query Format (PQF[3]). A PQF query is
336       submitted by using the extension parameter x-pquery, in which case the
337       query parameter must be omitted, which makes the request not valid SRU.
338       Please feel free to use this facility within your own applications; but
339       be aware that it is not only non-standard SRU but not even
340       syntactically valid, since it omits the mandatory query parameter.
341
342   SRU Scan
343       Zebra supports scan operation. Scanning using CQL syntax is the
344       default, where the standard scanClause parameter is used.
345
346       In addition, a mutant form of SRU scan is supported, using the
347       non-standard x-pScanClause parameter in place of the standard
348       scanClause to scan on a PQF query clause.
349
350   SRU Explain
351       Zebra supports explain.
352
353       The ZeeRex record explaining a database may be requested either with a
354       fully fledged SRU request (with operation=explain and version-number
355       specified) or with a simple HTTP GET at the server's basename. The
356       ZeeRex record returned in response is the one embedded in the YAZ
357       Frontend Server configuration file that is described in the the section
358       called “YAZ SERVER VIRTUAL HOSTS”.
359
360       Unfortunately, the data found in the CQL-to-PQF text file must be added
361       by hand-craft into the explain section of the YAZ Frontend Server
362       configuration file to be able to provide a suitable explain record. Too
363       bad, but this is all extreme new alpha stuff, and a lot of work has yet
364       to be done ..
365
366       There is no linkage whatsoever between the Z39.50 explain model and the
367       SRU explain response (well, at least not implemented in Zebra, that is
368       ..). Zebra does not provide a means using Z39.50 to obtain the ZeeRex
369       record.
370
371   Other SRU operations
372       In the Z39.50 protocol, Initialization, Present, Sort and Close are
373       separate operations. In SRU, however, these operations do not exist.
374
375       ·   SRU has no explicit initialization handshake phase, but commences
376           immediately with searching, scanning and explain operations.
377
378       ·   Neither does SRU have a close operation, since the protocol is
379           stateless and each request is self-contained. (It is true that
380           multiple SRU request/response pairs may be implemented as multiple
381           HTTP request/response pairs over a single persistent TCP/IP
382           connection; but the closure of that connection is not a
383           protocol-level operation.)
384
385       ·   Retrieval in SRU is part of the searchRetrieve operation, in which
386           a search is submitted and the response includes a subset of the
387           records in the result set. There is no direct analogue of Z39.50's
388           Present operation which requests records from an established result
389           set. In SRU, this is achieved by sending a subsequent
390           searchRetrieve request with the query cql.resultSetId=id where id
391           is the identifier of the previously generated result-set.
392
393       ·   Sorting in CQL is done within the searchRetrieve operation - in
394           v1.1, by an explicit sort parameter, but the forthcoming v1.2 or
395           v2.0 will most likely use an extension of the query language, CQL
396           sorting[4].
397
398       It can be seen, then, that while Zebra operating as an SRU server does
399       not provide the same set of operations as when operating as a Z39.50
400       server, it does provide equivalent functionality.
401

SRU EXAMPLES

403       Surf into http://localhost:9999 to get an explain response, or use
404
405               http://localhost:9999/?version=1.1&operation=explain
406
407
408       See number of hits for a query
409
410               http://localhost:9999/?version=1.1&operation=searchRetrieve
411               &query=text=(plant%20and%20soil)
412
413
414       Fetch record 5-7 in Dublin Core format
415
416               http://localhost:9999/?version=1.1&operation=searchRetrieve
417               &query=text=(plant%20and%20soil)
418               &startRecord=5&maximumRecords=2&recordSchema=dc
419
420
421       Even search using PQF queries using the extended naughty parameter
422       x-pquery
423
424               http://localhost:9999/?version=1.1&operation=searchRetrieve
425               &x-pquery=@attr%201=text%20@and%20plant%20soil
426
427
428       Or scan indexes using the extended extremely naughty parameter
429       x-pScanClause
430
431               http://localhost:9999/?version=1.1&operation=scan
432               &x-pScanClause=@attr%201=text%20something
433
434
435
436       Don't do this in production code!  But it's a great fast debugging aid.
437

YAZ SERVER VIRTUAL HOSTS

439       The Virtual hosts mechanism allows a YAZ frontend server to support
440       multiple backends. A backend is selected on the basis of the TCP/IP
441       binding (port+listening address) and/or the virtual host.
442
443       A backend can be configured to execute in a particular working
444       directory. Or the YAZ frontend may perform CQL[5] to RPN conversion,
445       thus allowing traditional Z39.50 backends to be offered as a SRU[2]
446       service.  SRU Explain information for a particular backend may also be
447       specified.
448
449       For the HTTP protocol, the virtual host is specified in the Host
450       header. For the Z39.50 protocol, the virtual host is specified as in
451       the Initialize Request in the OtherInfo, OID
452       1.2.840.10003.10.1000.81.1.
453
454           Note
455           Not all Z39.50 clients allows the VHOST information to be set. For
456           those the selection of the backend must rely on the TCP/IP
457           information alone (port and address).
458
459       The YAZ frontend server uses XML to describe the backend
460       configurations. Command-line option -f specifies filename of the XML
461       configuration.
462
463       The configuration uses the root element yazgfs. This element includes a
464       list of listen elements, followed by one or more server elements.
465
466       The listen describes listener (transport end point), such as TCP/IP,
467       Unix file socket or SSL server. Content for a listener:
468
469       CDATA (required)
470           The CDATA for the listen element holds the listener string, such as
471           tcp:@:210, tcp:server1:2100, etc.
472
473       attribute id (optional)
474           identifier for this listener. This may be referred to from server
475           sections.
476
477           Note
478           We expect more information to be added for the listen section in a
479           future version, such as CERT file for SSL servers.
480
481       The server describes a server and the parameters for this server type.
482       Content for a server:
483
484       attribute id (optional)
485           Identifier for this server. Currently not used for anything, but it
486           might be for logging purposes.
487
488       attribute listenref (optional)
489           Specifies listener for this server. If this attribute is not given,
490           the server is accessible from all listener. In order for the server
491           to be used for real, however, the virtual host must match (if
492           specified in the configuration).
493
494       element config (optional)
495           Specifies the server configuration. This is equivalent to the
496           config specified using command line option -c.
497
498       element directory (optional)
499           Specifies a working directory for this backend server. If
500           specified, the YAZ frontend changes current working directory to
501           this directory whenever a backend of this type is started (backend
502           handler bend_start), stopped (backend handler hand_stop) and
503           initialized (bend_init).
504
505       element host (optional)
506           Specifies the virtual host for this server. If this is specified a
507           client must specify this host string in order to use this backend.
508
509       element cql2rpn (optional)
510           Specifies a filename that includes CQL[5] to RPN conversion for
511           this backend server. See CQL[5] section in YAZ manual. If given,
512           the backend server will only "see" a Type-1/RPN query.
513
514       element explain (optional)
515           Specifies SRU[2] ZeeRex content for this server - copied verbatim
516           to the client. As things are now, some of the Explain content seems
517           redundant because host information, etc. is also stored elsewhere.
518
519           The format of the Explain record is described in detail, with
520           examples, on the file at the ZeeRex[6] web-site.
521
522       The XML below configures a server that accepts connections from two
523       ports, TCP/IP port 9900 and a local UNIX file socket. We name the
524       TCP/IP server public and the other server internal.
525
526
527            <yazgfs>
528             <listen id="public">tcp:@:9900</listen>
529             <listen id="internal">unix:/var/tmp/socket</listen>
530             <server id="server1">
531               <host>server1.mydomain</host>
532               <directory>/var/www/s1</directory>
533               <config>config.cfg</config>
534             </server>
535             <server id="server2">
536               <host>server2.mydomain</host>
537               <directory>/var/www/s2</directory>
538               <config>config.cfg</config>
539               <cql2rpn>../etc/pqf.properties</cql2rpn>
540               <explain xmlns="http://explain.z3950.org/dtd/2.0/">
541                 <serverInfo>
542                   <host>server2.mydomain</host>
543                   <port>9900</port>
544                   <database>a</database>
545                 </serverInfo>
546               </explain>
547             </server>
548             <server id="server3" listenref="internal">
549               <directory>/var/www/s3</directory>
550               <config>config.cfg</config>
551             </server>
552            </yazgfs>
553
554
555
556       There are three configured backend servers. The first two servers,
557       "server1" and "server2", can be reached by both listener addresses -
558       since no listenref attribute is specified. In order to distinguish
559       between the two a virtual host has been specified for each of server in
560       the host elements.
561
562       For "server2" elements for CQL[5] to RPN conversion is supported and
563       explain information has been added (a short one here to keep the
564       example small).
565
566       The third server, "server3" can only be reached via listener
567       "internal".
568

SEE ALSO

570       zebraidx(1)
571

AUTHORS

573       Index Data
574

NOTES

576        1. Z39.50 Explain
577           http://www.loc.gov/z3950/agency/markup/07.html
578
579        2. SRU
580           http://www.loc.gov/standards/sru/
581
582        3. PQF
583           http://www.indexdata.com/yaz/doc/tools.html#PQF
584
585        4. CQL sorting
586           http://zing.z3950.org/cql/sorting.html
587
588        5. CQL
589           http://www.loc.gov/standards/sru/cql/
590
591        6. ZeeRex
592           http://explain.z3950.org/
593
594
595
596zebra 2.1.4                       08/01/2018                       ZEBRASRV(8)
Impressum