1Boulder::Unigene(3)   User Contributed Perl Documentation  Boulder::Unigene(3)
2
3
4

NAME

6       Boulder::Unigene - Fetch Unigene data records as parsed Boulder Stones
7

SYNOPSIS

9         # parse a file of Unigene records
10         $ug = new Boulder::Unigene(-accessor=>'File',
11                                    -param => '/data/unigene/Hs.dat');
12         while (my $s = $ug->get) {
13           print $s->Identifier;
14           print $s->Gene;
15         }
16
17         # parse flatfile records yourself
18         open (UG,"/data/unigene/Hs.dat");
19         local $/ = "*RECORD*";
20         while (<UG>) {
21            my $s = Boulder::Unigene->parse($_);
22            # etc.
23         }
24

DESCRIPTION

26       Boulder::Unigene provides retrieval and parsing services for UNIGENE
27       records
28
29       Boulder::Unigene provides retrieval and parsing services for NCBI Uni‐
30       gene records.  It returns Unigene entries in Stone format, allowing
31       easy access to the various fields and values.  Boulder::Unigene is a
32       descendent of Boulder::Stream, and provides a stream-like interface to
33       a series of Stone objects.
34
35       Access to Unigene is provided by one accessors, which give access to
36       local Unigene database.  When you create a new Boulder::Unigene stream,
37       you provide the accessors, along with accessor-specific parameters that
38       control what entries to fetch.  The accessors is:
39
40       File
41         This provides access to local Unigene entries by reading from a flat
42         file (typically Hs.dat file downloadable from NCBI's Ftp site).  The
43         stream will return a Stone corresponding to each of the entries in
44         the file, starting from the top of the file and working downward.
45         The parameter is the path to the local file.
46
47       It is also possible to parse a single Unigene entry from a text string
48       stored in a scalar variable, returning a Stone object.
49
50       Boulder::Unigene methods
51
52       This section lists the public methods that the Boulder::Unigene class
53       makes available.
54
55       new()
56              # Local fetch via File
57              $ug=new Boulder::Unigene(-accessor  =>  'File',
58                                       -param     =>  '/data/unigene/Hs.dat');
59
60           The new() method creates a new Boulder::Unigene stream on the
61           accessor provided.  The only possible accessors is File.  If suc‐
62           cessful, the method returns the stream object.  Otherwise it
63           returns undef.
64
65           new() takes the following arguments:
66
67                   -accessor       Name of the accessor to use
68                   -param          Parameters to pass to the accessor
69
70           Specify the accessor to use with the -accessor argument.  If not
71           specified, it defaults to File.
72
73           -param is an accessor-specific argument.  The possibilities is:
74
75           For File, the -param argument must point to a string-valued scalar,
76           which will be interpreted as the path to the file to read Unigene
77           entries from.
78
79       get()
80           The get() method is inherited from Boulder::Stream, and simply
81           returns the next parsed Unigene Stone, or undef if there is nothing
82           more to fetch.  It has the same semantics as the parent class,
83           including the ability to restrict access to certain top-level tags.
84
85       put()
86           The put() method is inherited from the parent Boulder::Stream
87           class, and will write the passed Stone to standard output in Boul‐
88           der format.  This means that it is currently not possible to write
89           a Boulder::Unigene object back into Unigene flatfile form.
90

OUTPUT TAGS

92       The tags returned by the parsing operation are taken from the names
93       shown in the Flat file Hs.dat since no better description of them is
94       provided yet by the database source producer.
95
96       Top-Level Tags
97
98       These are tags that appear at the top level of the parsed Unigene
99       entry.
100
101       Identifier
102           The Unigene identifier of this entry.  Identifier is a single-value
103           tag.
104
105           Example:
106
107                 my $identifierNo = $s->Identifier;
108
109       Title
110           The Unigene title for this entry.
111
112           Example:
113                 my $titledef=$s->Title;
114
115       Gene The Gene associated with   this Unigene entry
116           Example:
117                 my $thegene=$s->Gene;
118
119       Cytoband The cytological band position of this entry
120           Example:
121                 my $thecytoband=$s->Cytoband;
122
123       Counts The number of EST in this record
124           Example:
125                 my $thecounts=$s->Counts;
126
127       LocusLink The id of the LocusLink entry associated with this record
128           Example:
129                 my $thelocuslink=$s->LocusLink;
130
131       Chromosome This field contains a list, of the chromosomes numbers in
132       which this entry has been linked
133           Example:
134                 my @theChromosome=$s->Chromosome;
135
136       STS Multiple records in the form ^STS     ACC=XXXXXX NAME=YYYYYY
137
138       ACC
139       NAME
140
141       TXMAP Multiple records in the form  ^TXMAP  XXXXXXX; MARKER=YYYYY;
142       RHPANEL=ZZZZ
143
144       The TXMAP tag points to a Stone record that contains multiple subtags.
145       Each subtag is the name of a feature which points, in turn, to a Stone
146       that describes the feature's location and other attributes.
147
148       Each feature will contain one or more of the following subtags:
149
150       MARKER
151       RHPANEL
152
153       PROTSIM Multiple records in the form ^PROTSIM ORG=XXX; PROTID=DBID:YYY;
154       PCT=ZZZ; ALN=QQQQ Where DBID is PID for indicate presence of GenPept
155       identifier, SP to indicate SWISSPROT identifier, PIR to indicate PIR
156       identifier, PRF to indicate ???
157
158       ORG
159       PROTID
160       PCT
161       ALN
162
163       SEQUENCE Multiple records in the form ^SEQUENCE ACC=XXX; NID=YYYY; PID
164       = CLONE= END= LID=
165
166       ACC
167       NID
168       PID
169       CLONE
170       END
171       LID
172

SEE ALSO

174       Boulder, Boulder::Blast, Boulder::Genbank
175

AUTHOR

177       Lincoln Stein <lstein@cshl.org>.  Luca I.G. Toldo <luca.toldo@merck.de>
178
179       Copyright (c) 1997 Lincoln D. Stein Copyright (c) 1999 Luca I.G. Toldo
180
181       This library is free software; you can redistribute it and/or modify it
182       under the same terms as Perl itself.  See DISCLAIMER.txt for dis‐
183       claimers of warranty.
184
185
186
187perl v5.8.8                       2000-06-08               Boulder::Unigene(3)
Impressum