1Boulder::Unigene(3) User Contributed Perl Documentation Boulder::Unigene(3)
2
3
4
6 Boulder::Unigene - Fetch Unigene data records as parsed Boulder Stones
7
9 # parse a file of Unigene records
10 $ug = new Boulder::Unigene(-accessor=>'File',
11 -param => '/data/unigene/Hs.dat');
12 while (my $s = $ug->get) {
13 print $s->Identifier;
14 print $s->Gene;
15 }
16
17 # parse flatfile records yourself
18 open (UG,"/data/unigene/Hs.dat");
19 local $/ = "*RECORD*";
20 while (<UG>) {
21 my $s = Boulder::Unigene->parse($_);
22 # etc.
23 }
24
26 Boulder::Unigene provides retrieval and parsing services for UNIGENE
27 records
28
29 Boulder::Unigene provides retrieval and parsing services for NCBI Uni‐
30 gene records. It returns Unigene entries in Stone format, allowing
31 easy access to the various fields and values. Boulder::Unigene is a
32 descendent of Boulder::Stream, and provides a stream-like interface to
33 a series of Stone objects.
34
35 Access to Unigene is provided by one accessors, which give access to
36 local Unigene database. When you create a new Boulder::Unigene stream,
37 you provide the accessors, along with accessor-specific parameters that
38 control what entries to fetch. The accessors is:
39
40 File
41 This provides access to local Unigene entries by reading from a flat
42 file (typically Hs.dat file downloadable from NCBI's Ftp site). The
43 stream will return a Stone corresponding to each of the entries in
44 the file, starting from the top of the file and working downward.
45 The parameter is the path to the local file.
46
47 It is also possible to parse a single Unigene entry from a text string
48 stored in a scalar variable, returning a Stone object.
49
50 Boulder::Unigene methods
51
52 This section lists the public methods that the Boulder::Unigene class
53 makes available.
54
55 new()
56 # Local fetch via File
57 $ug=new Boulder::Unigene(-accessor => 'File',
58 -param => '/data/unigene/Hs.dat');
59
60 The new() method creates a new Boulder::Unigene stream on the
61 accessor provided. The only possible accessors is File. If suc‐
62 cessful, the method returns the stream object. Otherwise it
63 returns undef.
64
65 new() takes the following arguments:
66
67 -accessor Name of the accessor to use
68 -param Parameters to pass to the accessor
69
70 Specify the accessor to use with the -accessor argument. If not
71 specified, it defaults to File.
72
73 -param is an accessor-specific argument. The possibilities is:
74
75 For File, the -param argument must point to a string-valued scalar,
76 which will be interpreted as the path to the file to read Unigene
77 entries from.
78
79 get()
80 The get() method is inherited from Boulder::Stream, and simply
81 returns the next parsed Unigene Stone, or undef if there is nothing
82 more to fetch. It has the same semantics as the parent class,
83 including the ability to restrict access to certain top-level tags.
84
85 put()
86 The put() method is inherited from the parent Boulder::Stream
87 class, and will write the passed Stone to standard output in Boul‐
88 der format. This means that it is currently not possible to write
89 a Boulder::Unigene object back into Unigene flatfile form.
90
92 The tags returned by the parsing operation are taken from the names
93 shown in the Flat file Hs.dat since no better description of them is
94 provided yet by the database source producer.
95
96 Top-Level Tags
97
98 These are tags that appear at the top level of the parsed Unigene
99 entry.
100
101 Identifier
102 The Unigene identifier of this entry. Identifier is a single-value
103 tag.
104
105 Example:
106
107 my $identifierNo = $s->Identifier;
108
109 Title
110 The Unigene title for this entry.
111
112 Example:
113 my $titledef=$s->Title;
114
115 Gene The Gene associated with this Unigene entry
116 Example:
117 my $thegene=$s->Gene;
118
119 Cytoband The cytological band position of this entry
120 Example:
121 my $thecytoband=$s->Cytoband;
122
123 Counts The number of EST in this record
124 Example:
125 my $thecounts=$s->Counts;
126
127 LocusLink The id of the LocusLink entry associated with this record
128 Example:
129 my $thelocuslink=$s->LocusLink;
130
131 Chromosome This field contains a list, of the chromosomes numbers in
132 which this entry has been linked
133 Example:
134 my @theChromosome=$s->Chromosome;
135
136 STS Multiple records in the form ^STS ACC=XXXXXX NAME=YYYYYY
137
138 ACC
139 NAME
140
141 TXMAP Multiple records in the form ^TXMAP XXXXXXX; MARKER=YYYYY;
142 RHPANEL=ZZZZ
143
144 The TXMAP tag points to a Stone record that contains multiple subtags.
145 Each subtag is the name of a feature which points, in turn, to a Stone
146 that describes the feature's location and other attributes.
147
148 Each feature will contain one or more of the following subtags:
149
150 MARKER
151 RHPANEL
152
153 PROTSIM Multiple records in the form ^PROTSIM ORG=XXX; PROTID=DBID:YYY;
154 PCT=ZZZ; ALN=QQQQ Where DBID is PID for indicate presence of GenPept
155 identifier, SP to indicate SWISSPROT identifier, PIR to indicate PIR
156 identifier, PRF to indicate ???
157
158 ORG
159 PROTID
160 PCT
161 ALN
162
163 SEQUENCE Multiple records in the form ^SEQUENCE ACC=XXX; NID=YYYY; PID
164 = CLONE= END= LID=
165
166 ACC
167 NID
168 PID
169 CLONE
170 END
171 LID
172
174 Boulder, Boulder::Blast, Boulder::Genbank
175
177 Lincoln Stein <lstein@cshl.org>. Luca I.G. Toldo <luca.toldo@merck.de>
178
179 Copyright (c) 1997 Lincoln D. Stein Copyright (c) 1999 Luca I.G. Toldo
180
181 This library is free software; you can redistribute it and/or modify it
182 under the same terms as Perl itself. See DISCLAIMER.txt for dis‐
183 claimers of warranty.
184
185
186
187perl v5.8.8 2000-06-08 Boulder::Unigene(3)