1Boulder::Unigene(3) User Contributed Perl Documentation Boulder::Unigene(3)
2
3
4
6 Boulder::Unigene - Fetch Unigene data records as parsed Boulder Stones
7
9 # parse a file of Unigene records
10 $ug = new Boulder::Unigene(-accessor=>'File',
11 -param => '/data/unigene/Hs.dat');
12 while (my $s = $ug->get) {
13 print $s->Identifier;
14 print $s->Gene;
15 }
16
17 # parse flatfile records yourself
18 open (UG,"/data/unigene/Hs.dat");
19 local $/ = "*RECORD*";
20 while (<UG>) {
21 my $s = Boulder::Unigene->parse($_);
22 # etc.
23 }
24
26 Boulder::Unigene provides retrieval and parsing services for UNIGENE
27 records
28
29 Boulder::Unigene provides retrieval and parsing services for NCBI
30 Unigene records. It returns Unigene entries in Stone format, allowing
31 easy access to the various fields and values. Boulder::Unigene is a
32 descendent of Boulder::Stream, and provides a stream-like interface to
33 a series of Stone objects.
34
35 Access to Unigene is provided by one accessors, which give access to
36 local Unigene database. When you create a new Boulder::Unigene stream,
37 you provide the accessors, along with accessor-specific parameters that
38 control what entries to fetch. The accessors is:
39
40 File
41 This provides access to local Unigene entries by reading from a flat
42 file (typically Hs.dat file downloadable from NCBI's Ftp site). The
43 stream will return a Stone corresponding to each of the entries in
44 the file, starting from the top of the file and working downward.
45 The parameter is the path to the local file.
46
47 It is also possible to parse a single Unigene entry from a text string
48 stored in a scalar variable, returning a Stone object.
49
50 Boulder::Unigene methods
51 This section lists the public methods that the Boulder::Unigene class
52 makes available.
53
54 new()
55 # Local fetch via File
56 $ug=new Boulder::Unigene(-accessor => 'File',
57 -param => '/data/unigene/Hs.dat');
58
59 The new() method creates a new Boulder::Unigene stream on the
60 accessor provided. The only possible accessors is File. If
61 successful, the method returns the stream object. Otherwise it
62 returns undef.
63
64 new() takes the following arguments:
65
66 -accessor Name of the accessor to use
67 -param Parameters to pass to the accessor
68
69 Specify the accessor to use with the -accessor argument. If not
70 specified, it defaults to File.
71
72 -param is an accessor-specific argument. The possibilities is:
73
74 For File, the -param argument must point to a string-valued scalar,
75 which will be interpreted as the path to the file to read Unigene
76 entries from.
77
78 get()
79 The get() method is inherited from Boulder::Stream, and simply
80 returns the next parsed Unigene Stone, or undef if there is nothing
81 more to fetch. It has the same semantics as the parent class,
82 including the ability to restrict access to certain top-level tags.
83
84 put()
85 The put() method is inherited from the parent Boulder::Stream
86 class, and will write the passed Stone to standard output in
87 Boulder format. This means that it is currently not possible to
88 write a Boulder::Unigene object back into Unigene flatfile form.
89
91 The tags returned by the parsing operation are taken from the names
92 shown in the Flat file Hs.dat since no better description of them is
93 provided yet by the database source producer.
94
95 Top-Level Tags
96 These are tags that appear at the top level of the parsed Unigene
97 entry.
98
99 Identifier
100 The Unigene identifier of this entry. Identifier is a single-value
101 tag.
102
103 Example:
104
105 my $identifierNo = $s->Identifier;
106
107 Title
108 The Unigene title for this entry.
109
110 Example:
111 my $titledef=$s->Title;
112
113 Gene The Gene associated with this Unigene entry
114 Example:
115 my $thegene=$s->Gene;
116
117 Cytoband The cytological band position of this entry
118 Example:
119 my $thecytoband=$s->Cytoband;
120
121 Counts The number of EST in this record
122 Example:
123 my $thecounts=$s->Counts;
124
125 LocusLink The id of the LocusLink entry associated with this record
126 Example:
127 my $thelocuslink=$s->LocusLink;
128
129 Chromosome This field contains a list, of the chromosomes numbers in
130 which this entry has been linked
131 Example:
132 my @theChromosome=$s->Chromosome;
133
134 STS Multiple records in the form ^STS ACC=XXXXXX NAME=YYYYYY
135 ACC
136 NAME
137
138 TXMAP Multiple records in the form ^TXMAP XXXXXXX; MARKER=YYYYY;
139 RHPANEL=ZZZZ
140 The TXMAP tag points to a Stone record that contains multiple subtags.
141 Each subtag is the name of a feature which points, in turn, to a Stone
142 that describes the feature's location and other attributes.
143
144 Each feature will contain one or more of the following subtags:
145
146 MARKER
147 RHPANEL
148
149 PROTSIM Multiple records in the form ^PROTSIM ORG=XXX; PROTID=DBID:YYY;
150 PCT=ZZZ; ALN=QQQQ Where DBID is PID for indicate presence of GenPept
151 identifier, SP to indicate SWISSPROT identifier, PIR to indicate PIR
152 identifier, PRF to indicate ???
153 ORG
154 PROTID
155 PCT
156 ALN
157
158 SEQUENCE Multiple records in the form ^SEQUENCE ACC=XXX; NID=YYYY; PID =
159 CLONE= END= LID=
160 ACC
161 NID
162 PID
163 CLONE
164 END
165 LID
166
168 Boulder, Boulder::Blast, Boulder::Genbank
169
171 Lincoln Stein <lstein@cshl.org>. Luca I.G. Toldo <luca.toldo@merck.de>
172
173 Copyright (c) 1997 Lincoln D. Stein Copyright (c) 1999 Luca I.G. Toldo
174
175 This library is free software; you can redistribute it and/or modify it
176 under the same terms as Perl itself. See DISCLAIMER.txt for
177 disclaimers of warranty.
178
179
180
181perl v5.32.1 2021-01-26 Boulder::Unigene(3)