1Boulder::Unigene(3)   User Contributed Perl Documentation  Boulder::Unigene(3)
2
3
4

NAME

6       Boulder::Unigene - Fetch Unigene data records as parsed Boulder Stones
7

SYNOPSIS

9         # parse a file of Unigene records
10         $ug = new Boulder::Unigene(-accessor=>'File',
11                                    -param => '/data/unigene/Hs.dat');
12         while (my $s = $ug->get) {
13           print $s->Identifier;
14           print $s->Gene;
15         }
16
17         # parse flatfile records yourself
18         open (UG,"/data/unigene/Hs.dat");
19         local $/ = "*RECORD*";
20         while (<UG>) {
21            my $s = Boulder::Unigene->parse($_);
22            # etc.
23         }
24

DESCRIPTION

26       Boulder::Unigene provides retrieval and parsing services for UNIGENE
27       records
28
29       Boulder::Unigene provides retrieval and parsing services for NCBI
30       Unigene records.  It returns Unigene entries in Stone format, allowing
31       easy access to the various fields and values.  Boulder::Unigene is a
32       descendent of Boulder::Stream, and provides a stream-like interface to
33       a series of Stone objects.
34
35       Access to Unigene is provided by one accessors, which give access to
36       local Unigene database.  When you create a new Boulder::Unigene stream,
37       you provide the accessors, along with accessor-specific parameters that
38       control what entries to fetch.  The accessors is:
39
40       File
41         This provides access to local Unigene entries by reading from a flat
42         file (typically Hs.dat file downloadable from NCBI's Ftp site).  The
43         stream will return a Stone corresponding to each of the entries in
44         the file, starting from the top of the file and working downward.
45         The parameter is the path to the local file.
46
47       It is also possible to parse a single Unigene entry from a text string
48       stored in a scalar variable, returning a Stone object.
49
50   Boulder::Unigene methods
51       This section lists the public methods that the Boulder::Unigene class
52       makes available.
53
54       new()
55              # Local fetch via File
56              $ug=new Boulder::Unigene(-accessor  =>  'File',
57                                       -param     =>  '/data/unigene/Hs.dat');
58
59           The new() method creates a new Boulder::Unigene stream on the
60           accessor provided.  The only possible accessors is File.  If
61           successful, the method returns the stream object.  Otherwise it
62           returns undef.
63
64           new() takes the following arguments:
65
66                   -accessor       Name of the accessor to use
67                   -param          Parameters to pass to the accessor
68
69           Specify the accessor to use with the -accessor argument.  If not
70           specified, it defaults to File.
71
72           -param is an accessor-specific argument.  The possibilities is:
73
74           For File, the -param argument must point to a string-valued scalar,
75           which will be interpreted as the path to the file to read Unigene
76           entries from.
77
78       get()
79           The get() method is inherited from Boulder::Stream, and simply
80           returns the next parsed Unigene Stone, or undef if there is nothing
81           more to fetch.  It has the same semantics as the parent class,
82           including the ability to restrict access to certain top-level tags.
83
84       put()
85           The put() method is inherited from the parent Boulder::Stream
86           class, and will write the passed Stone to standard output in
87           Boulder format.  This means that it is currently not possible to
88           write a Boulder::Unigene object back into Unigene flatfile form.
89

OUTPUT TAGS

91       The tags returned by the parsing operation are taken from the names
92       shown in the Flat file Hs.dat since no better description of them is
93       provided yet by the database source producer.
94
95   Top-Level Tags
96       These are tags that appear at the top level of the parsed Unigene
97       entry.
98
99       Identifier
100           The Unigene identifier of this entry.  Identifier is a single-value
101           tag.
102
103           Example:
104
105                 my $identifierNo = $s->Identifier;
106
107       Title
108           The Unigene title for this entry.
109
110           Example:
111                 my $titledef=$s->Title;
112
113       Gene The Gene associated with   this Unigene entry
114           Example:
115                 my $thegene=$s->Gene;
116
117       Cytoband The cytological band position of this entry
118           Example:
119                 my $thecytoband=$s->Cytoband;
120
121       Counts The number of EST in this record
122           Example:
123                 my $thecounts=$s->Counts;
124
125       LocusLink The id of the LocusLink entry associated with this record
126           Example:
127                 my $thelocuslink=$s->LocusLink;
128
129       Chromosome This field contains a list, of the chromosomes numbers in
130       which this entry has been linked
131           Example:
132                 my @theChromosome=$s->Chromosome;
133
134   STS Multiple records in the form ^STS     ACC=XXXXXX NAME=YYYYYY
135       ACC
136       NAME
137
138   TXMAP Multiple records in the form  ^TXMAP  XXXXXXX; MARKER=YYYYY;
139       RHPANEL=ZZZZ
140       The TXMAP tag points to a Stone record that contains multiple subtags.
141       Each subtag is the name of a feature which points, in turn, to a Stone
142       that describes the feature's location and other attributes.
143
144       Each feature will contain one or more of the following subtags:
145
146       MARKER
147       RHPANEL
148
149   PROTSIM Multiple records in the form ^PROTSIM ORG=XXX; PROTID=DBID:YYY;
150       PCT=ZZZ; ALN=QQQQ Where DBID is PID for indicate presence of GenPept
151       identifier, SP to indicate SWISSPROT identifier, PIR to indicate PIR
152       identifier, PRF to indicate ???
153       ORG
154       PROTID
155       PCT
156       ALN
157
158   SEQUENCE Multiple records in the form ^SEQUENCE ACC=XXX; NID=YYYY; PID =
159       CLONE= END= LID=
160       ACC
161       NID
162       PID
163       CLONE
164       END
165       LID
166

SEE ALSO

168       Boulder, Boulder::Blast, Boulder::Genbank
169

AUTHOR

171       Lincoln Stein <lstein@cshl.org>.  Luca I.G. Toldo <luca.toldo@merck.de>
172
173       Copyright (c) 1997 Lincoln D. Stein Copyright (c) 1999 Luca I.G. Toldo
174
175       This library is free software; you can redistribute it and/or modify it
176       under the same terms as Perl itself.  See DISCLAIMER.txt for
177       disclaimers of warranty.
178
179
180
181perl v5.34.0                      2022-01-20               Boulder::Unigene(3)
Impressum