1Boulder::LocusLink(3) User Contributed Perl DocumentationBoulder::LocusLink(3)
2
3
4
6 Boulder::LocusLink - Fetch LocusLink data records as parsed Boulder
7 Stones
8
10 # parse a file of LocusLink records
11 $ll = new Boulder::LocusLink(-accessor=>'File',
12 -param => '/home/data/LocusLink/LL_tmpl');
13 while (my $s = $ll->get) {
14 print $s->Identifier;
15 print $s->Gene;
16 }
17
18 # parse flatfile records yourself
19 open (LL,"/home/data/LocusLink/LL_tmpl");
20 local $/ = "*RECORD*";
21 while (<LL>) {
22 my $s = Boulder::LocusLink->parse($_);
23 # etc.
24 }
25
27 Boulder::LocusLink provides retrieval and parsing services for
28 LocusLink records
29
30 Boulder::LocusLink provides retrieval and parsing services for NCBI
31 LocusLink records. It returns Unigene entries in Stone format, allow‐
32 ing easy access to the various fields and values. Boulder::LocusLink
33 is a descendent of Boulder::Stream, and provides a stream-like inter‐
34 face to a series of Stone objects.
35
36 Access to LocusLink is provided by one accessors, which give access to
37 local LocusLink database. When you create a new Boulder::LocusLink
38 stream, you provide the accessors, along with accessor-specific parame‐
39 ters that control what entries to fetch. The accessors is:
40
41 File
42 This provides access to local LocusLink entries by reading from a
43 flat file (typically Hs.dat file downloadable from NCBI's Ftp site).
44 The stream will return a Stone corresponding to each of the entries
45 in the file, starting from the top of the file and working downward.
46 The parameter is the path to the local file.
47
48 It is also possible to parse a single LocusLink entry from a text
49 string stored in a scalar variable, returning a Stone object.
50
51 Boulder::LocusLink methods
52
53 This section lists the public methods that the Boulder::LocusLink class
54 makes available.
55
56 new()
57 # Local fetch via File
58 $ug=new Boulder::LocusLink(-accessor => 'File',
59 -param => '/data/LocusLink/Hs.dat');
60
61 The new() method creates a new Boulder::LocusLink stream on the
62 accessor provided. The only possible accessors is File. If suc‐
63 cessful, the method returns the stream object. Otherwise it
64 returns undef.
65
66 new() takes the following arguments:
67
68 -accessor Name of the accessor to use
69 -param Parameters to pass to the accessor
70
71 Specify the accessor to use with the -accessor argument. If not
72 specified, it defaults to File.
73
74 -param is an accessor-specific argument. The possibilities is:
75
76 For File, the -param argument must point to a string-valued scalar,
77 which will be interpreted as the path to the file to read LocusLink
78 entries from.
79
80 get()
81 The get() method is inherited from Boulder::Stream, and simply
82 returns the next parsed LocusLink Stone, or undef if there is noth‐
83 ing more to fetch. It has the same semantics as the parent class,
84 including the ability to restrict access to certain top-level tags.
85
86 put()
87 The put() method is inherited from the parent Boulder::Stream
88 class, and will write the passed Stone to standard output in Boul‐
89 der format. This means that it is currently not possible to write
90 a Boulder::LocusLink object back into LocusLink flatfile form.
91
93 The tags returned by the parsing operation are taken from the names
94 shown in the Flat file Hs.dat since no better description of them is
95 provided yet by the database source producer.
96
97 Top-Level Tags
98
99 These are tags that appear at the top level of the parsed LocusLink
100 entry.
101
102 Identifier
103 The LocusLink identifier of this entry. Identifier is a single-
104 value tag.
105
106 Example:
107
108 my $identifierNo = $s->Identifier;
109
110 Current_locusid
111 If a locus has been merged with another, the Current_locusid con‐
112 tains the previous LOCUSID line (A bit confusing, shall be called
113 "previous_locusid", but this is defined in NCBI README File ... ).
114
115 Example:
116 my $prevlocusid=$s->Current_locusid;
117
118 Organism Source species ased on NCBI's Taxonomy
119 Example:
120 my $theorganism=$s->Organism;
121
122 Status Type of reference sequence record. If "PROVISIONAL" then means
123 that is generated automatically from existing Genbank record and infor‐
124 mation stored in the LocusLink database, no curation. If "REVIEWED"
125 than it means that is generated from the most representative complete
126 GenBank sequence or merge of GenBank sequenes and from information
127 stored in the LocusLink database
128 Example:
129 my $thestatus=$s->Status;
130
131 LocAss Here comes a complex record ... made up of LOCUS_STRING, NM
132 The value in the LOCUS field of the RefSeq record , NP The Ref‐
133 Seq accession number for an mRNA record, PRODUCT The name of the
134 produc tof this transcript, TRANSVAR a variant-specific description,
135 ASSEMBLY The Genbank accession used to assemble the refseq record
136 Example:
137 my $theprod=$s->LocAss->Product;
138
139 AccProt Here comes a complex record ... made up of ACCNUM Nu‐
140 cleotide sequence accessio number TYPE e=EST, m=mRNA, g=Genomic
141 PROT set of PID values for the coding region or regions anno‐
142 tated on the nucleotide record. The first value is the PID (an integer
143 or null), then either MMDB or na, separated from the PID by a ⎪. If
144 MMDB is present, it indicates there are structur edata available for a
145 protein related to the protein referenced by the PID Example: my $thep‐
146 rot=$s->AccProt->Prot;
147 OFFICIAL_SYMBOL The symbol used for gene reports, validated by the
148 appropriate nomenclature committee
149 PREFERRED_SYMBOL Interim symbol used for display
150 OFFICIAL_GENE_NAME The gene description used for gene reports validate
151 by the appropriate nomenclatur eommittee. If the symbol is official,
152 the gene name will be official. No records will have both official and
153 interim nomenclature.
154 PREFERRED_GENE_NAME Interim used for display
155 PREFERRED_PRODUCT The name of the product used in the RefSeq record
156 ALIAS_SYMBOL Other symbols associated with this gene
157 ALIAS_PROT Other protein names associated with this gene
158 PhenoTable A complex record made up of Phenotype Phenotype_ID
159 SUmmary
160 Unigene
161 Omim
162 Chr
163 Map
164 STS
165 ECNUM
166 ButTable BUTTON LINK
167 DBTable DB_DESCR DB_LINK
168 PMID a subset of publications associated with this locus with the link
169 being the PubMed unique identifier comma separated
170
172 Boulder, Boulder::Blast, Boulder::Genbank
173
175 Lincoln Stein <lstein@cshl.org>. Luca I.G. Toldo <luca.toldo@merck.de>
176
177 Copyright (c) 1997 Lincoln D. Stein Copyright (c) 1999 Luca I.G. Toldo
178
179 This library is free software; you can redistribute it and/or modify it
180 under the same terms as Perl itself. See DISCLAIMER.txt for dis‐
181 claimers of warranty.
182
183
184
185perl v5.8.8 2000-06-08 Boulder::LocusLink(3)