1Boulder::LocusLink(3) User Contributed Perl DocumentationBoulder::LocusLink(3)
2
3
4
6 Boulder::LocusLink - Fetch LocusLink data records as parsed Boulder
7 Stones
8
10 # parse a file of LocusLink records
11 $ll = new Boulder::LocusLink(-accessor=>'File',
12 -param => '/home/data/LocusLink/LL_tmpl');
13 while (my $s = $ll->get) {
14 print $s->Identifier;
15 print $s->Gene;
16 }
17
18 # parse flatfile records yourself
19 open (LL,"/home/data/LocusLink/LL_tmpl");
20 local $/ = "*RECORD*";
21 while (<LL>) {
22 my $s = Boulder::LocusLink->parse($_);
23 # etc.
24 }
25
27 Boulder::LocusLink provides retrieval and parsing services for
28 LocusLink records
29
30 Boulder::LocusLink provides retrieval and parsing services for NCBI
31 LocusLink records. It returns Unigene entries in Stone format,
32 allowing easy access to the various fields and values.
33 Boulder::LocusLink is a descendent of Boulder::Stream, and provides a
34 stream-like interface to a series of Stone objects.
35
36 Access to LocusLink is provided by one accessors, which give access to
37 local LocusLink database. When you create a new Boulder::LocusLink
38 stream, you provide the accessors, along with accessor-specific
39 parameters that control what entries to fetch. The accessors is:
40
41 File
42 This provides access to local LocusLink entries by reading from a
43 flat file (typically Hs.dat file downloadable from NCBI's Ftp site).
44 The stream will return a Stone corresponding to each of the entries
45 in the file, starting from the top of the file and working downward.
46 The parameter is the path to the local file.
47
48 It is also possible to parse a single LocusLink entry from a text
49 string stored in a scalar variable, returning a Stone object.
50
51 Boulder::LocusLink methods
52 This section lists the public methods that the Boulder::LocusLink class
53 makes available.
54
55 new()
56 # Local fetch via File
57 $ug=new Boulder::LocusLink(-accessor => 'File',
58 -param => '/data/LocusLink/Hs.dat');
59
60 The new() method creates a new Boulder::LocusLink stream on the
61 accessor provided. The only possible accessors is File. If
62 successful, the method returns the stream object. Otherwise it
63 returns undef.
64
65 new() takes the following arguments:
66
67 -accessor Name of the accessor to use
68 -param Parameters to pass to the accessor
69
70 Specify the accessor to use with the -accessor argument. If not
71 specified, it defaults to File.
72
73 -param is an accessor-specific argument. The possibilities is:
74
75 For File, the -param argument must point to a string-valued scalar,
76 which will be interpreted as the path to the file to read LocusLink
77 entries from.
78
79 get()
80 The get() method is inherited from Boulder::Stream, and simply
81 returns the next parsed LocusLink Stone, or undef if there is
82 nothing more to fetch. It has the same semantics as the parent
83 class, including the ability to restrict access to certain top-
84 level tags.
85
86 put()
87 The put() method is inherited from the parent Boulder::Stream
88 class, and will write the passed Stone to standard output in
89 Boulder format. This means that it is currently not possible to
90 write a Boulder::LocusLink object back into LocusLink flatfile
91 form.
92
94 The tags returned by the parsing operation are taken from the names
95 shown in the Flat file Hs.dat since no better description of them is
96 provided yet by the database source producer.
97
98 Top-Level Tags
99 These are tags that appear at the top level of the parsed LocusLink
100 entry.
101
102 Identifier
103 The LocusLink identifier of this entry. Identifier is a single-
104 value tag.
105
106 Example:
107
108 my $identifierNo = $s->Identifier;
109
110 Current_locusid
111 If a locus has been merged with another, the Current_locusid
112 contains the previous LOCUSID line (A bit confusing, shall be
113 called "previous_locusid", but this is defined in NCBI README File
114 ... ).
115
116 Example:
117 my $prevlocusid=$s->Current_locusid;
118
119 Organism Source species ased on NCBI's Taxonomy
120 Example:
121 my $theorganism=$s->Organism;
122
123 Status Type of reference sequence record. If "PROVISIONAL" then means
124 that is generated automatically from existing Genbank record and
125 information stored in the LocusLink database, no curation. If
126 "REVIEWED" than it means that is generated from the most representative
127 complete GenBank sequence or merge of GenBank sequenes and from
128 information stored in the LocusLink database
129 Example:
130 my $thestatus=$s->Status;
131
132 LocAss Here comes a complex record ... made up of LOCUS_STRING, NM
133 The value in the LOCUS field of the RefSeq record , NP The
134 RefSeq accession number for an mRNA record, PRODUCT The name of the
135 produc tof this transcript, TRANSVAR a variant-specific description,
136 ASSEMBLY The Genbank accession used to assemble the refseq record
137 Example:
138 my $theprod=$s->LocAss->Product;
139
140 AccProt Here comes a complex record ... made up of ACCNUM
141 Nucleotide sequence accessio number TYPE e=EST, m=mRNA,
142 g=Genomic PROT set of PID values for the coding region or
143 regions annotated on the nucleotide record. The first value is the PID
144 (an integer or null), then either MMDB or na, separated from the PID by
145 a |. If MMDB is present, it indicates there are structur edata
146 available for a protein related to the protein referenced by the PID
147 Example: my $theprot=$s->AccProt->Prot;
148 OFFICIAL_SYMBOL The symbol used for gene reports, validated by the
149 appropriate nomenclature committee
150 PREFERRED_SYMBOL Interim symbol used for display
151 OFFICIAL_GENE_NAME The gene description used for gene reports validate
152 by the appropriate nomenclatur eommittee. If the symbol is official,
153 the gene name will be official. No records will have both official and
154 interim nomenclature.
155 PREFERRED_GENE_NAME Interim used for display
156 PREFERRED_PRODUCT The name of the product used in the RefSeq record
157 ALIAS_SYMBOL Other symbols associated with this gene
158 ALIAS_PROT Other protein names associated with this gene
159 PhenoTable A complex record made up of Phenotype Phenotype_ID
160 SUmmary
161 Unigene
162 Omim
163 Chr
164 Map
165 STS
166 ECNUM
167 ButTable BUTTON LINK
168 DBTable DB_DESCR DB_LINK
169 PMID a subset of publications associated with this locus with the link
170 being the PubMed unique identifier comma separated
171
173 Boulder, Boulder::Blast, Boulder::Genbank
174
176 Lincoln Stein <lstein@cshl.org>. Luca I.G. Toldo <luca.toldo@merck.de>
177
178 Copyright (c) 1997 Lincoln D. Stein Copyright (c) 1999 Luca I.G. Toldo
179
180 This library is free software; you can redistribute it and/or modify it
181 under the same terms as Perl itself. See DISCLAIMER.txt for
182 disclaimers of warranty.
183
184
185
186perl v5.28.1 2002-12-14 Boulder::LocusLink(3)