1MONGOC_AGGREGATE(3) libmongoc MONGOC_AGGREGATE(3)
2
3
4
5This document provides a number of practical examples that display the capa‐
6bilities of the aggregation framework.
7
8The Aggregations using the Zip Codes Data Set examples uses a publicly avail‐
9able data set of all zipcodes and populations in the United States. These data
10are available at: zips.json.
11
13 Let's check if everything is installed.
14
15 Use the following command to load zips.json data set into mongod in‐
16 stance:
17
18 $ mongoimport --drop -d test -c zipcodes zips.json
19
20 Let's use the MongoDB shell to verify that everything was imported suc‐
21 cessfully.
22
23 $ mongo test
24 connecting to: test
25 > db.zipcodes.count()
26 29467
27 > db.zipcodes.findOne()
28 {
29 "_id" : "35004",
30 "city" : "ACMAR",
31 "loc" : [
32 -86.51557,
33 33.584132
34 ],
35 "pop" : 6055,
36 "state" : "AL"
37 }
38
40 Each document in this collection has the following form:
41
42 {
43 "_id" : "35004",
44 "city" : "Acmar",
45 "state" : "AL",
46 "pop" : 6055,
47 "loc" : [-86.51557, 33.584132]
48 }
49
50 In these documents:
51
52 • The _id field holds the zipcode as a string.
53
54 • The city field holds the city name.
55
56 • The state field holds the two letter state abbreviation.
57
58 • The pop field holds the population.
59
60 • The loc field holds the location as a [latitude, longitude] array.
61
63 To get all states with a population greater than 10 million, use the
64 following aggregation pipeline:
65
66 aggregation1.c
67
68 #include <mongoc/mongoc.h>
69 #include <stdio.h>
70
71 static void
72 print_pipeline (mongoc_collection_t *collection)
73 {
74 mongoc_cursor_t *cursor;
75 bson_error_t error;
76 const bson_t *doc;
77 bson_t *pipeline;
78 char *str;
79
80 pipeline = BCON_NEW ("pipeline",
81 "[",
82 "{",
83 "$group",
84 "{",
85 "_id",
86 "$state",
87 "total_pop",
88 "{",
89 "$sum",
90 "$pop",
91 "}",
92 "}",
93 "}",
94 "{",
95 "$match",
96 "{",
97 "total_pop",
98 "{",
99 "$gte",
100 BCON_INT32 (10000000),
101 "}",
102 "}",
103 "}",
104 "]");
105
106 cursor = mongoc_collection_aggregate (
107 collection, MONGOC_QUERY_NONE, pipeline, NULL, NULL);
108
109 while (mongoc_cursor_next (cursor, &doc)) {
110 str = bson_as_canonical_extended_json (doc, NULL);
111 printf ("%s\n", str);
112 bson_free (str);
113 }
114
115 if (mongoc_cursor_error (cursor, &error)) {
116 fprintf (stderr, "Cursor Failure: %s\n", error.message);
117 }
118
119 mongoc_cursor_destroy (cursor);
120 bson_destroy (pipeline);
121 }
122
123 int
124 main (void)
125 {
126 mongoc_client_t *client;
127 mongoc_collection_t *collection;
128 const char *uri_string =
129 "mongodb://localhost:27017/?appname=aggregation-example";
130 mongoc_uri_t *uri;
131 bson_error_t error;
132
133 mongoc_init ();
134
135 uri = mongoc_uri_new_with_error (uri_string, &error);
136 if (!uri) {
137 fprintf (stderr,
138 "failed to parse URI: %s\n"
139 "error message: %s\n",
140 uri_string,
141 error.message);
142 return EXIT_FAILURE;
143 }
144
145 client = mongoc_client_new_from_uri (uri);
146 if (!client) {
147 return EXIT_FAILURE;
148 }
149
150 mongoc_client_set_error_api (client, 2);
151 collection = mongoc_client_get_collection (client, "test", "zipcodes");
152
153 print_pipeline (collection);
154
155 mongoc_uri_destroy (uri);
156 mongoc_collection_destroy (collection);
157 mongoc_client_destroy (client);
158
159 mongoc_cleanup ();
160
161 return EXIT_SUCCESS;
162 }
163
164
165 You should see a result like the following:
166
167 { "_id" : "PA", "total_pop" : 11881643 }
168 { "_id" : "OH", "total_pop" : 10847115 }
169 { "_id" : "NY", "total_pop" : 17990455 }
170 { "_id" : "FL", "total_pop" : 12937284 }
171 { "_id" : "TX", "total_pop" : 16986510 }
172 { "_id" : "IL", "total_pop" : 11430472 }
173 { "_id" : "CA", "total_pop" : 29760021 }
174
175 The above aggregation pipeline is build from two pipeline operators:
176 $group and $match.
177
178 The $group pipeline operator requires _id field where we specify group‐
179 ing; remaining fields specify how to generate composite value and must
180 use one of the group aggregation functions: $addToSet, $first, $last,
181 $max, $min, $avg, $push, $sum. The $match pipeline operator syntax is
182 the same as the read operation query syntax.
183
184 The $group process reads all documents and for each state it creates a
185 separate document, for example:
186
187 { "_id" : "WA", "total_pop" : 4866692 }
188
189 The total_pop field uses the $sum aggregation function to sum the val‐
190 ues of all pop fields in the source documents.
191
192 Documents created by $group are piped to the $match pipeline operator.
193 It returns the documents with the value of total_pop field greater than
194 or equal to 10 million.
195
197 To get the first three states with the greatest average population per
198 city, use the following aggregation:
199
200 pipeline = BCON_NEW ("pipeline", "[",
201 "{", "$group", "{", "_id", "{", "state", "$state", "city", "$city", "}", "pop", "{", "$sum", "$pop", "}", "}", "}",
202 "{", "$group", "{", "_id", "$_id.state", "avg_city_pop", "{", "$avg", "$pop", "}", "}", "}",
203 "{", "$sort", "{", "avg_city_pop", BCON_INT32 (-1), "}", "}",
204 "{", "$limit", BCON_INT32 (3) "}",
205 "]");
206
207 This aggregate pipeline produces:
208
209 { "_id" : "DC", "avg_city_pop" : 303450.0 }
210 { "_id" : "FL", "avg_city_pop" : 27942.29805615551 }
211 { "_id" : "CA", "avg_city_pop" : 27735.341099720412 }
212
213 The above aggregation pipeline is build from three pipeline operators:
214 $group, $sort and $limit.
215
216 The first $group operator creates the following documents:
217
218 { "_id" : { "state" : "WY", "city" : "Smoot" }, "pop" : 414 }
219
220 Note, that the $group operator can't use nested documents except the
221 _id field.
222
223 The second $group uses these documents to create the following docu‐
224 ments:
225
226 { "_id" : "FL", "avg_city_pop" : 27942.29805615551 }
227
228 These documents are sorted by the avg_city_pop field in descending or‐
229 der. Finally, the $limit pipeline operator returns the first 3 docu‐
230 ments from the sorted set.
231
233 MongoDB, Inc
234
236 2017-present, MongoDB, Inc
237
238
239
240
2411.25.1 Nov 08, 2023 MONGOC_AGGREGATE(3)