1OVSDB(5) Open vSwitch OVSDB(5)
2
3
4
6 ovsdb - Open vSwitch Database (File Formats)
7
9 OVSDB, the Open vSwitch Database, is a database system whose network
10 protocol is specified by RFC 7047. The RFC does not specify an on-disk
11 storage format. The OVSDB implementation in Open vSwitch implements
12 two storage formats: one for standalone (and active-backup) databases,
13 and the other for clustered databases. This manpage documents both of
14 these formats.
15
16 Most users do not need to be concerned with this specification.
17 Instead, to manipulate OVSDB files, refer to ovsdb-tool(1). For an
18 introduction to OVSDB as a whole, read ovsdb(7).
19
20 OVSDB files explicitly record changes that are implied by the database
21 schema. For example, the OVSDB “garbage collection” feature means that
22 when a client removes the last reference to a garbage-collected row,
23 the database server automatically removes that row. The database file
24 explicitly records the deletion of the garbage-collected row, so that
25 the reader does not need to infer it.
26
27 OVSDB files do not include the values of ephemeral columns.
28
29 Standalone and clustered database files share the common structure
30 described here. They are text files encoded in UTF-8 with LF (U+000A)
31 line ends, organized as append-only series of records. Each record
32 consists of 2 lines of text.
33
34 The first line in each record has the format OVSDB <magic> <length>
35 <hash>, where <magic> is JSON for standalone databases or CLUSTER for
36 clustered databases, <length> is a positive decimal integer, and <hash>
37 is a SHA-1 checksum expressed as 40 hexadecimal digits. Words in the
38 first line must be separated by exactly one space.
39
40 The second line must be exactly length bytes long (including the LF)
41 and its SHA-1 checksum (including the LF) must match hash exactly. The
42 line’s contents must be a valid JSON object as specified by RFC 4627.
43 Strings in the JSON object must be valid UTF-8. To ensure that the
44 second line is exactly one line of text, the OVSDB implementation
45 expresses any LF characters within a JSON string as \n. For the same
46 reason, and to save space, the OVSDB implementation does not “pretty
47 print” the JSON object with spaces and LFs. (The OVSDB implementation
48 tolerates LFs when reading an OVSDB database file, as long as length
49 and hash are correct.)
50
51 JSON Notation
52 We use notation from RFC 7047 here to describe the JSON data in
53 records. In addition to the notation defined there, we add the follow‐
54 ing:
55
56 <raw-uuid>
57 A 36-character JSON string that contains a UUID in the format
58 described by RFC 4122, e.g.
59 "550e8400-e29b-41d4-a716-446655440000"
60
61 Standalone Format
62 The first record in a standalone database contains the JSON schema for
63 the database, as specified in RFC 7047. Only this record is mandatory
64 (a standalone file that contains only a schema represents an empty
65 database).
66
67 The second and subsequent records in a standalone database are transac‐
68 tion records. Each record may have the following optional special mem‐
69 bers, which do not have any semantics but are often useful to adminis‐
70 trators looking through a database log with ovsdb-tool show-log:
71
72 "_date": <integer>
73 The time at which the transaction was committed, as an integer
74 number of milliseconds since the Unix epoch. Early versions of
75 OVSDB counted seconds instead of milliseconds; these can be
76 detected by noticing that their values are less than 2**32.
77
78 OVSDB always writes a _date member.
79
80 "_comment": <string>
81 A JSON string that specifies the comment provided in a transac‐
82 tion comment operation. If a transaction has multiple comment
83 operations, OVSDB concatenates them into a single _comment mem‐
84 ber, separated by a new-line.
85
86 OVSDB only writes a _comment member if it would be a nonempty
87 string.
88
89 Each of these records also has one or more additional members, each of
90 which maps from the name of a database table to a <table-txn>:
91
92 <table-txn>
93 A JSON object that describes the effects of a transaction on a
94 database table. Its names are <raw-uuid>s for rows in the table
95 and its values are <row-txn>s.
96
97 <row-txn>
98 Either null, which indicates that the transaction deleted this
99 row, or a JSON object that describes how the transaction
100 inserted or modified the row, whose names are the names of col‐
101 umns and whose values are <value>s that give the column’s new
102 value.
103
104 For new rows, the OVSDB implementation omits columns whose val‐
105 ues have the default values for their types defined in RFC 7047
106 section 5.2.1; for modified rows, the OVSDB implementation omits
107 columns whose values are unchanged.
108
109 Clustered Format
110 The clustered format has the following additional notation:
111
112 <uint64>
113 A JSON integer that represents a 64-bit unsigned integer. The
114 OVS JSON implementation only supports integers in the range
115 -2**63 through 2**63-1, so 64-bit unsigned integer values from
116 2**63 through 2**64-1 are expressed as negative numbers.
117
118 <address>
119 A JSON string that represents a network address to support clus‐
120 tering, in the <protocol>:<ip>:<port> syntax described in
121 ovsdb-tool(1).
122
123 <servers>
124 A JSON object whose names are <raw-uuid>s that identify servers
125 and whose values are <address>es that specify those servers’
126 addresses.
127
128 <cluster-txn>
129 A JSON array with two elements:
130
131 1. The first element is either a <database-schema> or null. A
132 <database-schema> element is always present in the first
133 record of a clustered database to indicate the database’s
134 initial schema. If it is not null in a later record, it
135 indicates a change of schema for the database.
136
137 2. The second element is either a transaction record in the for‐
138 mat described under Standalone Format above, or null.
139
140 When a schema is present, the transaction record is relative to
141 an empty database. That is, a schema change effectively resets
142 the database to empty and the transaction record represents the
143 full database contents. This allows readers to be ignorant of
144 the full semantics of schema change.
145
146 The first record in a clustered database contains the following mem‐
147 bers, all of which are required, except prev_election_timer:
148
149 "server_id": <raw-uuid>
150 The server’s own UUID, which must be unique within the cluster.
151
152 "local_address": <address>
153 The address on which the server listens for connections from
154 other servers in the cluster.
155
156 "name": <id>
157 The database schema name. It is only important when a server is
158 in the process of joining a cluster: a server will only join a
159 cluster if the name matches. (If the database schema name were
160 unique, then we would not also need a cluster ID.)
161
162 "cluster_id": <raw-uuid>
163 The cluster’s UUID. The all-zeros UUID is not a valid cluster
164 ID.
165
166 "prev_term": <uint64> and "prev_index": <uint64>
167 The Raft term and index just before the beginning of the log.
168
169 "prev_servers": <servers>
170 The set of one or more servers in the cluster at index
171 “prev_index” and term “prev_term”. It might not include this
172 server, if it was not the initial server in the cluster.
173
174 "prev_election_timer": <uint64>
175 The election base time before the beginning of the log. If not
176 exist, the default value 1000 ms is used as if it exists this
177 record.
178
179 "prev_data": <json-value> and "prev_eid": <raw-uuid>
180 A snapshot of the data in the database at index “prev_index” and
181 term “prev_term”, and the entry ID for that data. The snapshot
182 must contain a schema.
183
184 The second and subsequent records, if present, in a clustered database
185 represent changes to the database, to the cluster state, or both.
186 There are several types of these records. The most important types of
187 records directly represent persistent state described in the Raft spec‐
188 ification:
189
190 Entry A Raft log entry.
191
192 Term The start of a new term.
193
194 Vote The server’s vote for a leader in the current term.
195
196 The following additional types of records aid debugging and trou‐
197 bleshooting, but they do not affect correctness.
198
199 Leader Identifies a newly elected leader for the current term.
200
201 Commit Index
202 An update to the server’s commit_index.
203
204 Note A human-readable description of some event.
205
206 The table below identifies the members that each type of record con‐
207 tains. “yes” indicates that a member is required, “?” that it is
208 optional, blank that it is forbidden, and [1] that data and eid must be
209 either both present or both absent.
210
211 ┌───────────┬───────┬──────┬──────┬────────┬────────┬──────┐
212 │member │ Entry │ Term │ Vote │ Leader │ Commit │ Note │
213 │ │ │ │ │ │ Index │ │
214 ├───────────┼───────┼──────┼──────┼────────┼────────┼──────┤
215 │comment │ ? │ ? │ ? │ ? │ ? │ ? │
216 ├───────────┼───────┼──────┼──────┼────────┼────────┼──────┤
217 │term │ yes │ yes │ yes │ yes │ │ │
218 ├───────────┼───────┼──────┼──────┼────────┼────────┼──────┤
219 │index │ yes │ │ │ │ │ │
220 ├───────────┼───────┼──────┼──────┼────────┼────────┼──────┤
221 │servers │ ? │ │ │ │ │ │
222 ├───────────┼───────┼──────┼──────┼────────┼────────┼──────┤
223 │elec‐ │ ? │ │ │ │ │ │
224 │tion_timer │ │ │ │ │ │ │
225 ├───────────┼───────┼──────┼──────┼────────┼────────┼──────┤
226 │data │ [1] │ │ │ │ │ │
227 ├───────────┼───────┼──────┼──────┼────────┼────────┼──────┤
228 │eid │ [1] │ │ │ │ │ │
229 ├───────────┼───────┼──────┼──────┼────────┼────────┼──────┤
230 │vote │ │ │ yes │ │ │ │
231 ├───────────┼───────┼──────┼──────┼────────┼────────┼──────┤
232 │leader │ │ │ │ yes │ │ │
233 ├───────────┼───────┼──────┼──────┼────────┼────────┼──────┤
234 │com‐ │ │ │ │ │ yes │ │
235 │mit_index │ │ │ │ │ │ │
236 ├───────────┼───────┼──────┼──────┼────────┼────────┼──────┤
237 │note │ │ │ │ │ │ yes │
238 └───────────┴───────┴──────┴──────┴────────┴────────┴──────┘
239
240 The members are:
241
242 "comment": <string>
243 A human-readable string giving an administrator more information
244 about the reason a record was emitted.
245
246 "term": <uint64>
247 The term in which the activity occurred.
248
249 "index": <uint64>
250 The index of a log entry.
251
252 "servers": <servers>
253 Server configuration in a log entry.
254
255 "election_timer": <uint64>
256 Leader election timeout base value in a log entry.
257
258 "data": <json-value>
259 The data in a log entry.
260
261 "eid": <raw-uuid>
262 Entry ID in a log entry.
263
264 "vote": <raw-uuid>
265 The server ID for which this server voted.
266
267 "leader": <raw-uuid>
268 The server ID of the server. Emitted by both leaders and fol‐
269 lowers when a leader is elected.
270
271 "commit_index": <uint64>
272 Updated commit_index value.
273
274 "note": <string>
275 One of a few special strings indicating important events. The
276 currently defined strings are:
277
278 "transfer leadership"
279 This server transferred leadership to a different server
280 (with details included in comment).
281
282 "left" This server finished leaving the cluster. (This lets
283 subsequent readers know that the server is not part of
284 the cluster and should not attempt to connect to it.)
285
286 Joining a Cluster
287 In addition to general format for a clustered database, there is also a
288 special case for a database file created by ovsdb-tool join-cluster.
289 Such a file contains exactly one record, which conveys the information
290 passed to the join-cluster command. It has the following members:
291
292 "server_id": <raw-uuid> and "local_address": <address> and "name": <id>
293 These have the same semantics described above in the general
294 description of the format.
295
296 "cluster_id": <raw-uuid>
297 This is provided only if the user gave the --cid option to
298 join-cluster. It has the same semantics described above.
299
300 "remote_addresses"; [<address>*]
301 One or more remote servers to contact for joining the cluster.
302
303 When the server successfully joins the cluster, the database file is
304 replaced by one described in Clustered Format.
305
307 The Open vSwitch Development Community
308
310 2021, The Open vSwitch Development Community
311
312
313
314
3152.15 Feb 17, 2021 OVSDB(5)