1OVSDB(5) Open vSwitch OVSDB(5)
2
3
4
6 ovsdb - Open vSwitch Database (File Formats)
7
9 OVSDB, the Open vSwitch Database, is a database system whose network
10 protocol is specified by RFC 7047. The RFC does not specify an on-disk
11 storage format. The OVSDB implementation in Open vSwitch implements
12 two storage formats: one for standalone (and active-backup) databases,
13 and the other for clustered databases. This manpage documents both of
14 these formats.
15
16 Most users do not need to be concerned with this specification. In‐
17 stead, to manipulate OVSDB files, refer to ovsdb-tool(1). For an in‐
18 troduction to OVSDB as a whole, read ovsdb(7).
19
20 OVSDB files explicitly record changes that are implied by the database
21 schema. For example, the OVSDB “garbage collection” feature means that
22 when a client removes the last reference to a garbage-collected row,
23 the database server automatically removes that row. The database file
24 explicitly records the deletion of the garbage-collected row, so that
25 the reader does not need to infer it.
26
27 OVSDB files do not include the values of ephemeral columns.
28
29 Standalone and clustered database files share the common structure de‐
30 scribed here. They are text files encoded in UTF-8 with LF (U+000A)
31 line ends, organized as append-only series of records. Each record
32 consists of 2 lines of text.
33
34 The first line in each record has the format OVSDB <magic> <length>
35 <hash>, where <magic> is JSON for standalone databases or CLUSTER for
36 clustered databases, <length> is a positive decimal integer, and <hash>
37 is a SHA-1 checksum expressed as 40 hexadecimal digits. Words in the
38 first line must be separated by exactly one space.
39
40 The second line must be exactly length bytes long (including the LF)
41 and its SHA-1 checksum (including the LF) must match hash exactly. The
42 line’s contents must be a valid JSON object as specified by RFC 4627.
43 Strings in the JSON object must be valid UTF-8. To ensure that the
44 second line is exactly one line of text, the OVSDB implementation ex‐
45 presses any LF characters within a JSON string as \n. For the same
46 reason, and to save space, the OVSDB implementation does not “pretty
47 print” the JSON object with spaces and LFs. (The OVSDB implementation
48 tolerates LFs when reading an OVSDB database file, as long as length
49 and hash are correct.)
50
51 JSON Notation
52 We use notation from RFC 7047 here to describe the JSON data in
53 records. In addition to the notation defined there, we add the follow‐
54 ing:
55
56 <raw-uuid>
57 A 36-character JSON string that contains a UUID in the format
58 described by RFC 4122, e.g.
59 "550e8400-e29b-41d4-a716-446655440000"
60
61 Standalone Format
62 The first record in a standalone database contains the JSON schema for
63 the database, as specified in RFC 7047. Only this record is mandatory
64 (a standalone file that contains only a schema represents an empty
65 database).
66
67 The second and subsequent records in a standalone database are transac‐
68 tion records. Each record may have the following optional special mem‐
69 bers, which do not have any semantics but are often useful to adminis‐
70 trators looking through a database log with ovsdb-tool show-log:
71
72 "_date": <integer>
73 The time at which the transaction was committed, as an integer
74 number of milliseconds since the Unix epoch. Early versions of
75 OVSDB counted seconds instead of milliseconds; these can be de‐
76 tected by noticing that their values are less than 2**32.
77
78 OVSDB always writes a _date member.
79
80 "_comment": <string>
81 A JSON string that specifies the comment provided in a transac‐
82 tion comment operation. If a transaction has multiple comment
83 operations, OVSDB concatenates them into a single _comment mem‐
84 ber, separated by a new-line.
85
86 OVSDB only writes a _comment member if it would be a nonempty
87 string.
88
89 Each of these records also has one or more additional members, each of
90 which maps from the name of a database table to a <table-txn>:
91
92 <table-txn>
93 A JSON object that describes the effects of a transaction on a
94 database table. Its names are <raw-uuid>s for rows in the table
95 and its values are <row-txn>s.
96
97 <row-txn>
98 Either null, which indicates that the transaction deleted this
99 row, or a JSON object that describes how the transaction in‐
100 serted or modified the row, whose names are the names of columns
101 and whose values are <value>s that give the column’s new value.
102
103 For new rows, the OVSDB implementation omits columns whose val‐
104 ues have the default values for their types defined in RFC 7047
105 section 5.2.1; for modified rows, the OVSDB implementation omits
106 columns whose values are unchanged.
107
108 Clustered Format
109 The clustered format has the following additional notation:
110
111 <uint64>
112 A JSON integer that represents a 64-bit unsigned integer. The
113 OVS JSON implementation only supports integers in the range
114 -2**63 through 2**63-1, so 64-bit unsigned integer values from
115 2**63 through 2**64-1 are expressed as negative numbers.
116
117 <address>
118 A JSON string that represents a network address to support clus‐
119 tering, in the <protocol>:<ip>:<port> syntax described in
120 ovsdb-tool(1).
121
122 <servers>
123 A JSON object whose names are <raw-uuid>s that identify servers
124 and whose values are <address>es that specify those servers’ ad‐
125 dresses.
126
127 <cluster-txn>
128 A JSON array with two elements:
129
130 1. The first element is either a <database-schema> or null. A
131 <database-schema> element is always present in the first
132 record of a clustered database to indicate the database’s
133 initial schema. If it is not null in a later record, it in‐
134 dicates a change of schema for the database.
135
136 2. The second element is either a transaction record in the for‐
137 mat described under Standalone Format above, or null.
138
139 When a schema is present, the transaction record is relative to
140 an empty database. That is, a schema change effectively resets
141 the database to empty and the transaction record represents the
142 full database contents. This allows readers to be ignorant of
143 the full semantics of schema change.
144
145 The first record in a clustered database contains the following mem‐
146 bers, all of which are required, except prev_election_timer:
147
148 "server_id": <raw-uuid>
149 The server’s own UUID, which must be unique within the cluster.
150
151 "local_address": <address>
152 The address on which the server listens for connections from
153 other servers in the cluster.
154
155 "name": <id>
156 The database schema name. It is only important when a server is
157 in the process of joining a cluster: a server will only join a
158 cluster if the name matches. (If the database schema name were
159 unique, then we would not also need a cluster ID.)
160
161 "cluster_id": <raw-uuid>
162 The cluster’s UUID. The all-zeros UUID is not a valid cluster
163 ID.
164
165 "prev_term": <uint64> and "prev_index": <uint64>
166 The Raft term and index just before the beginning of the log.
167
168 "prev_servers": <servers>
169 The set of one or more servers in the cluster at index “prev_in‐
170 dex” and term “prev_term”. It might not include this server, if
171 it was not the initial server in the cluster.
172
173 "prev_election_timer": <uint64>
174 The election base time before the beginning of the log. If not
175 exist, the default value 1000 ms is used as if it exists this
176 record.
177
178 "prev_data": <json-value> and "prev_eid": <raw-uuid>
179 A snapshot of the data in the database at index “prev_index” and
180 term “prev_term”, and the entry ID for that data. The snapshot
181 must contain a schema.
182
183 The second and subsequent records, if present, in a clustered database
184 represent changes to the database, to the cluster state, or both.
185 There are several types of these records. The most important types of
186 records directly represent persistent state described in the Raft spec‐
187 ification:
188
189 Entry A Raft log entry.
190
191 Term The start of a new term.
192
193 Vote The server’s vote for a leader in the current term.
194
195 The following additional types of records aid debugging and trou‐
196 bleshooting, but they do not affect correctness.
197
198 Leader Identifies a newly elected leader for the current term.
199
200 Commit Index
201 An update to the server’s commit_index.
202
203 Note A human-readable description of some event.
204
205 The table below identifies the members that each type of record con‐
206 tains. “yes” indicates that a member is required, “?” that it is op‐
207 tional, blank that it is forbidden, and [1] that data and eid must be
208 either both present or both absent.
209
210 ┌───────────┬───────┬──────┬──────┬────────┬────────────┬──────┐
211 │member │ Entry │ Term │ Vote │ Leader │ Commit In‐ │ Note │
212 │ │ │ │ │ │ dex │ │
213 ├───────────┼───────┼──────┼──────┼────────┼────────────┼──────┤
214 │comment │ ? │ ? │ ? │ ? │ ? │ ? │
215 ├───────────┼───────┼──────┼──────┼────────┼────────────┼──────┤
216 │term │ yes │ yes │ yes │ yes │ │ │
217 ├───────────┼───────┼──────┼──────┼────────┼────────────┼──────┤
218 │index │ yes │ │ │ │ │ │
219 ├───────────┼───────┼──────┼──────┼────────┼────────────┼──────┤
220 │servers │ ? │ │ │ │ │ │
221 ├───────────┼───────┼──────┼──────┼────────┼────────────┼──────┤
222 │elec‐ │ ? │ │ │ │ │ │
223 │tion_timer │ │ │ │ │ │ │
224 ├───────────┼───────┼──────┼──────┼────────┼────────────┼──────┤
225 │data │ [1] │ │ │ │ │ │
226 ├───────────┼───────┼──────┼──────┼────────┼────────────┼──────┤
227 │eid │ [1] │ │ │ │ │ │
228 ├───────────┼───────┼──────┼──────┼────────┼────────────┼──────┤
229 │vote │ │ │ yes │ │ │ │
230 ├───────────┼───────┼──────┼──────┼────────┼────────────┼──────┤
231 │leader │ │ │ │ yes │ │ │
232 ├───────────┼───────┼──────┼──────┼────────┼────────────┼──────┤
233 │commit_in‐ │ │ │ │ │ yes │ │
234 │dex │ │ │ │ │ │ │
235 ├───────────┼───────┼──────┼──────┼────────┼────────────┼──────┤
236 │note │ │ │ │ │ │ yes │
237 └───────────┴───────┴──────┴──────┴────────┴────────────┴──────┘
238
239 The members are:
240
241 "comment": <string>
242 A human-readable string giving an administrator more information
243 about the reason a record was emitted.
244
245 "term": <uint64>
246 The term in which the activity occurred.
247
248 "index": <uint64>
249 The index of a log entry.
250
251 "servers": <servers>
252 Server configuration in a log entry.
253
254 "election_timer": <uint64>
255 Leader election timeout base value in a log entry.
256
257 "data": <json-value>
258 The data in a log entry.
259
260 "eid": <raw-uuid>
261 Entry ID in a log entry.
262
263 "vote": <raw-uuid>
264 The server ID for which this server voted.
265
266 "leader": <raw-uuid>
267 The server ID of the server. Emitted by both leaders and fol‐
268 lowers when a leader is elected.
269
270 "commit_index": <uint64>
271 Updated commit_index value.
272
273 "note": <string>
274 One of a few special strings indicating important events. The
275 currently defined strings are:
276
277 "transfer leadership"
278 This server transferred leadership to a different server
279 (with details included in comment).
280
281 "left" This server finished leaving the cluster. (This lets
282 subsequent readers know that the server is not part of
283 the cluster and should not attempt to connect to it.)
284
285 Joining a Cluster
286 In addition to general format for a clustered database, there is also a
287 special case for a database file created by ovsdb-tool join-cluster.
288 Such a file contains exactly one record, which conveys the information
289 passed to the join-cluster command. It has the following members:
290
291 "server_id": <raw-uuid> and "local_address": <address> and "name": <id>
292 These have the same semantics described above in the general de‐
293 scription of the format.
294
295 "cluster_id": <raw-uuid>
296 This is provided only if the user gave the --cid option to
297 join-cluster. It has the same semantics described above.
298
299 "remote_addresses"; [<address>*]
300 One or more remote servers to contact for joining the cluster.
301
302 When the server successfully joins the cluster, the database file is
303 replaced by one described in Clustered Format.
304
306 The Open vSwitch Development Community
307
309 2016-2023, The Open vSwitch Development Community
310
311
312
313
3143.1 Jun 08, 2023 OVSDB(5)