1LIBCXL(3) LIBCXL(3)
2
3
4
6 libcxl - A library to interact with CXL devices through sysfs(5) and
7 ioctl(2) interfaces
8
10 #include <cxl/libcxl.h>
11 cc ... -lcxl
12
14 libcxl provides interfaces to interact with CXL devices in Linux, using
15 sysfs interfaces for most kernel interactions, and the ioctl()
16 interface for command submission.
17
18 The starting point for all library interfaces is a cxl_ctx object,
19 returned by linklibcxl:cxl_new[3]. CXL Type 3 memory devices and other
20 CXL device objects are descendants of the cxl_ctx object, and can be
21 iterated via an object an iterator API of the form
22 cxl_<object>_foreach(<parent object>, <object iterator>).
23
25 The object representing a CXL memory expander (Type 3 device) is struct
26 cxl_memdev. Library interfaces related to these devices have the prefix
27 cxl_memdev_. These interfaces are mostly associated with sysfs
28 interactions (unless otherwise noted in their respective documentation
29 sections). They are typically used to retrieve data published by the
30 kernel, or to send data or trigger kernel operations for a given
31 device.
32
33 MEMDEV: Enumeration
34 struct cxl_memdev *cxl_memdev_get_first(struct cxl_ctx *ctx);
35 struct cxl_memdev *cxl_memdev_get_next(struct cxl_memdev *memdev);
36 struct cxl_ctx *cxl_memdev_get_ctx(struct cxl_memdev *memdev);
37 const char *cxl_memdev_get_host(struct cxl_memdev *memdev)
38 struct cxl_memdev *cxl_endpoint_get_memdev(struct cxl_endpoint *endpoint);
39
40 #define cxl_memdev_foreach(ctx, memdev) \
41 for (memdev = cxl_memdev_get_first(ctx); \
42 memdev != NULL; \
43 memdev = cxl_memdev_get_next(memdev))
44
45 CXL memdev instances are enumerated from the global library context
46 struct cxl_ctx. By default a memdev only offers a portal to submit
47 memory device commands, see the port, decoder, and endpoint APIs to
48 determine what if any CXL Memory Resources are reachable given a
49 specific memdev.
50
51 The host of a memdev is the PCIe Endpoint device that registered its
52 CXL capabilities with the Linux CXL core.
53
54 MEMDEV: Attributes
55 int cxl_memdev_get_id(struct cxl_memdev *memdev);
56 unsigned long long cxl_memdev_get_serial(struct cxl_memdev *memdev);
57 const char *cxl_memdev_get_devname(struct cxl_memdev *memdev);
58 int cxl_memdev_get_major(struct cxl_memdev *memdev);
59 int cxl_memdev_get_minor(struct cxl_memdev *memdev);
60 unsigned long long cxl_memdev_get_pmem_size(struct cxl_memdev *memdev);
61 unsigned long long cxl_memdev_get_ram_size(struct cxl_memdev *memdev);
62 const char *cxl_memdev_get_firmware_verison(struct cxl_memdev *memdev);
63 size_t cxl_memdev_get_label_size(struct cxl_memdev *memdev);
64 int cxl_memdev_nvdimm_bridge_active(struct cxl_memdev *memdev);
65 int cxl_memdev_get_numa_node(struct cxl_memdev *memdev);
66
67 A memdev is given a kernel device name of the form "mem%d" where an id
68 (cxl_memdev_get_id()) is dynamically allocated as devices are
69 discovered. Note that there are no guarantees that ids / kernel device
70 names for memdevs are stable from one boot to the next, devices are
71 enumerated asynchronously. If a stable identifier is use
72 cxl_memdev_get_serial() which returns a value according to the Device
73 Serial Number Extended Capability in the PCIe 5.0 Base Specification.
74
75 The character device node for command submission can be found by
76 default at /dev/cxl/mem%d, or created with a major / minor returned
77 from cxl_memdev_get_{major,minor}().
78
79 The pmem_size and ram_size attributes return the current provisioning
80 of DPA (Device Physical Address / local capacity) in the device.
81
82 cxl_memdev_get_numa_node() returns the affinitized CPU node number if
83 available or -1 otherwise.
84
85 MEMDEV: Control
86 int cxl_memdev_disable_invalidate(struct cxl_memdev *memdev);
87 int cxl_memdev_enable(struct cxl_memdev *memdev);
88
89 When a memory device is disabled it unregisters its associated
90 endpoints and potentially intervening switch ports if there are no
91 other memdevs pinning that port active. That means that any existing
92 port objects that the library has previously returned are in valid and
93 need to be re-read. Callers must be careful to re-retrieve port objects
94 after cxl_memdev_disable_invalidate(). Any usage of a previously
95 obtained port object after a cxl_memdev_disable_invalidate() call is a
96 use-after-free programming error. It follows that after
97 cxl_memdev_enable() new ports may appear in the topology that were not
98 previously enumerable.
99
100 Note
101 cxl_memdev_disable_invalidate() will force disable the memdev
102 regardless of whether the memory provided by the device is in
103 active use by the operating system. Callers take responisbility for
104 assuring that it is safe to disable the memory device. Otherwise,
105 this call can be as destructive as ripping a DIMM out of a running
106 system. Like all other libcxl calls that mutate the system state or
107 divulge security sensitive information this call requires root /
108 CAP_SYS_ADMIN.
109
110 MEMDEV: Commands
111 struct cxl_cmd *cxl_cmd_new_raw(struct cxl_memdev *memdev, int opcode);
112 struct cxl_cmd *cxl_cmd_new_identify(struct cxl_memdev *memdev);
113 struct cxl_cmd *cxl_cmd_new_get_health_info(struct cxl_memdev *memdev);
114 struct cxl_cmd *cxl_cmd_new_read_label(struct cxl_memdev *memdev,
115 unsigned int offset, unsigned int length);
116 struct cxl_cmd *cxl_cmd_new_write_label(struct cxl_memdev *memdev, void *buf,
117 unsigned int offset, unsigned int length);
118 int cxl_memdev_zero_label(struct cxl_memdev *memdev, size_t length,
119 size_t offset);
120 int cxl_memdev_read_label(struct cxl_memdev *memdev, void *buf, size_t length,
121 size_t offset);
122 int cxl_memdev_write_label(struct cxl_memdev *memdev, void *buf, size_t length,
123 size_t offset);
124 struct cxl_cmd *cxl_cmd_new_get_partition(struct cxl_memdev *memdev);
125 struct cxl_cmd *cxl_cmd_new_set_partition(struct cxl_memdev *memdev,
126 unsigned long long volatile_size);
127
128 A cxl_cmd is a reference counted object which is used to perform
129 Mailbox commands as described in the CXL Specification. A cxl_cmd
130 object is tied to a cxl_memdev. Associated library interfaces have the
131 prefix cxl_cmd_. Within this sub-class of interfaces, there are:
132
133 • cxl_cmd_new_*() interfaces that allocate a new cxl_cmd object for a
134 given command type targeted at a given memdev. As part of the
135 command instantiation process the library validates that the
136 command is supported by the memory device, otherwise it returns
137 NULL to indicate no support. The libcxl command id is translated by
138 the kernel into a CXL standard opcode. See the potential command
139 ids in /usr/include/linux/cxl_mem.h.
140
141 • cxl_cmd_<name>_set_<field> interfaces that set specific fields in a
142 cxl_cmd
143
144 • cxl_cmd_submit which submits the command via ioctl()
145
146 • cxl_cmd_<name>_get_<field> interfaces that get specific fields out
147 of the command response
148
149 • cxl_cmd_get_* interfaces to get general command related
150 information.
151
152 cxl_cmd_new_raw() supports so called RAW commands where the command id
153 is RAW and it carries an unmodified CXL memory device command payload
154 associated with the opcode argument. Given the kernel does minimal
155 input validation on these commands typically raw commands are not
156 supported by the kernel outside debug build scenarios. libcxl is
157 limited to supporting commands that appear in the CXL standard / public
158 specifications.
159
160 cxl_memdev{read,write,zero}_label() are helpers for marshaling multiple
161 label access commands over an arbitrary extent of the device’s label
162 area.
163
164 cxl_cmd_partition_set_mode() supports selecting NEXTBOOT or IMMEDIATE
165 mode. When CXL_SETPART_IMMEDIATE mode is set, it is the caller’s
166 responsibility to avoid immediate changes to partitioning when the
167 device is in use. When CXL_SETPART_NEXTBOOT mode is set, the change in
168 partitioning shall become the “next” configuration, to become active on
169 the next device reset.
170
172 The CXL Memory space is CPU and Device coherent. The address ranges
173 that support coherent access are described by platform firmware and
174 communicated to the operating system via a CXL root object struct
175 cxl_bus.
176
177 BUS: Enumeration
178 struct cxl_bus *cxl_bus_get_first(struct cxl_ctx *ctx);
179 struct cxl_bus *cxl_bus_get_next(struct cxl_bus *bus);
180 struct cxl_ctx *cxl_bus_get_ctx(struct cxl_bus *bus);
181 struct cxl_bus *cxl_memdev_get_bus(struct cxl_memdev *memdev);
182 struct cxl_bus *cxl_port_get_bus(struct cxl_port *port);
183 struct cxl_bus *cxl_endpoint_get_bus(struct cxl_endpoint *endpoint);
184
185 #define cxl_bus_foreach(ctx, bus) \
186 for (bus = cxl_bus_get_first(ctx); bus != NULL; \
187 bus = cxl_bus_get_next(bus))
188
189 When a memdev is active it has established a CXL port hierarchy between
190 itself and the root of its associated CXL topology. The
191 cxl_{memdev,endpoint}_get_bus() helpers walk that topology to retrieve
192 the associated bus object.
193
194 BUS: Attributes
195 const char *cxl_bus_get_provider(struct cxl_bus *bus);
196 const char *cxl_bus_get_devname(struct cxl_bus *bus);
197 int cxl_bus_get_id(struct cxl_bus *bus);
198
199 The provider name of a bus is a persistent name that is independent of
200 discovery order. The possible provider names are ACPI.CXL and cxl_test.
201 The devname and id attributes, like other objects, are just the kernel
202 device names that are subject to change based on discovery order.
203
205 CXL ports track the PCIe hierarchy between a platform firmware CXL root
206 object, through CXL / PCIe Host Bridges, CXL / PCIe Root Ports, and CXL
207 / PCIe Switch Ports.
208
209 PORT: Enumeration
210 struct cxl_port *cxl_bus_get_port(struct cxl_bus *bus);
211 struct cxl_port *cxl_port_get_first(struct cxl_port *parent);
212 struct cxl_port *cxl_port_get_next(struct cxl_port *port);
213 struct cxl_port *cxl_port_get_parent(struct cxl_port *port);
214 struct cxl_ctx *cxl_port_get_ctx(struct cxl_port *port);
215 const char *cxl_port_get_host(struct cxl_port *port);
216 struct cxl_port *cxl_decoder_get_port(struct cxl_decoder *decoder);
217 struct cxl_port *cxl_port_get_next_all(struct cxl_port *port,
218 const struct cxl_port *top);
219 struct cxl_port *cxl_dport_get_port(struct cxl_dport *dport);
220
221 #define cxl_port_foreach(parent, port) \
222 for (port = cxl_port_get_first(parent); port != NULL; \
223 port = cxl_port_get_next(port))
224
225 #define cxl_port_foreach_all(top, port) \
226 for (port = cxl_port_get_first(top); port != NULL; \
227 port = cxl_port_get_next_all(port, top))
228
229 A bus object encapsulates a CXL port object. Use cxl_bus_get_port() to
230 use generic port APIs on root objects.
231
232 Ports are hierarchical. All but the a root object have another CXL port
233 as a parent object retrievable via cxl_port_get_parent().
234
235 The root port of a hiearchy can be retrieved via any port instance in
236 that hierarchy via cxl_port_get_bus().
237
238 The host of a port is the corresponding device name of the PCIe Root
239 Port, or Switch Upstream Port with CXL capabilities.
240
241 The cxl_port_foreach_all() helper does a depth first iteration of all
242 ports beneath the top port argument.
243
244 PORT: Control
245 --- int cxl_port_disable_invalidate(struct cxl_port *port); int
246 cxl_port_enable(struct cxl_port *port); ---
247 cxl_port_disable_invalidate() is a violent operation that disables
248 entire sub-tree of CXL Memory Device and Ports, only use it for test /
249 debug scenarios, or ensuring that all impacted devices are deactivated
250 first.
251
252 PORT: Attributes
253 const char *cxl_port_get_devname(struct cxl_port *port);
254 int cxl_port_get_id(struct cxl_port *port);
255 int cxl_port_is_enabled(struct cxl_port *port);
256 bool cxl_port_is_root(struct cxl_port *port);
257 bool cxl_port_is_switch(struct cxl_port *port);
258 bool cxl_port_is_endpoint(struct cxl_port *port);
259 bool cxl_port_hosts_memdev(struct cxl_port *port, struct cxl_memdev *memdev);
260 int cxl_port_get_nr_dports(struct cxl_port *port);
261
262 The port type is communicated via cxl_port_is_<type>(). An enabled port
263 is one that has succeeded in discovering the CXL component registers in
264 the host device and has enumerated its downstream ports. In order for a
265 memdev to be enabled for CXL memory operation all CXL ports in its
266 ancestry must also be enabled including a root port, an arbitrary
267 number of intervening switch ports, and a terminal endpoint port.
268
269 cxl_port_hosts_memdev() returns true if the port’s host appears in the
270 memdev host’s device topology ancestry.
271
272 DPORTS
273 A CXL dport object represents a CXL / PCIe Switch Downstream Port,
274 or a CXL / PCIe host bridge.
275
276 struct cxl_dport *cxl_dport_get_first(struct cxl_port *port);
277 struct cxl_dport *cxl_dport_get_next(struct cxl_dport *dport);
278 struct cxl_dport *cxl_port_get_dport_by_memdev(struct cxl_port *port,
279 struct cxl_memdev *memdev);
280
281 #define cxl_dport_foreach(port, dport) \
282 for (dport = cxl_dport_get_first(port); dport != NULL; \
283 dport = cxl_dport_get_next(dport))
284
285 const char *cxl_dport_get_devname(struct cxl_dport *dport);
286 const char *cxl_dport_get_physical_node(struct cxl_dport *dport);
287 int cxl_dport_get_id(struct cxl_dport *dport);
288 bool cxl_dport_maps_memdev(struct cxl_dport *dport, struct cxl_memdev *memdev);
289
290 The id of a dport is the hardware idenfifier used by an upstream
291 port to reference a downstream port. The physical node of a dport
292 is only available for platform firmware defined downstream ports
293 and alias the companion object, like a PCI host bridge, in the PCI
294 device hierarchy.
295
296 The cxl_dport_maps_memdev() helper checks if a dport is an ancestor
297 of a given memdev.
298
300 CXL endpoint objects encapsulate the set of host-managed device-memory
301 (HDM) decoders in a physical memory device. The endpoint is the last
302 hop in a decoder chain that translate SPA to DPA
303 (system-physical-address to device-local-physical-address).
304
305 ENDPOINT: Enumeration
306 struct cxl_endpoint *cxl_endpoint_get_first(struct cxl_port *parent);
307 struct cxl_endpoint *cxl_endpoint_get_next(struct cxl_endpoint *endpoint);
308 struct cxl_ctx *cxl_endpoint_get_ctx(struct cxl_endpoint *endpoint);
309 struct cxl_port *cxl_endpoint_get_parent(struct cxl_endpoint *endpoint);
310 struct cxl_port *cxl_endpoint_get_port(struct cxl_endpoint *endpoint);
311 const char *cxl_endpoint_get_host(struct cxl_endpoint *endpoint);
312 struct cxl_endpoint *cxl_memdev_get_endpoint(struct cxl_memdev *memdev);
313 struct cxl_endpoint *cxl_port_to_endpoint(struct cxl_port *port);
314
315 #define cxl_endpoint_foreach(port, endpoint) \
316 for (endpoint = cxl_endpoint_get_first(port); endpoint != NULL; \
317 endpoint = cxl_endpoint_get_next(endpoint))
318
319 ENDPOINT: Attributes
320 const char *cxl_endpoint_get_devname(struct cxl_endpoint *endpoint);
321 int cxl_endpoint_get_id(struct cxl_endpoint *endpoint);
322 int cxl_endpoint_is_enabled(struct cxl_endpoint *endpoint);
323
325 Decoder objects are associated with the "HDM Decoder Capability"
326 published in Port devices and CXL capable PCIe endpoints. The kernel
327 additionally models platform firmware described CXL memory ranges (like
328 the ACPI CEDT.CFMWS) as static decoder objects. They route System
329 Physical Addresses through a port topology to an endpoint decoder that
330 does the final translation from SPA to DPA (system-physical-address to
331 device-local-physical-address).
332
333 DECODER: Enumeration
334 struct cxl_decoder *cxl_decoder_get_first(struct cxl_port *port);
335 struct cxl_decoder *cxl_decoder_get_next(struct cxl_decoder *decoder);
336 struct cxl_ctx *cxl_decoder_get_ctx(struct cxl_decoder *decoder);
337 struct cxl_decoder *cxl_target_get_decoder(struct cxl_target *target);
338
339 #define cxl_decoder_foreach(port, decoder) \
340 for (decoder = cxl_decoder_get_first(port); decoder != NULL; \
341 decoder = cxl_decoder_get_next(decoder))
342
343 The definition of a CXL port in libcxl is an object that hosts one or
344 more CXL decoder objects.
345
346 DECODER: Attributes
347 unsigned long long cxl_decoder_get_resource(struct cxl_decoder *decoder);
348 unsigned long long cxl_decoder_get_size(struct cxl_decoder *decoder);
349 const char *cxl_decoder_get_devname(struct cxl_decoder *decoder);
350 int cxl_decoder_get_id(struct cxl_decoder *decoder);
351 int cxl_decoder_get_nr_targets(struct cxl_decoder *decoder);
352
353 enum cxl_decoder_target_type {
354 CXL_DECODER_TTYPE_UNKNOWN,
355 CXL_DECODER_TTYPE_EXPANDER,
356 CXL_DECODER_TTYPE_ACCELERATOR,
357 };
358
359 cxl_decoder_get_target_type(struct cxl_decoder *decoder);
360 bool cxl_decoder_is_pmem_capable(struct cxl_decoder *decoder);
361 bool cxl_decoder_is_volatile_capable(struct cxl_decoder *decoder);
362 bool cxl_decoder_is_mem_capable(struct cxl_decoder *decoder);
363 bool cxl_decoder_is_accelmem_capable(struct cxl_decoder *decoder);
364 bool cxl_decoder_is_locked(struct cxl_decoder *decoder);
365
366 The kernel protects the enumeration of the physical address layout of
367 the system. Without CAP_SYS_ADMIN cxl_decoder_get_resource() returns
368 ULLONG_MAX to indicate that the address information was not
369 retrievable. Otherwise, cxl_decoder_get_resource() returns the
370 currently programmed value of the base of the decoder’s decode range. A
371 zero-sized decoder indicates a disabled decoder.
372
373 Root level decoders only support limited set of memory types in their
374 address range. The cxl_decoder_is_<memtype>_capable() helpers identify
375 what is supported. Switch level decoders, in contrast are capable of
376 routing any memory type, i.e. they just forward along the memory type
377 support from their parent port. Endpoint decoders follow the
378 capabilities of their host memory device.
379
380 The capabilities of a decoder are not to be confused with their type /
381 mode. The type ultimately depends on the endpoint. For example an
382 accelerator requires all decoders in its ancestry to be set to
383 CXL_DECODER_TTYPE_ACCELERATOR, and conversely plain memory expander
384 devices require CXL_DECODER_TTYPE_EXPANDER.
385
386 Platform firmware may setup the CXL decode hierarchy before the OS
387 boots, and may additionally require that the OS not change the decode
388 settings. This property is indicated by the cxl_decoder_is_locked()
389 API.
390
391 TARGETS
392 A root or switch level decoder takes an SPA
393 (system-physical-address) as input and routes it to a downstream
394 port. Which downstream port depends on the downstream port’s
395 position in the interleave. A struct cxl_target object represents
396 the properties of a given downstream port relative to its
397 interleave configuration.
398
399 struct cxl_target *cxl_decoder_get_target_by_memdev(struct cxl_decoder *decoder,
400 struct cxl_memdev *memdev);
401 struct cxl_target *
402 cxl_decoder_get_target_by_position(struct cxl_decoder *decoder, int position);
403 struct cxl_target *cxl_target_get_first(struct cxl_decoder *decoder);
404 struct cxl_target *cxl_target_get_next(struct cxl_target *target);
405
406 #define cxl_target_foreach(decoder, target) \
407 for (target = cxl_target_get_first(decoder); target != NULL; \
408 target = cxl_target_get_next(target))
409
410 Target objects can only be enumerated if the decoder has been
411 configured, for switch decoders. For root decoders they are always
412 available since the root decoder target mapping is static. The
413 cxl_decoder_get_target_by_memdev() helper walks the topology to
414 validate if the given memory device is capable of receiving cycles
415 from this upstream decoder. It does not validate if the memory
416 device is currently configured to participate in that decode.
417
418 int cxl_target_get_position(struct cxl_target *target);
419 unsigned long cxl_target_get_id(struct cxl_target *target);
420 const char *cxl_target_get_devname(struct cxl_target *target);
421 bool cxl_target_maps_memdev(struct cxl_target *target,
422 struct cxl_memdev *memdev);
423 const char *cxl_target_get_physical_node(struct cxl_target *target);
424
425 The position of a decoder along with the interleave granularity
426 dictate which address in the decoder’s resource range map to which
427 port.
428
429 The target id is an identifier that the CXL port uses to reference
430 this downstream port. For CXL / PCIe downstream switch ports the id
431 is defined by the PCIe Link Capability Port Number field. For root
432 decoders the id is specified by platform firmware specific
433 mechanism. For ACPI.CXL defined root ports the id comes from the
434 CEDT.CHBS / ACPI0016 _UID.
435
436 The device name of a target is the name of the host device for the
437 downstream port. For CXL / PCIe downstream ports the devname is
438 downstream switch port PCI device. For CXL root ports the devname
439 is a platform firmware object for the host bridge like a ACPI0016
440 device instance.
441
442 The cxl_target_maps_memdev() helper is the companion of
443 cxl_decoder_get_target_by_memdev() to determine which downstream
444 ports / targets are capable of mapping which memdevs.
445
446 Some platform firmware implementations define an alias / companion
447 device to represent the root of a PCI device hierarchy. The
448 cxl_target_get_physical_node() helper returns the device name of
449 that companion object in the PCI hierarchy.
450
452 Copyright © 2016 - 2022, Intel Corporation. License GPLv2: GNU GPL
453 version 2 http://gnu.org/licenses/gpl.html. This is free software: you
454 are free to change and redistribute it. There is NO WARRANTY, to the
455 extent permitted by law.
456
458 linklibcxl:cxl[1]
459
460
461
462 03/08/2022 LIBCXL(3)