1HBWMALLOC(3) HBWMALLOC HBWMALLOC(3)
2
3
4
6 hbwmalloc - The high bandwidth memory interface
7 Note: hbwmalloc.h functionality is considered as stable API (STANDARD
8 API).
9
11 #include <hbwmalloc.h>
12
13 Link with -lmemkind
14
15 int hbw_check_available(void);
16 void* hbw_malloc(size_t size);
17 void* hbw_calloc(size_t nmemb, size_t size);
18 void* hbw_realloc (void *ptr, size_t size);
19 void hbw_free(void *ptr);
20 size_t hbw_malloc_usable_size(void *ptr);
21 int hbw_posix_memalign(void **memptr, size_t alignment, size_t size);
22 int hbw_posix_memalign_psize(void **memptr, size_t alignment, size_t size, hbw_pagesize_t pagesize);
23 hbw_policy_t hbw_get_policy(void);
24 int hbw_set_policy(hbw_policy_t mode);
25 int hbw_verify_memory_region(void *addr, size_t size, int flags);
26
28 hbw_check_available() returns zero if high bandwidth memory is avail‐
29 able or an error code described in the ERRORS section if not.
30
31 hbw_malloc() allocates size bytes of uninitialized high bandwidth mem‐
32 ory. The allocated space is suitably aligned (after possible pointer
33 coercion) for storage of any type of object. If size is zero then
34 hbw_malloc() returns NULL.
35
36 hbw_calloc() allocates space for nmemb objects in high bandwidth mem‐
37 ory, each size bytes in length. The result is identical to calling
38 hbw_malloc() with an argument of nmemb * size, with the exception that
39 the allocated memory is explicitly initialized to zero bytes. If nmemb
40 or size is 0, then hbw_calloc() returns NULL.
41
42 hbw_realloc() changes the size of the previously allocated high band‐
43 width memory referenced by ptr to size bytes. The contents of the mem‐
44 ory remain unchanged up to the lesser of the new and old sizes. If the
45 new size is larger, the contents of the newly allocated portion of the
46 memory are undefined. Upon success, the memory referenced by ptr is
47 freed and a pointer to the newly allocated high bandwidth memory is re‐
48 turned.
49
50 Note: hbw_realloc() may move the memory allocation, resulting in a dif‐
51 ferent return value than ptr.
52
53 If ptr is NULL, the hbw_realloc() function behaves identically to
54 hbw_malloc() for the specified size. If size is equal to zero, and ptr
55 is not NULL, then the call is equivalent to hbw_free(ptr) and NULL is
56 returned. The address ptr, if not NULL, was returned by a previous call
57 to hbw_malloc(), hbw_calloc(), hbw_realloc() or hbw_posix_memalign().
58 Otherwise, or if hbw_free(ptr) was called before, undefined behavior
59 occurs.
60
61 Note: hbw_realloc() cannot be used with a pointer returned by
62 hbw_posix_memalign_psize().
63
64
65 hbw_free() causes the allocated memory referenced by ptr to be made
66 available for future allocations. If ptr is NULL, no action occurs.
67 The address ptr, if not NULL, must have been returned by a previous
68 call to hbw_malloc(), hbw_calloc(), hbw_realloc(), hbw_posix_memalign()
69 or hbw_posix_memalign_psize(). Otherwise, if hbw_free(ptr) was called
70 before, undefined behavior occurs.
71
72 hbw_malloc_usable_size() returns the number of usable bytes in the
73 block pointed to by ptr, a pointer to a block of memory allocated by
74 hbw_malloc(), hbw_calloc(), hbw_realloc(), hbw_posix_memalign(), or
75 hbw_posix_memalign_psize().
76
77 hbw_posix_memalign() allocates size bytes of high bandwidth memory such
78 that the allocation's base address is an even multiple of alignment,
79 and returns the allocation in the value pointed to by memptr. The re‐
80 quested alignment must be a power of 2 at least as large as
81 sizeof(void*). If size is 0, then hbw_posix_memalign() returns 0, with
82 a NULL returned in memptr.
83
84 hbw_posix_memalign_psize() allocates size bytes of high bandwidth mem‐
85 ory such that the allocation's base address is an even multiple of
86 alignment, and returns the allocation in the value pointed to by
87 memptr. The requested alignment must be a power of 2 at least as large
88 as sizeof(void*). The memory will be allocated using pages determined
89 by the pagesize variable which may be one of the following enumerated
90 values:
91
92 HBW_PAGESIZE_4KB
93 The four kilobyte page size option. Note that with transparent
94 huge pages enabled these allocations may be promoted by the op‐
95 erating system to two megabyte pages.
96
97 HBW_PAGESIZE_2MB
98 The two megabyte page size option. Note: This page size re‐
99 quires huge pages configuration described in SYSTEM CONFIGURA‐
100 TION section.
101
102 HBW_PAGESIZE_1GB (DEPRECATED)
103 This option allows the user to specify arbitrary sizes backed by
104 1GB chunks of huge pages. Huge pages are allocated even if the
105 size is not a modulo of 1GB. Note: This page size requires huge
106 pages configuration described in SYSTEM CONFIGURATION section.
107
108 HBW_PAGESIZE_1GB_STRICT (DEPRECATED)
109 The total size of the allocation must be a multiple of 1GB with
110 this option, otherwise the allocation will fail. Note: This
111 page size requires huge pages configuration described in SYSTEM
112 CONFIGURATION section.
113
114 Note: HBW_PAGESIZE_2MB, HBW_PAGESIZE_1GB and HBW_PAGESIZE_1GB_STRICT
115 options are not supported with HBW_POLICY_INTERLEAVE policy.
116
117 hbw_get_policy() returns the current fallback policy when insufficient
118 high bandwidth memory is available.
119
120 hbw_set_policy() sets the current fallback policy. The policy can be
121 modified only once in the lifetime of an application and before calling
122 hbw_malloc(), hbw_calloc(), hbw_realloc(), hbw_posix_memalign(), or
123 hbw_posix_memalign_psize() function.
124 Note: If the policy is not set, than HBW_POLICY_PREFERRED will be used
125 by default.
126
127 HBW_POLICY_BIND
128 If insufficient high bandwidth memory from the nearest NUMA node
129 is available to satisfy a request, the allocated pointer is set
130 to NULL and errno is set to ENOMEM. If insufficient high band‐
131 width memory pages are available at fault time the Out Of Memory
132 (OOM) Killer is triggered. Note that pages are faulted exclu‐
133 sively from the high bandwidth NUMA node nearest at time of al‐
134 location, not at time of fault.
135
136 HBW_POLICY_BIND_ALL
137 If insufficient high bandwidth memory is available to satisfy a
138 request, the allocated pointer is set to NULL and errno is set
139 to ENOMEM. If insufficient high bandwidth memory pages are
140 available at fault time the Out Of Memory (OOM) Killer is trig‐
141 gered. Note that pages are faulted from the high bandwidth NUMA
142 nodes. Nearest NUMA node is selected at time of page fault.
143
144 HBW_POLICY_PREFERRED
145 If insufficient memory is available from the high bandwidth NUMA
146 node closest at allocation time, fall back to standard memory
147 (default) with the smallest NUMA distance.
148
149 HBW_POLICY_INTERLEAVE
150 Interleave faulted pages from across all high bandwidth NUMA
151 nodes using standard size pages (the Transparent Huge Page fea‐
152 ture is disabled).
153
154 hbw_verify_memory_region() verifies if memory region fully falls into
155 high bandwidth memory. Returns 0 if memory address range from addr to
156 addr + size is allocated in high bandwidth memory, -1 if any fragment
157 of memory was not backed by high bandwidth memory (e.g. when memory is
158 not initialized) or one of error codes described in ERRORS section.
159
160 Using this function in production code may result in serious perfor‐
161 mance penalty.
162
163 The Flags argument may include optional flags that modify function be‐
164 havior:
165
166 HBW_TOUCH_PAGES
167 Before checking pages, function will touch first byte of all
168 pages in address range starting from addr to addr + size by read
169 and write (so the content will be overwritten by the same data
170 as it was read). Using this option may trigger Out Of Memory
171 Killer.
172
174 hbw_get_policy() returns HBW_POLICY_BIND, HBW_POLICY_BIND_ALL, HBW_POL‐
175 ICY_PREFERRED or HBW_POLICY_INTERLEAVE which represents the current
176 high bandwidth policy. hbw_free() do not have return value. hbw_mal‐
177 loc() hbw_calloc() and hbw_realloc() return the pointer to the allo‐
178 cated memory, or NULL if the request fails. hbw_posix_memalign(),
179 hbw_posix_memalign_psize() and hbw_set_policy() return zero on success
180 and return an error code as described in the ERRORS section below on
181 failure.
182
184 Error codes described here are the POSIX standard error codes as de‐
185 fined in
186 <errno.h>
187
188 hbw_check_available()
189 returns ENODEV if high-bandwidth memory is unavailable.
190
191 hbw_posix_memalign() and hbw_posix_memalign_psize()
192 If the alignment parameter is not a power of two, or was not a
193 multiple of sizeof(void*), then EINVAL is returned. If the pol‐
194 icy and pagesize combination is unsupported then EINVAL is re‐
195 turned. If there was insufficient memory to satisfy the request
196 then ENOMEM is returned.
197
198 hbw_set_policy()
199 returns EPERM if hbw_set_policy() was called more than once, or
200 EINVAL if mode argument was neither HBW_POLICY_PREFERRED,
201 HBW_POLICY_BIND, HBW_POLICY_BIND_ALL nor HBW_POLICY_INTERLEAVE.
202
203 hbw_verify_memory_region()
204 returns EINVAL if addr is NULL, size equals 0 or flags contained
205 unsupported bit set. If memory pointed by addr could not be ver‐
206 ified then EFAULT is returned.
207
209 The <hbwmalloc.h> file defines the external functions and enumerations
210 for the hbwmalloc library. These interfaces define a heap manager that
211 targets high bandwidth memory numa nodes.
212
214 /usr/bin/memkind-hbw-nodes
215 Prints a comma-separated list of high bandwidth nodes.
216
218 MEMKIND_HBW_NODES
219 This environment variable is a comma-separated list of NUMA
220 nodes that are treated as high bandwidth. Uses the libnuma rou‐
221 tine numa_parse_nodestring() for parsing, so the syntax de‐
222 scribed in the numa(3) man page for this routine applies for ex‐
223 ample: 1-3,5 is a valid setting.
224
225 MEMKIND_ARENA_NUM_PER_KIND
226 This environment variable allows leveraging internal mechanism
227 of the library for setting number of arenas per kind. Value
228 should be a positive integer (not greater than INT_MAX defined
229 in <limits.h>). The user should set the value based on the
230 characteristics of application that is using the library. Higher
231 value can provide better performance in extremely multithreaded
232 applications at the cost of memory overhead. See section IMPLE‐
233 MENTATION NOTES of jemalloc(3) for more details about arenas.
234
235 MEMKIND_HEAP_MANAGER
236 Controls heap management behavior in memkind library by switch‐
237 ing to one of the available heap managers.
238 Values:
239 JEMALLOC - sets the jemalloc heap manager
240 TBB - sets the Intel Threading Building Blocks heap manager.
241 This option requires installed
242 Intel Threading Building Blocks library.
243
244 Note: If the MEMKIND_HEAP_MANAGER is not set then the jemalloc heap
245 manager will be used by default.
246
248 Interfaces for obtaining 2MB (HUGETLB) memory need allocated huge pages
249 in the kernel's huge page pool.
250
251 HUGETLB (huge pages)
252 Current number of "persistent" huge pages can be read from
253 /proc/sys/vm/nr_hugepages file. Proposed way of setting
254 hugepages is: sudo sysctl vm.nr_hugepages=<number_of_hugepages>.
255 More information can be found here:
256 ⟨https://www.kernel.org/doc/Documentation/vm/hugetlbpage.txt⟩
257
259 HUGETLB (huge pages)
260 There might be some overhead in huge pages consumption caused by
261 heap management. If your allocation fails because of OOM,
262 please try to allocate extra huge pages (e.g. 8 huge pages).
263
265 Copyright (C) 2014 - 2020 Intel Corporation. All rights reserved.
266
268 malloc(3), numa(3), numactl(8), mbind(2), mmap(2), move_pages(2), je‐
269 malloc(3), memkind(3)
270
271
272
273Intel Corporation 2015-03-31 HBWMALLOC(3)