1HBWMALLOC(3) HBWMALLOC HBWMALLOC(3)
2
3
4
6 hbwmalloc - The high bandwidth memory interface
7 Note: hbwmalloc.h functionality is considered as stable API (STANDARD
8 API).
9
11 #include <hbwmalloc.h>
12
13 Link with -lmemkind
14
15 int hbw_check_available(void);
16 void* hbw_malloc(size_t size);
17 void* hbw_calloc(size_t nmemb, size_t size);
18 void* hbw_realloc (void *ptr, size_t size);
19 void hbw_free(void *ptr);
20 int hbw_posix_memalign(void **memptr, size_t alignment, size_t size);
21 int hbw_posix_memalign_psize(void **memptr, size_t alignment, size_t size, hbw_pagesize_t pagesize);
22 hbw_policy_t hbw_get_policy(void);
23 int hbw_set_policy(hbw_policy_t mode);
24 int hbw_verify_memory_region(void *addr, size_t size, int flags);
25
27 hbw_check_available() returns 0 if high bandwidth memory is available
28 and an error code described in the ERRORS section if not.
29
30 hbw_malloc() allocates size bytes of uninitialized high bandwidth mem‐
31 ory. The allocated space is suitably aligned (after possible pointer
32 coercion) for storage of any type of object. If size is zero then
33 hbw_malloc() returns NULL.
34
35 hbw_calloc() allocates space for nmemb objects in high bandwidth mem‐
36 ory, each size bytes in length. The result is identical to calling
37 hbw_malloc() with an argument of nmemb*size , with the exception that
38 the allocated memory is explicitly initialized to zero bytes. If nmemb
39 or size is 0, then hbw_calloc() returns NULL.
40
41 hbw_realloc() changes the size of the previously allocated high band‐
42 width memory referenced by ptr to size bytes. The contents of the mem‐
43 ory are unchanged up to the lesser of the new and old sizes. If the new
44 size is larger, the contents of the newly allocated portion of the mem‐
45 ory are undefined. Upon success, the memory referenced by ptr is freed
46 and a pointer to the newly allocated high bandwidth memory is returned.
47
48 Note: hbw_realloc() may move the memory allocation, resulting in a dif‐
49 ferent return value than ptr.
50
51 If ptr is NULL, the hbw_realloc() function behaves identically to
52 hbw_malloc() for the specified size. The address ptr, if not NULL, was
53 returned by a previous call to hbw_malloc(), hbw_calloc(), hbw_real‐
54 loc(), or hbw_posix_memalign(). Otherwise, or if hbw_free(ptr) was
55 called before, undefined behavior occurs.
56
57
58 Note: hbw_realloc() cannot be used with a pointer returned by
59 hbw_posix_memalign_psize().
60
61
62 hbw_free() causes the allocated memory referenced by ptr to be made
63 available for future allocations. If ptr is NULL, no action occurs.
64 The address ptr, if not NULL, must have been returned by a previous
65 call to hbw_malloc(), hbw_calloc(), hbw_realloc(), hbw_posix_mema‐
66 lign(), or hbw_posix_memalign_psize(). Otherwise, if hbw_free(ptr) was
67 called before, undefined behavior occurs.
68
69 hbw_posix_memalign() allocates size bytes of high bandwidth memory such
70 that the allocation's base address is an even multiple of alignment,
71 and returns the allocation in the value pointed to by memptr. The
72 requested alignment must be a power of 2 at least as large as
73 sizeof(void *).
74
75 hbw_posix_memalign_psize() allocates size bytes of high bandwidth mem‐
76 ory such that the allocation's base address is an even multiple of
77 alignment, and returns the allocation in the value pointed to by
78 memptr. The requested alignment must be a power of 2 at least as large
79 as sizeof(void *). The memory will be allocated using pages determined
80 by the pagesize variable which may be one of the following enumerated
81 values:
82
83 HBW_PAGESIZE_4KB
84 The four kilobyte page size option. Note that with transparent
85 huge pages enabled these allocations may be promoted by the
86 operating system to two megabyte pages.
87
88 HBW_PAGESIZE_2MB
89 The two megabyte page size option. Note: This page size requires
90 huge pages configuration described in SYSTEM CONFIGURATION sec‐
91 tion.
92
93 HBW_PAGESIZE_1GB (DEPRECATED)
94 This option allows the user to specify arbitrary sizes backed by
95 1GB chunks of huge pages. Huge pages are allocated even if the
96 size is not a modulo of 1GB. Note: This page size requires huge
97 pages configuration described in SYSTEM CONFIGURATION section.
98
99 HBW_PAGESIZE_1GB_STRICT (DEPRECATED)
100 The total size of the allocation must be a multiple of 1GB with
101 this option, otherwise the allocation will fail. Note: This page
102 size requires huge pages configuration described in SYSTEM CON‐
103 FIGURATION section.
104
105 HBW_PAGESIZE_2MB, HBW_PAGESIZE_1GB and HBW_PAGESIZE_1GB_STRICT options
106 are not supported with HBW_POLICY_INTERLEAVE policy.
107
108 hbw_get_policy() returns the current fallback policy when insufficient
109 high bandwith memory is available.
110
111 hbw_set_policy() sets the current fallback policy. The policy can be
112 modified only once in the lifetime of an application and before
113 calling hbw_*alloc() or hbw_posix_memalign*() function.
114 Note: If the policy is not set, than HBW_POLICY_PREFERRED will be used
115 by default.
116
117 HBW_POLICY_BIND
118 If insufficient high bandwidth memory from the nearest NUMA node
119 is available to satisfy a request, the allocated pointer is set
120 to NULL and errno is set to ENOMEM. If insufficient high band‐
121 width memory pages are available at fault time the Out Of Memory
122 (OOM) killer is triggered. Note that pages are faulted exclu‐
123 sively from the high bandwidth NUMA node nearest at time of
124 allocation, not at time of fault.
125
126 HBW_POLICY_BIND_ALL
127 If insufficient high bandwidth memory is available to satisfy a
128 request, the allocated pointer is set to NULL and errno is set
129 to ENOMEM. If insufficient high bandwidth memory pages are
130 available at fault time the Out Of Memory (OOM) killer is trig‐
131 gered. Note that pages are faulted from the high bandwidth NUMA
132 nodes. Nearest NUMA node is selected at time of page fault.
133
134 HBW_POLICY_PREFERRED
135 If insufficient memory is available from the high bandwidth NUMA
136 node closest at allocation time, fall back to standard memory
137 (default) with the smallest NUMA distance.
138
139 HBW_POLICY_INTERLEAVE
140 Interleave faulted pages from across all high bandwidth NUMA
141 nodes using standard size pages (the Transparent Huge Page fea‐
142 ture is disabled).
143
144 hbw_verify_memory_region() verifies if memory region fully fall into
145 high bandwidth memory. Returns: 0 if memory in address range from addr
146 to addr + size is allocated in high bandwidth memory, -1 if any frag‐
147 ment of memory was not backed by high bandwidth memory [e.g. when mem‐
148 ory is not initalized] or one of error codes described in ERRORS sec‐
149 tion.
150
151 Using this function in production code may result in serious perfor‐
152 mance penalty.
153
154 Flags argument may include optional flags that modifies function behav‐
155 iour:
156
157 HBW_TOUCH_PAGES
158 Before checking pages, function will touch first byte of all
159 pages in address range starting from addr to addr + size by read
160 and write (so the content will be overwitten by the same data as
161 it was read). Using this option may trigger Out Of Memory
162 killer.
163
165 hbw_get_policy() returns HBW_POLICY_BIND, HBW_POLICY_BIND_ALL, HBW_POL‐
166 ICY_PREFERRED or HBW_POLICY_INTERLEAVE which represents the current
167 high bandwidth policy. hbw_free() do not have return value. hbw_mal‐
168 loc() hbw_calloc(), and hbw_realloc() return the pointer to the allo‐
169 cated memory, or NULL if the request fails. hbw_posix_memalign(),
170 hbw_posix_memalign_psize() and hbw_set_policy() return zero on success
171 and return an error code as described in the ERRORS section below on
172 failure.
173
175 Error codes described here are the POSIX standard error codes as
176 defined in <errno.h>
177
178 hbw_check_available()
179 returns ENODEV if high-bandwidth memory is unavailable.
180
181 hbw_posix_memalign() and hbw_posix_memalign_psize()
182 If the alignment parameter is not a power of two, or was not a
183 multiple of sizoeof(void *), then EINVAL is returned. If the
184 policy and pagesize combination is unsupported then EINVAL is
185 returned. If there was insufficient memory to satisfy the
186 request then ENOMEM is returned.
187
188 hbw_set_policy()
189 returns EPERM if hbw_set_policy () was called more than once, or
190 EINVAL if mode argument was neither HBW_POLICY_PREFERRED,
191 HBW_POLICY_BIND, HBW_POLICY_BIND_ALL nor HBW_POLICY_INTERLEAVE.
192
193 hbw_verify_memory_region()
194 returns EINVAL if addr is NULL, size equals 0 or flags contained
195 unsupported bit set. If memory pointed by addr could not be ver‐
196 ified then EFAULT is returned.
197
199 The hbwmalloc.h file defines the external functions and enumerations
200 for the hbwmalloc library. These interfaces define a heap manager that
201 targets high bandwidth memory numa nodes.
202
204 /usr/bin/memkind-hbw-nodes
205 Prints a comma separated list of high bandwidth nodes.
206
208 MEMKIND_HBW_NODES
209 This environment variable is a comma separated list of NUMA
210 nodes that are treated as high bandwidth. Uses the libnuma rou‐
211 tine numa_parse_nodestring() for parsing, so the syntax
212 described in the numa(3) man page for this routine applies for
213 example: 1-3,5 is a valid setting.
214
215 MEMKIND_ARENA_NUM_PER_KIND
216 This environment variable allows leveraging internal mechanism
217 of the library for setting number of arenas per kind. Value
218 should be a positive integer (not greater than INT_MAX defined
219 in limits.h). The user should set the value based on the char‐
220 acteristics of application that is using the library. Higher
221 value can provide better performance in extremely multithreaded
222 applications at the cost of memory overhead. See section "IMPLE‐
223 MENTATION NOTES" of jemalloc(3) for more details about arenas.
224
225 MEMKIND_HEAP_MANAGER
226 Controls heap management behavior in memkind library by switch‐
227 ing to one of the available heap managers.
228 Values:
229 JEMALLOC – sets the jamalloc heap manager
230 TBB – sets the Intel Threading Building Blocks heap manager.
231 This option requires installed
232 Intel Threading Building Blocks library. If the
233 MEMKIND_HEAP_MANAGER is not set than the jemalloc heap manager
234 will be used by default.
235
237 Interfaces for obtaining 2MB (HUGETLB) need allocated huge pages in the
238 kernel's huge page pool.
239
240 HUGETLB (huge pages)
241 Current number of "persistent" huge pages can be read from
242 /proc/sys/vm/nr_hugepages file. Proposed way of setting
243 hugepages is: "sudo sysctl vm.nr_hugepages=<num‐
244 ber_of_hugepages>". More informations can be found here:
245 https://www.kernel.org/doc/Documentation/vm/hugetlbpage.txt
246
248 HUGETLB (huge pages)
249 There might be some overhead in huge pages consumption caused by
250 heap management. If your allocation fails because of OOM,
251 please try to allocate extra huge pages (e.g. 8 huge pages).
252
254 Copyright (C) 2014 - 2016 Intel Corporation. All rights reserved.
255
257 malloc(3), numa(3), numactl(8), mbind(2), mmap(2), move_pages(2) jemal‐
258 loc(3) memkind(3)
259
260
261
262Intel Corporation 2015-03-31 HBWMALLOC(3)