1HBWMALLOC(3)                       HBWMALLOC                      HBWMALLOC(3)
2
3
4

NAME

6       hbwmalloc - The high bandwidth memory interface
7       Note:  hbwmalloc.h  functionality is considered as stable API (STANDARD
8       API).
9

SYNOPSIS

11       #include <hbwmalloc.h>
12
13       Link with -lmemkind
14
15       int hbw_check_available(void);
16       void* hbw_malloc(size_t size);
17       void* hbw_calloc(size_t nmemb, size_t size);
18       void* hbw_realloc (void *ptr, size_t size);
19       void hbw_free(void *ptr);
20       size_t hbw_malloc_usable_size(void *ptr);
21       int hbw_posix_memalign(void **memptr, size_t alignment, size_t size);
22       int hbw_posix_memalign_psize(void **memptr, size_t alignment, size_t size, hbw_pagesize_t pagesize);
23       hbw_policy_t hbw_get_policy(void);
24       int hbw_set_policy(hbw_policy_t mode);
25       int hbw_verify_memory_region(void *addr, size_t size, int flags);
26

DESCRIPTION

28       hbw_check_available() returns zero if high bandwidth memory  is  avail‐
29       able or an error code described in the ERRORS section if not.
30
31       hbw_malloc()  allocates size bytes of uninitialized high bandwidth mem‐
32       ory. The allocated space is suitably aligned  (after  possible  pointer
33       coercion)  for  storage  of  any  type  of object. If size is zero then
34       hbw_malloc() returns NULL.
35
36       hbw_calloc() allocates space for nmemb objects in high  bandwidth  mem‐
37       ory,  each  size  bytes  in  length. The result is identical to calling
38       hbw_malloc() with an argument of nmemb * size, with the exception  that
39       the allocated memory is explicitly initialized to zero bytes.  If nmemb
40       or size is 0, then hbw_calloc() returns NULL.
41
42       hbw_realloc() changes the size of the previously allocated  high  band‐
43       width  memory referenced by ptr to size bytes. The contents of the mem‐
44       ory remain unchanged up to the lesser of the new and old sizes. If  the
45       new  size is larger, the contents of the newly allocated portion of the
46       memory are undefined. Upon success, the memory  referenced  by  ptr  is
47       freed and a pointer to the newly allocated high bandwidth memory is re‐
48       turned.
49
50       Note: hbw_realloc() may move the memory allocation, resulting in a dif‐
51       ferent return value than ptr.
52
53       If  ptr  is  NULL,  the  hbw_realloc()  function behaves identically to
54       hbw_malloc() for the specified size.  If size is equal to zero, and ptr
55       is  not  NULL, then the call is equivalent to hbw_free(ptr) and NULL is
56       returned. The address ptr, if not NULL, was returned by a previous call
57       to  hbw_malloc(),  hbw_calloc(), hbw_realloc() or hbw_posix_memalign().
58       Otherwise, or if hbw_free(ptr) was called  before,  undefined  behavior
59       occurs.
60
61       Note:   hbw_realloc()  cannot  be  used  with  a  pointer  returned  by
62       hbw_posix_memalign_psize().
63
64
65       hbw_free() causes the allocated memory referenced by  ptr  to  be  made
66       available  for  future  allocations.  If ptr is NULL, no action occurs.
67       The address ptr, if not NULL, must have been  returned  by  a  previous
68       call to hbw_malloc(), hbw_calloc(), hbw_realloc(), hbw_posix_memalign()
69       or hbw_posix_memalign_psize().  Otherwise, if hbw_free(ptr) was  called
70       before, undefined behavior occurs.
71
72       hbw_malloc_usable_size()  returns  the  number  of  usable bytes in the
73       block pointed to by ptr, a pointer to a block of  memory  allocated  by
74       hbw_malloc(),  hbw_calloc(),  hbw_realloc(),  hbw_posix_memalign(),  or
75       hbw_posix_memalign_psize().
76
77       hbw_posix_memalign() allocates size bytes of high bandwidth memory such
78       that  the  allocation's  base address is an even multiple of alignment,
79       and returns the allocation in the value pointed to by memptr.  The  re‐
80       quested  alignment  must  be  a  power  of  2  at  least  as  large  as
81       sizeof(void*).  If size is 0, then hbw_posix_memalign() returns 0, with
82       a NULL returned in memptr.
83
84       hbw_posix_memalign_psize()  allocates size bytes of high bandwidth mem‐
85       ory such that the allocation's base address  is  an  even  multiple  of
86       alignment,  and  returns  the  allocation  in  the  value pointed to by
87       memptr.  The requested alignment must be a power of 2 at least as large
88       as  sizeof(void*).  The memory will be allocated using pages determined
89       by the pagesize variable which may be one of the  following  enumerated
90       values:
91
92       HBW_PAGESIZE_4KB
93              The  four  kilobyte page size option. Note that with transparent
94              huge pages enabled these allocations may be promoted by the  op‐
95              erating system to two megabyte pages.
96
97       HBW_PAGESIZE_2MB
98              The  two  megabyte  page  size option.  Note: This page size re‐
99              quires huge pages configuration described in  SYSTEM  CONFIGURA‐
100              TION section.
101
102       HBW_PAGESIZE_1GB (DEPRECATED)
103              This option allows the user to specify arbitrary sizes backed by
104              1GB chunks of huge pages. Huge pages are allocated even  if  the
105              size is not a modulo of 1GB.  Note: This page size requires huge
106              pages configuration described in SYSTEM CONFIGURATION section.
107
108       HBW_PAGESIZE_1GB_STRICT (DEPRECATED)
109              The total size of the allocation must be a multiple of 1GB  with
110              this  option,  otherwise  the  allocation will fail.  Note: This
111              page size requires huge pages configuration described in  SYSTEM
112              CONFIGURATION section.
113
114       Note:  HBW_PAGESIZE_2MB,  HBW_PAGESIZE_1GB  and HBW_PAGESIZE_1GB_STRICT
115       options are not supported with HBW_POLICY_INTERLEAVE policy.
116
117       hbw_get_policy() returns the current fallback policy when  insufficient
118       high bandwidth memory is available.
119
120       hbw_set_policy()  sets  the  current fallback policy. The policy can be
121       modified only once in the lifetime of an application and before calling
122       hbw_malloc(),  hbw_calloc(),  hbw_realloc(),  hbw_posix_memalign(),  or
123       hbw_posix_memalign_psize() function.
124       Note: If the policy is not set, than HBW_POLICY_PREFERRED will be  used
125       by default.
126
127       HBW_POLICY_BIND
128              If insufficient high bandwidth memory from the nearest NUMA node
129              is available to satisfy a request, the allocated pointer is  set
130              to  NULL and errno is set to ENOMEM.  If insufficient high band‐
131              width memory pages are available at fault time the Out Of Memory
132              (OOM)  Killer  is triggered.  Note that pages are faulted exclu‐
133              sively from the high bandwidth NUMA node nearest at time of  al‐
134              location, not at time of fault.
135
136       HBW_POLICY_BIND_ALL
137              If  insufficient high bandwidth memory is available to satisfy a
138              request, the allocated pointer is set to NULL and errno  is  set
139              to  ENOMEM.   If  insufficient  high  bandwidth memory pages are
140              available at fault time the Out Of Memory (OOM) Killer is  trig‐
141              gered.  Note that pages are faulted from the high bandwidth NUMA
142              nodes.  Nearest NUMA node is selected at time of page fault.
143
144       HBW_POLICY_PREFERRED
145              If insufficient memory is available from the high bandwidth NUMA
146              node  closest  at  allocation time, fall back to standard memory
147              (default) with the smallest NUMA distance.
148
149       HBW_POLICY_INTERLEAVE
150              Interleave faulted pages from across  all  high  bandwidth  NUMA
151              nodes  using standard size pages (the Transparent Huge Page fea‐
152              ture is disabled).
153
154       hbw_verify_memory_region() verifies if memory region fully  falls  into
155       high  bandwidth  memory. Returns 0 if memory address range from addr to
156       addr + size is allocated in high bandwidth memory, -1 if  any  fragment
157       of  memory was not backed by high bandwidth memory (e.g. when memory is
158       not initialized) or one of error codes described in ERRORS section.
159
160       Using this function in production code may result  in  serious  perfor‐
161       mance penalty.
162
163       The  Flags argument may include optional flags that modify function be‐
164       havior:
165
166       HBW_TOUCH_PAGES
167              Before checking pages, function will touch  first  byte  of  all
168              pages in address range starting from addr to addr + size by read
169              and write (so the content will be overwritten by the  same  data
170              as  it  was  read).  Using this option may trigger Out Of Memory
171              Killer.
172

RETURN VALUE

174       hbw_get_policy() returns HBW_POLICY_BIND, HBW_POLICY_BIND_ALL, HBW_POL‐
175       ICY_PREFERRED  or  HBW_POLICY_INTERLEAVE  which  represents the current
176       high bandwidth policy.  hbw_free() do not have return value.   hbw_mal‐
177       loc()  hbw_calloc()  and  hbw_realloc() return the pointer to the allo‐
178       cated memory, or NULL  if  the  request  fails.   hbw_posix_memalign(),
179       hbw_posix_memalign_psize()  and hbw_set_policy() return zero on success
180       and return an error code as described in the ERRORS  section  below  on
181       failure.
182

ERRORS

184       Error  codes  described  here are the POSIX standard error codes as de‐
185       fined in
186              <errno.h>
187
188       hbw_check_available()
189              returns ENODEV if high-bandwidth memory is unavailable.
190
191       hbw_posix_memalign() and hbw_posix_memalign_psize()
192              If the alignment parameter is not a power of two, or was  not  a
193              multiple of sizeof(void*), then EINVAL is returned.  If the pol‐
194              icy and pagesize combination is unsupported then EINVAL  is  re‐
195              turned.  If there was insufficient memory to satisfy the request
196              then ENOMEM is returned.
197
198       hbw_set_policy()
199              returns EPERM if hbw_set_policy() was called more than once,  or
200              EINVAL   if  mode  argument  was  neither  HBW_POLICY_PREFERRED,
201              HBW_POLICY_BIND, HBW_POLICY_BIND_ALL nor HBW_POLICY_INTERLEAVE.
202
203       hbw_verify_memory_region()
204              returns EINVAL if addr is NULL, size equals 0 or flags contained
205              unsupported bit set. If memory pointed by addr could not be ver‐
206              ified then EFAULT is returned.
207

NOTES

209       The <hbwmalloc.h> file defines the external functions and  enumerations
210       for  the hbwmalloc library. These interfaces define a heap manager that
211       targets high bandwidth memory numa nodes.
212

FILES

214       /usr/bin/memkind-hbw-nodes
215              Prints a comma-separated list of high bandwidth nodes.
216

ENVIRONMENT

218       MEMKIND_HBW_NODES
219              This environment variable is  a  comma-separated  list  of  NUMA
220              nodes  that are treated as high bandwidth. Uses the libnuma rou‐
221              tine numa_parse_nodestring() for  parsing,  so  the  syntax  de‐
222              scribed in the numa(3) man page for this routine applies for ex‐
223              ample: 1-3,5 is a valid setting.
224
225       MEMKIND_ARENA_NUM_PER_KIND
226              This environment variable allows leveraging  internal  mechanism
227              of  the  library  for  setting  number of arenas per kind. Value
228              should be a positive integer (not greater than  INT_MAX  defined
229              in  <limits.h>).   The  user  should  set the value based on the
230              characteristics of application that is using the library. Higher
231              value  can provide better performance in extremely multithreaded
232              applications at the cost of memory overhead. See section  IMPLE‐
233              MENTATION NOTES of jemalloc(3) for more details about arenas.
234
235       MEMKIND_HEAP_MANAGER
236              Controls  heap management behavior in memkind library by switch‐
237              ing to one of the available heap managers.
238              Values:
239                  JEMALLOC - sets the jemalloc heap manager
240                  TBB - sets the Intel Threading Building Blocks heap manager.
241              This option requires installed
242                  Intel Threading Building Blocks library.
243
244       Note:  If  the  MEMKIND_HEAP_MANAGER  is not set then the jemalloc heap
245       manager will be used by default.
246

SYSTEM CONFIGURATION

248       Interfaces for obtaining 2MB (HUGETLB) memory need allocated huge pages
249       in the kernel's huge page pool.
250
251       HUGETLB (huge pages)
252              Current  number  of  "persistent"  huge  pages  can be read from
253              /proc/sys/vm/nr_hugepages  file.   Proposed   way   of   setting
254              hugepages is: sudo sysctl vm.nr_hugepages=<number_of_hugepages>.
255              More      information       can       be       found       here:
256https://www.kernel.org/doc/Documentation/vm/hugetlbpage.txt
257

KNOWN ISSUES

259       HUGETLB (huge pages)
260              There might be some overhead in huge pages consumption caused by
261              heap management.  If  your  allocation  fails  because  of  OOM,
262              please try to allocate extra huge pages (e.g. 8 huge pages).
263
265       Copyright (C) 2014 - 2020 Intel Corporation. All rights reserved.
266

SEE ALSO

268       malloc(3),  numa(3),  numactl(8), mbind(2), mmap(2), move_pages(2), je‐
269       malloc(3), memkind(3)
270
271
272
273Intel Corporation                 2015-03-31                      HBWMALLOC(3)
Impressum