− The high bandwidth memory interface
Note: hbwmalloc.h functionality is considered as stable API (STANDARD API).
Link with -lmemkind
void* hbw_malloc(size_t size);
void* hbw_calloc(size_t nmemb, size_t size);
void* hbw_realloc (void *ptr, size_t size);
void hbw_free(void *ptr);
int hbw_posix_memalign(void **memptr, size_t alignment, size_t size);
int hbw_posix_memalign_psize(void **memptr, size_t alignment, size_t size, hbw_pagesize_t pagesize);
int hbw_set_policy(hbw_policy_t mode);
int hbw_verify_memory_region(void *addr, size_t size, int flags);
hbw_check_available() returns 0 if high bandwidth memory is available and an error code described in the ERRORS section if not.
hbw_malloc() allocates size bytes of uninitialized high bandwidth memory. The allocated space is suitably aligned (after possible pointer coercion) for storage of any type of object. If size is zero then hbw_malloc() returns NULL.
hbw_calloc() allocates space for nmemb objects in high bandwidth memory, each size bytes in length. The result is identical to calling hbw_malloc() with an argument of nmemb*size , with the exception that the allocated memory is explicitly initialized to zero bytes. If nmemb or size is 0, then hbw_calloc() returns NULL.
hbw_realloc() changes the size of the previously allocated high bandwidth memory referenced by ptr to size bytes. The contents of the memory are unchanged up to the lesser of the new and old sizes. If the new size is larger, the contents of the newly allocated portion of the memory are undefined. Upon success, the memory referenced by ptr is freed and a pointer to the newly allocated high bandwidth memory is returned.
Note: hbw_realloc() may move the memory allocation, resulting in a different return value than ptr.
If ptr is NULL, the hbw_realloc() function behaves identically to hbw_malloc() for the specified size. The address ptr, if not NULL, was returned by a previous call to hbw_malloc(), hbw_calloc(), hbw_realloc(), or hbw_posix_memalign(). Otherwise, or if hbw_free(ptr) was called before, undefined behavior occurs.
Note: hbw_realloc() cannot be used with a pointer returned by hbw_posix_memalign_psize().
hbw_free() causes the allocated memory referenced by ptr to be made available for future allocations. If ptr is NULL, no action occurs. The address ptr, if not NULL, must have been returned by a previous call to hbw_malloc(), hbw_calloc(), hbw_realloc(), hbw_posix_memalign(), or hbw_posix_memalign_psize(). Otherwise, if hbw_free(ptr) was called before, undefined behavior occurs.
hbw_posix_memalign() allocates size bytes of high bandwidth memory such that the allocation’s base address is an even multiple of alignment, and returns the allocation in the value pointed to by memptr. The requested alignment must be a power of 2 at least as large as sizeof(void *).
allocates size bytes of high bandwidth memory such
that the allocation’s base address is an even multiple
of alignment, and returns the allocation in the value
pointed to by memptr. The requested alignment
must be a power of 2 at least as large as sizeof(void
*). The memory will be allocated using pages determined
by the pagesize variable which may be one of the
following enumerated values:
The four kilobyte page size option. Note that with transparent huge pages enabled these allocations may be promoted by the operating system to two megabyte pages.
The two megabyte page size option. Note: This page size requires huge pages configuration described in SYSTEM CONFIGURATION section.
This option allows the user to specify arbitrary sizes backed by one gigabytes pages. Gigabyte pages are allocated even if the size is not a modulo of 1GB. A good example of using this feature with realloc is shown in gb_realloc_example.c. Note: This page size requires gigabyte pages configuration described in SYSTEM CONFIGURATION section.
The one gigabyte page size option. The total size of the allocation must be a multiple of 1GB with this option, otherwise the allocation will fail. Note: This page size requires gigabyte pages configuration described in SYSTEM CONFIGURATION section.
hbw_get_policy() returns the current fallback policy when insufficient high bandwith memory is available.
sets the current fallback policy. The policy can be modified
only once in the lifetime of an application and before
calling hbw_*alloc() or hbw_posix_memalign*() function.
Note: If the policy is not set, than HBW_POLICY_PREFERRED will be used by default.
If insufficient high bandwidth memory from the nearest NUMA node is available to satisfy a request, the allocated pointer is set to NULL and errno is set to ENOMEM. If insufficient high bandwidth memory pages are available at fault time the Out Of Memory (OOM) killer is triggered. Note that pages are faulted exclusively from the high bandwidth NUMA node nearest at time of allocation, not at time of fault.
If insufficient memory is available from the high bandwidth NUMA node closest at allocation time, fall back to standard memory (default) with the smallest NUMA distance.
Interleave faulted pages from across all high bandwidth NUMA nodes using standard size pages (the Transparent Huge Page feature is disabled).
hbw_verify_memory_region() verifies if memory region fully fall into high bandwidth memory. Returns: 0 if memory in address range from addr to addr + size is allocated in high bandwidth memory, -1 if any fragment of memory was not backed by high bandwidth memory [e.g. when memory is not initalized] or one of error codes described in ERRORS section.
Using this function in production code may result in serious performance penalty.
argument may include optional flags that modifies function
Before checking pages, function will touch first byte of all pages in address range starting from addr to addr + size by read and write (so the content will be overwitten by the same data as it was read). Using this option may trigger Out Of Memory killer.
hbw_get_policy() returns HBW_POLICY_BIND or HBW_POLICY_PREFERRED which represents the current high bandwidth policy. hbw_free() do not have return value. hbw_malloc() hbw_calloc(), and hbw_realloc() return the pointer to the allocated memory, or NULL if the request fails. hbw_posix_memalign(), hbw_posix_memalign_psize() and hbw_set_policy() return zero on success and return an error code as described in the ERRORS section below on failure.
described here are the POSIX standard error codes as
defined in <errno.h>
returns ENODEV if high-bandwidth memory is unavailable.
hbw_posix_memalign() and hbw_posix_memalign_psize()
If the alignment parameter is not a power of two, or was not a multiple of sizoeof(void *), then EINVAL is returned. If there was insufficient memory to satisfy the request then ENOMEM is returned.
returns EPERM if hbw_set_policy () was called more than once, or EINVAL if mode argument was neither HBW_POLICY_PREFERRED, HBW_POLICY_BIND nor HBW_POLICY_INTERLEAVE.
returns EINVAL if addr is NULL, size equals 0 or flags contained unsupported bit set. If memory pointed by addr could not be verified then EFAULT is returned.
The hbwmalloc.h file defines the external functions and enumerations for the hbwmalloc library. These interfaces define a heap manager that targets high bandwidth memory numa nodes.
Prints a comma separated list of high bandwidth nodes.
This environment variable is a comma separated list of NUMA nodes that are treated as high bandwidth. Uses the libnuma routine numa_parse_nodestring() for parsing, so the syntax described in the numa(3) man page for this routine applies for example: 1-3,5 is a valid setting.
This environment variable allows leveraging internal mechanism of the library for setting number of arenas per kind. Value should be a positive integer (not greater than INT_MAX defined in limits.h). The user should set the value based on the characteristics of application that is using the library. Higher value can provide better performance in extremely multithreaded applications at the cost of memory overhead. See section "IMPLEMENTATION NOTES" of jemalloc(3) for more details about arenas.
obtaining 2MB and 1GB pages (HUGETLB and GBTLB) need
allocated huge pages in the kernel’s huge page pool.
HUGETLB (huge pages)
Current number of "persistent" huge pages can be read from /proc/sys/vm/nr_hugepages file. Proposed way of setting hugepages is: "sudo sysctl vm.nr_hugepages=<number_of_hugepages>". More informations can be found here: https://www.kernel.org/doc/Documentation/vm/hugetlbpage.txt
GBTLB (gigabyte pages)
Number of preallocated gigabyte pages can be read from /proc/cmdline (hugepagesz=1G nr_hugepages=N). Setting gigabyte hugepages is available by kernel commandline. From 3.16 and later kernels, users can allocate gigabyte pages like its done for 2MB pages.
HUGETLB (huge pages)
There might be some overhead in huge pages consumption caused by heap management. If your allocation fails because of OOM, please try to allocate extra huge pages (e.g. 8 huge pages).
Copyright (C) 2014 - 2016 Intel Corporation. All rights reserved.
malloc(3), numa(3), numactl(8), mbind(2), mmap(2), move_pages(2) jemalloc(3) memkind(3)