Heap manager that enables allocations to memory with
This header expose EXPERIMENTAL API in except of STANDARD API placed in section LIBRARY VERSION. API Standards are described below in this man page.
Link with -lmemkind
void memkind_error_message(int err, char *msg, size_t size);
void *memkind_malloc(memkind_t kind, size_t size);
void *memkind_calloc(memkind_t kind, size_t num, size_t size);
void *memkind_realloc(memkind_t kind, void *ptr, size_t size);
int memkind_posix_memalign(memkind_t kind, void **memptr, size_t alignment, size_t size);
void memkind_free(memkind_t kind, void *ptr);
void *memkind_partition_mmap(int partition, void *addr, size_t size);(DEPRECATED)
int memkind_create(const struct memkind_ops *ops, const char *name, memkind_t *kind);(DEPRECATED)
int memkind_create_pmem(const char *dir, size_t max_size, memkind_t *kind);
int memkind_get_num_kind(int *num_kind);(DEPRECATED)
int memkind_get_kind_by_partition(int partition, memkind_t *kind);(DEPRECATED)
int memkind_get_kind_by_name(const char *name, memkind_t *kind);(DEPRECATED)
int memkind_get_size(memkind_t kind, size_t *total, size_t *free);(DEPRECATED)
int memkind_check_available(memkind_t kind);
void memkind_malloc_pre(memkind_t *kind, size_t *size);
void memkind_malloc_post(memkind_t kind, size_t size, void **result);
void memkind_calloc_pre(memkind_t *kind, size_t *nmemb, size_t *size);
void memkind_calloc_post(memkind_t kind, size_t nmemb, size_t size, void **result);
void memkind_posix_memalign_pre(memkind_t *kind, void **memptr, size_t *alignment, size_t *size);
void memkind_posix_memalign_post(memkind_t kind, void **memptr, size_t alignment, size_t size, int *err);
void memkind_realloc_pre(memkind_t *kind, void **ptr, size_t *size);
void memkind_realloc_post(memkind_t kind, void *ptr, size_t size, void **result);
void memkind_free_pre(memkind_t *kind, void **ptr);
void memkind_free_post(memkind_t kind, void *ptr);
memkind_error_message() converts an error number err returned by a member of the memkind interface to an error message msg where the maximum size of the message is passed by the size parameter.
The functions described in this section define a heap manager with an interface modeled on the ISO C standard API’s, except that the user must specify the kind of memory with the first argument to each function. See the KINDS section below for a full description of the implemented kinds.
memkind_malloc() allocates size bytes of uninitialized memory of the specified kind. The allocated space is suitably aligned (after possible pointer coercion) for storage of any type of object. If size is 0, then memkind_malloc() returns NULL.
memkind_calloc() allocates space for num objects each size bytes in length in memory of the specified kind. The result is identical to calling memkind_malloc() with an argument of num*size, with the exception that the allocated memory is explicitly initialized to zero bytes. If num or size is 0, then memkind_calloc() returns NULL.
memkind_realloc() changes the size of the previously allocated memory referenced by ptr to size bytes of the specified kind. The contents of the memory are unchanged up to the lesser of the new and old sizes. If the new size is larger, the contents of the newly allocated portion of the memory are undefined. Upon success, the memory referenced by ptr is freed and a pointer to the newly allocated high bandwidth memory is returned.
Note: memkind_realloc() may move the memory allocation, resulting in a different return value than ptr.
If ptr is NULL, the memkind_realloc() function behaves identically to memkind_malloc() for the specified size. The address ptr, if not NULL, must have been returned by a previous call to memkind_malloc(), memkind_calloc(), memkind_realloc(), or memkind_posix_memalign() with the same kind as specified to the call to memkind_realloc(). Otherwise, if memkind_free(kind, ptr) was called before, undefined behavior occurs.
memkind_posix_memalign() allocates size bytes of memory of a specified kind such that the allocation’s base address is an even multiple of alignment, and returns the allocation in the value pointed to by memptr. The requested alignment must be a power of 2 at least as large as sizeof(void *). If size is 0, then memkind_posix_memalign() returns NULL.
memkind_free() causes the allocated memory referenced by ptr to be made available for future allocations. This pointer must have been returned by a previous call to memkind_malloc(), memkind_calloc(), memkind_realloc(), or memkind_posix_memalign(). Otherwise, if memkind_free(kind, ptr) was already called before, undefined behavior occurs. If ptr is NULL, no operation is performed. The value of MEMKIND_DEFAULT can be given as the kind for all buffers allocated by a kind that leverages the jemalloc allocator. This includes all internally defined kinds other than those that use gigabyte pages. In cases where the kind is unknown in the context of the call to memkind_free() 0 can be given as the kind specified to memkind_free() but this will require a look up that can be bypassed by specifying a non-zero value.
This section describes the memkind interface which is used internally by the heap manager. For this API the kind is determined by the partition index. This enables the underlying heap manager to call routines with standard type arguments, and allows the heap manager implementation to be independent of the specifics of the memkind_t implementation. Currently there is only one callback function.
memkind_partition_mmap() which is a wrapper around the mmap(2) system call. The hint address addr and the length in bytes of the buffer to be allocated size are passed through to mmap(2). The other mmap(2) parameters are determined by the kind operations.
There are built-in kinds that are always available, and these are enumerated in the KINDS section. The user can also create their own kinds of memory. This section describes the API’s that enable the tracking of the different kinds of memory and determining their properties.
memkind_create()(DEPRECATED) is used to create a new kind of memory. It takes as input a pointer to a vtable called ops that determines the operations that define the kind of memory. These operations can be taken from one of the memkind built-in implementations defined in the memkind headers, or it can be user implemented. The requirements for each operation is defined in the MEMKIND OPERATIONS section. See the SEE ALSO section for the list of memkind header files that implement the built-in operations: each header has a man page.
is a convenience function used to create a file-backed kind
of memory. It allocates a temporary file in the given
directory dir. The file is created in a fashion
similar to tmpfile(3), so that the file name does not
appear when the directory is listed and the space is
automatically freed when the program terminates. The file is
truncated to a size of max_size bytes and the
resulting space is memory-mapped. Then, the actual pmem
kind is created by calling memkind_create()
with a unique name string ("pmemXXXXXXXX"),
and the pointer to MEMKIND_PMEM_OPS as ops
Note that the actual file system space is not allocated immediately, but only on a call to memkind_pmem_mmap() (see memkind_pmem(3)). This allows to create a pmem memkind of a pretty large size without the need to reserve in advance the corresponding file system space for the entire heap. The minimum max_size value allowed by the library is defined in <memkind_pmem.h> as MEMKIND_PMEM_MIN_SIZE. Calling memkind_create_pmem() with a size smaller than that will return an error. The maximum allowed size is not limited by memkind, but by the file system specified by the dir argument. The max_size passed in is the raw size of the memory pool and jemalloc will use some of that space for its own metadata.
memkind_finalize() releases all resources associated with the memkind library including the resources used by all of the kinds that were created, but it does not free memory allocated with the HEAP MANAGEMENT interface. This must be the last call to the memkind library before application termination, but it can be called more than once.
memkind_get_num_kind() sets num_kind to the number of available kinds of memory. This accounts for the built-in static kinds and any dynamically created kinds. Since there is a one-to-one mapping between partitions and kinds, this is also the number of partitions.
memkind_get_kind_by_partition() sets kind to the memory kind associated with the partition index which must be in the range [0, num_kind - 1] where num_kind can be retrieved with the memkind_get_num_kind() function.
memkind_get_kind_by_name() sets kind to the memory kind associated with the name string specified. All of the built-in kinds have name strings that are the lower-case version of name given in the KINDS section (for example: MEMKIND_DEFAULT has name "memkind_default")
memkind_get_size()(DEPRECATED) sets total to the number of bytes on the system which can be allocated with the specified kind, and sets free to the number of unallocated bytes available of memory with the specified kind.
Note: These numbers may be specific to the CPU of the calling thread if the kind binds memory to NUMA nodes associated with the CPU.
memkind_check_available() Returns a zero if the specified kind is available or an error code from the ERRORS section if it is not.
The memkind library enables the user to define decorator functions that can be called before and after each memkind heap management API. The decorators that are called at the beginning of the function end are named after that function with _pre appended to the name, and those that are called at the end of the function are named after that function with _post appended to the name. These are weak symbols, and if they are not present at link time they are not called. The memkind library does not define these symbols which are reserved for user definition. These decorators can be used to track calls to the heap management interface or to modify parameters. The decorators that are called at the beginning of the allocator pass all inputs by reference, and the decorators that are called at the end of the allocator pass the output by reference. This enables the modification of the input and output of each heap management function by the decorators.
The memkind library version scheme consist major, minor and patch numbers separated by dot. Combining those numbers, we got the following representation:
-major number is incremented whenever API is changed (loss of backward compatibility),
-minor number is incremented whenever additional extensions are introduced, or behavior has been changed,
-patch number is incremented whenever small bug fixes are added.
provide numeric representation of the version by exposing
the following API:
int memkind_get_version() return version number represented by a single integer number, obtained from the formula:
major * 1000000 + minor * 1000 + patch
Note: major < 1 means unstable API.
-STANDARD API, API is considered as stable
-NON-STANDARD API, API is considered as stable, however this is not a standard way to use memkind
-EXPERIMENTAL API, API is considered as unstable and the subject to change
The memkind_ops structure is a vtable that defines the operations which determine the kind of memory. This design pattern is modeled after the "mix-in" pattern used in the Linux kernel to enable some of the features of an object oriented language in C. This section defines the inputs, outputs and responsibilities of each function pointer enumerated in the memkind_ops structure. Each of these methods takes a memkind_t argument as its first parameter which shall be self referencing. In this documentation the function pointers in the memkind_ops structure will be prepended with "ops." and should be considered the operation associated with the kind.
ops.create(memkind_t kind, const struct
memkind_ops *ops, const char
shall instantiate all of the dynamic resources associated with the kind. It takes a pointer to the vtable structure ops which has a function pointer for each of methods defined in this section of the man page. If any methods are unnecessary to the implementation of the kind these function pointers shall be set to NULL. The name string is an input parameter that identifies the kind of memory so that it can be fetched with the memkind_get_kind_by_name() function. Typically this method is either a pointer to the function memkind_default_create() defined in <memkind_default.h>, or a function that calls memkind_default_create() before performing other setup.
shall free all of the dynamic resources reserved by the ops.create() method. If no dynamic resources were explicitly allocated in the ops.create() method, this pointer can be set to NULL.
*ops.malloc(memkind_t kind, size_t
shall implement memkind_malloc(), as described above.
*ops.calloc(memkind_t kind, size_t
num, size_t size);
shall implement memkind_calloc(), as described above.
ops.posix_memalign(memkind_t kind, void
**memptr, size_t alignment,
shall implement memkind_posix_memalign(), as described above.
*ops.realloc(memkind_t kind, void
*ptr, size_t size);
shall implement memkind_realloc(), as described above.
ops.free(memkind_t kind, void
shall implement memkind_free(), as described above.
void *ops.mmap(memkind_t kind, void *addr, size_t size); shall wrap the mmap(2) mbind(2) and madvise(2) system calls while passing addr and size through and determining all other parameters for mmap(2) mbind(2) and madvise(2) by calling other functions resolved by the kind.ops vtable. This function shall return a virtual address to the memory mapped, or MAP_FAILED as defined in <sys/mman.h> which is (void *) -1.
ops.mbind(memkind_t kind, void
*ptr, size_t size);
shall wrap the mbind(2) system call and pass through the start address ptr to be bound, and the number of bytes size from that address to be bound. The other parameters to mbind(2) shall be determined by calling other functions resolved by the kind.ops vtable.
ops.madvise(memkind_t kind, void
*addr, size_t size);
shall wrap the madvise(2) system call and pass through the start address addr to be advised, and the number of bytes size from that address to be advised. This may call madvise(2) multiple times with different advice.
ops.get_mmap_flags(memkind_t kind, int
shall set flags to a value appropriate for passing to the mmap(2) system call for the kind.
ops.get_mbind_mode(memkind_t kind, int
shall set mode to a value appropriate for passing to the mmap(2) system call for the kind.
unsigned long *nodemask, unsigned long
shall set the nodemask of length maxnode bits to a value appropriate for passing to the mbind(2) system call for the kind.
ops.get_arena(memkind_t kind, unsigned int
shall set arena to an index appropriate for the kind, allocation size, and CPU when using the jemalloc arena allocation through the jemk_mallocx() API. Size parameter is not used at the moment, but will be necessary for integration with jemalloc 4.0.x (planned for one of the future releases).
ops.get_size(memkind_t kind, size_t
*total, size_t *free);
shall implement memkind_get_size(), as described above.
shall return 0 if the kind is available on the system, and an error code if not.
ops.check_addr(memkind_t kind, void
shall return 0 if the addr can be freed with the specified kind and an error code otherwise. If the memory cannot be freed with jemk_free(), then at least one of the instantiated kinds must return 0 to enable freeing.
this function pointer shall be set to NULL for any kind that is not built-in. The method is used to allocate dynamic resources for built-in kinds without requiring and initialization routine.
memkind_calloc(), memkind_malloc(), and memkind_realloc(), return the pointer to the allocated memory, or NULL if the request fails. memkind_free() and memkind_error_message() do not have return values. All other memkind API’s return 0 upon success, and an error code defined in the ERRORS section upon failure. The memkind library avoids setting errno directly, but calls to underlying libraries and system calls may set errno.
kinds of memory
Default allocation using standard memory and default page size.
Allocate from standard memory using huge pages. Note: This kind requires huge pages configuration described in SYSTEM CONFIGURATION section.
Allocate from standard memory using giga byte huge pages. Note: This kind requires gigabyte pages configuration described in SYSTEM CONFIGURATION section.
Allocate pages interleaved across all NUMA nodes with transparent huge pages disabled.
Allocate from the closest high bandwidth memory NUMA node at time of allocation. If there is not enough high bandwidth memory to satisfy the request errno is set to ENOMEM and the allocated pointer is set to NULL.
Same as MEMKIND_HBW except the allocation is backed by huge pages. Note: This kind requires huge pages configuration described in SYSTEM CONFIGURATION section.
Same as MEMKIND_HBW except that if there is not enough high bandwidth memory to satisfy the request, the allocation will fall back on standard memory.
Same as MEMKIND_HBW_PREFERRED except the allocation is backed by huge pages. Note: This kind requires huge pages configuration described in SYSTEM CONFIGURATION section.
Same as MEMKIND_HBW except the allocation is backed by one gigabyte huge pages. Note that size can take on any value, but full gigabyte pages will allocated for each request, so remainder of the last page will be wasted. A good use case scenario is to grow a buffer in the course of an application with reallocs. In this case, if there is enough memory available within in already allocated gigabyte page, newer pages are not fetched. This is demonstrated in the examples directory with gb_realloc_example.c This kind requires gigabyte pages configuration described in SYSTEM CONFIGURATION section.
Same as MEMKIND_HBW_GBTLB except that if there is not enough high bandwidth memory to satisfy the request, the allocation will fall back on standard memory. Note: This kind requires gigabyte pages configuration described in SYSTEM CONFIGURATION section.
Same as MEMKIND_HBW except that the pages that support the allocation are interleaved across all high bandwidth nodes and transparent huge pages are disabled.
returns the one of the POSIX standard error codes EINVAL or ENOMEM as defined in <errno.h> if an error occurs (these have positive values). If the alignment parameter is not a power of two, or is not a multiple of sizoeof(void *), then EINVAL is returned. If there is insufficient memory to satisfy the request then ENOMEM is returned.
other than memkind_posix_memalign() which have an
integer return type return one of the negative error codes
as defined in <memkind.h> and described below.
Requested memory kind is not available
Call to mbind(2) failed
Call to mmap(2) failed
Call to jemk_posix_memalign() failed
Call to jemk_mallctl() failed
Call to jemk_malloc() failed
Call to sched_getcpu() returned out of range
Two NUMA memory nodes are equidistant from target cpu node
Alignment must be a power of two and larger than sizeof(void *)
Call to jemk_allocm() failed
Error parsing environment variable (MEMKIND_*)
Invalid input arguments to memkind routine
Prints a comma separated list of high bandwidth nodes.
This environment variable is a comma separated list of NUMA nodes that are treated as high bandwidth. Uses the libnuma routine numa_parse_nodestring() for parsing, so the syntax described in the numa(3) man page for this routine applies: e.g 1-3,5 is a valid setting.
This environment variable allows leveraging internal mechanism of the library for setting number of arenas per kind. Value should be a positive integer (not greater than INT_MAX defined in limits.h). The user should set the value based on the characteristics of application that is using the library. Higher value can provide better performance in extremely multithreaded applications at the cost of memory overhead. See section "IMPLEMENTATION NOTES" of jemalloc(3) for more details about arenas.
Controls logging mechanism in memkind. Setting MEMKIND_DEBUG to "1" enables printing messages like errors and general informations about environment to stderr.
obtaining 2MB and 1GB pages (HUGETLB and GBTLB) need
allocated huge pages in the kernel’s huge page pool.
HUGETLB (huge pages)
Current number of "persistent" huge pages can be read from /proc/sys/vm/nr_hugepages file. Proposed way of setting hugepages is: "sudo sysctl vm.nr_hugepages=<number_of_hugepages>". More informations can be found here: https://www.kernel.org/doc/Documentation/vm/hugetlbpage.txt
GBTLB (gigabyte pages)
Number of preallocated gigabyte pages can be read from /proc/cmdline (hugepagesz=1G nr_hugepages=N). Setting gigabyte hugepages is available by kernel commandline. From 3.16 and later kernels, users can allocate gigabyte pages like its done for 2MB pages.
HUGETLB (huge pages)
There might be some overhead in huge pages consumption caused by heap management. If your allocation fails because of OOM, please try to allocate extra huge pages (e.g. 8 huge pages).
Copyright (C) 2014 - 2016 Intel Corporation. All rights reserved.
malloc(3), numa(3), numactl(8), mbind(2), mmap(2), move_pages(2), jemalloc(3), memkind_default(3), memkind_arena(3), memkind_hbw(3), memkind_hugetlb(3), memkind_gbtlb(3), memkind_pmem(3)