Lightweight performance analysis toolkit and suite of tools
copied from cf-staging / timemoryLightweight, cross-language utility for recording timing, memory, resource usage, and hardware counters on the CPU and GPU. Timemory provides 40+ metrics for C, C++, CUDA, and/or Python codes that can arbitrarily composed into distinct toolsets which can inter-weaved and without nesting restrictions.