taichi.lang.misc

Module Contents

Functions

print_kernel_profile_info(mode='count')

Print the profiling results of Taichi kernels.

query_kernel_profile_info(name)

Query kernel elapsed time(min,avg,max) on devices using the kernel name.

clear_kernel_profile_info()

Clear all KernelProfiler records.

kernel_profiler_total_time()

Get elapsed time of all kernels recorded in KernelProfiler.

set_kernel_profiler_toolkit(toolkit_name='default')

Set the toolkit used by KernelProfiler.

set_kernel_profile_metrics(metric_list=default_cupti_metrics)

Set metrics that will be collected by the CUPTI toolkit.

collect_kernel_profile_metrics(metric_list=default_cupti_metrics)

Set temporary metrics that will be collected by the CUPTI toolkit within this context.

print_memory_profile_info()

Memory profiling tool for LLVM backends with full sparse support.

reset()

Resets Taichi to its initial state.

init(arch=None, default_fp=None, default_ip=None, _test_mode=False, enable_fallback=True, **kwargs)

Initializes the Taichi runtime.

no_activate(*args)

block_local(*args)

Hints Taichi to cache the fields and to enable the BLS optimization.

mesh_local(*args)

cache_read_only(*args)

assume_in_range(val, base, low, high)

Tape(loss, clear_gradients=True)

Return a context manager of TapeImpl. The

clear_all_gradients()

Set all fields' gradients to 0.

benchmark(_func, repeat=300, args=())

benchmark_plot(fn=None, cases=None, columns=None, column_titles=None, archs=None, title=None, bars='sync_vs_async', bar_width=0.4, bar_distance=0, left_margin=0, size=(12, 8))

Attributes

i

j

k

l

ij

ik

il

jk

jl

kl

ijk

ijl

ikl

jkl

ijkl

cfg

x86_64

The x64 CPU backend.

x64

The X64 CPU backend.

arm64

The ARM CPU backend.

cuda

The CUDA backend.

metal

The Apple Metal backend.

opengl

The OpenGL backend. OpenGL 4.3 required.

cc

wasm

The WebAssembly backend.

vulkan

The Vulkan backend.

dx11

The DX11 backend.

gpu

A list of GPU backends supported on the current system.

cpu

A list of CPU backends supported on the current system.

extension

parallelize

block_dim

global_thread_idx

taichi.lang.misc.i
taichi.lang.misc.j
taichi.lang.misc.k
taichi.lang.misc.l
taichi.lang.misc.ij
taichi.lang.misc.ik
taichi.lang.misc.il
taichi.lang.misc.jk
taichi.lang.misc.jl
taichi.lang.misc.kl
taichi.lang.misc.ijk
taichi.lang.misc.ijl
taichi.lang.misc.ikl
taichi.lang.misc.jkl
taichi.lang.misc.ijkl
taichi.lang.misc.cfg
taichi.lang.misc.x86_64

The x64 CPU backend.

taichi.lang.misc.x64

The X64 CPU backend.

taichi.lang.misc.arm64

The ARM CPU backend.

taichi.lang.misc.cuda

The CUDA backend.

taichi.lang.misc.metal

The Apple Metal backend.

taichi.lang.misc.opengl

The OpenGL backend. OpenGL 4.3 required.

taichi.lang.misc.cc
taichi.lang.misc.wasm

The WebAssembly backend.

taichi.lang.misc.vulkan

The Vulkan backend.

taichi.lang.misc.dx11

The DX11 backend.

taichi.lang.misc.gpu

A list of GPU backends supported on the current system.

When this is used, Taichi automatically picks the matching GPU backend. If no GPU is detected, Taichi falls back to the CPU backend.

taichi.lang.misc.cpu

A list of CPU backends supported on the current system.

When this is used, Taichi automatically picks the matching CPU backend.

taichi.lang.misc.print_kernel_profile_info(mode='count')

Print the profiling results of Taichi kernels.

To enable this profiler, set kernel_profiler=True in ti.init(). 'count' mode: print the statistics (min,max,avg time) of launched kernels, 'trace' mode: print the records of launched kernels with specific profiling metrics (time, memory load/store and core utilization etc.), and defaults to 'count'.

Parameters

mode (str) – the way to print profiling results.

Example:

>>> import taichi as ti

>>> ti.init(ti.cpu, kernel_profiler=True)
>>> var = ti.field(ti.f32, shape=1)

>>> @ti.kernel
>>> def compute():
>>>     var[0] = 1.0

>>> compute()
>>> ti.print_kernel_profile_info()
>>> # equivalent calls :
>>> # ti.print_kernel_profile_info('count')

>>> ti.print_kernel_profile_info('trace')

Note

Currently the result of KernelProfiler could be incorrect on OpenGL backend due to its lack of support for ti.sync().

For advanced mode of KernelProfiler, please visit https://docs.taichi.graphics/docs/lang/articles/misc/profiler#advanced-mode.

taichi.lang.misc.query_kernel_profile_info(name)

Query kernel elapsed time(min,avg,max) on devices using the kernel name.

To enable this profiler, set kernel_profiler=True in ti.init.

Parameters

name (str) – kernel name.

Returns

with member variables(counter, min, max, avg)

Return type

KernelProfilerQueryResult (class)

Example:

>>> import taichi as ti

>>> ti.init(ti.cpu, kernel_profiler=True)
>>> n = 1024*1024
>>> var = ti.field(ti.f32, shape=n)

>>> @ti.kernel
>>> def fill():
>>>     for i in range(n):
>>>         var[i] = 0.1

>>> fill()
>>> ti.clear_kernel_profile_info() #[1]
>>> for i in range(100):
>>>     fill()
>>> query_result = ti.query_kernel_profile_info(fill.__name__) #[2]
>>> print("kernel excuted times =",query_result.counter)
>>> print("kernel elapsed time(min_in_ms) =",query_result.min)
>>> print("kernel elapsed time(max_in_ms) =",query_result.max)
>>> print("kernel elapsed time(avg_in_ms) =",query_result.avg)

Note

[1] To get the correct result, query_kernel_profile_info() must be used in conjunction with clear_kernel_profile_info().

[2] Currently the result of KernelProfiler could be incorrect on OpenGL backend due to its lack of support for ti.sync().

taichi.lang.misc.clear_kernel_profile_info()

Clear all KernelProfiler records.

taichi.lang.misc.kernel_profiler_total_time()

Get elapsed time of all kernels recorded in KernelProfiler.

Returns

total time in second.

Return type

time (float)

taichi.lang.misc.set_kernel_profiler_toolkit(toolkit_name='default')

Set the toolkit used by KernelProfiler.

Currently, we only support toolkits: 'default' and 'cupti'.

Parameters

toolkit_name (str) – string of toolkit name.

Returns

whether the setting is successful or not.

Return type

status (bool)

Example:

>>> import taichi as ti

>>> ti.init(arch=ti.cuda, kernel_profiler=True)
>>> x = ti.field(ti.f32, shape=1024*1024)

>>> @ti.kernel
>>> def fill():
>>>     for i in x:
>>>         x[i] = i

>>> ti.set_kernel_profiler_toolkit('cupti')
>>> for i in range(100):
>>>     fill()
>>> ti.print_kernel_profile_info()

>>> ti.set_kernel_profiler_toolkit('default')
>>> for i in range(100):
>>>     fill()
>>> ti.print_kernel_profile_info()
taichi.lang.misc.set_kernel_profile_metrics(metric_list=default_cupti_metrics)

Set metrics that will be collected by the CUPTI toolkit.

Parameters

metric_list (list) – a list of CuptiMetric() instances, default value: default_cupti_metrics.

Example:

>>> import taichi as ti

>>> ti.init(kernel_profiler=True, arch=ti.cuda)
>>> ti.set_kernel_profiler_toolkit('cupti')
>>> num_elements = 128*1024*1024

>>> x = ti.field(ti.f32, shape=num_elements)
>>> y = ti.field(ti.f32, shape=())
>>> y[None] = 0

>>> @ti.kernel
>>> def reduction():
>>>     for i in x:
>>>         y[None] += x[i]

>>> # In the case of not pramater, Taichi will print its pre-defined metrics list
>>> ti.get_predefined_cupti_metrics()
>>> # get Taichi pre-defined metrics
>>> profiling_metrics = ti.get_predefined_cupti_metrics('shared_access')

>>> global_op_atom = ti.CuptiMetric(
>>>     name='l1tex__t_set_accesses_pipe_lsu_mem_global_op_atom.sum',
>>>     header=' global.atom ',
>>>     format='    {:8.0f} ')
>>> # add user defined metrics
>>> profiling_metrics += [global_op_atom]

>>> # metrics setting will be retained until the next configuration
>>> ti.set_kernel_profile_metrics(profiling_metrics)
>>> for i in range(16):
>>>     reduction()
>>> ti.print_kernel_profile_info('trace')

Note

Metrics setting will be retained until the next configuration.

taichi.lang.misc.collect_kernel_profile_metrics(metric_list=default_cupti_metrics)

Set temporary metrics that will be collected by the CUPTI toolkit within this context.

Parameters

metric_list (list) – a list of CuptiMetric() instances, default value: default_cupti_metrics.

Example:

>>> import taichi as ti

>>> ti.init(kernel_profiler=True, arch=ti.cuda)
>>> ti.set_kernel_profiler_toolkit('cupti')
>>> num_elements = 128*1024*1024

>>> x = ti.field(ti.f32, shape=num_elements)
>>> y = ti.field(ti.f32, shape=())
>>> y[None] = 0

>>> @ti.kernel
>>> def reduction():
>>>     for i in x:
>>>         y[None] += x[i]

>>> # In the case of not pramater, Taichi will print its pre-defined metrics list
>>> ti.get_predefined_cupti_metrics()
>>> # get Taichi pre-defined metrics
>>> profiling_metrics = ti.get_predefined_cupti_metrics('device_utilization')

>>> global_op_atom = ti.CuptiMetric(
>>>     name='l1tex__t_set_accesses_pipe_lsu_mem_global_op_atom.sum',
>>>     header=' global.atom ',
>>>     format='    {:8.0f} ')
>>> # add user defined metrics
>>> profiling_metrics += [global_op_atom]

>>> # metrics setting is temporary, and will be clear when exit from this context.
>>> with ti.collect_kernel_profile_metrics(profiling_metrics):
>>>     for i in range(16):
>>>         reduction()
>>>     ti.print_kernel_profile_info('trace')

Note

The configuration of the metric_list will be clear when exit from this context.

taichi.lang.misc.print_memory_profile_info()

Memory profiling tool for LLVM backends with full sparse support.

This profiler is automatically on.

taichi.lang.misc.extension
taichi.lang.misc.reset()

Resets Taichi to its initial state.

This would destroy all the fields and kernels.

taichi.lang.misc.init(arch=None, default_fp=None, default_ip=None, _test_mode=False, enable_fallback=True, **kwargs)

Initializes the Taichi runtime.

This should always be the entry point of your Taichi program. Most importantly, it sets the backend used throughout the program.

Parameters
  • arch – Backend to use. This is usually cpu or gpu.

  • default_fp (Optional[type]) – Default floating-point type.

  • default_ip (Optional[type]) – Default integral type.

  • **kwargs

    Taichi provides highly customizable compilation through kwargs, which allows for fine grained control of Taichi compiler behavior. Below we list some of the most frequently used ones. For a complete list, please check out https://github.com/taichi-dev/taichi/blob/master/taichi/program/compile_config.h.

    • cpu_max_num_threads (int): Sets the number of threads used by the CPU thread pool.

    • debug (bool): Enables the debug mode, under which Taichi does a few more things like boundary checks.

    • print_ir (bool): Prints the CHI IR of the Taichi kernels.

    • packed (bool): Enables the packed memory layout. See https://docs.taichi.graphics/lang/articles/advanced/layout.

taichi.lang.misc.no_activate(*args)
taichi.lang.misc.block_local(*args)

Hints Taichi to cache the fields and to enable the BLS optimization.

Please visit https://docs.taichi.graphics/lang/articles/advanced/performance for how BLS is used.

Parameters

*args (List[Field]) – A list of sparse Taichi fields.

taichi.lang.misc.mesh_local(*args)
taichi.lang.misc.cache_read_only(*args)
taichi.lang.misc.assume_in_range(val, base, low, high)
taichi.lang.misc.parallelize
taichi.lang.misc.block_dim
taichi.lang.misc.global_thread_idx
taichi.lang.misc.Tape(loss, clear_gradients=True)

Return a context manager of TapeImpl. The context manager would catching all of the callings of functions that decorated by kernel() or grad_replaced() under with statement, and calculate all the partial gradients of a given loss variable by calling all of the gradient function of the callings caught in reverse order while with statement ended.

See also kernel() and grad_replaced() for gradient functions.

Parameters
  • loss (Expr) – The loss field, which shape should be ().

  • clear_gradients (Bool) – Before with body start, clear all gradients or not.

Returns

The context manager.

Return type

TapeImpl

Example:

>>> @ti.kernel
>>> def sum(a: ti.float32):
>>>     for I in ti.grouped(x):
>>>         y[None] += x[I] ** a
>>>
>>> with ti.Tape(loss = y):
>>>     sum(2)
taichi.lang.misc.clear_all_gradients()

Set all fields’ gradients to 0.

taichi.lang.misc.benchmark(_func, repeat=300, args=())
taichi.lang.misc.benchmark_plot(fn=None, cases=None, columns=None, column_titles=None, archs=None, title=None, bars='sync_vs_async', bar_width=0.4, bar_distance=0, left_margin=0, size=(12, 8))