一個Tensor的一生 - torch.rand篇
- 前言
- torch/\_C/_VariableFunctions.pyi
- backtrace
- Python bindings
- torch::autograd::THPVariable_rand
- operator()
- C++ API
- torch::rand_symint
- at::rand_symint
- dispatch
- at::_ops::rand
- at::_ops::rand::call
- native_functions.yaml
- at::(anonymous namespace)::rand
- redisptach
- at::_ops::rand::redispatch
- native_functions.yaml
- CPU kernel
- at::(anonymous namespace)::(anonymous namespace)::wrapper_CompositeExplicitAutograd__rand
- at::native::rand
- at::native::rand
- c10::optional
- c10::OptionalBase
- c10::constexpr_optional_base
- constexpr_storage_t
- at::native::rand
前言
Life of a Tensor這篇文章中介紹了torch.rand
函數從Python API一直到C++底層的調用過程,寫於2019年7月,當時的PyTorch版本為1.1.0。參考該篇文章,本篇的關注點同樣在整個調用流程,不過是基於較新的PyTorch 2.0版。
首先編寫一個torch_rand.py
如下:
import torch
torch.rand(3, 4)
如果我們查找torch.rand
的定義,會跳到torch/_C/_VariableFunctions.pyi
中。
torch/_C/_VariableFunctions.pyi
torch/_C/_VariableFunctions.pyi
這個檔案裡面共有8個不同簽名的rand
函數:
@overload
def rand(size: Sequence[Union[_int, SymInt]], *, generator: Optional[Generator], names: Optional[Sequence[Union[str, ellipsis, None]]], dtype: Optional[_dtype] = None, layout: Optional[_layout] = None, device: Optional[Union[_device, str, None]] = None, pin_memory: Optional[_bool] = False, requires_grad: Optional[_bool] = False) -> Tensor: ...
@overload
def rand(*size: _int, generator: Optional[Generator], names: Optional[Sequence[Union[str, ellipsis, None]]], dtype: Optional[_dtype] = None, layout: Optional[_layout] = None, device: Optional[Union[_device, str, None]] = None, pin_memory: Optional[_bool] = False, requires_grad: Optional[_bool] = False) -> Tensor: ...
@overload
def rand(size: Sequence[Union[_int, SymInt]], *, generator: Optional[Generator], out: Optional[Tensor] = None, dtype: Optional[_dtype] = None, layout: Optional[_layout] = None, device: Optional[Union[_device, str, None]] = None, pin_memory: Optional[_bool] = False, requires_grad: Optional[_bool] = False) -> Tensor: ...
@overload
def rand(*size: _int, generator: Optional[Generator], out: Optional[Tensor] = None, dtype: Optional[_dtype] = None, layout: Optional[_layout] = None, device: Optional[Union[_device, str, None]] = None, pin_memory: Optional[_bool] = False, requires_grad: Optional[_bool] = False) -> Tensor: ...
@overload
def rand(size: Sequence[Union[_int, SymInt]], *, out: Optional[Tensor] = None, dtype: Optional[_dtype] = None, layout: Optional[_layout] = None, device: Optional[Union[_device, str, None]] = None, pin_memory: Optional[_bool] = False, requires_grad: Optional[_bool] = False) -> Tensor: ...
@overload
def rand(*size: _int, out: Optional[Tensor] = None, dtype: Optional[_dtype] = None, layout: Optional[_layout] = None, device: Optional[Union[_device, str, None]] = None, pin_memory: Optional[_bool] = False, requires_grad: Optional[_bool] = False) -> Tensor: ...
@overload
def rand(size: Sequence[Union[_int, SymInt]], *, names: Optional[Sequence[Union[str, ellipsis, None]]], dtype: Optional[_dtype] = None, layout: Optional[_layout] = None, device: Optional[Union[_device, str, None]] = None, pin_memory: Optional[_bool] = False, requires_grad: Optional[_bool] = False) -> Tensor: ...
@overload
def rand(*size: _int, names: Optional[Sequence[Union[str, ellipsis, None]]], dtype: Optional[_dtype] = None, layout: Optional[_layout] = None, device: Optional[Union[_device, str, None]] = None, pin_memory: Optional[_bool] = False, requires_grad: Optional[_bool] = False) -> Tensor: ...
torch.rand(3, 4)
調用的是上面倒數第三個rand
函數。
注1:關於此處出現的*size
, _int
及Sequence
等型別,以及, *,
的用法,詳見Python typing函式庫和torch.types。
注2:torch/_C/_VariableFunctions.pyi
這個檔案是由gen_pyi.py
依據torch/_C/_VariableFunctions.pyi.in
這個模板及native_functions.yaml
生成的,想知道它具體是怎麼被生成的,詳見PyTorch中的pyi檔案生成機制。
如果把擂同的部份對齊,完全相同的部份省略:
@overload
def rand(size: Sequence[Union[_int, SymInt]], *, generator: Optional[Generator], names: Optional[Sequence[Union[str, ellipsis, None]]], ...
@overload
def rand(*size: _int, generator: Optional[Generator], names: Optional[Sequence[Union[str, ellipsis, None]]], ...
@overload
def rand(size: Sequence[Union[_int, SymInt]], *, generator: Optional[Generator], out: Optional[Tensor] = None, ...
@overload
def rand(*size: _int, generator: Optional[Generator], out: Optional[Tensor] = None, ...
@overload
def rand(size: Sequence[Union[_int, SymInt]], *, out: Optional[Tensor] = None, ...
@overload
def rand(*size: _int, out: Optional[Tensor] = None, ...
@overload
def rand(size: Sequence[Union[_int, SymInt]], *, names: Optional[Sequence[Union[str, ellipsis, None]]], ...
@overload
def rand(*size: _int, names: Optional[Sequence[Union[str, ellipsis, None]]], ...
可以依據參數將它們分為以下四類:
-
size, generator, names
-
size, generator(out可省略)
-
size(out可省略)
-
size, names
這四類中的每一類又各自都有兩個版本,即第一個參數(形狀)接受的是_int
或者是Sequence
。
如果將Python腳本中的代碼改成:
torch.rand((3,4))
或:
torch.rand([3,4])
就會調用到倒數第四個、Sequence
版本的rand
函數(跟剛剛看到的倒數第三個、_int
版本的是一對):
@overload
def rand(size: Sequence[Union[_int, SymInt]], *, out: Optional[Tensor] = None, dtype: Optional[_dtype] = None, layout: Optional[_layout] = None, device: Optional[Union[_device, str, None]] = None, pin_memory: Optional[_bool] = False, requires_grad: Optional[_bool] = False) -> Tensor: ...
backtrace
接下來新建一個名為gdbrand
的檔案並將以下內容填入其中:
python sys.path.append("/usr/share/gcc/python");
set logging file rand.txt
set logging on
set breakpoint pending on
break at::empty
info breakpoints
run torch_rand.py
bt
開啟gdb python
,輸入source gdbrand
指令執行上述腳本,就會得到以下的backtrace:
#0 at::empty (size=..., options=..., memory_format=memory_format@entry=...) at /root/Documents/pytorch/build/aten/src/ATen/Functions.h:2652
#1 0x00007f04620d60b1 in at::native::rand (size=..., generator=..., dtype=..., layout=..., device=..., pin_memory=...)at /root/Documents/pytorch/c10/util/Optional.h:204
#2 0x00007f04620d61fc in at::native::rand (size=..., dtype=..., layout=..., layout@entry=..., device=..., pin_memory=...)at /root/Documents/pytorch/aten/src/ATen/native/TensorFactories.cpp:781
#3 0x00007f0462dd2c28 in at::(anonymous namespace)::(anonymous namespace)::wrapper_CompositeExplicitAutograd__rand (pin_memory=..., device=..., layout=..., dtype=..., size=...) at /root/Documents/pytorch/build/aten/src/ATen/RegisterCompositeExplicitAutograd.cpp:2214
#4 c10::impl::detail::WrapFunctionIntoFunctor_<c10::CompileTimeFunctionPointer<at::Tensor(c10::ArrayRef<c10::SymInt>, c10::optional<c10::ScalarType>, c10::optional<c10::Layout>, c10::optional<c10::Device>, c10::optional<bool>), at::(anonymous namespace)::(anonymous namespace)::wrapper_CompositeExplicitAutograd__rand>, at::Tensor, c10::guts::typelist::typelist<c10::ArrayRef<c10::SymInt>, c10::optional<c10::ScalarType>, c10::optional<c10::Layout>, c10::optional<c10::Device>, c10::optional<bool> > >::operator() (args#4=..., args#3=..., args#2=..., args#1=..., args#0=..., this=<optimized out>)at /root/Documents/pytorch/aten/src/ATen/core/boxing/impl/WrapFunctionIntoFunctor.h:13
#5 c10::impl::wrap_kernel_functor_unboxed_<c10::impl::detail::WrapFunctionIntoFunctor_<c10::CompileTimeFunctionPointer<at::Tensor(c10::ArrayRef<c10::SymInt>, c10::optional<c10::ScalarType>, c10::optional<c10::Layout>, c10::optional<c10::Device>, c10::optional<bool>), at::(anonymous namespace)::(anonymous namespace)::wrapper_CompositeExplicitAutograd__rand>, at::Tensor, c10::guts::typelist::typelist<c10::ArrayRef<c10::SymInt>, c10::optional<c10::ScalarType>, c10::optional<c10::Layout>, c10::optional<c10::Device>, c10::optional<bool> > >, at::Tensor(c10::ArrayRef<c10::SymInt>, c10::optional<c10::ScalarType>, c10::optional<c10::Layout>, c10::optional<c10::Device>, c10::optional<bool>)>::call(c10::OperatorKernel *, c10::DispatchKeySet, c10::ArrayRef<c10::SymInt>, c10::optional<c10::ScalarType>, c10::optional<c10::Layout>, c10::optional<c10::Device>, c10::optional<bool>) (functor=<optimized out>, args#0=..., args#1=..., args#2=..., args#3=..., args#4=...) at /root/Documents/pytorch/aten/src/ATen/core/boxing/impl/make_boxed_from_unboxed_functor.h:463
#6 0x00007f0462889309 in c10::callUnboxedKernelFunction<at::Tensor, c10::ArrayRef<c10::SymInt>, c10::optional<c10::ScalarType>, c10::optional<c10::Layout>, c10::optional<c10::Device>, c10::optional<bool> > (dispatchKeySet=..., functor=<optimized out>, unboxed_kernel_func=<optimized out>)at /root/Documents/pytorch/aten/src/ATen/core/boxing/KernelFunction_impl.h:50
#7 c10::KernelFunction::call<at::Tensor, c10::ArrayRef<c10::SymInt>, c10::optional<c10::ScalarType>, c10::optional<c10::Layout>, c10::optional<c10::Device>, c10::optional<bool> > (dispatchKeySet=..., opHandle=..., this=0x2075db8) at /root/Documents/pytorch/aten/src/ATen/core/boxing/KernelFunction_impl.h:91
#8 c10::Dispatcher::redispatch<at::Tensor, c10::ArrayRef<c10::SymInt>, c10::optional<c10::ScalarType>, c10::optional<c10::Layout>, c10::optional<c10::Device>, c10::optional<bool> >(c10::TypedOperatorHandle<at::Tensor (c10::ArrayRef<c10::SymInt>, c10::optional<c10::ScalarType>, c10::optional<c10::Layout>, c10::optional<c10::Device>, c10::optional<bool>)> const&, c10::DispatchKeySet, c10::ArrayRef<c10::SymInt>, c10::optional<c10::ScalarType>, c10::optional<c10::Layout>, c10::optional<c10::Device>, c10::optional<bool>) const (this=<optimized out>, currentDispatchKeySet=..., op=...)at /root/Documents/pytorch/aten/src/ATen/core/dispatch/Dispatcher.h:656
#9 c10::TypedOperatorHandle<at::Tensor (c10::ArrayRef<c10::SymInt>, c10::optional<c10::ScalarType>, c10::optional<c10::Layout>, c10::optional<c10::Device>, c10::optional<bool>)>::redispatch(c10::DispatchKeySet, c10::ArrayRef<c10::SymInt>, c10::optional<c10::ScalarType>, c10::optional<c10::Layout>, c10::optional<c10::Device>, c10::optional<bool>) const (args#4=..., args#3=..., args#2=..., args#1=..., args#0=..., currentDispatchKeySet=..., this=0x7f046a780150 <at::_ops::rand::redispatch(c10::DispatchKeySet, c10::ArrayRef<c10::SymInt>, c10::optional<c10::ScalarType>, c10::optional<c10::Layout>, c10::optional<c10::Device>, c10::optional<bool>)::op>) at /root/Documents/pytorch/aten/src/ATen/core/dispatch/Dispatcher.h:492
#10 at::_ops::rand::redispatch (dispatchKeySet=..., size=..., dtype=..., layout=..., layout@entry=..., device=..., pin_memory=...) at /root/Documents/pytorch/build/aten/src/ATen/Operators_2.cpp:5220
#11 0x00007f0462c05329 in at::(anonymous namespace)::rand (pin_memory=..., device=..., layout=..., dtype=..., size=...) at /root/Documents/pytorch/build/aten/src/ATen/RegisterBackendSelect.cpp:365
#12 c10::impl::detail::WrapFunctionIntoFunctor_<c10::CompileTimeFunctionPointer<at::Tensor(c10::ArrayRef<c10::SymInt>, c10::optional<c10::ScalarType>, c10::optional<c10::Layout>, c10::optional<c10::Device>, c10::optional<bool>), at::(anonymous namespace)::rand>, at::Tensor, c10::guts::typelist::typelist<c10::ArrayRef<c10::SymInt>, c10::optional<c10::ScalarType>, c10::optional<c10::Layout>, c10::optional<c10::Device>, c10::optional<bool> > >::operator() (args#4=..., args#3=..., args#2=..., args#1=..., args#0=..., this=<optimized out>) at /root/Documents/pytorch/aten/src/ATen/core/boxing/impl/WrapFunctionIntoFunctor.h:13
#13 c10::impl::wrap_kernel_functor_unboxed_<c10::impl::detail::WrapFunctionIntoFunctor_<c10::CompileTimeFunctionPointer<at::Tensor(c10::ArrayRef<c10::SymInt>, c10::optional<c10::ScalarType>, c10::optional<c10::Layout>, c10::optional<c10::Device>, c10::optional<bool>), at::(anonymous namespace)::rand>, at::Tensor, c10::guts::typelist::typelist<c10::ArrayRef<c10::SymInt>, c10::optional<c10::ScalarType>, c10::optional<c10::Layout>, c10::optional<c10::Device>, c10::optional<bool> > >, at::Tensor(c10::ArrayRef<c10::SymInt>, c10::optional<c10::ScalarType>, c10::optional<c10::Layout>, c10::optional<c10::Device>, c10::optional<bool>)>::call(c10::OperatorKernel *, c10::DispatchKeySet, c10::ArrayRef<c10::SymInt>, c10::optional<c10::ScalarType>, c10::optional<c10::Layout>, c10::optional<c10::Device>, c10::optional<bool>) (functor=<optimized out>, args#0=..., args#1=..., args#2=..., args#3=..., args#4=...) at /root/Documents/pytorch/aten/src/ATen/core/boxing/impl/make_boxed_from_unboxed_functor.h:463
#14 0x00007f04628e3422 in c10::callUnboxedKernelFunction<at::Tensor, c10::ArrayRef<c10::SymInt>, c10::optional<c10::ScalarType>, c10::optional<c10::Layout>, c10::optional<c10::Device>, c10::optional<bool> > (dispatchKeySet=..., functor=<optimized out>, unboxed_kernel_func=<optimized out>) at /root/Documents/pytorch/aten/src/ATen/core/boxing/KernelFunction_impl.h:50
#15 c10::KernelFunction::call<at::Tensor, c10::ArrayRef<c10::SymInt>, c10::optional<c10::ScalarType>, c10::optional<c10::Layout>, c10::optional<c10::Device>, c10::optional<bool> > (dispatchKeySet=..., opHandle=..., this=0x2076638) at /root/Documents/pytorch/aten/src/ATen/core/boxing/KernelFunction_impl.h:91
#16 c10::Dispatcher::call<at::Tensor, c10::ArrayRef<c10::SymInt>, c10::optional<c10::ScalarType>, c10::optional<c10::Layout>, c10::optional<c10::Device>, c10::optional<bool> >(c10::TypedOperatorHandle<at::Tensor (c10::ArrayRef<c10::SymInt>, c10::optional<c10::ScalarType>, c10::optional<c10::Layout>, c10::optional<c10::Device>, c10::optional<bool>)> const&, c10::ArrayRef<c10::SymInt>, c10::optional<c10::ScalarType>, c10::optional<c10::Layout>, c10::optional<c10::Device>, c10::optional<bool>) const (op=..., this=<optimized out>) at /root/Documents/pytorch/aten/src/ATen/core/dispatch/Dispatcher.h:639
#17 c10::TypedOperatorHandle<at::Tensor (c10::ArrayRef<c10::SymInt>, c10::optional<c10::ScalarType>, c10::optional<c10::Layout>, c10::optional<c10::Device>, c10::optional<bool>)>::call(c10::ArrayRef<c10::SymInt>, c10::optional<c10::ScalarType>, c10::optional<c10::Layout>, c10::optional<c10::Device>, c10::optional<bool>) const (args#4=..., args#3=..., args#2=..., args#1=..., args#0=..., this=0x7f046a780170 <at::_ops::rand::call(c10::ArrayRef<c10::SymInt>, c10::optional<c10::ScalarType>, c10::optional<c10::Layout>, c10::optional<c10::Device>, c10::optional<bool>)::op>) at /root/Documents/pytorch/aten/src/ATen/core/dispatch/Dispatcher.h:487
#18 at::_ops::rand::call (size=..., dtype=..., layout=..., layout@entry=..., device=..., pin_memory=...) at /root/Documents/pytorch/build/aten/src/ATen/Operators_2.cpp:5213
#19 0x00007f046abdd595 in at::rand_symint (options=..., size=...) at /root/Documents/pytorch/build/aten/src/ATen/Functions.h:5770
#20 torch::rand_symint (options=..., size=...) at /root/Documents/pytorch/torch/csrc/autograd/generated/variable_factories.h:418
#21 operator() (__closure=<synthetic pointer>, options=..., size=...) at /root/Documents/pytorch/torch/csrc/autograd/generated/python_torch_functions_0.cpp:5256
#22 torch::autograd::THPVariable_rand (self_=<optimized out>, args=<optimized out>, kwargs=<optimized out>) at /root/Documents/pytorch/torch/csrc/autograd/generated/python_torch_functions_0.cpp:5258
#23 0x0000000000508127 in cfunction_call (func=0x7f048932aa90, args=<optimized out>, kwargs=<optimized out>) at /usr/local/src/conda/python-3.9.13/Objects/methodobject.c:543
#24 0x00000000004f0edc in _PyObject_MakeTpCall (tstate=0x1a610a0, callable=0x7f048932aa90, args=<optimized out>, nargs=<optimized out>, keywords=0x0) at /usr/local/src/conda/python-3.9.13/Objects/call.c:191
#25 0x00000000004ed255 in _PyObject_VectorcallTstate (kwnames=0x0, nargsf=<optimized out>, args=0x1abc308, callable=0x7f048932aa90, tstate=<optimized out>) at /usr/local/src/conda/python-3.9.13/Include/cpython/abstract.h:116
#26 _PyObject_VectorcallTstate (kwnames=0x0, nargsf=<optimized out>, args=0x1abc308, callable=0x7f048932aa90, tstate=<optimized out>) at /usr/local/src/conda/python-3.9.13/Include/cpython/abstract.h:103
#27 PyObject_Vectorcall (kwnames=0x0, nargsf=<optimized out>, args=0x1abc308, callable=0x7f048932aa90) at /usr/local/src/conda/python-3.9.13/Include/cpython/abstract.h:127
#28 call_function (kwnames=0x0, oparg=<optimized out>, pp_stack=<synthetic pointer>, tstate=0x1a610a0) at /usr/local/src/conda/python-3.9.13/Python/ceval.c:5077
#29 _PyEval_EvalFrameDefault (tstate=<optimized out>, f=0x1abc190, throwflag=<optimized out>) at /usr/local/src/conda/python-3.9.13/Python/ceval.c:3489
#30 0x00000000004e70ca in _PyEval_EvalFrame (throwflag=0, f=0x1abc190, tstate=0x1a610a0) at /usr/local/src/conda/python-3.9.13/Include/internal/pycore_ceval.h:40
#31 _PyEval_EvalCode (tstate=<optimized out>, _co=<optimized out>, globals=<optimized out>, locals=<optimized out>, args=<optimized out>, argcount=<optimized out>, kwnames=0x0, kwargs=0x0, kwcount=<optimized out>, kwstep=2, defs=0x0, defcount=<optimized out>, kwdefs=0x0, closure=0x0, name=0x0, qualname=0x0) at /usr/local/src/conda/python-3.9.13/Python/ceval.c:4329
#32 0x00000000004e6d57 in _PyEval_EvalCodeWithName (_co=<optimized out>, globals=<optimized out>, locals=<optimized out>, args=<optimized out>, argcount=<optimized out>, kwnames=<optimized out>, kwargs=0x0, kwcount=0, kwstep=2, defs=0x0, defcount=0, kwdefs=0x0, closure=0x0, name=0x0, qualname=0x0) at /usr/local/src/conda/python-3.9.13/Python/ceval.c:4361
#33 0x00000000004e6d09 in PyEval_EvalCodeEx (_co=<optimized out>, globals=<optimized out>, locals=<optimized out>, args=<optimized out>, argcount=<optimized out>, kws=<optimized out>, kwcount=0, defs=0x0, defcount=0, kwdefs=0x0, closure=0x0) at /usr/local/src/conda/python-3.9.13/Python/ceval.c:4377
#34 0x0000000000594e7b in PyEval_EvalCode (co=co@entry=0x7f049ec05870, globals=globals@entry=0x7f049ebfd780, locals=locals@entry=0x7f049ebfd780) at /usr/local/src/conda/python-3.9.13/Python/ceval.c:828
#35 0x00000000005c2307 in run_eval_code_obj (tstate=0x1a610a0, co=0x7f049ec05870, globals=0x7f049ebfd780, locals=0x7f049ebfd780) at /usr/local/src/conda/python-3.9.13/Python/pythonrun.c:1221
#36 0x00000000005be270 in run_mod (mod=<optimized out>, filename=<optimized out>, globals=0x7f049ebfd780, locals=0x7f049ebfd780, flags=<optimized out>, arena=<optimized out>) at /usr/local/src/conda/python-3.9.13/Python/pythonrun.c:1242
#37 0x00000000004563ed in pyrun_file (fp=0x1a5d440, filename=0x7f049eb2d930, start=<optimized out>, globals=0x7f049ebfd780, locals=0x7f049ebfd780, closeit=1, flags=0x7ffecfd9a7f8) at /usr/local/src/conda/python-3.9.13/Python/pythonrun.c:1140
#38 0x00000000005b8062 in pyrun_simple_file (flags=0x7ffecfd9a7f8, closeit=1, filename=0x7f049eb2d930, fp=0x1a5d440) at /usr/local/src/conda/python-3.9.13/Python/pythonrun.c:450
#39 PyRun_SimpleFileExFlags (fp=0x1a5d440, filename=<optimized out>, closeit=1, flags=0x7ffecfd9a7f8) at /usr/local/src/conda/python-3.9.13/Python/pythonrun.c:483
#40 0x00000000005b55ce in pymain_run_file (cf=0x7ffecfd9a7f8, config=0x1a5faa0) at /usr/local/src/conda/python-3.9.13/Modules/main.c:379
#41 pymain_run_python (exitcode=0x7ffecfd9a7f0) at /usr/local/src/conda/python-3.9.13/Modules/main.c:604
#42 Py_RunMain () at /usr/local/src/conda/python-3.9.13/Modules/main.c:683
#43 0x0000000000588ff9 in Py_BytesMain (argc=<optimized out>, argv=<optimized out>) at /usr/local/src/conda/python-3.9.13/Modules/main.c:1129
#44 0x00007f049ed56d90 in __libc_start_call_main (main=main@entry=0x588fb0 <main>, argc=argc@entry=2, argv=argv@entry=0x7ffecfd9aa28) at ../sysdeps/nptl/libc_start_call_main.h:58
#45 0x00007f049ed56e40 in __libc_start_main_impl (main=0x588fb0 <main>, argc=2, argv=0x7ffecfd9aa28, init=<optimized out>, fini=<optimized out>, rtld_fini=<optimized out>, stack_end=0x7ffecfd9aa18) at ../csu/libc-start.c:392
#46 0x0000000000588eae in _start ()
backtrace應由下往上看,一開始是_start
,中間經過十幾個Python自身的函數,來到的第一個與PyTorch相關的函數為第22個節點:torch::autograd::THPVariable_rand
。
Python bindings
torch::autograd::THPVariable_rand
#22 torch::autograd::THPVariable_rand (self_=<optimized out>, args=<optimized out>, kwargs=<optimized out>) at /root/Documents/pytorch/torch/csrc/autograd/generated/python_torch_functions_0.cpp:5258
torch::autograd::THPVariable_rand
函數定義於torch/csrc/autograd/generated/python_torch_functions_0.cpp
,注意到該資料夾下還有其它名字類似但後綴不同的cpp檔。這些檔案(python_torch_functions_i.cpp
)是在編譯PyTorch時由gen.py
腳本依據native_functions.yaml
及tools/autograd/templates/python_torch_functions.cpp
這個模板來生成的,詳見PyTorch中的python_torch_functions_i.cpp檔案生成機制。
THPVariable_rand
函數的宣告如下:
static PyObject * THPVariable_rand(PyObject* self_, PyObject* args, PyObject* kwargs);
定義PyMethodDef
陣列,Python中的rand
函數會被對應到C++的THPVariable_rand
函數:
static PyMethodDef torch_functions_shard[] = {//...{"rand", castPyCFunctionWithKeywords(THPVariable_rand), METH_VARARGS | METH_KEYWORDS | METH_STATIC, NULL},//...
}
THPVariable_rand
函數的具體實現如下:
// generated methods start here// ...// rand
static PyObject * THPVariable_rand(PyObject* self_, PyObject* args, PyObject* kwargs)
{HANDLE_TH_ERRORSstatic PythonArgParser parser({"rand(SymIntArrayRef size, *, Generator? generator, DimnameList? names, ScalarType? dtype=None, Layout? layout=None, Device? device=None, bool? pin_memory=False, bool? requires_grad=False)","rand(SymIntArrayRef size, *, Generator? generator, Tensor out=None, ScalarType? dtype=None, Layout? layout=None, Device? device=None, bool? pin_memory=False, bool? requires_grad=False)","rand(SymIntArrayRef size, *, Tensor out=None, ScalarType? dtype=None, Layout? layout=None, Device? device=None, bool? pin_memory=False, bool? requires_grad=False)","rand(SymIntArrayRef size, *, DimnameList? names, ScalarType? dtype=None, Layout? layout=None, Device? device=None, bool? pin_memory=False, bool? requires_grad=False)",}, /*traceable=*/true);ParsedArgs<8> parsed_args;auto _r = parser.parse(nullptr, args, kwargs, parsed_args);if(_r.has_torch_function()) {return handle_torch_function(_r, nullptr, args, kwargs, THPVariableFunctionsModule, "torch");}switch (_r.idx) {case 0: {// aten::rand.generator_with_names(SymInt[] size, *, Generator? generator, Dimname[]? names, ScalarType? dtype=None, Layout? layout=None, Device? device=None, bool? pin_memory=None) -> Tensorauto __names = _r.toDimnameListOptional(2);c10::optional<DimnameList> names = __names ? c10::make_optional(DimnameList(__names.value())) : c10::nullopt;const auto options = TensorOptions().dtype(_r.scalartypeOptional(3)).device(_r.deviceWithDefault(5, torch::tensors::get_default_device())).layout(_r.layoutOptional(4)).requires_grad(_r.toBool(7)).pinned_memory(_r.toBool(6));torch::utils::maybe_initialize_cuda(options);auto dispatch_rand = [](c10::SymIntArrayRef size, c10::optional<at::Generator> generator, c10::optional<at::DimnameList> names, at::TensorOptions options) -> at::Tensor {pybind11::gil_scoped_release no_gil;return torch::rand_symint(size, generator, names, options);};return wrap(dispatch_rand(_r.symintlist(0), _r.generator(1), names, options));}case 1: {if (_r.isNone(2)) {// aten::rand.generator(SymInt[] size, *, Generator? generator, ScalarType? dtype=None, Layout? layout=None, Device? device=None, bool? pin_memory=None) -> Tensorconst auto options = TensorOptions().dtype(_r.scalartypeOptional(3)).device(_r.deviceWithDefault(5, torch::tensors::get_default_device())).layout(_r.layoutOptional(4)).requires_grad(_r.toBool(7)).pinned_memory(_r.toBool(6));torch::utils::maybe_initialize_cuda(options);auto dispatch_rand = [](c10::SymIntArrayRef size, c10::optional<at::Generator> generator, at::TensorOptions options) -> at::Tensor {pybind11::gil_scoped_release no_gil;return torch::rand_symint(size, generator, options);};return wrap(dispatch_rand(_r.symintlist(0), _r.generator(1), options));} else {// aten::rand.generator_out(SymInt[] size, *, Generator? generator, Tensor(a!) out) -> Tensor(a!)check_out_type_matches(_r.tensor(2), _r.scalartypeOptional(3),_r.isNone(3), _r.layoutOptional(4),_r.deviceWithDefault(5, torch::tensors::get_default_device()), _r.isNone(5));auto dispatch_rand_out = [](at::Tensor out, c10::SymIntArrayRef size, c10::optional<at::Generator> generator) -> at::Tensor {pybind11::gil_scoped_release no_gil;return at::rand_symint_out(out, size, generator);};return wrap(dispatch_rand_out(_r.tensor(2), _r.symintlist(0), _r.generator(1)).set_requires_grad(_r.toBool(7)));}}case 2: {if (_r.isNone(1)) {// aten::rand(SymInt[] size, *, ScalarType? dtype=None, Layout? layout=None, Device? device=None, bool? pin_memory=None) -> Tensorconst auto options = TensorOptions().dtype(_r.scalartypeOptional(2)).device(_r.deviceWithDefault(4, torch::tensors::get_default_device())).layout(_r.layoutOptional(3)).requires_grad(_r.toBool(6)).pinned_memory(_r.toBool(5));torch::utils::maybe_initialize_cuda(options);auto dispatch_rand = [](c10::SymIntArrayRef size, at::TensorOptions options) -> at::Tensor {pybind11::gil_scoped_release no_gil;return torch::rand_symint(size, options);};return wrap(dispatch_rand(_r.symintlist(0), options));} else {// aten::rand.out(SymInt[] size, *, Tensor(a!) out) -> Tensor(a!)check_out_type_matches(_r.tensor(1), _r.scalartypeOptional(2),_r.isNone(2), _r.layoutOptional(3),_r.deviceWithDefault(4, torch::tensors::get_default_device()), _r.isNone(4));auto dispatch_rand_out = [](at::Tensor out, c10::SymIntArrayRef size) -> at::Tensor {pybind11::gil_scoped_release no_gil;return at::rand_symint_out(out, size);};return wrap(dispatch_rand_out(_r.tensor(1), _r.symintlist(0)).set_requires_grad(_r.toBool(6)));}}case 3: {// aten::rand.names(SymInt[] size, *, Dimname[]? names, ScalarType? dtype=None, Layout? layout=None, Device? device=None, bool? pin_memory=None) -> Tensorauto __names = _r.toDimnameListOptional(1);c10::optional<DimnameList> names = __names ? c10::make_optional(DimnameList(__names.value())) : c10::nullopt;const auto options = TensorOptions().dtype(_r.scalartypeOptional(2)).device(_r.deviceWithDefault(4, torch::tensors::get_default_device())).layout(_r.layoutOptional(3)).requires_grad(_r.toBool(6)).pinned_memory(_r.toBool(5));torch::utils::maybe_initialize_cuda(options);auto dispatch_rand = [](c10::SymIntArrayRef size, c10::optional<at::DimnameList> names, at::TensorOptions options) -> at::Tensor {pybind11::gil_scoped_release no_gil;return torch::rand_symint(size, names, options);};return wrap(dispatch_rand(_r.symintlist(0), names, options));}}Py_RETURN_NONE;END_HANDLE_TH_ERRORS
}
在parser
的定義中,rand
函數依據簽名被分成了以下4大類,共6個函數:
第0個簽名需提供size, generator和names:
"rand(SymIntArrayRef size, *, Generator? generator, DimnameList? names, ScalarType? dtype=None, Layout? layout=None, Device? device=None, bool? pin_memory=False, bool? requires_grad=False)",
// aten::rand.generator_with_names(SymInt[] size, *, Generator? generator, Dimname[]? names, ScalarType? dtype=None, Layout? layout=None, Device? device=None, bool? pin_memory=None) -> Tensor
第1種簽名需提供size和generator:
"rand(SymIntArrayRef size, *, Generator? generator, Tensor out=None, ScalarType? dtype=None, Layout? layout=None, Device? device=None, bool? pin_memory=False, bool? requires_grad=False)",
搭配注釋,可以知道第1種簽名又可分別被細分為2種API,第1-1種不需提供out:
// aten::rand.generator(SymInt[] size, *, Generator? generator, ScalarType? dtype=None, Layout? layout=None, Device? device=None, bool? pin_memory=None) -> Tensor
第1-2種是需提供out的變形:
// aten::rand.generator_out(SymInt[] size, *, Generator? generator, Tensor(a!) out) -> Tensor(a!)
第2種簽名只需提供size:
"rand(SymIntArrayRef size, *, Tensor out=None, ScalarType? dtype=None, Layout? layout=None, Device? device=None, bool? pin_memory=False, bool? requires_grad=False)",
搭配注釋,可以知道第2種簽名又可分別被細分為2種API,第2-1種不需提供out:
// aten::rand(SymInt[] size, *, ScalarType? dtype=None, Layout? layout=None, Device? device=None, bool? pin_memory=None) -> Tensor
第2-2種是需提供out的變形:
// aten::rand.out(SymInt[] size, *, Tensor(a!) out) -> Tensor(a!)
第3種簽名需提供size和names:
"rand(SymIntArrayRef size, *, DimnameList? names, ScalarType? dtype=None, Layout? layout=None, Device? device=None, bool? pin_memory=False, bool? requires_grad=False)",
// aten::rand.names(SymInt[] size, *, Dimname[]? names, ScalarType? dtype=None, Layout? layout=None, Device? device=None, bool? pin_memory=None) -> Tensor
因為我們調用的是torch.rand(3,4)
,只傳入size,所以此處會進入switch的case 2。
接著用_r.isNone(1)
檢查第1個(0-based)參數,也就是out
參數,是否為空。此處未提供out
參數,所以進入if分支。對應到只提供size
的aten::rand
:
// aten::rand(SymInt[] size, *, ScalarType? dtype=None, Layout? layout=None, Device? device=None, bool? pin_memory=None) -> Tensor
operator()
#21 operator() (__closure=<synthetic pointer>, options=..., size=...) at /root/Documents/pytorch/torch/csrc/autograd/generated/python_torch_functions_0.cpp:5256
torch/csrc/autograd/generated/python_torch_functions_0.cpp
這個節點其實就只是torch::autograd::THPVariable_rand
中case 2的if分支裡定義的lambda函數:
auto dispatch_rand = [](c10::SymIntArrayRef size, at::TensorOptions options) -> at::Tensor {pybind11::gil_scoped_release no_gil;return torch::rand_symint(size, options);};
其中pybind11::gil_scoped_release no_gil;
釋放了GIL鎖,這使得接下來要調用的torch::rand_symint
得以在其它的線程中執行(這裡有另外開thread?),相當於讓Python擁有了執行多線程的能力。詳見Python GIL及其釋放/獲取函數。
C++ API
torch::rand_symint
#20 torch::rand_symint (options=..., size=...) at /root/Documents/pytorch/torch/csrc/autograd/generated/variable_factories.h:418
torch/csrc/autograd/generated/variable_factories.h
inline at::Tensor rand_symint(c10::SymIntArrayRef size, at::TensorOptions options = {}) {at::AutoDispatchBelowADInplaceOrView guard;return autograd::make_variable(at::rand_symint(size, at::TensorOptions(options).requires_grad(c10::nullopt)), /*requires_grad=*/options.requires_grad());
}
torch::rand_symint
同樣只接受size參數,其函數內容則是先用at::rand_symint
得到一個at::Tensor
,再用autograd::make_variable
包裝後返回。
at::rand_symint
回傳的是一個沒有自動微分功能的純張量,autograd::make_variable
的作用則是設定該張量關於自動微分的metadata,設定完成之後,該張量便擁有了自動微分的功能。關於torch::rand_symint
和at::rand_symint
的差別,詳見torch::和at:: factory function的差別。
以下是同一個檔案中出現的rand系列函數,
inline at::Tensor rand(at::IntArrayRef size, c10::optional<at::DimnameList> names, at::TensorOptions options = {}) {at::AutoDispatchBelowADInplaceOrView guard;return autograd::make_variable(at::rand(size, names, at::TensorOptions(options).requires_grad(c10::nullopt)), /*requires_grad=*/options.requires_grad());
}
//...
inline at::Tensor rand(at::IntArrayRef size, c10::optional<at::Generator> generator, c10::optional<at::DimnameList> names, at::TensorOptions options = {}) {at::AutoDispatchBelowADInplaceOrView guard;return autograd::make_variable(at::rand(size, generator, names, at::TensorOptions(options).requires_grad(c10::nullopt)), /*requires_grad=*/options.requires_grad());
}
//...
inline at::Tensor rand(at::IntArrayRef size, at::TensorOptions options = {}) {at::AutoDispatchBelowADInplaceOrView guard;return autograd::make_variable(at::rand(size, at::TensorOptions(options).requires_grad(c10::nullopt)), /*requires_grad=*/options.requires_grad());
}
//...
inline at::Tensor rand(at::IntArrayRef size, c10::optional<at::Generator> generator, at::TensorOptions options = {}) {at::AutoDispatchBelowADInplaceOrView guard;return autograd::make_variable(at::rand(size, generator, at::TensorOptions(options).requires_grad(c10::nullopt)), /*requires_grad=*/options.requires_grad());
}
分四種:
- size, names
- size, generator, names
- size
- size, generator
正好可與torch/_C/_VariableFucntions.pyi
和torch::autograd::THPVariable_rand
中的4個函數一一對應。
at::rand_symint
#19 0x00007f046abdd595 in at::rand_symint (options=..., size=...) at /root/Documents/pytorch/build/aten/src/ATen/Functions.h:5770
build/aten/src/ATen/Functions.h
// aten::rand(SymInt[] size, *, ScalarType? dtype=None, Layout? layout=None, Device? device=None, bool? pin_memory=None) -> Tensor
inline at::Tensor rand_symint(c10::SymIntArrayRef size, at::TensorOptions options={}) {return at::_ops::rand::call(size, optTypeMetaToScalarType(options.dtype_opt()), options.layout_opt(), options.device_opt(), options.pinned_memory_opt());
}
at::rand_symint
函數同樣只接受size參數。
注釋中表明了at::rand_symint
與aten::rand
的關係,待會會在native_functions.yaml
看見aten::rand
的身影。
下面還定義了at::symint::rand
函數,其實現與at::rand_symint
一致。
namespace symint {template <typename T, typename = std::enable_if_t<std::is_same<T, c10::SymInt>::value>>at::Tensor rand(c10::SymIntArrayRef size, at::TensorOptions options={}) {return at::_ops::rand::call(size, optTypeMetaToScalarType(options.dtype_opt()), options.layout_opt(), options.device_opt(), options.pinned_memory_opt());}
}
以下是同一個檔案中出現的rand系列函數,可分為aten::rand.names
, aten::rand.generator_with_names
, aten::rand
, aten::rand.generator
, aten::rand.out
, aten::rand.generator_out
六類,可與torch::autograd::THPVariable_rand
注釋中的六個函數一一對應。
每類又依第一個參數型別為at::intArrayRef
或``c10::SymIntArrayRef各分兩種,再依第二個以後的參數為
dtype,
layout,
device,
pin_memory或直接用
at::TensorOptions`包起來各分兩種,所以共有6 * 2 * 2 = 24個。
// aten::rand.names(SymInt[] size, *, Dimname[]? names, ScalarType? dtype=None, Layout? layout=None, Device? device=None, bool? pin_memory=None) -> Tensor
inline at::Tensor rand(at::IntArrayRef size, c10::optional<at::DimnameList> names, at::TensorOptions options={}) {return at::_ops::rand_names::call(c10::fromIntArrayRefSlow(size), names, optTypeMetaToScalarType(options.dtype_opt()), options.layout_opt(), options.device_opt(), options.pinned_memory_opt());
}
namespace symint {template <typename T, typename = std::enable_if_t<std::is_same<T, int64_t>::value>>at::Tensor rand(at::IntArrayRef size, c10::optional<at::DimnameList> names, at::TensorOptions options={}) {return at::_ops::rand_names::call(c10::fromIntArrayRefSlow(size), names, optTypeMetaToScalarType(options.dtype_opt()), options.layout_opt(), options.device_opt(), options.pinned_memory_opt());}
}// aten::rand.names(SymInt[] size, *, Dimname[]? names, ScalarType? dtype=None, Layout? layout=None, Device? device=None, bool? pin_memory=None) -> Tensor
inline at::Tensor rand(at::IntArrayRef size, c10::optional<at::DimnameList> names, c10::optional<at::ScalarType> dtype, c10::optional<at::Layout> layout, c10::optional<at::Device> device, c10::optional<bool> pin_memory) {return at::_ops::rand_names::call(c10::fromIntArrayRefSlow(size), names, dtype, layout, device, pin_memory);
}
namespace symint {template <typename T, typename = std::enable_if_t<std::is_same<T, int64_t>::value>>at::Tensor rand(at::IntArrayRef size, c10::optional<at::DimnameList> names, c10::optional<at::ScalarType> dtype, c10::optional<at::Layout> layout, c10::optional<at::Device> device, c10::optional<bool> pin_memory) {return at::_ops::rand_names::call(c10::fromIntArrayRefSlow(size), names, dtype, layout, device, pin_memory);}
}// aten::rand.names(SymInt[] size, *, Dimname[]? names, ScalarType? dtype=None, Layout? layout=None, Device? device=None, bool? pin_memory=None) -> Tensor
inline at::Tensor rand_symint(c10::SymIntArrayRef size, c10::optional<at::DimnameList> names, at::TensorOptions options={}) {return at::_ops::rand_names::call(size, names, optTypeMetaToScalarType(options.dtype_opt()), options.layout_opt(), options.device_opt(), options.pinned_memory_opt());
}
namespace symint {template <typename T, typename = std::enable_if_t<std::is_same<T, c10::SymInt>::value>>at::Tensor rand(c10::SymIntArrayRef size, c10::optional<at::DimnameList> names, at::TensorOptions options={}) {return at::_ops::rand_names::call(size, names, optTypeMetaToScalarType(options.dtype_opt()), options.layout_opt(), options.device_opt(), options.pinned_memory_opt());}
}// aten::rand.names(SymInt[] size, *, Dimname[]? names, ScalarType? dtype=None, Layout? layout=None, Device? device=None, bool? pin_memory=None) -> Tensor
inline at::Tensor rand_symint(c10::SymIntArrayRef size, c10::optional<at::DimnameList> names, c10::optional<at::ScalarType> dtype, c10::optional<at::Layout> layout, c10::optional<at::Device> device, c10::optional<bool> pin_memory) {return at::_ops::rand_names::call(size, names, dtype, layout, device, pin_memory);
}
namespace symint {template <typename T, typename = std::enable_if_t<std::is_same<T, c10::SymInt>::value>>at::Tensor rand(c10::SymIntArrayRef size, c10::optional<at::DimnameList> names, c10::optional<at::ScalarType> dtype, c10::optional<at::Layout> layout, c10::optional<at::Device> device, c10::optional<bool> pin_memory) {return at::_ops::rand_names::call(size, names, dtype, layout, device, pin_memory);}
}// aten::rand.generator_with_names(SymInt[] size, *, Generator? generator, Dimname[]? names, ScalarType? dtype=None, Layout? layout=None, Device? device=None, bool? pin_memory=None) -> Tensor
inline at::Tensor rand(at::IntArrayRef size, c10::optional<at::Generator> generator, c10::optional<at::DimnameList> names, at::TensorOptions options={}) {return at::_ops::rand_generator_with_names::call(c10::fromIntArrayRefSlow(size), generator, names, optTypeMetaToScalarType(options.dtype_opt()), options.layout_opt(), options.device_opt(), options.pinned_memory_opt());
}
namespace symint {template <typename T, typename = std::enable_if_t<std::is_same<T, int64_t>::value>>at::Tensor rand(at::IntArrayRef size, c10::optional<at::Generator> generator, c10::optional<at::DimnameList> names, at::TensorOptions options={}) {return at::_ops::rand_generator_with_names::call(c10::fromIntArrayRefSlow(size), generator, names, optTypeMetaToScalarType(options.dtype_opt()), options.layout_opt(), options.device_opt(), options.pinned_memory_opt());}
}// aten::rand.generator_with_names(SymInt[] size, *, Generator? generator, Dimname[]? names, ScalarType? dtype=None, Layout? layout=None, Device? device=None, bool? pin_memory=None) -> Tensor
inline at::Tensor rand(at::IntArrayRef size, c10::optional<at::Generator> generator, c10::optional<at::DimnameList> names, c10::optional<at::ScalarType> dtype, c10::optional<at::Layout> layout, c10::optional<at::Device> device, c10::optional<bool> pin_memory) {return at::_ops::rand_generator_with_names::call(c10::fromIntArrayRefSlow(size), generator, names, dtype, layout, device, pin_memory);
}
namespace symint {template <typename T, typename = std::enable_if_t<std::is_same<T, int64_t>::value>>at::Tensor rand(at::IntArrayRef size, c10::optional<at::Generator> generator, c10::optional<at::DimnameList> names, c10::optional<at::ScalarType> dtype, c10::optional<at::Layout> layout, c10::optional<at::Device> device, c10::optional<bool> pin_memory) {return at::_ops::rand_generator_with_names::call(c10::fromIntArrayRefSlow(size), generator, names, dtype, layout, device, pin_memory);}
}// aten::rand.generator_with_names(SymInt[] size, *, Generator? generator, Dimname[]? names, ScalarType? dtype=None, Layout? layout=None, Device? device=None, bool? pin_memory=None) -> Tensor
inline at::Tensor rand_symint(c10::SymIntArrayRef size, c10::optional<at::Generator> generator, c10::optional<at::DimnameList> names, at::TensorOptions options={}) {return at::_ops::rand_generator_with_names::call(size, generator, names, optTypeMetaToScalarType(options.dtype_opt()), options.layout_opt(), options.device_opt(), options.pinned_memory_opt());
}
namespace symint {template <typename T, typename = std::enable_if_t<std::is_same<T, c10::SymInt>::value>>at::Tensor rand(c10::SymIntArrayRef size, c10::optional<at::Generator> generator, c10::optional<at::DimnameList> names, at::TensorOptions options={}) {return at::_ops::rand_generator_with_names::call(size, generator, names, optTypeMetaToScalarType(options.dtype_opt()), options.layout_opt(), options.device_opt(), options.pinned_memory_opt());}
}// aten::rand.generator_with_names(SymInt[] size, *, Generator? generator, Dimname[]? names, ScalarType? dtype=None, Layout? layout=None, Device? device=None, bool? pin_memory=None) -> Tensor
inline at::Tensor rand_symint(c10::SymIntArrayRef size, c10::optional<at::Generator> generator, c10::optional<at::DimnameList> names, c10::optional<at::ScalarType> dtype, c10::optional<at::Layout> layout, c10::optional<at::Device> device, c10::optional<bool> pin_memory) {return at::_ops::rand_generator_with_names::call(size, generator, names, dtype, layout, device, pin_memory);
}
namespace symint {template <typename T, typename = std::enable_if_t<std::is_same<T, c10::SymInt>::value>>at::Tensor rand(c10::SymIntArrayRef size, c10::optional<at::Generator> generator, c10::optional<at::DimnameList> names, c10::optional<at::ScalarType> dtype, c10::optional<at::Layout> layout, c10::optional<at::Device> device, c10::optional<bool> pin_memory) {return at::_ops::rand_generator_with_names::call(size, generator, names, dtype, layout, device, pin_memory);}
}// aten::rand(SymInt[] size, *, ScalarType? dtype=None, Layout? layout=None, Device? device=None, bool? pin_memory=None) -> Tensor
inline at::Tensor rand(at::IntArrayRef size, at::TensorOptions options={}) {return at::_ops::rand::call(c10::fromIntArrayRefSlow(size), optTypeMetaToScalarType(options.dtype_opt()), options.layout_opt(), options.device_opt(), options.pinned_memory_opt());
}
namespace symint {template <typename T, typename = std::enable_if_t<std::is_same<T, int64_t>::value>>at::Tensor rand(at::IntArrayRef size, at::TensorOptions options={}) {return at::_ops::rand::call(c10::fromIntArrayRefSlow(size), optTypeMetaToScalarType(options.dtype_opt()), options.layout_opt(), options.device_opt(), options.pinned_memory_opt());}
}// aten::rand(SymInt[] size, *, ScalarType? dtype=None, Layout? layout=None, Device? device=None, bool? pin_memory=None) -> Tensor
inline at::Tensor rand(at::IntArrayRef size, c10::optional<at::ScalarType> dtype, c10::optional<at::Layout> layout, c10::optional<at::Device> device, c10::optional<bool> pin_memory) {return at::_ops::rand::call(c10::fromIntArrayRefSlow(size), dtype, layout, device, pin_memory);
}
namespace symint {template <typename T, typename = std::enable_if_t<std::is_same<T, int64_t>::value>>at::Tensor rand(at::IntArrayRef size, c10::optional<at::ScalarType> dtype, c10::optional<at::Layout> layout, c10::optional<at::Device> device, c10::optional<bool> pin_memory) {return at::_ops::rand::call(c10::fromIntArrayRefSlow(size), dtype, layout, device, pin_memory);}
}// aten::rand(SymInt[] size, *, ScalarType? dtype=None, Layout? layout=None, Device? device=None, bool? pin_memory=None) -> Tensor
inline at::Tensor rand_symint(c10::SymIntArrayRef size, at::TensorOptions options={}) {return at::_ops::rand::call(size, optTypeMetaToScalarType(options.dtype_opt()), options.layout_opt(), options.device_opt(), options.pinned_memory_opt());
}
namespace symint {template <typename T, typename = std::enable_if_t<std::is_same<T, c10::SymInt>::value>>at::Tensor rand(c10::SymIntArrayRef size, at::TensorOptions options={}) {return at::_ops::rand::call(size, optTypeMetaToScalarType(options.dtype_opt()), options.layout_opt(), options.device_opt(), options.pinned_memory_opt());}
}// aten::rand(SymInt[] size, *, ScalarType? dtype=None, Layout? layout=None, Device? device=None, bool? pin_memory=None) -> Tensor
inline at::Tensor rand_symint(c10::SymIntArrayRef size, c10::optional<at::ScalarType> dtype, c10::optional<at::Layout> layout, c10::optional<at::Device> device, c10::optional<bool> pin_memory) {return at::_ops::rand::call(size, dtype, layout, device, pin_memory);
}
namespace symint {template <typename T, typename = std::enable_if_t<std::is_same<T, c10::SymInt>::value>>at::Tensor rand(c10::SymIntArrayRef size, c10::optional<at::ScalarType> dtype, c10::optional<at::Layout> layout, c10::optional<at::Device> device, c10::optional<bool> pin_memory) {return at::_ops::rand::call(size, dtype, layout, device, pin_memory);}
}// aten::rand.generator(SymInt[] size, *, Generator? generator, ScalarType? dtype=None, Layout? layout=None, Device? device=None, bool? pin_memory=None) -> Tensor
inline at::Tensor rand(at::IntArrayRef size, c10::optional<at::Generator> generator, at::TensorOptions options={}) {return at::_ops::rand_generator::call(c10::fromIntArrayRefSlow(size), generator, optTypeMetaToScalarType(options.dtype_opt()), options.layout_opt(), options.device_opt(), options.pinned_memory_opt());
}
namespace symint {template <typename T, typename = std::enable_if_t<std::is_same<T, int64_t>::value>>at::Tensor rand(at::IntArrayRef size, c10::optional<at::Generator> generator, at::TensorOptions options={}) {return at::_ops::rand_generator::call(c10::fromIntArrayRefSlow(size), generator, optTypeMetaToScalarType(options.dtype_opt()), options.layout_opt(), options.device_opt(), options.pinned_memory_opt());}
}// aten::rand.generator(SymInt[] size, *, Generator? generator, ScalarType? dtype=None, Layout? layout=None, Device? device=None, bool? pin_memory=None) -> Tensor
inline at::Tensor rand(at::IntArrayRef size, c10::optional<at::Generator> generator, c10::optional<at::ScalarType> dtype, c10::optional<at::Layout> layout, c10::optional<at::Device> device, c10::optional<bool> pin_memory) {return at::_ops::rand_generator::call(c10::fromIntArrayRefSlow(size), generator, dtype, layout, device, pin_memory);
}
namespace symint {template <typename T, typename = std::enable_if_t<std::is_same<T, int64_t>::value>>at::Tensor rand(at::IntArrayRef size, c10::optional<at::Generator> generator, c10::optional<at::ScalarType> dtype, c10::optional<at::Layout> layout, c10::optional<at::Device> device, c10::optional<bool> pin_memory) {return at::_ops::rand_generator::call(c10::fromIntArrayRefSlow(size), generator, dtype, layout, device, pin_memory);}
}// aten::rand.generator(SymInt[] size, *, Generator? generator, ScalarType? dtype=None, Layout? layout=None, Device? device=None, bool? pin_memory=None) -> Tensor
inline at::Tensor rand_symint(c10::SymIntArrayRef size, c10::optional<at::Generator> generator, at::TensorOptions options={}) {return at::_ops::rand_generator::call(size, generator, optTypeMetaToScalarType(options.dtype_opt()), options.layout_opt(), options.device_opt(), options.pinned_memory_opt());
}
namespace symint {template <typename T, typename = std::enable_if_t<std::is_same<T, c10::SymInt>::value>>at::Tensor rand(c10::SymIntArrayRef size, c10::optional<at::Generator> generator, at::TensorOptions options={}) {return at::_ops::rand_generator::call(size, generator, optTypeMetaToScalarType(options.dtype_opt()), options.layout_opt(), options.device_opt(), options.pinned_memory_opt());}
}// aten::rand.generator(SymInt[] size, *, Generator? generator, ScalarType? dtype=None, Layout? layout=None, Device? device=None, bool? pin_memory=None) -> Tensor
inline at::Tensor rand_symint(c10::SymIntArrayRef size, c10::optional<at::Generator> generator, c10::optional<at::ScalarType> dtype, c10::optional<at::Layout> layout, c10::optional<at::Device> device, c10::optional<bool> pin_memory) {return at::_ops::rand_generator::call(size, generator, dtype, layout, device, pin_memory);
}
namespace symint {template <typename T, typename = std::enable_if_t<std::is_same<T, c10::SymInt>::value>>at::Tensor rand(c10::SymIntArrayRef size, c10::optional<at::Generator> generator, c10::optional<at::ScalarType> dtype, c10::optional<at::Layout> layout, c10::optional<at::Device> device, c10::optional<bool> pin_memory) {return at::_ops::rand_generator::call(size, generator, dtype, layout, device, pin_memory);}
}// aten::rand.out(SymInt[] size, *, Tensor(a!) out) -> Tensor(a!)
inline at::Tensor & rand_out(at::Tensor & out, at::IntArrayRef size) {return at::_ops::rand_out::call(c10::fromIntArrayRefSlow(size), out);
}
namespace symint {template <typename T, typename = std::enable_if_t<std::is_same<T, int64_t>::value>>at::Tensor & rand_out(at::Tensor & out, at::IntArrayRef size) {return at::_ops::rand_out::call(c10::fromIntArrayRefSlow(size), out);}
}// aten::rand.out(SymInt[] size, *, Tensor(a!) out) -> Tensor(a!)
inline at::Tensor & rand_outf(at::IntArrayRef size, at::Tensor & out) {return at::_ops::rand_out::call(c10::fromIntArrayRefSlow(size), out);
}
namespace symint {template <typename T, typename = std::enable_if_t<std::is_same<T, int64_t>::value>>at::Tensor & rand_outf(at::IntArrayRef size, at::Tensor & out) {return at::_ops::rand_out::call(c10::fromIntArrayRefSlow(size), out);}
}// aten::rand.out(SymInt[] size, *, Tensor(a!) out) -> Tensor(a!)
inline at::Tensor & rand_symint_out(at::Tensor & out, c10::SymIntArrayRef size) {return at::_ops::rand_out::call(size, out);
}
namespace symint {template <typename T, typename = std::enable_if_t<std::is_same<T, c10::SymInt>::value>>at::Tensor & rand_out(at::Tensor & out, c10::SymIntArrayRef size) {return at::_ops::rand_out::call(size, out);}
}// aten::rand.out(SymInt[] size, *, Tensor(a!) out) -> Tensor(a!)
inline at::Tensor & rand_symint_outf(c10::SymIntArrayRef size, at::Tensor & out) {return at::_ops::rand_out::call(size, out);
}
namespace symint {template <typename T, typename = std::enable_if_t<std::is_same<T, c10::SymInt>::value>>at::Tensor & rand_outf(c10::SymIntArrayRef size, at::Tensor & out) {return at::_ops::rand_out::call(size, out);}
}// aten::rand.generator_out(SymInt[] size, *, Generator? generator, Tensor(a!) out) -> Tensor(a!)
inline at::Tensor & rand_out(at::Tensor & out, at::IntArrayRef size, c10::optional<at::Generator> generator) {return at::_ops::rand_generator_out::call(c10::fromIntArrayRefSlow(size), generator, out);
}
namespace symint {template <typename T, typename = std::enable_if_t<std::is_same<T, int64_t>::value>>at::Tensor & rand_out(at::Tensor & out, at::IntArrayRef size, c10::optional<at::Generator> generator) {return at::_ops::rand_generator_out::call(c10::fromIntArrayRefSlow(size), generator, out);}
}// aten::rand.generator_out(SymInt[] size, *, Generator? generator, Tensor(a!) out) -> Tensor(a!)
inline at::Tensor & rand_outf(at::IntArrayRef size, c10::optional<at::Generator> generator, at::Tensor & out) {return at::_ops::rand_generator_out::call(c10::fromIntArrayRefSlow(size), generator, out);
}
namespace symint {template <typename T, typename = std::enable_if_t<std::is_same<T, int64_t>::value>>at::Tensor & rand_outf(at::IntArrayRef size, c10::optional<at::Generator> generator, at::Tensor & out) {return at::_ops::rand_generator_out::call(c10::fromIntArrayRefSlow(size), generator, out);}
}// aten::rand.generator_out(SymInt[] size, *, Generator? generator, Tensor(a!) out) -> Tensor(a!)
inline at::Tensor & rand_symint_out(at::Tensor & out, c10::SymIntArrayRef size, c10::optional<at::Generator> generator) {return at::_ops::rand_generator_out::call(size, generator, out);
}
namespace symint {template <typename T, typename = std::enable_if_t<std::is_same<T, c10::SymInt>::value>>at::Tensor & rand_out(at::Tensor & out, c10::SymIntArrayRef size, c10::optional<at::Generator> generator) {return at::_ops::rand_generator_out::call(size, generator, out);}
}// aten::rand.generator_out(SymInt[] size, *, Generator? generator, Tensor(a!) out) -> Tensor(a!)
inline at::Tensor & rand_symint_outf(c10::SymIntArrayRef size, c10::optional<at::Generator> generator, at::Tensor & out) {return at::_ops::rand_generator_out::call(size, generator, out);
}
namespace symint {template <typename T, typename = std::enable_if_t<std::is_same<T, c10::SymInt>::value>>at::Tensor & rand_outf(c10::SymIntArrayRef size, c10::optional<at::Generator> generator, at::Tensor & out) {return at::_ops::rand_generator_out::call(size, generator, out);}
}
接下來會調用at::_ops::rand::call
做分發(dispatch),但在進入該函數前,先來看一下at::_ops::rand
的定義。
dispatch
at::_ops::rand
build/aten/src/ATen/Operators.h
rand
是一個結構體,有call
和redispatch
兩個static的成員函數,待會都會用到。
struct TORCH_API rand {using schema = at::Tensor (c10::SymIntArrayRef, c10::optional<at::ScalarType>, c10::optional<at::Layout>, c10::optional<at::Device>, c10::optional<bool>);using ptr_schema = schema*;// See Note [static constexpr char* members for windows NVCC]STATIC_CONSTEXPR_STR_INL_EXCEPT_WIN_CUDA(name, "aten::rand")STATIC_CONSTEXPR_STR_INL_EXCEPT_WIN_CUDA(overload_name, "")STATIC_CONSTEXPR_STR_INL_EXCEPT_WIN_CUDA(schema_str, "rand(SymInt[] size, *, ScalarType? dtype=None, Layout? layout=None, Device? device=None, bool? pin_memory=None) -> Tensor")static at::Tensor call(c10::SymIntArrayRef size, c10::optional<at::ScalarType> dtype, c10::optional<at::Layout> layout, c10::optional<at::Device> device, c10::optional<bool> pin_memory);static at::Tensor redispatch(c10::DispatchKeySet dispatchKeySet, c10::SymIntArrayRef size, c10::optional<at::ScalarType> dtype, c10::optional<at::Layout> layout, c10::optional<at::Device> device, c10::optional<bool> pin_memory);
};
STATIC_CONSTEXPR_STR_INL_EXCEPT_WIN_CUDA
為at::_ops::rand
這個結構體宣告了三個名字分別為name
, overload_name
和schema_str
的static字串成員變數,並在此處或待會在build/aten/src/ATen/Operators_2.cpp
處賦予它們初始值。
同檔案下共有六個rand系列函數(rand_names
, rand_generator_with_names
, rand
, rand_generator
, rand_out
, rand_generator_out
),可與torch::autograd::THPVariable_rand
注釋中的六個函數一一對應。
struct TORCH_API rand_names {using schema = at::Tensor (c10::SymIntArrayRef, c10::optional<at::DimnameList>, c10::optional<at::ScalarType>, c10::optional<at::Layout>, c10::optional<at::Device>, c10::optional<bool>);using ptr_schema = schema*;// See Note [static constexpr char* members for windows NVCC]STATIC_CONSTEXPR_STR_INL_EXCEPT_WIN_CUDA(name, "aten::rand")STATIC_CONSTEXPR_STR_INL_EXCEPT_WIN_CUDA(overload_name, "names")STATIC_CONSTEXPR_STR_INL_EXCEPT_WIN_CUDA(schema_str, "rand.names(SymInt[] size, *, Dimname[]? names, ScalarType? dtype=None, Layout? layout=None, Device? device=None, bool? pin_memory=None) -> Tensor")static at::Tensor call(c10::SymIntArrayRef size, c10::optional<at::DimnameList> names, c10::optional<at::ScalarType> dtype, c10::optional<at::Layout> layout, c10::optional<at::Device> device, c10::optional<bool> pin_memory);static at::Tensor redispatch(c10::DispatchKeySet dispatchKeySet, c10::SymIntArrayRef size, c10::optional<at::DimnameList> names, c10::optional<at::ScalarType> dtype, c10::optional<at::Layout> layout, c10::optional<at::Device> device, c10::optional<bool> pin_memory);
};struct TORCH_API rand_generator_with_names {using schema = at::Tensor (c10::SymIntArrayRef, c10::optional<at::Generator>, c10::optional<at::DimnameList>, c10::optional<at::ScalarType>, c10::optional<at::Layout>, c10::optional<at::Device>, c10::optional<bool>);using ptr_schema = schema*;// See Note [static constexpr char* members for windows NVCC]STATIC_CONSTEXPR_STR_INL_EXCEPT_WIN_CUDA(name, "aten::rand")STATIC_CONSTEXPR_STR_INL_EXCEPT_WIN_CUDA(overload_name, "generator_with_names")STATIC_CONSTEXPR_STR_INL_EXCEPT_WIN_CUDA(schema_str, "rand.generator_with_names(SymInt[] size, *, Generator? generator, Dimname[]? names, ScalarType? dtype=None, Layout? layout=None, Device? device=None, bool? pin_memory=None) -> Tensor")static at::Tensor call(c10::SymIntArrayRef size, c10::optional<at::Generator> generator, c10::optional<at::DimnameList> names, c10::optional<at::ScalarType> dtype, c10::optional<at::Layout> layout, c10::optional<at::Device> device, c10::optional<bool> pin_memory);static at::Tensor redispatch(c10::DispatchKeySet dispatchKeySet, c10::SymIntArrayRef size, c10::optional<at::Generator> generator, c10::optional<at::DimnameList> names, c10::optional<at::ScalarType> dtype, c10::optional<at::Layout> layout, c10::optional<at::Device> device, c10::optional<bool> pin_memory);
};struct TORCH_API rand {using schema = at::Tensor (c10::SymIntArrayRef, c10::optional<at::ScalarType>, c10::optional<at::Layout>, c10::optional<at::Device>, c10::optional<bool>);using ptr_schema = schema*;// See Note [static constexpr char* members for windows NVCC]STATIC_CONSTEXPR_STR_INL_EXCEPT_WIN_CUDA(name, "aten::rand")STATIC_CONSTEXPR_STR_INL_EXCEPT_WIN_CUDA(overload_name, "")STATIC_CONSTEXPR_STR_INL_EXCEPT_WIN_CUDA(schema_str, "rand(SymInt[] size, *, ScalarType? dtype=None, Layout? layout=None, Device? device=None, bool? pin_memory=None) -> Tensor")static at::Tensor call(c10::SymIntArrayRef size, c10::optional<at::ScalarType> dtype, c10::optional<at::Layout> layout, c10::optional<at::Device> device, c10::optional<bool> pin_memory);static at::Tensor redispatch(c10::DispatchKeySet dispatchKeySet, c10::SymIntArrayRef size, c10::optional<at::ScalarType> dtype, c10::optional<at::Layout> layout, c10::optional<at::Device> device, c10::optional<bool> pin_memory);
};struct TORCH_API rand_generator {using schema = at::Tensor (c10::SymIntArrayRef, c10::optional<at::Generator>, c10::optional<at::ScalarType>, c10::optional<at::Layout>, c10::optional<at::Device>, c10::optional<bool>);using ptr_schema = schema*;// See Note [static constexpr char* members for windows NVCC]STATIC_CONSTEXPR_STR_INL_EXCEPT_WIN_CUDA(name, "aten::rand")STATIC_CONSTEXPR_STR_INL_EXCEPT_WIN_CUDA(overload_name, "generator")STATIC_CONSTEXPR_STR_INL_EXCEPT_WIN_CUDA(schema_str, "rand.generator(SymInt[] size, *, Generator? generator, ScalarType? dtype=None, Layout? layout=None, Device? device=None, bool? pin_memory=None) -> Tensor")static at::Tensor call(c10::SymIntArrayRef size, c10::optional<at::Generator> generator, c10::optional<at::ScalarType> dtype, c10::optional<at::Layout> layout, c10::optional<at::Device> device, c10::optional<bool> pin_memory);static at::Tensor redispatch(c10::DispatchKeySet dispatchKeySet, c10::SymIntArrayRef size, c10::optional<at::Generator> generator, c10::optional<at::ScalarType> dtype, c10::optional<at::Layout> layout, c10::optional<at::Device> device, c10::optional<bool> pin_memory);
};struct TORCH_API rand_out {using schema = at::Tensor & (c10::SymIntArrayRef, at::Tensor &);using ptr_schema = schema*;// See Note [static constexpr char* members for windows NVCC]STATIC_CONSTEXPR_STR_INL_EXCEPT_WIN_CUDA(name, "aten::rand")STATIC_CONSTEXPR_STR_INL_EXCEPT_WIN_CUDA(overload_name, "out")STATIC_CONSTEXPR_STR_INL_EXCEPT_WIN_CUDA(schema_str, "rand.out(SymInt[] size, *, Tensor(a!) out) -> Tensor(a!)")static at::Tensor & call(c10::SymIntArrayRef size, at::Tensor & out);static at::Tensor & redispatch(c10::DispatchKeySet dispatchKeySet, c10::SymIntArrayRef size, at::Tensor & out);
};struct TORCH_API rand_generator_out {using schema = at::Tensor & (c10::SymIntArrayRef, c10::optional<at::Generator>, at::Tensor &);using ptr_schema = schema*;// See Note [static constexpr char* members for windows NVCC]STATIC_CONSTEXPR_STR_INL_EXCEPT_WIN_CUDA(name, "aten::rand")STATIC_CONSTEXPR_STR_INL_EXCEPT_WIN_CUDA(overload_name, "generator_out")STATIC_CONSTEXPR_STR_INL_EXCEPT_WIN_CUDA(schema_str, "rand.generator_out(SymInt[] size, *, Generator? generator, Tensor(a!) out) -> Tensor(a!)")static at::Tensor & call(c10::SymIntArrayRef size, c10::optional<at::Generator> generator, at::Tensor & out);static at::Tensor & redispatch(c10::DispatchKeySet dispatchKeySet, c10::SymIntArrayRef size, c10::optional<at::Generator> generator, at::Tensor & out);
};
at::_ops::rand::call
#18 at::_ops::rand::call (size=..., dtype=..., layout=..., layout@entry=..., device=..., pin_memory=...) at /root/Documents/pytorch/build/aten/src/ATen/Operators_2.cpp:5213
build/aten/src/ATen/Operators_2.cpp
STATIC_CONST_STR_OUT_OF_LINE_FOR_WIN_CUDA(rand, name, "aten::rand")
STATIC_CONST_STR_OUT_OF_LINE_FOR_WIN_CUDA(rand, overload_name, "")
STATIC_CONST_STR_OUT_OF_LINE_FOR_WIN_CUDA(rand, schema_str, "rand(SymInt[] size, *, ScalarType? dtype=None, Layout? layout=None, Device? device=None, bool? pin_memory=None) -> Tensor")
at::_ops::rand
中使用STATIC_CONSTEXPR_STR_INL_EXCEPT_WIN_CUDA
宣告了name
, overload_name
, schema_str
等三個成員變數。這裡則用STATIC_CONST_STR_OUT_OF_LINE_FOR_WIN_CUDA
將rand
的name
成員變數的值設為"aten::rand"
;把overload_name
成員設為""
;把schema_str
成員設為"rand(SymInt[] size, *, ScalarType? dtype=None, Layout? layout=None, Device? device=None, bool? pin_memory=None) -> Tensor"
。如果build/aten/src/ATen/Operators.h
處已經賦予初始值,那麼此處的代碼就是無效的。
at::_ops::rand::call
如下:
// aten::rand(SymInt[] size, *, ScalarType? dtype=None, Layout? layout=None, Device? device=None, bool? pin_memory=None) -> Tensor
static C10_NOINLINE c10::TypedOperatorHandle<rand::schema> create_rand_typed_handle() {return c10::Dispatcher::singleton().findSchemaOrThrow(rand::name, rand::overload_name).typed<rand::schema>();
}// aten::rand(SymInt[] size, *, ScalarType? dtype=None, Layout? layout=None, Device? device=None, bool? pin_memory=None) -> Tensor
at::Tensor rand::call(c10::SymIntArrayRef size, c10::optional<at::ScalarType> dtype, c10::optional<at::Layout> layout, c10::optional<at::Device> device, c10::optional<bool> pin_memory) {static auto op = create_rand_typed_handle();return op.call(size, dtype, layout, device, pin_memory);
}
此處調用create_rand_typed_handle
,會查表做dispatch,而分發時查的表格則是由native_functions.yaml
生成。
另外從注釋中可以看出at::_ops::rand::call
和待會會在native_functions.yaml
看到的aten::rand
是一對一的關係。
native_functions.yaml
aten/src/ATen/native/native_functions.yaml
- func: rand.names(SymInt[] size, *, Dimname[]? names, ScalarType? dtype=None, Layout? layout=None, Device? device=None, bool? pin_memory=None) -> Tensordevice_check: NoCheckdevice_guard: Falsedispatch:CompositeExplicitAutograd: randautogen: rand.names_outtags: nondeterministic_seeded- func: rand.generator_with_names(SymInt[] size, *, Generator? generator, Dimname[]? names, ScalarType? dtype=None, Layout? layout=None, Device? device=None, bool? pin_memory=None) -> Tensordevice_check: NoCheckdevice_guard: Falsetags: nondeterministic_seededdispatch:CompositeExplicitAutograd: randautogen: rand.generator_with_names_out- func: rand(SymInt[] size, *, ScalarType? dtype=None, Layout? layout=None, Device? device=None, bool? pin_memory=None) -> Tensortags: [core, nondeterministic_seeded]dispatch:CompositeExplicitAutograd: rand- func: rand.generator(SymInt[] size, *, Generator? generator, ScalarType? dtype=None, Layout? layout=None, Device? device=None, bool? pin_memory=None) -> Tensortags: nondeterministic_seededdispatch:CompositeExplicitAutograd: rand- func: rand.out(SymInt[] size, *, Tensor(a!) out) -> Tensor(a!)tags: nondeterministic_seededdispatch:CompositeExplicitAutograd: rand_out- func: rand.generator_out(SymInt[] size, *, Generator? generator, Tensor(a!) out) -> Tensor(a!)tags: nondeterministic_seeded
native_functions.yaml
有六個版本的rand
,分別是:
- size, names
- size, generator, names
- size
- size, generator
- size, out
- size, generator, out
回顧torch/_C/_VariableFunctions.pyi
中的四個分類:“size, generator, names”,“size, generator(out可省略)”,“size(out可省略)”,“size, names”。這四個分類可以拆開來變成6個:“size, generator, names”,“size, generator, out”,“size, generator”,“size, out”,“size”,“size, names”,正好可以跟此處的六個條目一一對應。
回顧THPVariable_rand
注釋中的六個函數:aten::rand.generator_with_names
, aten::rand.generator
, aten::rand.generator_out
, aten::rand
, aten::rand.out
, aten::rand.names
。正可與此處的六個函數一一對應。另外因為由native_functions.yaml
中func:
後的函數的命名空間默認為aten
,所以注釋裡所有函數都加上了aten::
的前綴。
at::_ops::rand::call
(對應到aten::rand
)調用了c10::TypedOperatorHandle<rand::schema>::call
函數做分發,從Python bindings處已知torch.rand
是對應到此處第三個沒有generator,names和out的rand
函數,來到native_functions.yaml
查表:
- func: rand(SymInt[] size, *, ScalarType? dtype=None, Layout? layout=None, Device? device=None, bool? pin_memory=None) -> Tensortags: [core, nondeterministic_seeded]dispatch:CompositeExplicitAutograd: rand
後來到at::(anonymous namespace)::rand
。
at::(anonymous namespace)::rand
#11 0x00007f0462c05329 in at::(anonymous namespace)::rand (pin_memory=..., device=..., layout=..., dtype=..., size=...) at /root/Documents/pytorch/build/aten/src/ATen/RegisterBackendSelect.cpp:365
build/aten/src/ATen/RegisterBackendSelect.cpp
// aten::rand(SymInt[] size, *, ScalarType? dtype=None, Layout? layout=None, Device? device=None, bool? pin_memory=None) -> Tensor
C10_ALWAYS_INLINE
at::Tensor rand(c10::SymIntArrayRef size, c10::optional<at::ScalarType> dtype, c10::optional<at::Layout> layout, c10::optional<at::Device> device, c10::optional<bool> pin_memory) {DispatchKeySet _dk = c10::DispatchKeySet(c10::computeDispatchKey(dtype, layout, device));return at::_ops::rand::redispatch(_dk, size, dtype, layout, device, pin_memory);
}
進行再次分發(redispatch)。
redisptach
at::_ops::rand::redispatch
#10 at::_ops::rand::redispatch (dispatchKeySet=..., size=..., dtype=..., layout=..., layout@entry=..., device=..., pin_memory=...) at /root/Documents/pytorch/build/aten/src/ATen/Operators_2.cpp:5220
build/aten/src/ATen/Operators_2.cpp
// aten::rand(SymInt[] size, *, ScalarType? dtype=None, Layout? layout=None, Device? device=None, bool? pin_memory=None) -> Tensor
at::Tensor rand::redispatch(c10::DispatchKeySet dispatchKeySet, c10::SymIntArrayRef size, c10::optional<at::ScalarType> dtype, c10::optional<at::Layout> layout, c10::optional<at::Device> device, c10::optional<bool> pin_memory) {static auto op = create_rand_typed_handle();return op.redispatch(dispatchKeySet, size, dtype, layout, device, pin_memory);
}
native_functions.yaml
回到native_functions.yaml
查表,關注aten::rand
:
- func: rand(SymInt[] size, *, ScalarType? dtype=None, Layout? layout=None, Device? device=None, bool? pin_memory=None) -> Tensortags: [core, nondeterministic_seeded]dispatch:CompositeExplicitAutograd: rand
查看dispatch欄位,下有一個名為CompositeExplicitAutograd
的key,表示redispatch後會去到CompositeExplicitAutograd
backend,即待會會看到的wrapper_CompositeExplicitAutograd__rand
函數;value則是rand
,因為dispatch:
後預設的命名空間是at::native
,所以redispatch的終點是at::native::rand
函數。
CPU kernel
at::(anonymous namespace)::(anonymous namespace)::wrapper_CompositeExplicitAutograd__rand
#3 0x00007f0462dd2c28 in at::(anonymous namespace)::(anonymous namespace)::wrapper_CompositeExplicitAutograd__rand (pin_memory=..., device=..., layout=..., dtype=..., size=...) at /root/Documents/pytorch/build/aten/src/ATen/RegisterCompositeExplicitAutograd.cpp:2214
build/aten/src/ATen/RegisterCompositeExplicitAutograd.cpp
namespace {
at::Tensor wrapper_CompositeExplicitAutograd__rand(c10::SymIntArrayRef size, c10::optional<at::ScalarType> dtype, c10::optional<at::Layout> layout, c10::optional<at::Device> device, c10::optional<bool> pin_memory) {// No device check// DeviceGuard omittedreturn at::native::rand(C10_AS_INTARRAYREF_SLOW(size), dtype, layout, device, pin_memory);
}
} // anonymous namespace
注:在PyTorch 1.14版中是build/aten/src/ATen/RegisterCompositeExplicitAutograd.cpp
中的at::(anonymous namespace)::(anonymous namespace)::wrapper__rand
函數。
同檔案下的rand系列函數:
namespace {
at::Tensor wrapper_CompositeExplicitAutograd_names_rand(c10::SymIntArrayRef size, c10::optional<at::DimnameList> names, c10::optional<at::ScalarType> dtype, c10::optional<at::Layout> layout, c10::optional<at::Device> device, c10::optional<bool> pin_memory) {// No device check// DeviceGuard omittedreturn at::native::rand(C10_AS_INTARRAYREF_SLOW(size), names, dtype, layout, device, pin_memory);
}
} // anonymous namespace
namespace {
at::Tensor & wrapper_CompositeExplicitAutograd_names_out_rand_out(c10::SymIntArrayRef size, c10::optional<at::DimnameList> names, at::Tensor & out) {// No device check// DeviceGuard omittedreturn at::native::rand_names_out_symint(size, names, out);
}
} // anonymous namespace
namespace {
at::Tensor wrapper_CompositeExplicitAutograd_generator_with_names_rand(c10::SymIntArrayRef size, c10::optional<at::Generator> generator, c10::optional<at::DimnameList> names, c10::optional<at::ScalarType> dtype, c10::optional<at::Layout> layout, c10::optional<at::Device> device, c10::optional<bool> pin_memory) {// No device check// DeviceGuard omittedreturn at::native::rand(C10_AS_INTARRAYREF_SLOW(size), generator, names, dtype, layout, device, pin_memory);
}
} // anonymous namespace
namespace {
at::Tensor & wrapper_CompositeExplicitAutograd_generator_with_names_out_rand_out(c10::SymIntArrayRef size, c10::optional<at::Generator> generator, c10::optional<at::DimnameList> names, at::Tensor & out) {// No device check// DeviceGuard omittedreturn at::native::rand_generator_with_names_out_symint(size, generator, names, out);
}
} // anonymous namespace
namespace {
at::Tensor wrapper_CompositeExplicitAutograd__rand(c10::SymIntArrayRef size, c10::optional<at::ScalarType> dtype, c10::optional<at::Layout> layout, c10::optional<at::Device> device, c10::optional<bool> pin_memory) {// No device check// DeviceGuard omittedreturn at::native::rand(C10_AS_INTARRAYREF_SLOW(size), dtype, layout, device, pin_memory);
}
} // anonymous namespace
namespace {
at::Tensor & wrapper_CompositeExplicitAutograd_out_rand_out(c10::SymIntArrayRef size, at::Tensor & out) {// No device check// DeviceGuard omittedreturn at::native::rand_out(C10_AS_INTARRAYREF_SLOW(size), out);
}
} // anonymous namespace
namespace {
at::Tensor wrapper_CompositeExplicitAutograd_generator_rand(c10::SymIntArrayRef size, c10::optional<at::Generator> generator, c10::optional<at::ScalarType> dtype, c10::optional<at::Layout> layout, c10::optional<at::Device> device, c10::optional<bool> pin_memory) {// No device check// DeviceGuard omittedreturn at::native::rand(C10_AS_INTARRAYREF_SLOW(size), generator, dtype, layout, device, pin_memory);
}
} // anonymous namespace
有names_rand
,names_out_rand_out
,generator_with_names_rand
,generator_with_names_out_rand_out
,_rand
,out_rand_out
和genrator_rand
共七個(缺了有generator和out的版本?)。
at::native::rand
#2 0x00007f04620d61fc in at::native::rand (size=..., dtype=..., layout=..., layout@entry=..., device=..., pin_memory=...)at /root/Documents/pytorch/aten/src/ATen/native/TensorFactories.cpp:781
aten/src/ATen/native/TensorFactories.cpp
// ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ rand ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~Tensor rand(IntArrayRef size,c10::optional<ScalarType> dtype,c10::optional<Layout> layout,c10::optional<Device> device,c10::optional<bool> pin_memory) {return native::rand(size, static_cast<c10::optional<Generator>>(c10::nullopt), dtype, layout, device, pin_memory);
}
創建一個nullopt的generator後調用同一個檔案下簽名較完整的at::native::rand
函數。
at::native::rand
#1 0x00007f04620d60b1 in at::native::rand (size=..., generator=..., dtype=..., layout=..., device=..., pin_memory=...)at /root/Documents/pytorch/c10/util/Optional.h:204
在backtrace中裡直接從at::native::rand
跳來了c10/util/Optional.h
,有點不明所以,以下是筆者猜測的完整調用路徑。
首先在at::native::rand
中有一個static_cast<c10::optional<Generator>>(c10::nullopt)
的參數,當中用到了c10::optional
:
c10::optional
c10/util/Optional.h
template <class T>
class optional : private OptionalBase<T> {// ...
};
c10::OptionalBase
c10/util/Optional.h
template <class T>
using OptionalBase = std::conditional_t<detail_::is_arrayref<T>::value,arrayref_optional_base<T>,std::conditional_t<std::is_trivially_destructible<T>::value &&C10_IS_TRIVIALLY_COPYABLE(T) &&// Avoid using is_trivially_copy_{constructible,assignable}// because old GCC versions don't support them. Also,// is_trivially_copyable seems not to do what I expect, so check// trivially_copyable_optimization_optional_base directly.std::is_copy_constructible<trivially_copyable_optimization_optional_base<T>>::value &&std::is_copy_assignable<trivially_copyable_optimization_optional_base<T>>::value,trivially_copyable_optimization_optional_base<T>,std::conditional_t<std::is_trivially_destructible<T>::value, // if possibleconstexpr_optional_base<std::remove_const_t<T>>, // use base with// trivial// destructoroptional_base<std::remove_const_t<T>>>>>;
當中用到了constexpr_optional_base
。
c10::constexpr_optional_base
c10/util/Optional.h
template <class T>
struct constexpr_optional_base {bool init_;constexpr_storage_t<T> storage_;// ...
};
constexpr_storage_t
c10/util/Optional.h
template <class T>
union constexpr_storage_t {unsigned char dummy_;T value_;#if __cplusplus >= 202002L// C++20 lifted the requirement to initialize a union member in order to be// constexpr.constexpr constexpr_storage_t(trivial_init_t) noexcept {new (&dummy_) unsigned char;}
#elseconstexpr constexpr_storage_t(trivial_init_t) noexcept : dummy_() {}
#endiftemplate <class... Args>constexpr constexpr_storage_t(Args&&... args): value_(constexpr_forward<Args>(args)...) {}~constexpr_storage_t() = default;
};
從backtrace看此處是進入了else分支,然後直接進入at::empty
函數,仍然不懂箇中奧妙。
at::native::rand
下面則是筆者根據調用處的函數簽名找到的對應的函數定義:
aten/src/ATen/native/TensorFactories.cpp
Tensor rand(IntArrayRef size, c10::optional<Generator> generator,c10::optional<ScalarType> dtype,c10::optional<Layout> layout,c10::optional<Device> device,c10::optional<bool> pin_memory) {// See [Note: hacky wrapper removal for TensorOptions]TensorOptions options = TensorOptions().dtype(dtype).layout(layout).device(device).pinned_memory(pin_memory);auto result = at::empty(size, options);return result.uniform_(0, 1, std::move(generator));
}
首先調用at::empty
得到一個at::Tensor
後,再使用at::Tensor::uniform_
來讓張量裡的元素化為均勻分布。自此將進入at::empty
或uniform_
函數,結束rand
函數的旅程。