一個Tensor的一生 - torch.rand篇

一個Tensor的一生 - torch.rand篇

  • 前言
  • torch/\_C/_VariableFunctions.pyi
  • backtrace
  • Python bindings
    • torch::autograd::THPVariable_rand
    • operator()
  • C++ API
    • torch::rand_symint
    • at::rand_symint
  • dispatch
    • at::_ops::rand
    • at::_ops::rand::call
    • native_functions.yaml
    • at::(anonymous namespace)::rand
  • redisptach
    • at::_ops::rand::redispatch
    • native_functions.yaml
  • CPU kernel
    • at::(anonymous namespace)::(anonymous namespace)::wrapper_CompositeExplicitAutograd__rand
    • at::native::rand
    • at::native::rand
      • c10::optional
      • c10::OptionalBase
      • c10::constexpr_optional_base
      • constexpr_storage_t
      • at::native::rand

前言

Life of a Tensor這篇文章中介紹了torch.rand函數從Python API一直到C++底層的調用過程,寫於2019年7月,當時的PyTorch版本為1.1.0。參考該篇文章,本篇的關注點同樣在整個調用流程,不過是基於較新的PyTorch 2.0版。

首先編寫一個torch_rand.py如下:

import torch
torch.rand(3, 4)

如果我們查找torch.rand的定義,會跳到torch/_C/_VariableFunctions.pyi中。

torch/_C/_VariableFunctions.pyi

torch/_C/_VariableFunctions.pyi這個檔案裡面共有8個不同簽名的rand函數:

@overload
def rand(size: Sequence[Union[_int, SymInt]], *, generator: Optional[Generator], names: Optional[Sequence[Union[str, ellipsis, None]]], dtype: Optional[_dtype] = None, layout: Optional[_layout] = None, device: Optional[Union[_device, str, None]] = None, pin_memory: Optional[_bool] = False, requires_grad: Optional[_bool] = False) -> Tensor: ...
@overload
def rand(*size: _int, generator: Optional[Generator], names: Optional[Sequence[Union[str, ellipsis, None]]], dtype: Optional[_dtype] = None, layout: Optional[_layout] = None, device: Optional[Union[_device, str, None]] = None, pin_memory: Optional[_bool] = False, requires_grad: Optional[_bool] = False) -> Tensor: ...
@overload
def rand(size: Sequence[Union[_int, SymInt]], *, generator: Optional[Generator], out: Optional[Tensor] = None, dtype: Optional[_dtype] = None, layout: Optional[_layout] = None, device: Optional[Union[_device, str, None]] = None, pin_memory: Optional[_bool] = False, requires_grad: Optional[_bool] = False) -> Tensor: ...
@overload
def rand(*size: _int, generator: Optional[Generator], out: Optional[Tensor] = None, dtype: Optional[_dtype] = None, layout: Optional[_layout] = None, device: Optional[Union[_device, str, None]] = None, pin_memory: Optional[_bool] = False, requires_grad: Optional[_bool] = False) -> Tensor: ...
@overload
def rand(size: Sequence[Union[_int, SymInt]], *, out: Optional[Tensor] = None, dtype: Optional[_dtype] = None, layout: Optional[_layout] = None, device: Optional[Union[_device, str, None]] = None, pin_memory: Optional[_bool] = False, requires_grad: Optional[_bool] = False) -> Tensor: ...
@overload
def rand(*size: _int, out: Optional[Tensor] = None, dtype: Optional[_dtype] = None, layout: Optional[_layout] = None, device: Optional[Union[_device, str, None]] = None, pin_memory: Optional[_bool] = False, requires_grad: Optional[_bool] = False) -> Tensor: ...
@overload
def rand(size: Sequence[Union[_int, SymInt]], *, names: Optional[Sequence[Union[str, ellipsis, None]]], dtype: Optional[_dtype] = None, layout: Optional[_layout] = None, device: Optional[Union[_device, str, None]] = None, pin_memory: Optional[_bool] = False, requires_grad: Optional[_bool] = False) -> Tensor: ...
@overload
def rand(*size: _int, names: Optional[Sequence[Union[str, ellipsis, None]]], dtype: Optional[_dtype] = None, layout: Optional[_layout] = None, device: Optional[Union[_device, str, None]] = None, pin_memory: Optional[_bool] = False, requires_grad: Optional[_bool] = False) -> Tensor: ...

torch.rand(3, 4)調用的是上面倒數第三個rand函數。

注1:關於此處出現的*size, _intSequence等型別,以及, *, 的用法,詳見Python typing函式庫和torch.types。

注2:torch/_C/_VariableFunctions.pyi這個檔案是由gen_pyi.py依據torch/_C/_VariableFunctions.pyi.in這個模板及native_functions.yaml生成的,想知道它具體是怎麼被生成的,詳見PyTorch中的pyi檔案生成機制。

如果把擂同的部份對齊,完全相同的部份省略:

@overload
def rand(size: Sequence[Union[_int, SymInt]], *, generator: Optional[Generator], names: Optional[Sequence[Union[str, ellipsis, None]]], ...
@overload
def rand(*size: _int,                            generator: Optional[Generator], names: Optional[Sequence[Union[str, ellipsis, None]]], ...
@overload
def rand(size: Sequence[Union[_int, SymInt]], *, generator: Optional[Generator], out: Optional[Tensor] = None, ...
@overload
def rand(*size: _int,                            generator: Optional[Generator], out: Optional[Tensor] = None, ...
@overload
def rand(size: Sequence[Union[_int, SymInt]], *,                                 out: Optional[Tensor] = None, ...
@overload
def rand(*size: _int,                                                            out: Optional[Tensor] = None, ...
@overload
def rand(size: Sequence[Union[_int, SymInt]], *,                                 names: Optional[Sequence[Union[str, ellipsis, None]]], ...
@overload
def rand(*size: _int,                                                            names: Optional[Sequence[Union[str, ellipsis, None]]], ...

可以依據參數將它們分為以下四類:

  • size, generator, names

  • size, generator(out可省略)

  • size(out可省略)

  • size, names

這四類中的每一類又各自都有兩個版本,即第一個參數(形狀)接受的是_int或者是Sequence

如果將Python腳本中的代碼改成:

torch.rand((3,4))

或:

torch.rand([3,4])

就會調用到倒數第四個、Sequence版本的rand函數(跟剛剛看到的倒數第三個、_int版本的是一對):

@overload
def rand(size: Sequence[Union[_int, SymInt]], *, out: Optional[Tensor] = None, dtype: Optional[_dtype] = None, layout: Optional[_layout] = None, device: Optional[Union[_device, str, None]] = None, pin_memory: Optional[_bool] = False, requires_grad: Optional[_bool] = False) -> Tensor: ...

backtrace

接下來新建一個名為gdbrand的檔案並將以下內容填入其中:

python sys.path.append("/usr/share/gcc/python");
set logging file rand.txt
set logging on
set breakpoint pending on
break at::empty
info breakpoints
run torch_rand.py
bt

開啟gdb python,輸入source gdbrand指令執行上述腳本,就會得到以下的backtrace:

#0  at::empty (size=..., options=..., memory_format=memory_format@entry=...) at /root/Documents/pytorch/build/aten/src/ATen/Functions.h:2652
#1  0x00007f04620d60b1 in at::native::rand (size=..., generator=..., dtype=..., layout=..., device=..., pin_memory=...)at /root/Documents/pytorch/c10/util/Optional.h:204
#2  0x00007f04620d61fc in at::native::rand (size=..., dtype=..., layout=..., layout@entry=..., device=..., pin_memory=...)at /root/Documents/pytorch/aten/src/ATen/native/TensorFactories.cpp:781
#3  0x00007f0462dd2c28 in at::(anonymous namespace)::(anonymous namespace)::wrapper_CompositeExplicitAutograd__rand (pin_memory=..., device=..., layout=..., dtype=..., size=...) at /root/Documents/pytorch/build/aten/src/ATen/RegisterCompositeExplicitAutograd.cpp:2214
#4  c10::impl::detail::WrapFunctionIntoFunctor_<c10::CompileTimeFunctionPointer<at::Tensor(c10::ArrayRef<c10::SymInt>, c10::optional<c10::ScalarType>, c10::optional<c10::Layout>, c10::optional<c10::Device>, c10::optional<bool>), at::(anonymous namespace)::(anonymous namespace)::wrapper_CompositeExplicitAutograd__rand>, at::Tensor, c10::guts::typelist::typelist<c10::ArrayRef<c10::SymInt>, c10::optional<c10::ScalarType>, c10::optional<c10::Layout>, c10::optional<c10::Device>, c10::optional<bool> > >::operator() (args#4=..., args#3=..., args#2=..., args#1=..., args#0=..., this=<optimized out>)at /root/Documents/pytorch/aten/src/ATen/core/boxing/impl/WrapFunctionIntoFunctor.h:13
#5  c10::impl::wrap_kernel_functor_unboxed_<c10::impl::detail::WrapFunctionIntoFunctor_<c10::CompileTimeFunctionPointer<at::Tensor(c10::ArrayRef<c10::SymInt>, c10::optional<c10::ScalarType>, c10::optional<c10::Layout>, c10::optional<c10::Device>, c10::optional<bool>), at::(anonymous namespace)::(anonymous namespace)::wrapper_CompositeExplicitAutograd__rand>, at::Tensor, c10::guts::typelist::typelist<c10::ArrayRef<c10::SymInt>, c10::optional<c10::ScalarType>, c10::optional<c10::Layout>, c10::optional<c10::Device>, c10::optional<bool> > >, at::Tensor(c10::ArrayRef<c10::SymInt>, c10::optional<c10::ScalarType>, c10::optional<c10::Layout>, c10::optional<c10::Device>, c10::optional<bool>)>::call(c10::OperatorKernel *, c10::DispatchKeySet, c10::ArrayRef<c10::SymInt>, c10::optional<c10::ScalarType>, c10::optional<c10::Layout>, c10::optional<c10::Device>, c10::optional<bool>) (functor=<optimized out>, args#0=..., args#1=..., args#2=..., args#3=..., args#4=...) at /root/Documents/pytorch/aten/src/ATen/core/boxing/impl/make_boxed_from_unboxed_functor.h:463
#6  0x00007f0462889309 in c10::callUnboxedKernelFunction<at::Tensor, c10::ArrayRef<c10::SymInt>, c10::optional<c10::ScalarType>, c10::optional<c10::Layout>, c10::optional<c10::Device>, c10::optional<bool> > (dispatchKeySet=..., functor=<optimized out>, unboxed_kernel_func=<optimized out>)at /root/Documents/pytorch/aten/src/ATen/core/boxing/KernelFunction_impl.h:50
#7  c10::KernelFunction::call<at::Tensor, c10::ArrayRef<c10::SymInt>, c10::optional<c10::ScalarType>, c10::optional<c10::Layout>, c10::optional<c10::Device>, c10::optional<bool> > (dispatchKeySet=..., opHandle=..., this=0x2075db8) at /root/Documents/pytorch/aten/src/ATen/core/boxing/KernelFunction_impl.h:91
#8  c10::Dispatcher::redispatch<at::Tensor, c10::ArrayRef<c10::SymInt>, c10::optional<c10::ScalarType>, c10::optional<c10::Layout>, c10::optional<c10::Device>, c10::optional<bool> >(c10::TypedOperatorHandle<at::Tensor (c10::ArrayRef<c10::SymInt>, c10::optional<c10::ScalarType>, c10::optional<c10::Layout>, c10::optional<c10::Device>, c10::optional<bool>)> const&, c10::DispatchKeySet, c10::ArrayRef<c10::SymInt>, c10::optional<c10::ScalarType>, c10::optional<c10::Layout>, c10::optional<c10::Device>, c10::optional<bool>) const (this=<optimized out>, currentDispatchKeySet=..., op=...)at /root/Documents/pytorch/aten/src/ATen/core/dispatch/Dispatcher.h:656
#9  c10::TypedOperatorHandle<at::Tensor (c10::ArrayRef<c10::SymInt>, c10::optional<c10::ScalarType>, c10::optional<c10::Layout>, c10::optional<c10::Device>, c10::optional<bool>)>::redispatch(c10::DispatchKeySet, c10::ArrayRef<c10::SymInt>, c10::optional<c10::ScalarType>, c10::optional<c10::Layout>, c10::optional<c10::Device>, c10::optional<bool>) const (args#4=..., args#3=..., args#2=..., args#1=..., args#0=..., currentDispatchKeySet=..., this=0x7f046a780150 <at::_ops::rand::redispatch(c10::DispatchKeySet, c10::ArrayRef<c10::SymInt>, c10::optional<c10::ScalarType>, c10::optional<c10::Layout>, c10::optional<c10::Device>, c10::optional<bool>)::op>) at /root/Documents/pytorch/aten/src/ATen/core/dispatch/Dispatcher.h:492
#10 at::_ops::rand::redispatch (dispatchKeySet=..., size=..., dtype=..., layout=..., layout@entry=..., device=..., pin_memory=...) at /root/Documents/pytorch/build/aten/src/ATen/Operators_2.cpp:5220
#11 0x00007f0462c05329 in at::(anonymous namespace)::rand (pin_memory=..., device=..., layout=..., dtype=..., size=...) at /root/Documents/pytorch/build/aten/src/ATen/RegisterBackendSelect.cpp:365
#12 c10::impl::detail::WrapFunctionIntoFunctor_<c10::CompileTimeFunctionPointer<at::Tensor(c10::ArrayRef<c10::SymInt>, c10::optional<c10::ScalarType>, c10::optional<c10::Layout>, c10::optional<c10::Device>, c10::optional<bool>), at::(anonymous namespace)::rand>, at::Tensor, c10::guts::typelist::typelist<c10::ArrayRef<c10::SymInt>, c10::optional<c10::ScalarType>, c10::optional<c10::Layout>, c10::optional<c10::Device>, c10::optional<bool> > >::operator() (args#4=..., args#3=..., args#2=..., args#1=..., args#0=..., this=<optimized out>) at /root/Documents/pytorch/aten/src/ATen/core/boxing/impl/WrapFunctionIntoFunctor.h:13
#13 c10::impl::wrap_kernel_functor_unboxed_<c10::impl::detail::WrapFunctionIntoFunctor_<c10::CompileTimeFunctionPointer<at::Tensor(c10::ArrayRef<c10::SymInt>, c10::optional<c10::ScalarType>, c10::optional<c10::Layout>, c10::optional<c10::Device>, c10::optional<bool>), at::(anonymous namespace)::rand>, at::Tensor, c10::guts::typelist::typelist<c10::ArrayRef<c10::SymInt>, c10::optional<c10::ScalarType>, c10::optional<c10::Layout>, c10::optional<c10::Device>, c10::optional<bool> > >, at::Tensor(c10::ArrayRef<c10::SymInt>, c10::optional<c10::ScalarType>, c10::optional<c10::Layout>, c10::optional<c10::Device>, c10::optional<bool>)>::call(c10::OperatorKernel *, c10::DispatchKeySet, c10::ArrayRef<c10::SymInt>, c10::optional<c10::ScalarType>, c10::optional<c10::Layout>, c10::optional<c10::Device>, c10::optional<bool>) (functor=<optimized out>, args#0=..., args#1=..., args#2=..., args#3=..., args#4=...) at /root/Documents/pytorch/aten/src/ATen/core/boxing/impl/make_boxed_from_unboxed_functor.h:463
#14 0x00007f04628e3422 in c10::callUnboxedKernelFunction<at::Tensor, c10::ArrayRef<c10::SymInt>, c10::optional<c10::ScalarType>, c10::optional<c10::Layout>, c10::optional<c10::Device>, c10::optional<bool> > (dispatchKeySet=..., functor=<optimized out>, unboxed_kernel_func=<optimized out>) at /root/Documents/pytorch/aten/src/ATen/core/boxing/KernelFunction_impl.h:50
#15 c10::KernelFunction::call<at::Tensor, c10::ArrayRef<c10::SymInt>, c10::optional<c10::ScalarType>, c10::optional<c10::Layout>, c10::optional<c10::Device>, c10::optional<bool> > (dispatchKeySet=..., opHandle=..., this=0x2076638) at /root/Documents/pytorch/aten/src/ATen/core/boxing/KernelFunction_impl.h:91
#16 c10::Dispatcher::call<at::Tensor, c10::ArrayRef<c10::SymInt>, c10::optional<c10::ScalarType>, c10::optional<c10::Layout>, c10::optional<c10::Device>, c10::optional<bool> >(c10::TypedOperatorHandle<at::Tensor (c10::ArrayRef<c10::SymInt>, c10::optional<c10::ScalarType>, c10::optional<c10::Layout>, c10::optional<c10::Device>, c10::optional<bool>)> const&, c10::ArrayRef<c10::SymInt>, c10::optional<c10::ScalarType>, c10::optional<c10::Layout>, c10::optional<c10::Device>, c10::optional<bool>) const (op=..., this=<optimized out>) at /root/Documents/pytorch/aten/src/ATen/core/dispatch/Dispatcher.h:639
#17 c10::TypedOperatorHandle<at::Tensor (c10::ArrayRef<c10::SymInt>, c10::optional<c10::ScalarType>, c10::optional<c10::Layout>, c10::optional<c10::Device>, c10::optional<bool>)>::call(c10::ArrayRef<c10::SymInt>, c10::optional<c10::ScalarType>, c10::optional<c10::Layout>, c10::optional<c10::Device>, c10::optional<bool>) const (args#4=..., args#3=..., args#2=..., args#1=..., args#0=..., this=0x7f046a780170 <at::_ops::rand::call(c10::ArrayRef<c10::SymInt>, c10::optional<c10::ScalarType>, c10::optional<c10::Layout>, c10::optional<c10::Device>, c10::optional<bool>)::op>) at /root/Documents/pytorch/aten/src/ATen/core/dispatch/Dispatcher.h:487
#18 at::_ops::rand::call (size=..., dtype=..., layout=..., layout@entry=..., device=..., pin_memory=...) at /root/Documents/pytorch/build/aten/src/ATen/Operators_2.cpp:5213
#19 0x00007f046abdd595 in at::rand_symint (options=..., size=...) at /root/Documents/pytorch/build/aten/src/ATen/Functions.h:5770
#20 torch::rand_symint (options=..., size=...) at /root/Documents/pytorch/torch/csrc/autograd/generated/variable_factories.h:418
#21 operator() (__closure=<synthetic pointer>, options=..., size=...) at /root/Documents/pytorch/torch/csrc/autograd/generated/python_torch_functions_0.cpp:5256
#22 torch::autograd::THPVariable_rand (self_=<optimized out>, args=<optimized out>, kwargs=<optimized out>) at /root/Documents/pytorch/torch/csrc/autograd/generated/python_torch_functions_0.cpp:5258
#23 0x0000000000508127 in cfunction_call (func=0x7f048932aa90, args=<optimized out>, kwargs=<optimized out>) at /usr/local/src/conda/python-3.9.13/Objects/methodobject.c:543
#24 0x00000000004f0edc in _PyObject_MakeTpCall (tstate=0x1a610a0, callable=0x7f048932aa90, args=<optimized out>, nargs=<optimized out>, keywords=0x0) at /usr/local/src/conda/python-3.9.13/Objects/call.c:191
#25 0x00000000004ed255 in _PyObject_VectorcallTstate (kwnames=0x0, nargsf=<optimized out>, args=0x1abc308, callable=0x7f048932aa90, tstate=<optimized out>) at /usr/local/src/conda/python-3.9.13/Include/cpython/abstract.h:116
#26 _PyObject_VectorcallTstate (kwnames=0x0, nargsf=<optimized out>, args=0x1abc308, callable=0x7f048932aa90, tstate=<optimized out>) at /usr/local/src/conda/python-3.9.13/Include/cpython/abstract.h:103
#27 PyObject_Vectorcall (kwnames=0x0, nargsf=<optimized out>, args=0x1abc308, callable=0x7f048932aa90) at /usr/local/src/conda/python-3.9.13/Include/cpython/abstract.h:127
#28 call_function (kwnames=0x0, oparg=<optimized out>, pp_stack=<synthetic pointer>, tstate=0x1a610a0) at /usr/local/src/conda/python-3.9.13/Python/ceval.c:5077
#29 _PyEval_EvalFrameDefault (tstate=<optimized out>, f=0x1abc190, throwflag=<optimized out>) at /usr/local/src/conda/python-3.9.13/Python/ceval.c:3489
#30 0x00000000004e70ca in _PyEval_EvalFrame (throwflag=0, f=0x1abc190, tstate=0x1a610a0) at /usr/local/src/conda/python-3.9.13/Include/internal/pycore_ceval.h:40
#31 _PyEval_EvalCode (tstate=<optimized out>, _co=<optimized out>, globals=<optimized out>, locals=<optimized out>, args=<optimized out>, argcount=<optimized out>, kwnames=0x0, kwargs=0x0, kwcount=<optimized out>, kwstep=2, defs=0x0, defcount=<optimized out>, kwdefs=0x0, closure=0x0, name=0x0, qualname=0x0) at /usr/local/src/conda/python-3.9.13/Python/ceval.c:4329
#32 0x00000000004e6d57 in _PyEval_EvalCodeWithName (_co=<optimized out>, globals=<optimized out>, locals=<optimized out>, args=<optimized out>, argcount=<optimized out>, kwnames=<optimized out>, kwargs=0x0, kwcount=0, kwstep=2, defs=0x0, defcount=0, kwdefs=0x0, closure=0x0, name=0x0, qualname=0x0) at /usr/local/src/conda/python-3.9.13/Python/ceval.c:4361
#33 0x00000000004e6d09 in PyEval_EvalCodeEx (_co=<optimized out>, globals=<optimized out>, locals=<optimized out>, args=<optimized out>, argcount=<optimized out>, kws=<optimized out>, kwcount=0, defs=0x0, defcount=0, kwdefs=0x0, closure=0x0) at /usr/local/src/conda/python-3.9.13/Python/ceval.c:4377
#34 0x0000000000594e7b in PyEval_EvalCode (co=co@entry=0x7f049ec05870, globals=globals@entry=0x7f049ebfd780, locals=locals@entry=0x7f049ebfd780) at /usr/local/src/conda/python-3.9.13/Python/ceval.c:828
#35 0x00000000005c2307 in run_eval_code_obj (tstate=0x1a610a0, co=0x7f049ec05870, globals=0x7f049ebfd780, locals=0x7f049ebfd780) at /usr/local/src/conda/python-3.9.13/Python/pythonrun.c:1221
#36 0x00000000005be270 in run_mod (mod=<optimized out>, filename=<optimized out>, globals=0x7f049ebfd780, locals=0x7f049ebfd780, flags=<optimized out>, arena=<optimized out>) at /usr/local/src/conda/python-3.9.13/Python/pythonrun.c:1242
#37 0x00000000004563ed in pyrun_file (fp=0x1a5d440, filename=0x7f049eb2d930, start=<optimized out>, globals=0x7f049ebfd780, locals=0x7f049ebfd780, closeit=1, flags=0x7ffecfd9a7f8) at /usr/local/src/conda/python-3.9.13/Python/pythonrun.c:1140
#38 0x00000000005b8062 in pyrun_simple_file (flags=0x7ffecfd9a7f8, closeit=1, filename=0x7f049eb2d930, fp=0x1a5d440) at /usr/local/src/conda/python-3.9.13/Python/pythonrun.c:450
#39 PyRun_SimpleFileExFlags (fp=0x1a5d440, filename=<optimized out>, closeit=1, flags=0x7ffecfd9a7f8) at /usr/local/src/conda/python-3.9.13/Python/pythonrun.c:483
#40 0x00000000005b55ce in pymain_run_file (cf=0x7ffecfd9a7f8, config=0x1a5faa0) at /usr/local/src/conda/python-3.9.13/Modules/main.c:379
#41 pymain_run_python (exitcode=0x7ffecfd9a7f0) at /usr/local/src/conda/python-3.9.13/Modules/main.c:604
#42 Py_RunMain () at /usr/local/src/conda/python-3.9.13/Modules/main.c:683
#43 0x0000000000588ff9 in Py_BytesMain (argc=<optimized out>, argv=<optimized out>) at /usr/local/src/conda/python-3.9.13/Modules/main.c:1129
#44 0x00007f049ed56d90 in __libc_start_call_main (main=main@entry=0x588fb0 <main>, argc=argc@entry=2, argv=argv@entry=0x7ffecfd9aa28) at ../sysdeps/nptl/libc_start_call_main.h:58
#45 0x00007f049ed56e40 in __libc_start_main_impl (main=0x588fb0 <main>, argc=2, argv=0x7ffecfd9aa28, init=<optimized out>, fini=<optimized out>, rtld_fini=<optimized out>, stack_end=0x7ffecfd9aa18) at ../csu/libc-start.c:392
#46 0x0000000000588eae in _start ()

backtrace應由下往上看,一開始是_start,中間經過十幾個Python自身的函數,來到的第一個與PyTorch相關的函數為第22個節點:torch::autograd::THPVariable_rand

Python bindings

torch::autograd::THPVariable_rand

#22 torch::autograd::THPVariable_rand (self_=<optimized out>, args=<optimized out>, kwargs=<optimized out>) at /root/Documents/pytorch/torch/csrc/autograd/generated/python_torch_functions_0.cpp:5258

torch::autograd::THPVariable_rand函數定義於torch/csrc/autograd/generated/python_torch_functions_0.cpp,注意到該資料夾下還有其它名字類似但後綴不同的cpp檔。這些檔案(python_torch_functions_i.cpp)是在編譯PyTorch時由gen.py腳本依據native_functions.yamltools/autograd/templates/python_torch_functions.cpp這個模板來生成的,詳見PyTorch中的python_torch_functions_i.cpp檔案生成機制。

THPVariable_rand函數的宣告如下:

static PyObject * THPVariable_rand(PyObject* self_, PyObject* args, PyObject* kwargs);

定義PyMethodDef陣列,Python中的rand函數會被對應到C++的THPVariable_rand函數:

static PyMethodDef torch_functions_shard[] = {//...{"rand", castPyCFunctionWithKeywords(THPVariable_rand), METH_VARARGS | METH_KEYWORDS | METH_STATIC, NULL},//...
}

THPVariable_rand函數的具體實現如下:

// generated methods start here// ...// rand
static PyObject * THPVariable_rand(PyObject* self_, PyObject* args, PyObject* kwargs)
{HANDLE_TH_ERRORSstatic PythonArgParser parser({"rand(SymIntArrayRef size, *, Generator? generator, DimnameList? names, ScalarType? dtype=None, Layout? layout=None, Device? device=None, bool? pin_memory=False, bool? requires_grad=False)","rand(SymIntArrayRef size, *, Generator? generator, Tensor out=None, ScalarType? dtype=None, Layout? layout=None, Device? device=None, bool? pin_memory=False, bool? requires_grad=False)","rand(SymIntArrayRef size, *, Tensor out=None, ScalarType? dtype=None, Layout? layout=None, Device? device=None, bool? pin_memory=False, bool? requires_grad=False)","rand(SymIntArrayRef size, *, DimnameList? names, ScalarType? dtype=None, Layout? layout=None, Device? device=None, bool? pin_memory=False, bool? requires_grad=False)",}, /*traceable=*/true);ParsedArgs<8> parsed_args;auto _r = parser.parse(nullptr, args, kwargs, parsed_args);if(_r.has_torch_function()) {return handle_torch_function(_r, nullptr, args, kwargs, THPVariableFunctionsModule, "torch");}switch (_r.idx) {case 0: {// aten::rand.generator_with_names(SymInt[] size, *, Generator? generator, Dimname[]? names, ScalarType? dtype=None, Layout? layout=None, Device? device=None, bool? pin_memory=None) -> Tensorauto __names = _r.toDimnameListOptional(2);c10::optional<DimnameList> names = __names ? c10::make_optional(DimnameList(__names.value())) : c10::nullopt;const auto options = TensorOptions().dtype(_r.scalartypeOptional(3)).device(_r.deviceWithDefault(5, torch::tensors::get_default_device())).layout(_r.layoutOptional(4)).requires_grad(_r.toBool(7)).pinned_memory(_r.toBool(6));torch::utils::maybe_initialize_cuda(options);auto dispatch_rand = [](c10::SymIntArrayRef size, c10::optional<at::Generator> generator, c10::optional<at::DimnameList> names, at::TensorOptions options) -> at::Tensor {pybind11::gil_scoped_release no_gil;return torch::rand_symint(size, generator, names, options);};return wrap(dispatch_rand(_r.symintlist(0), _r.generator(1), names, options));}case 1: {if (_r.isNone(2)) {// aten::rand.generator(SymInt[] size, *, Generator? generator, ScalarType? dtype=None, Layout? layout=None, Device? device=None, bool? pin_memory=None) -> Tensorconst auto options = TensorOptions().dtype(_r.scalartypeOptional(3)).device(_r.deviceWithDefault(5, torch::tensors::get_default_device())).layout(_r.layoutOptional(4)).requires_grad(_r.toBool(7)).pinned_memory(_r.toBool(6));torch::utils::maybe_initialize_cuda(options);auto dispatch_rand = [](c10::SymIntArrayRef size, c10::optional<at::Generator> generator, at::TensorOptions options) -> at::Tensor {pybind11::gil_scoped_release no_gil;return torch::rand_symint(size, generator, options);};return wrap(dispatch_rand(_r.symintlist(0), _r.generator(1), options));} else {// aten::rand.generator_out(SymInt[] size, *, Generator? generator, Tensor(a!) out) -> Tensor(a!)check_out_type_matches(_r.tensor(2), _r.scalartypeOptional(3),_r.isNone(3), _r.layoutOptional(4),_r.deviceWithDefault(5, torch::tensors::get_default_device()), _r.isNone(5));auto dispatch_rand_out = [](at::Tensor out, c10::SymIntArrayRef size, c10::optional<at::Generator> generator) -> at::Tensor {pybind11::gil_scoped_release no_gil;return at::rand_symint_out(out, size, generator);};return wrap(dispatch_rand_out(_r.tensor(2), _r.symintlist(0), _r.generator(1)).set_requires_grad(_r.toBool(7)));}}case 2: {if (_r.isNone(1)) {// aten::rand(SymInt[] size, *, ScalarType? dtype=None, Layout? layout=None, Device? device=None, bool? pin_memory=None) -> Tensorconst auto options = TensorOptions().dtype(_r.scalartypeOptional(2)).device(_r.deviceWithDefault(4, torch::tensors::get_default_device())).layout(_r.layoutOptional(3)).requires_grad(_r.toBool(6)).pinned_memory(_r.toBool(5));torch::utils::maybe_initialize_cuda(options);auto dispatch_rand = [](c10::SymIntArrayRef size, at::TensorOptions options) -> at::Tensor {pybind11::gil_scoped_release no_gil;return torch::rand_symint(size, options);};return wrap(dispatch_rand(_r.symintlist(0), options));} else {// aten::rand.out(SymInt[] size, *, Tensor(a!) out) -> Tensor(a!)check_out_type_matches(_r.tensor(1), _r.scalartypeOptional(2),_r.isNone(2), _r.layoutOptional(3),_r.deviceWithDefault(4, torch::tensors::get_default_device()), _r.isNone(4));auto dispatch_rand_out = [](at::Tensor out, c10::SymIntArrayRef size) -> at::Tensor {pybind11::gil_scoped_release no_gil;return at::rand_symint_out(out, size);};return wrap(dispatch_rand_out(_r.tensor(1), _r.symintlist(0)).set_requires_grad(_r.toBool(6)));}}case 3: {// aten::rand.names(SymInt[] size, *, Dimname[]? names, ScalarType? dtype=None, Layout? layout=None, Device? device=None, bool? pin_memory=None) -> Tensorauto __names = _r.toDimnameListOptional(1);c10::optional<DimnameList> names = __names ? c10::make_optional(DimnameList(__names.value())) : c10::nullopt;const auto options = TensorOptions().dtype(_r.scalartypeOptional(2)).device(_r.deviceWithDefault(4, torch::tensors::get_default_device())).layout(_r.layoutOptional(3)).requires_grad(_r.toBool(6)).pinned_memory(_r.toBool(5));torch::utils::maybe_initialize_cuda(options);auto dispatch_rand = [](c10::SymIntArrayRef size, c10::optional<at::DimnameList> names, at::TensorOptions options) -> at::Tensor {pybind11::gil_scoped_release no_gil;return torch::rand_symint(size, names, options);};return wrap(dispatch_rand(_r.symintlist(0), names, options));}}Py_RETURN_NONE;END_HANDLE_TH_ERRORS
}

parser的定義中,rand函數依據簽名被分成了以下4大類,共6個函數:

第0個簽名需提供size, generator和names:

    "rand(SymIntArrayRef size, *, Generator? generator, DimnameList? names, ScalarType? dtype=None, Layout? layout=None, Device? device=None, bool? pin_memory=False, bool? requires_grad=False)",
// aten::rand.generator_with_names(SymInt[] size, *, Generator? generator, Dimname[]? names, ScalarType? dtype=None, Layout? layout=None, Device? device=None, bool? pin_memory=None) -> Tensor

第1種簽名需提供size和generator:

    "rand(SymIntArrayRef size, *, Generator? generator, Tensor out=None, ScalarType? dtype=None, Layout? layout=None, Device? device=None, bool? pin_memory=False, bool? requires_grad=False)",

搭配注釋,可以知道第1種簽名又可分別被細分為2種API,第1-1種不需提供out:

// aten::rand.generator(SymInt[] size, *, Generator? generator, ScalarType? dtype=None, Layout? layout=None, Device? device=None, bool? pin_memory=None) -> Tensor

第1-2種是需提供out的變形:

// aten::rand.generator_out(SymInt[] size, *, Generator? generator, Tensor(a!) out) -> Tensor(a!)

第2種簽名只需提供size:

    "rand(SymIntArrayRef size, *, Tensor out=None, ScalarType? dtype=None, Layout? layout=None, Device? device=None, bool? pin_memory=False, bool? requires_grad=False)",

搭配注釋,可以知道第2種簽名又可分別被細分為2種API,第2-1種不需提供out:

// aten::rand(SymInt[] size, *, ScalarType? dtype=None, Layout? layout=None, Device? device=None, bool? pin_memory=None) -> Tensor

第2-2種是需提供out的變形:

// aten::rand.out(SymInt[] size, *, Tensor(a!) out) -> Tensor(a!)

第3種簽名需提供size和names:

    "rand(SymIntArrayRef size, *, DimnameList? names, ScalarType? dtype=None, Layout? layout=None, Device? device=None, bool? pin_memory=False, bool? requires_grad=False)",
// aten::rand.names(SymInt[] size, *, Dimname[]? names, ScalarType? dtype=None, Layout? layout=None, Device? device=None, bool? pin_memory=None) -> Tensor

因為我們調用的是torch.rand(3,4),只傳入size,所以此處會進入switch的case 2。

接著用_r.isNone(1)檢查第1個(0-based)參數,也就是out參數,是否為空。此處未提供out參數,所以進入if分支。對應到只提供sizeaten::rand

        // aten::rand(SymInt[] size, *, ScalarType? dtype=None, Layout? layout=None, Device? device=None, bool? pin_memory=None) -> Tensor

operator()

#21 operator() (__closure=<synthetic pointer>, options=..., size=...) at /root/Documents/pytorch/torch/csrc/autograd/generated/python_torch_functions_0.cpp:5256

torch/csrc/autograd/generated/python_torch_functions_0.cpp

這個節點其實就只是torch::autograd::THPVariable_rand中case 2的if分支裡定義的lambda函數:

        auto dispatch_rand = [](c10::SymIntArrayRef size, at::TensorOptions options) -> at::Tensor {pybind11::gil_scoped_release no_gil;return torch::rand_symint(size, options);};

其中pybind11::gil_scoped_release no_gil;釋放了GIL鎖,這使得接下來要調用的torch::rand_symint得以在其它的線程中執行(這裡有另外開thread?),相當於讓Python擁有了執行多線程的能力。詳見Python GIL及其釋放/獲取函數。

C++ API

torch::rand_symint

#20 torch::rand_symint (options=..., size=...) at /root/Documents/pytorch/torch/csrc/autograd/generated/variable_factories.h:418

torch/csrc/autograd/generated/variable_factories.h

inline at::Tensor rand_symint(c10::SymIntArrayRef size, at::TensorOptions options = {}) {at::AutoDispatchBelowADInplaceOrView guard;return autograd::make_variable(at::rand_symint(size, at::TensorOptions(options).requires_grad(c10::nullopt)), /*requires_grad=*/options.requires_grad());
}

torch::rand_symint同樣只接受size參數,其函數內容則是先用at::rand_symint得到一個at::Tensor,再用autograd::make_variable包裝後返回。

at::rand_symint回傳的是一個沒有自動微分功能的純張量,autograd::make_variable的作用則是設定該張量關於自動微分的metadata,設定完成之後,該張量便擁有了自動微分的功能。關於torch::rand_symintat::rand_symint的差別,詳見torch::和at:: factory function的差別。

以下是同一個檔案中出現的rand系列函數,

inline at::Tensor rand(at::IntArrayRef size, c10::optional<at::DimnameList> names, at::TensorOptions options = {}) {at::AutoDispatchBelowADInplaceOrView guard;return autograd::make_variable(at::rand(size, names, at::TensorOptions(options).requires_grad(c10::nullopt)), /*requires_grad=*/options.requires_grad());
}
//...
inline at::Tensor rand(at::IntArrayRef size, c10::optional<at::Generator> generator, c10::optional<at::DimnameList> names, at::TensorOptions options = {}) {at::AutoDispatchBelowADInplaceOrView guard;return autograd::make_variable(at::rand(size, generator, names, at::TensorOptions(options).requires_grad(c10::nullopt)), /*requires_grad=*/options.requires_grad());
}
//...
inline at::Tensor rand(at::IntArrayRef size, at::TensorOptions options = {}) {at::AutoDispatchBelowADInplaceOrView guard;return autograd::make_variable(at::rand(size, at::TensorOptions(options).requires_grad(c10::nullopt)), /*requires_grad=*/options.requires_grad());
}
//...
inline at::Tensor rand(at::IntArrayRef size, c10::optional<at::Generator> generator, at::TensorOptions options = {}) {at::AutoDispatchBelowADInplaceOrView guard;return autograd::make_variable(at::rand(size, generator, at::TensorOptions(options).requires_grad(c10::nullopt)), /*requires_grad=*/options.requires_grad());
}

分四種:

  • size, names
  • size, generator, names
  • size
  • size, generator

正好可與torch/_C/_VariableFucntions.pyitorch::autograd::THPVariable_rand中的4個函數一一對應。

at::rand_symint

#19 0x00007f046abdd595 in at::rand_symint (options=..., size=...) at /root/Documents/pytorch/build/aten/src/ATen/Functions.h:5770

build/aten/src/ATen/Functions.h

// aten::rand(SymInt[] size, *, ScalarType? dtype=None, Layout? layout=None, Device? device=None, bool? pin_memory=None) -> Tensor
inline at::Tensor rand_symint(c10::SymIntArrayRef size, at::TensorOptions options={}) {return at::_ops::rand::call(size, optTypeMetaToScalarType(options.dtype_opt()), options.layout_opt(), options.device_opt(), options.pinned_memory_opt());
}

at::rand_symint函數同樣只接受size參數。

注釋中表明了at::rand_symintaten::rand的關係,待會會在native_functions.yaml看見aten::rand的身影。

下面還定義了at::symint::rand函數,其實現與at::rand_symint一致。

namespace symint {template <typename T, typename = std::enable_if_t<std::is_same<T, c10::SymInt>::value>>at::Tensor rand(c10::SymIntArrayRef size, at::TensorOptions options={}) {return at::_ops::rand::call(size, optTypeMetaToScalarType(options.dtype_opt()), options.layout_opt(), options.device_opt(), options.pinned_memory_opt());}
}

以下是同一個檔案中出現的rand系列函數,可分為aten::rand.names, aten::rand.generator_with_names, aten::rand, aten::rand.generator, aten::rand.out, aten::rand.generator_out六類,可與torch::autograd::THPVariable_rand注釋中的六個函數一一對應。

每類又依第一個參數型別為at::intArrayRef或``c10::SymIntArrayRef各分兩種,再依第二個以後的參數為dtype, layout, device, pin_memory或直接用at::TensorOptions`包起來各分兩種,所以共有6 * 2 * 2 = 24個。

// aten::rand.names(SymInt[] size, *, Dimname[]? names, ScalarType? dtype=None, Layout? layout=None, Device? device=None, bool? pin_memory=None) -> Tensor
inline at::Tensor rand(at::IntArrayRef size, c10::optional<at::DimnameList> names, at::TensorOptions options={}) {return at::_ops::rand_names::call(c10::fromIntArrayRefSlow(size), names, optTypeMetaToScalarType(options.dtype_opt()), options.layout_opt(), options.device_opt(), options.pinned_memory_opt());
}
namespace symint {template <typename T, typename = std::enable_if_t<std::is_same<T, int64_t>::value>>at::Tensor rand(at::IntArrayRef size, c10::optional<at::DimnameList> names, at::TensorOptions options={}) {return at::_ops::rand_names::call(c10::fromIntArrayRefSlow(size), names, optTypeMetaToScalarType(options.dtype_opt()), options.layout_opt(), options.device_opt(), options.pinned_memory_opt());}
}// aten::rand.names(SymInt[] size, *, Dimname[]? names, ScalarType? dtype=None, Layout? layout=None, Device? device=None, bool? pin_memory=None) -> Tensor
inline at::Tensor rand(at::IntArrayRef size, c10::optional<at::DimnameList> names, c10::optional<at::ScalarType> dtype, c10::optional<at::Layout> layout, c10::optional<at::Device> device, c10::optional<bool> pin_memory) {return at::_ops::rand_names::call(c10::fromIntArrayRefSlow(size), names, dtype, layout, device, pin_memory);
}
namespace symint {template <typename T, typename = std::enable_if_t<std::is_same<T, int64_t>::value>>at::Tensor rand(at::IntArrayRef size, c10::optional<at::DimnameList> names, c10::optional<at::ScalarType> dtype, c10::optional<at::Layout> layout, c10::optional<at::Device> device, c10::optional<bool> pin_memory) {return at::_ops::rand_names::call(c10::fromIntArrayRefSlow(size), names, dtype, layout, device, pin_memory);}
}// aten::rand.names(SymInt[] size, *, Dimname[]? names, ScalarType? dtype=None, Layout? layout=None, Device? device=None, bool? pin_memory=None) -> Tensor
inline at::Tensor rand_symint(c10::SymIntArrayRef size, c10::optional<at::DimnameList> names, at::TensorOptions options={}) {return at::_ops::rand_names::call(size, names, optTypeMetaToScalarType(options.dtype_opt()), options.layout_opt(), options.device_opt(), options.pinned_memory_opt());
}
namespace symint {template <typename T, typename = std::enable_if_t<std::is_same<T, c10::SymInt>::value>>at::Tensor rand(c10::SymIntArrayRef size, c10::optional<at::DimnameList> names, at::TensorOptions options={}) {return at::_ops::rand_names::call(size, names, optTypeMetaToScalarType(options.dtype_opt()), options.layout_opt(), options.device_opt(), options.pinned_memory_opt());}
}// aten::rand.names(SymInt[] size, *, Dimname[]? names, ScalarType? dtype=None, Layout? layout=None, Device? device=None, bool? pin_memory=None) -> Tensor
inline at::Tensor rand_symint(c10::SymIntArrayRef size, c10::optional<at::DimnameList> names, c10::optional<at::ScalarType> dtype, c10::optional<at::Layout> layout, c10::optional<at::Device> device, c10::optional<bool> pin_memory) {return at::_ops::rand_names::call(size, names, dtype, layout, device, pin_memory);
}
namespace symint {template <typename T, typename = std::enable_if_t<std::is_same<T, c10::SymInt>::value>>at::Tensor rand(c10::SymIntArrayRef size, c10::optional<at::DimnameList> names, c10::optional<at::ScalarType> dtype, c10::optional<at::Layout> layout, c10::optional<at::Device> device, c10::optional<bool> pin_memory) {return at::_ops::rand_names::call(size, names, dtype, layout, device, pin_memory);}
}// aten::rand.generator_with_names(SymInt[] size, *, Generator? generator, Dimname[]? names, ScalarType? dtype=None, Layout? layout=None, Device? device=None, bool? pin_memory=None) -> Tensor
inline at::Tensor rand(at::IntArrayRef size, c10::optional<at::Generator> generator, c10::optional<at::DimnameList> names, at::TensorOptions options={}) {return at::_ops::rand_generator_with_names::call(c10::fromIntArrayRefSlow(size), generator, names, optTypeMetaToScalarType(options.dtype_opt()), options.layout_opt(), options.device_opt(), options.pinned_memory_opt());
}
namespace symint {template <typename T, typename = std::enable_if_t<std::is_same<T, int64_t>::value>>at::Tensor rand(at::IntArrayRef size, c10::optional<at::Generator> generator, c10::optional<at::DimnameList> names, at::TensorOptions options={}) {return at::_ops::rand_generator_with_names::call(c10::fromIntArrayRefSlow(size), generator, names, optTypeMetaToScalarType(options.dtype_opt()), options.layout_opt(), options.device_opt(), options.pinned_memory_opt());}
}// aten::rand.generator_with_names(SymInt[] size, *, Generator? generator, Dimname[]? names, ScalarType? dtype=None, Layout? layout=None, Device? device=None, bool? pin_memory=None) -> Tensor
inline at::Tensor rand(at::IntArrayRef size, c10::optional<at::Generator> generator, c10::optional<at::DimnameList> names, c10::optional<at::ScalarType> dtype, c10::optional<at::Layout> layout, c10::optional<at::Device> device, c10::optional<bool> pin_memory) {return at::_ops::rand_generator_with_names::call(c10::fromIntArrayRefSlow(size), generator, names, dtype, layout, device, pin_memory);
}
namespace symint {template <typename T, typename = std::enable_if_t<std::is_same<T, int64_t>::value>>at::Tensor rand(at::IntArrayRef size, c10::optional<at::Generator> generator, c10::optional<at::DimnameList> names, c10::optional<at::ScalarType> dtype, c10::optional<at::Layout> layout, c10::optional<at::Device> device, c10::optional<bool> pin_memory) {return at::_ops::rand_generator_with_names::call(c10::fromIntArrayRefSlow(size), generator, names, dtype, layout, device, pin_memory);}
}// aten::rand.generator_with_names(SymInt[] size, *, Generator? generator, Dimname[]? names, ScalarType? dtype=None, Layout? layout=None, Device? device=None, bool? pin_memory=None) -> Tensor
inline at::Tensor rand_symint(c10::SymIntArrayRef size, c10::optional<at::Generator> generator, c10::optional<at::DimnameList> names, at::TensorOptions options={}) {return at::_ops::rand_generator_with_names::call(size, generator, names, optTypeMetaToScalarType(options.dtype_opt()), options.layout_opt(), options.device_opt(), options.pinned_memory_opt());
}
namespace symint {template <typename T, typename = std::enable_if_t<std::is_same<T, c10::SymInt>::value>>at::Tensor rand(c10::SymIntArrayRef size, c10::optional<at::Generator> generator, c10::optional<at::DimnameList> names, at::TensorOptions options={}) {return at::_ops::rand_generator_with_names::call(size, generator, names, optTypeMetaToScalarType(options.dtype_opt()), options.layout_opt(), options.device_opt(), options.pinned_memory_opt());}
}// aten::rand.generator_with_names(SymInt[] size, *, Generator? generator, Dimname[]? names, ScalarType? dtype=None, Layout? layout=None, Device? device=None, bool? pin_memory=None) -> Tensor
inline at::Tensor rand_symint(c10::SymIntArrayRef size, c10::optional<at::Generator> generator, c10::optional<at::DimnameList> names, c10::optional<at::ScalarType> dtype, c10::optional<at::Layout> layout, c10::optional<at::Device> device, c10::optional<bool> pin_memory) {return at::_ops::rand_generator_with_names::call(size, generator, names, dtype, layout, device, pin_memory);
}
namespace symint {template <typename T, typename = std::enable_if_t<std::is_same<T, c10::SymInt>::value>>at::Tensor rand(c10::SymIntArrayRef size, c10::optional<at::Generator> generator, c10::optional<at::DimnameList> names, c10::optional<at::ScalarType> dtype, c10::optional<at::Layout> layout, c10::optional<at::Device> device, c10::optional<bool> pin_memory) {return at::_ops::rand_generator_with_names::call(size, generator, names, dtype, layout, device, pin_memory);}
}// aten::rand(SymInt[] size, *, ScalarType? dtype=None, Layout? layout=None, Device? device=None, bool? pin_memory=None) -> Tensor
inline at::Tensor rand(at::IntArrayRef size, at::TensorOptions options={}) {return at::_ops::rand::call(c10::fromIntArrayRefSlow(size), optTypeMetaToScalarType(options.dtype_opt()), options.layout_opt(), options.device_opt(), options.pinned_memory_opt());
}
namespace symint {template <typename T, typename = std::enable_if_t<std::is_same<T, int64_t>::value>>at::Tensor rand(at::IntArrayRef size, at::TensorOptions options={}) {return at::_ops::rand::call(c10::fromIntArrayRefSlow(size), optTypeMetaToScalarType(options.dtype_opt()), options.layout_opt(), options.device_opt(), options.pinned_memory_opt());}
}// aten::rand(SymInt[] size, *, ScalarType? dtype=None, Layout? layout=None, Device? device=None, bool? pin_memory=None) -> Tensor
inline at::Tensor rand(at::IntArrayRef size, c10::optional<at::ScalarType> dtype, c10::optional<at::Layout> layout, c10::optional<at::Device> device, c10::optional<bool> pin_memory) {return at::_ops::rand::call(c10::fromIntArrayRefSlow(size), dtype, layout, device, pin_memory);
}
namespace symint {template <typename T, typename = std::enable_if_t<std::is_same<T, int64_t>::value>>at::Tensor rand(at::IntArrayRef size, c10::optional<at::ScalarType> dtype, c10::optional<at::Layout> layout, c10::optional<at::Device> device, c10::optional<bool> pin_memory) {return at::_ops::rand::call(c10::fromIntArrayRefSlow(size), dtype, layout, device, pin_memory);}
}// aten::rand(SymInt[] size, *, ScalarType? dtype=None, Layout? layout=None, Device? device=None, bool? pin_memory=None) -> Tensor
inline at::Tensor rand_symint(c10::SymIntArrayRef size, at::TensorOptions options={}) {return at::_ops::rand::call(size, optTypeMetaToScalarType(options.dtype_opt()), options.layout_opt(), options.device_opt(), options.pinned_memory_opt());
}
namespace symint {template <typename T, typename = std::enable_if_t<std::is_same<T, c10::SymInt>::value>>at::Tensor rand(c10::SymIntArrayRef size, at::TensorOptions options={}) {return at::_ops::rand::call(size, optTypeMetaToScalarType(options.dtype_opt()), options.layout_opt(), options.device_opt(), options.pinned_memory_opt());}
}// aten::rand(SymInt[] size, *, ScalarType? dtype=None, Layout? layout=None, Device? device=None, bool? pin_memory=None) -> Tensor
inline at::Tensor rand_symint(c10::SymIntArrayRef size, c10::optional<at::ScalarType> dtype, c10::optional<at::Layout> layout, c10::optional<at::Device> device, c10::optional<bool> pin_memory) {return at::_ops::rand::call(size, dtype, layout, device, pin_memory);
}
namespace symint {template <typename T, typename = std::enable_if_t<std::is_same<T, c10::SymInt>::value>>at::Tensor rand(c10::SymIntArrayRef size, c10::optional<at::ScalarType> dtype, c10::optional<at::Layout> layout, c10::optional<at::Device> device, c10::optional<bool> pin_memory) {return at::_ops::rand::call(size, dtype, layout, device, pin_memory);}
}// aten::rand.generator(SymInt[] size, *, Generator? generator, ScalarType? dtype=None, Layout? layout=None, Device? device=None, bool? pin_memory=None) -> Tensor
inline at::Tensor rand(at::IntArrayRef size, c10::optional<at::Generator> generator, at::TensorOptions options={}) {return at::_ops::rand_generator::call(c10::fromIntArrayRefSlow(size), generator, optTypeMetaToScalarType(options.dtype_opt()), options.layout_opt(), options.device_opt(), options.pinned_memory_opt());
}
namespace symint {template <typename T, typename = std::enable_if_t<std::is_same<T, int64_t>::value>>at::Tensor rand(at::IntArrayRef size, c10::optional<at::Generator> generator, at::TensorOptions options={}) {return at::_ops::rand_generator::call(c10::fromIntArrayRefSlow(size), generator, optTypeMetaToScalarType(options.dtype_opt()), options.layout_opt(), options.device_opt(), options.pinned_memory_opt());}
}// aten::rand.generator(SymInt[] size, *, Generator? generator, ScalarType? dtype=None, Layout? layout=None, Device? device=None, bool? pin_memory=None) -> Tensor
inline at::Tensor rand(at::IntArrayRef size, c10::optional<at::Generator> generator, c10::optional<at::ScalarType> dtype, c10::optional<at::Layout> layout, c10::optional<at::Device> device, c10::optional<bool> pin_memory) {return at::_ops::rand_generator::call(c10::fromIntArrayRefSlow(size), generator, dtype, layout, device, pin_memory);
}
namespace symint {template <typename T, typename = std::enable_if_t<std::is_same<T, int64_t>::value>>at::Tensor rand(at::IntArrayRef size, c10::optional<at::Generator> generator, c10::optional<at::ScalarType> dtype, c10::optional<at::Layout> layout, c10::optional<at::Device> device, c10::optional<bool> pin_memory) {return at::_ops::rand_generator::call(c10::fromIntArrayRefSlow(size), generator, dtype, layout, device, pin_memory);}
}// aten::rand.generator(SymInt[] size, *, Generator? generator, ScalarType? dtype=None, Layout? layout=None, Device? device=None, bool? pin_memory=None) -> Tensor
inline at::Tensor rand_symint(c10::SymIntArrayRef size, c10::optional<at::Generator> generator, at::TensorOptions options={}) {return at::_ops::rand_generator::call(size, generator, optTypeMetaToScalarType(options.dtype_opt()), options.layout_opt(), options.device_opt(), options.pinned_memory_opt());
}
namespace symint {template <typename T, typename = std::enable_if_t<std::is_same<T, c10::SymInt>::value>>at::Tensor rand(c10::SymIntArrayRef size, c10::optional<at::Generator> generator, at::TensorOptions options={}) {return at::_ops::rand_generator::call(size, generator, optTypeMetaToScalarType(options.dtype_opt()), options.layout_opt(), options.device_opt(), options.pinned_memory_opt());}
}// aten::rand.generator(SymInt[] size, *, Generator? generator, ScalarType? dtype=None, Layout? layout=None, Device? device=None, bool? pin_memory=None) -> Tensor
inline at::Tensor rand_symint(c10::SymIntArrayRef size, c10::optional<at::Generator> generator, c10::optional<at::ScalarType> dtype, c10::optional<at::Layout> layout, c10::optional<at::Device> device, c10::optional<bool> pin_memory) {return at::_ops::rand_generator::call(size, generator, dtype, layout, device, pin_memory);
}
namespace symint {template <typename T, typename = std::enable_if_t<std::is_same<T, c10::SymInt>::value>>at::Tensor rand(c10::SymIntArrayRef size, c10::optional<at::Generator> generator, c10::optional<at::ScalarType> dtype, c10::optional<at::Layout> layout, c10::optional<at::Device> device, c10::optional<bool> pin_memory) {return at::_ops::rand_generator::call(size, generator, dtype, layout, device, pin_memory);}
}// aten::rand.out(SymInt[] size, *, Tensor(a!) out) -> Tensor(a!)
inline at::Tensor & rand_out(at::Tensor & out, at::IntArrayRef size) {return at::_ops::rand_out::call(c10::fromIntArrayRefSlow(size), out);
}
namespace symint {template <typename T, typename = std::enable_if_t<std::is_same<T, int64_t>::value>>at::Tensor & rand_out(at::Tensor & out, at::IntArrayRef size) {return at::_ops::rand_out::call(c10::fromIntArrayRefSlow(size), out);}
}// aten::rand.out(SymInt[] size, *, Tensor(a!) out) -> Tensor(a!)
inline at::Tensor & rand_outf(at::IntArrayRef size, at::Tensor & out) {return at::_ops::rand_out::call(c10::fromIntArrayRefSlow(size), out);
}
namespace symint {template <typename T, typename = std::enable_if_t<std::is_same<T, int64_t>::value>>at::Tensor & rand_outf(at::IntArrayRef size, at::Tensor & out) {return at::_ops::rand_out::call(c10::fromIntArrayRefSlow(size), out);}
}// aten::rand.out(SymInt[] size, *, Tensor(a!) out) -> Tensor(a!)
inline at::Tensor & rand_symint_out(at::Tensor & out, c10::SymIntArrayRef size) {return at::_ops::rand_out::call(size, out);
}
namespace symint {template <typename T, typename = std::enable_if_t<std::is_same<T, c10::SymInt>::value>>at::Tensor & rand_out(at::Tensor & out, c10::SymIntArrayRef size) {return at::_ops::rand_out::call(size, out);}
}// aten::rand.out(SymInt[] size, *, Tensor(a!) out) -> Tensor(a!)
inline at::Tensor & rand_symint_outf(c10::SymIntArrayRef size, at::Tensor & out) {return at::_ops::rand_out::call(size, out);
}
namespace symint {template <typename T, typename = std::enable_if_t<std::is_same<T, c10::SymInt>::value>>at::Tensor & rand_outf(c10::SymIntArrayRef size, at::Tensor & out) {return at::_ops::rand_out::call(size, out);}
}// aten::rand.generator_out(SymInt[] size, *, Generator? generator, Tensor(a!) out) -> Tensor(a!)
inline at::Tensor & rand_out(at::Tensor & out, at::IntArrayRef size, c10::optional<at::Generator> generator) {return at::_ops::rand_generator_out::call(c10::fromIntArrayRefSlow(size), generator, out);
}
namespace symint {template <typename T, typename = std::enable_if_t<std::is_same<T, int64_t>::value>>at::Tensor & rand_out(at::Tensor & out, at::IntArrayRef size, c10::optional<at::Generator> generator) {return at::_ops::rand_generator_out::call(c10::fromIntArrayRefSlow(size), generator, out);}
}// aten::rand.generator_out(SymInt[] size, *, Generator? generator, Tensor(a!) out) -> Tensor(a!)
inline at::Tensor & rand_outf(at::IntArrayRef size, c10::optional<at::Generator> generator, at::Tensor & out) {return at::_ops::rand_generator_out::call(c10::fromIntArrayRefSlow(size), generator, out);
}
namespace symint {template <typename T, typename = std::enable_if_t<std::is_same<T, int64_t>::value>>at::Tensor & rand_outf(at::IntArrayRef size, c10::optional<at::Generator> generator, at::Tensor & out) {return at::_ops::rand_generator_out::call(c10::fromIntArrayRefSlow(size), generator, out);}
}// aten::rand.generator_out(SymInt[] size, *, Generator? generator, Tensor(a!) out) -> Tensor(a!)
inline at::Tensor & rand_symint_out(at::Tensor & out, c10::SymIntArrayRef size, c10::optional<at::Generator> generator) {return at::_ops::rand_generator_out::call(size, generator, out);
}
namespace symint {template <typename T, typename = std::enable_if_t<std::is_same<T, c10::SymInt>::value>>at::Tensor & rand_out(at::Tensor & out, c10::SymIntArrayRef size, c10::optional<at::Generator> generator) {return at::_ops::rand_generator_out::call(size, generator, out);}
}// aten::rand.generator_out(SymInt[] size, *, Generator? generator, Tensor(a!) out) -> Tensor(a!)
inline at::Tensor & rand_symint_outf(c10::SymIntArrayRef size, c10::optional<at::Generator> generator, at::Tensor & out) {return at::_ops::rand_generator_out::call(size, generator, out);
}
namespace symint {template <typename T, typename = std::enable_if_t<std::is_same<T, c10::SymInt>::value>>at::Tensor & rand_outf(c10::SymIntArrayRef size, c10::optional<at::Generator> generator, at::Tensor & out) {return at::_ops::rand_generator_out::call(size, generator, out);}
}

接下來會調用at::_ops::rand::call做分發(dispatch),但在進入該函數前,先來看一下at::_ops::rand的定義。

dispatch

at::_ops::rand

build/aten/src/ATen/Operators.h

rand是一個結構體,有callredispatch兩個static的成員函數,待會都會用到。

struct TORCH_API rand {using schema = at::Tensor (c10::SymIntArrayRef, c10::optional<at::ScalarType>, c10::optional<at::Layout>, c10::optional<at::Device>, c10::optional<bool>);using ptr_schema = schema*;// See Note [static constexpr char* members for windows NVCC]STATIC_CONSTEXPR_STR_INL_EXCEPT_WIN_CUDA(name, "aten::rand")STATIC_CONSTEXPR_STR_INL_EXCEPT_WIN_CUDA(overload_name, "")STATIC_CONSTEXPR_STR_INL_EXCEPT_WIN_CUDA(schema_str, "rand(SymInt[] size, *, ScalarType? dtype=None, Layout? layout=None, Device? device=None, bool? pin_memory=None) -> Tensor")static at::Tensor call(c10::SymIntArrayRef size, c10::optional<at::ScalarType> dtype, c10::optional<at::Layout> layout, c10::optional<at::Device> device, c10::optional<bool> pin_memory);static at::Tensor redispatch(c10::DispatchKeySet dispatchKeySet, c10::SymIntArrayRef size, c10::optional<at::ScalarType> dtype, c10::optional<at::Layout> layout, c10::optional<at::Device> device, c10::optional<bool> pin_memory);
};

STATIC_CONSTEXPR_STR_INL_EXCEPT_WIN_CUDAat::_ops::rand這個結構體宣告了三個名字分別為name, overload_nameschema_str的static字串成員變數,並在此處或待會在build/aten/src/ATen/Operators_2.cpp處賦予它們初始值。

同檔案下共有六個rand系列函數(rand_names, rand_generator_with_names, rand, rand_generator, rand_out, rand_generator_out),可與torch::autograd::THPVariable_rand注釋中的六個函數一一對應。

struct TORCH_API rand_names {using schema = at::Tensor (c10::SymIntArrayRef, c10::optional<at::DimnameList>, c10::optional<at::ScalarType>, c10::optional<at::Layout>, c10::optional<at::Device>, c10::optional<bool>);using ptr_schema = schema*;// See Note [static constexpr char* members for windows NVCC]STATIC_CONSTEXPR_STR_INL_EXCEPT_WIN_CUDA(name, "aten::rand")STATIC_CONSTEXPR_STR_INL_EXCEPT_WIN_CUDA(overload_name, "names")STATIC_CONSTEXPR_STR_INL_EXCEPT_WIN_CUDA(schema_str, "rand.names(SymInt[] size, *, Dimname[]? names, ScalarType? dtype=None, Layout? layout=None, Device? device=None, bool? pin_memory=None) -> Tensor")static at::Tensor call(c10::SymIntArrayRef size, c10::optional<at::DimnameList> names, c10::optional<at::ScalarType> dtype, c10::optional<at::Layout> layout, c10::optional<at::Device> device, c10::optional<bool> pin_memory);static at::Tensor redispatch(c10::DispatchKeySet dispatchKeySet, c10::SymIntArrayRef size, c10::optional<at::DimnameList> names, c10::optional<at::ScalarType> dtype, c10::optional<at::Layout> layout, c10::optional<at::Device> device, c10::optional<bool> pin_memory);
};struct TORCH_API rand_generator_with_names {using schema = at::Tensor (c10::SymIntArrayRef, c10::optional<at::Generator>, c10::optional<at::DimnameList>, c10::optional<at::ScalarType>, c10::optional<at::Layout>, c10::optional<at::Device>, c10::optional<bool>);using ptr_schema = schema*;// See Note [static constexpr char* members for windows NVCC]STATIC_CONSTEXPR_STR_INL_EXCEPT_WIN_CUDA(name, "aten::rand")STATIC_CONSTEXPR_STR_INL_EXCEPT_WIN_CUDA(overload_name, "generator_with_names")STATIC_CONSTEXPR_STR_INL_EXCEPT_WIN_CUDA(schema_str, "rand.generator_with_names(SymInt[] size, *, Generator? generator, Dimname[]? names, ScalarType? dtype=None, Layout? layout=None, Device? device=None, bool? pin_memory=None) -> Tensor")static at::Tensor call(c10::SymIntArrayRef size, c10::optional<at::Generator> generator, c10::optional<at::DimnameList> names, c10::optional<at::ScalarType> dtype, c10::optional<at::Layout> layout, c10::optional<at::Device> device, c10::optional<bool> pin_memory);static at::Tensor redispatch(c10::DispatchKeySet dispatchKeySet, c10::SymIntArrayRef size, c10::optional<at::Generator> generator, c10::optional<at::DimnameList> names, c10::optional<at::ScalarType> dtype, c10::optional<at::Layout> layout, c10::optional<at::Device> device, c10::optional<bool> pin_memory);
};struct TORCH_API rand {using schema = at::Tensor (c10::SymIntArrayRef, c10::optional<at::ScalarType>, c10::optional<at::Layout>, c10::optional<at::Device>, c10::optional<bool>);using ptr_schema = schema*;// See Note [static constexpr char* members for windows NVCC]STATIC_CONSTEXPR_STR_INL_EXCEPT_WIN_CUDA(name, "aten::rand")STATIC_CONSTEXPR_STR_INL_EXCEPT_WIN_CUDA(overload_name, "")STATIC_CONSTEXPR_STR_INL_EXCEPT_WIN_CUDA(schema_str, "rand(SymInt[] size, *, ScalarType? dtype=None, Layout? layout=None, Device? device=None, bool? pin_memory=None) -> Tensor")static at::Tensor call(c10::SymIntArrayRef size, c10::optional<at::ScalarType> dtype, c10::optional<at::Layout> layout, c10::optional<at::Device> device, c10::optional<bool> pin_memory);static at::Tensor redispatch(c10::DispatchKeySet dispatchKeySet, c10::SymIntArrayRef size, c10::optional<at::ScalarType> dtype, c10::optional<at::Layout> layout, c10::optional<at::Device> device, c10::optional<bool> pin_memory);
};struct TORCH_API rand_generator {using schema = at::Tensor (c10::SymIntArrayRef, c10::optional<at::Generator>, c10::optional<at::ScalarType>, c10::optional<at::Layout>, c10::optional<at::Device>, c10::optional<bool>);using ptr_schema = schema*;// See Note [static constexpr char* members for windows NVCC]STATIC_CONSTEXPR_STR_INL_EXCEPT_WIN_CUDA(name, "aten::rand")STATIC_CONSTEXPR_STR_INL_EXCEPT_WIN_CUDA(overload_name, "generator")STATIC_CONSTEXPR_STR_INL_EXCEPT_WIN_CUDA(schema_str, "rand.generator(SymInt[] size, *, Generator? generator, ScalarType? dtype=None, Layout? layout=None, Device? device=None, bool? pin_memory=None) -> Tensor")static at::Tensor call(c10::SymIntArrayRef size, c10::optional<at::Generator> generator, c10::optional<at::ScalarType> dtype, c10::optional<at::Layout> layout, c10::optional<at::Device> device, c10::optional<bool> pin_memory);static at::Tensor redispatch(c10::DispatchKeySet dispatchKeySet, c10::SymIntArrayRef size, c10::optional<at::Generator> generator, c10::optional<at::ScalarType> dtype, c10::optional<at::Layout> layout, c10::optional<at::Device> device, c10::optional<bool> pin_memory);
};struct TORCH_API rand_out {using schema = at::Tensor & (c10::SymIntArrayRef, at::Tensor &);using ptr_schema = schema*;// See Note [static constexpr char* members for windows NVCC]STATIC_CONSTEXPR_STR_INL_EXCEPT_WIN_CUDA(name, "aten::rand")STATIC_CONSTEXPR_STR_INL_EXCEPT_WIN_CUDA(overload_name, "out")STATIC_CONSTEXPR_STR_INL_EXCEPT_WIN_CUDA(schema_str, "rand.out(SymInt[] size, *, Tensor(a!) out) -> Tensor(a!)")static at::Tensor & call(c10::SymIntArrayRef size, at::Tensor & out);static at::Tensor & redispatch(c10::DispatchKeySet dispatchKeySet, c10::SymIntArrayRef size, at::Tensor & out);
};struct TORCH_API rand_generator_out {using schema = at::Tensor & (c10::SymIntArrayRef, c10::optional<at::Generator>, at::Tensor &);using ptr_schema = schema*;// See Note [static constexpr char* members for windows NVCC]STATIC_CONSTEXPR_STR_INL_EXCEPT_WIN_CUDA(name, "aten::rand")STATIC_CONSTEXPR_STR_INL_EXCEPT_WIN_CUDA(overload_name, "generator_out")STATIC_CONSTEXPR_STR_INL_EXCEPT_WIN_CUDA(schema_str, "rand.generator_out(SymInt[] size, *, Generator? generator, Tensor(a!) out) -> Tensor(a!)")static at::Tensor & call(c10::SymIntArrayRef size, c10::optional<at::Generator> generator, at::Tensor & out);static at::Tensor & redispatch(c10::DispatchKeySet dispatchKeySet, c10::SymIntArrayRef size, c10::optional<at::Generator> generator, at::Tensor & out);
};

at::_ops::rand::call

#18 at::_ops::rand::call (size=..., dtype=..., layout=..., layout@entry=..., device=..., pin_memory=...) at /root/Documents/pytorch/build/aten/src/ATen/Operators_2.cpp:5213

build/aten/src/ATen/Operators_2.cpp

STATIC_CONST_STR_OUT_OF_LINE_FOR_WIN_CUDA(rand, name, "aten::rand")
STATIC_CONST_STR_OUT_OF_LINE_FOR_WIN_CUDA(rand, overload_name, "")
STATIC_CONST_STR_OUT_OF_LINE_FOR_WIN_CUDA(rand, schema_str, "rand(SymInt[] size, *, ScalarType? dtype=None, Layout? layout=None, Device? device=None, bool? pin_memory=None) -> Tensor")

at::_ops::rand中使用STATIC_CONSTEXPR_STR_INL_EXCEPT_WIN_CUDA宣告了name, overload_name, schema_str等三個成員變數。這裡則用STATIC_CONST_STR_OUT_OF_LINE_FOR_WIN_CUDArandname成員變數的值設為"aten::rand";把overload_name成員設為"";把schema_str成員設為"rand(SymInt[] size, *, ScalarType? dtype=None, Layout? layout=None, Device? device=None, bool? pin_memory=None) -> Tensor"。如果build/aten/src/ATen/Operators.h處已經賦予初始值,那麼此處的代碼就是無效的。

at::_ops::rand::call如下:

// aten::rand(SymInt[] size, *, ScalarType? dtype=None, Layout? layout=None, Device? device=None, bool? pin_memory=None) -> Tensor
static C10_NOINLINE c10::TypedOperatorHandle<rand::schema> create_rand_typed_handle() {return c10::Dispatcher::singleton().findSchemaOrThrow(rand::name, rand::overload_name).typed<rand::schema>();
}// aten::rand(SymInt[] size, *, ScalarType? dtype=None, Layout? layout=None, Device? device=None, bool? pin_memory=None) -> Tensor
at::Tensor rand::call(c10::SymIntArrayRef size, c10::optional<at::ScalarType> dtype, c10::optional<at::Layout> layout, c10::optional<at::Device> device, c10::optional<bool> pin_memory) {static auto op = create_rand_typed_handle();return op.call(size, dtype, layout, device, pin_memory);
}

此處調用create_rand_typed_handle,會查表做dispatch,而分發時查的表格則是由native_functions.yaml生成。

另外從注釋中可以看出at::_ops::rand::call和待會會在native_functions.yaml看到的aten::rand是一對一的關係。

native_functions.yaml

aten/src/ATen/native/native_functions.yaml

- func: rand.names(SymInt[] size, *, Dimname[]? names, ScalarType? dtype=None, Layout? layout=None, Device? device=None, bool? pin_memory=None) -> Tensordevice_check: NoCheckdevice_guard: Falsedispatch:CompositeExplicitAutograd: randautogen: rand.names_outtags: nondeterministic_seeded- func: rand.generator_with_names(SymInt[] size, *, Generator? generator, Dimname[]? names, ScalarType? dtype=None, Layout? layout=None, Device? device=None, bool? pin_memory=None) -> Tensordevice_check: NoCheckdevice_guard: Falsetags: nondeterministic_seededdispatch:CompositeExplicitAutograd: randautogen: rand.generator_with_names_out- func: rand(SymInt[] size, *, ScalarType? dtype=None, Layout? layout=None, Device? device=None, bool? pin_memory=None) -> Tensortags: [core, nondeterministic_seeded]dispatch:CompositeExplicitAutograd: rand- func: rand.generator(SymInt[] size, *, Generator? generator, ScalarType? dtype=None, Layout? layout=None, Device? device=None, bool? pin_memory=None) -> Tensortags: nondeterministic_seededdispatch:CompositeExplicitAutograd: rand- func: rand.out(SymInt[] size, *, Tensor(a!) out) -> Tensor(a!)tags: nondeterministic_seededdispatch:CompositeExplicitAutograd: rand_out- func: rand.generator_out(SymInt[] size, *, Generator? generator, Tensor(a!) out) -> Tensor(a!)tags: nondeterministic_seeded

native_functions.yaml有六個版本的rand,分別是:

  • size, names
  • size, generator, names
  • size
  • size, generator
  • size, out
  • size, generator, out

回顧torch/_C/_VariableFunctions.pyi中的四個分類:“size, generator, names”,“size, generator(out可省略)”,“size(out可省略)”,“size, names”。這四個分類可以拆開來變成6個:“size, generator, names”,“size, generator, out”,“size, generator”,“size, out”,“size”,“size, names”,正好可以跟此處的六個條目一一對應。

回顧THPVariable_rand注釋中的六個函數:aten::rand.generator_with_names, aten::rand.generator, aten::rand.generator_out, aten::rand, aten::rand.out, aten::rand.names。正可與此處的六個函數一一對應。另外因為由native_functions.yamlfunc:後的函數的命名空間默認為aten,所以注釋裡所有函數都加上了aten::的前綴。

at::_ops::rand::call(對應到aten::rand)調用了c10::TypedOperatorHandle<rand::schema>::call函數做分發,從Python bindings處已知torch.rand是對應到此處第三個沒有generator,names和out的rand函數,來到native_functions.yaml查表:

- func: rand(SymInt[] size, *, ScalarType? dtype=None, Layout? layout=None, Device? device=None, bool? pin_memory=None) -> Tensortags: [core, nondeterministic_seeded]dispatch:CompositeExplicitAutograd: rand

後來到at::(anonymous namespace)::rand

at::(anonymous namespace)::rand

#11 0x00007f0462c05329 in at::(anonymous namespace)::rand (pin_memory=..., device=..., layout=..., dtype=..., size=...) at /root/Documents/pytorch/build/aten/src/ATen/RegisterBackendSelect.cpp:365

build/aten/src/ATen/RegisterBackendSelect.cpp

// aten::rand(SymInt[] size, *, ScalarType? dtype=None, Layout? layout=None, Device? device=None, bool? pin_memory=None) -> Tensor
C10_ALWAYS_INLINE
at::Tensor rand(c10::SymIntArrayRef size, c10::optional<at::ScalarType> dtype, c10::optional<at::Layout> layout, c10::optional<at::Device> device, c10::optional<bool> pin_memory) {DispatchKeySet _dk = c10::DispatchKeySet(c10::computeDispatchKey(dtype, layout, device));return at::_ops::rand::redispatch(_dk, size, dtype, layout, device, pin_memory);
}

進行再次分發(redispatch)。

redisptach

at::_ops::rand::redispatch

#10 at::_ops::rand::redispatch (dispatchKeySet=..., size=..., dtype=..., layout=..., layout@entry=..., device=..., pin_memory=...) at /root/Documents/pytorch/build/aten/src/ATen/Operators_2.cpp:5220

build/aten/src/ATen/Operators_2.cpp

// aten::rand(SymInt[] size, *, ScalarType? dtype=None, Layout? layout=None, Device? device=None, bool? pin_memory=None) -> Tensor
at::Tensor rand::redispatch(c10::DispatchKeySet dispatchKeySet, c10::SymIntArrayRef size, c10::optional<at::ScalarType> dtype, c10::optional<at::Layout> layout, c10::optional<at::Device> device, c10::optional<bool> pin_memory) {static auto op = create_rand_typed_handle();return op.redispatch(dispatchKeySet, size, dtype, layout, device, pin_memory);
}

native_functions.yaml

回到native_functions.yaml查表,關注aten::rand

- func: rand(SymInt[] size, *, ScalarType? dtype=None, Layout? layout=None, Device? device=None, bool? pin_memory=None) -> Tensortags: [core, nondeterministic_seeded]dispatch:CompositeExplicitAutograd: rand

查看dispatch欄位,下有一個名為CompositeExplicitAutograd的key,表示redispatch後會去到CompositeExplicitAutograd backend,即待會會看到的wrapper_CompositeExplicitAutograd__rand函數;value則是rand,因為dispatch:後預設的命名空間是at::native,所以redispatch的終點是at::native::rand函數。

CPU kernel

at::(anonymous namespace)::(anonymous namespace)::wrapper_CompositeExplicitAutograd__rand

#3  0x00007f0462dd2c28 in at::(anonymous namespace)::(anonymous namespace)::wrapper_CompositeExplicitAutograd__rand (pin_memory=..., device=..., layout=..., dtype=..., size=...) at /root/Documents/pytorch/build/aten/src/ATen/RegisterCompositeExplicitAutograd.cpp:2214

build/aten/src/ATen/RegisterCompositeExplicitAutograd.cpp

namespace {
at::Tensor wrapper_CompositeExplicitAutograd__rand(c10::SymIntArrayRef size, c10::optional<at::ScalarType> dtype, c10::optional<at::Layout> layout, c10::optional<at::Device> device, c10::optional<bool> pin_memory) {// No device check// DeviceGuard omittedreturn at::native::rand(C10_AS_INTARRAYREF_SLOW(size), dtype, layout, device, pin_memory);
}
} // anonymous namespace

注:在PyTorch 1.14版中是build/aten/src/ATen/RegisterCompositeExplicitAutograd.cpp中的at::(anonymous namespace)::(anonymous namespace)::wrapper__rand函數。

同檔案下的rand系列函數:

namespace {
at::Tensor wrapper_CompositeExplicitAutograd_names_rand(c10::SymIntArrayRef size, c10::optional<at::DimnameList> names, c10::optional<at::ScalarType> dtype, c10::optional<at::Layout> layout, c10::optional<at::Device> device, c10::optional<bool> pin_memory) {// No device check// DeviceGuard omittedreturn at::native::rand(C10_AS_INTARRAYREF_SLOW(size), names, dtype, layout, device, pin_memory);
}
} // anonymous namespace
namespace {
at::Tensor & wrapper_CompositeExplicitAutograd_names_out_rand_out(c10::SymIntArrayRef size, c10::optional<at::DimnameList> names, at::Tensor & out) {// No device check// DeviceGuard omittedreturn at::native::rand_names_out_symint(size, names, out);
}
} // anonymous namespace
namespace {
at::Tensor wrapper_CompositeExplicitAutograd_generator_with_names_rand(c10::SymIntArrayRef size, c10::optional<at::Generator> generator, c10::optional<at::DimnameList> names, c10::optional<at::ScalarType> dtype, c10::optional<at::Layout> layout, c10::optional<at::Device> device, c10::optional<bool> pin_memory) {// No device check// DeviceGuard omittedreturn at::native::rand(C10_AS_INTARRAYREF_SLOW(size), generator, names, dtype, layout, device, pin_memory);
}
} // anonymous namespace
namespace {
at::Tensor & wrapper_CompositeExplicitAutograd_generator_with_names_out_rand_out(c10::SymIntArrayRef size, c10::optional<at::Generator> generator, c10::optional<at::DimnameList> names, at::Tensor & out) {// No device check// DeviceGuard omittedreturn at::native::rand_generator_with_names_out_symint(size, generator, names, out);
}
} // anonymous namespace
namespace {
at::Tensor wrapper_CompositeExplicitAutograd__rand(c10::SymIntArrayRef size, c10::optional<at::ScalarType> dtype, c10::optional<at::Layout> layout, c10::optional<at::Device> device, c10::optional<bool> pin_memory) {// No device check// DeviceGuard omittedreturn at::native::rand(C10_AS_INTARRAYREF_SLOW(size), dtype, layout, device, pin_memory);
}
} // anonymous namespace
namespace {
at::Tensor & wrapper_CompositeExplicitAutograd_out_rand_out(c10::SymIntArrayRef size, at::Tensor & out) {// No device check// DeviceGuard omittedreturn at::native::rand_out(C10_AS_INTARRAYREF_SLOW(size), out);
}
} // anonymous namespace
namespace {
at::Tensor wrapper_CompositeExplicitAutograd_generator_rand(c10::SymIntArrayRef size, c10::optional<at::Generator> generator, c10::optional<at::ScalarType> dtype, c10::optional<at::Layout> layout, c10::optional<at::Device> device, c10::optional<bool> pin_memory) {// No device check// DeviceGuard omittedreturn at::native::rand(C10_AS_INTARRAYREF_SLOW(size), generator, dtype, layout, device, pin_memory);
}
} // anonymous namespace

names_randnames_out_rand_outgenerator_with_names_randgenerator_with_names_out_rand_out_randout_rand_outgenrator_rand共七個(缺了有generator和out的版本?)。

at::native::rand

#2  0x00007f04620d61fc in at::native::rand (size=..., dtype=..., layout=..., layout@entry=..., device=..., pin_memory=...)at /root/Documents/pytorch/aten/src/ATen/native/TensorFactories.cpp:781

aten/src/ATen/native/TensorFactories.cpp

// ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ rand ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~Tensor rand(IntArrayRef size,c10::optional<ScalarType> dtype,c10::optional<Layout> layout,c10::optional<Device> device,c10::optional<bool> pin_memory) {return native::rand(size, static_cast<c10::optional<Generator>>(c10::nullopt), dtype, layout, device, pin_memory);
}

創建一個nullopt的generator後調用同一個檔案下簽名較完整的at::native::rand函數。

at::native::rand

#1  0x00007f04620d60b1 in at::native::rand (size=..., generator=..., dtype=..., layout=..., device=..., pin_memory=...)at /root/Documents/pytorch/c10/util/Optional.h:204

在backtrace中裡直接從at::native::rand跳來了c10/util/Optional.h,有點不明所以,以下是筆者猜測的完整調用路徑。

首先在at::native::rand中有一個static_cast<c10::optional<Generator>>(c10::nullopt)的參數,當中用到了c10::optional

c10::optional

c10/util/Optional.h

template <class T>
class optional : private OptionalBase<T> {// ...
};

c10::OptionalBase

c10/util/Optional.h

template <class T>
using OptionalBase = std::conditional_t<detail_::is_arrayref<T>::value,arrayref_optional_base<T>,std::conditional_t<std::is_trivially_destructible<T>::value &&C10_IS_TRIVIALLY_COPYABLE(T) &&// Avoid using is_trivially_copy_{constructible,assignable}// because old GCC versions don't support them. Also,// is_trivially_copyable seems not to do what I expect, so check// trivially_copyable_optimization_optional_base directly.std::is_copy_constructible<trivially_copyable_optimization_optional_base<T>>::value &&std::is_copy_assignable<trivially_copyable_optimization_optional_base<T>>::value,trivially_copyable_optimization_optional_base<T>,std::conditional_t<std::is_trivially_destructible<T>::value, // if possibleconstexpr_optional_base<std::remove_const_t<T>>, // use base with// trivial// destructoroptional_base<std::remove_const_t<T>>>>>;

當中用到了constexpr_optional_base

c10::constexpr_optional_base

c10/util/Optional.h

template <class T>
struct constexpr_optional_base {bool init_;constexpr_storage_t<T> storage_;// ...
};

constexpr_storage_t

c10/util/Optional.h

template <class T>
union constexpr_storage_t {unsigned char dummy_;T value_;#if __cplusplus >= 202002L// C++20 lifted the requirement to initialize a union member in order to be// constexpr.constexpr constexpr_storage_t(trivial_init_t) noexcept {new (&dummy_) unsigned char;}
#elseconstexpr constexpr_storage_t(trivial_init_t) noexcept : dummy_() {}
#endiftemplate <class... Args>constexpr constexpr_storage_t(Args&&... args): value_(constexpr_forward<Args>(args)...) {}~constexpr_storage_t() = default;
};

從backtrace看此處是進入了else分支,然後直接進入at::empty函數,仍然不懂箇中奧妙。

at::native::rand

下面則是筆者根據調用處的函數簽名找到的對應的函數定義:

aten/src/ATen/native/TensorFactories.cpp

Tensor rand(IntArrayRef size, c10::optional<Generator> generator,c10::optional<ScalarType> dtype,c10::optional<Layout> layout,c10::optional<Device> device,c10::optional<bool> pin_memory) {// See [Note: hacky wrapper removal for TensorOptions]TensorOptions options = TensorOptions().dtype(dtype).layout(layout).device(device).pinned_memory(pin_memory);auto result = at::empty(size, options);return result.uniform_(0, 1, std::move(generator));
}

首先調用at::empty得到一個at::Tensor後,再使用at::Tensor::uniform_來讓張量裡的元素化為均勻分布。自此將進入at::emptyuniform_函數,結束rand函數的旅程。

本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:http://www.mzph.cn/news/200074.shtml

如若内容造成侵权/违法违规/事实不符,请联系多彩编程网进行投诉反馈email:809451989@qq.com,一经查实,立即删除!

相关文章

【人体解剖学与组织胚胎学】练习一高度相联知识点整理及对应习题

文章目录 [toc]骨性鼻旁窦填空题问答题 关节填空题简答题 胸廓填空题简答题![在这里插入图片描述](https://img-blog.csdnimg.cn/direct/827e7d1db3af42858d8734bb81911fea.jpeg)补充 骨性鼻旁窦 填空题 问答题 关节 填空题 简答题 胸廓 填空题 简答题 补充 第二肋对应胸骨…

Leetcode.2477 到达首都的最少油耗

题目链接 Leetcode.2477 到达首都的最少油耗 rating : 2012 题目描述 给你一棵 n n n 个节点的树&#xff08;一个无向、连通、无环图&#xff09;&#xff0c;每个节点表示一个城市&#xff0c;编号从 0 0 0 到 n − 1 n - 1 n−1 &#xff0c;且恰好有 n − 1 n - 1 n−…

什么是呼叫中心的语音通道?呼叫中心语音线路有几种?

什么是呼叫中心的语音通道&#xff1f; 呼叫中心的语音通道是指在呼叫中心中使用的语音信号传输通道&#xff0c;它是呼叫中心中至关重要的一部分&#xff0c;负责将客户的语音信息传递给客服代表&#xff0c;以及将客服代表的语音信息传递给客户。在呼叫中心的运营中&#xf…

JAVA-JVM 之Class字节码文件的组成 【下篇】

字节码 类元数据接口元数据字段元数据方法元数据属性元数据 主页传送门&#xff1a;&#x1f4c0; 传送 类元数据 此部分元数据主要包含类索引&#xff08;This_Class&#xff09;和父类索引&#xff08;Super_Class&#xff09;。 类索引&#xff1a;指向Class字节码常量池表…

回顾一下磁盘管理

目录 一、磁盘概述 1.磁盘表示方法/dev/ 2.分区类型 3.文件系统 4.inode节点 二、磁盘操作 1.查看与添加磁盘 2.分区 2.非交互式 3.fdisk /dev/sdb < part.txt 三、LVM逻辑卷 1、物理卷中的操作命令 2、卷组中的操作命令 3、逻辑卷中的操作命令 一、磁盘概述 …

Python----Pandas

目录 Series属性 DataFrame的属性 Pandas的CSV文件 Pandas数据处理 Pandas的主要数据结构是Series&#xff08;一维数据&#xff09;与DataFrame&#xff08;二维数据&#xff09; Series属性 Series的属性如下&#xff1a; 属性描述pandas.Series(data,index,dtype,nam…

mybatis 的快速入门以及基于spring boot整合mybatis

MyBatis基础 MyBatis是一款非常优秀的持久层框架&#xff0c;用于简化JDBC的开发 准备工作&#xff1a; 1&#xff0c;创建sprong boot工程&#xff0c;引入mybatis相关依赖2&#xff0c;准备数据库表User&#xff0c;实体类User3&#xff0c; 配置MyBatis&#xff08;在applic…

2005-2021年地级市绿色发展注意力数据(根据政府报告文本词频统计)

2005-2021年地级市绿色发展注意力数据&#xff08;根据政府报告文本词频统计&#xff09; 1、时间&#xff1a;2005-2021年 2、指标&#xff1a;省、市、年份、一级指标、关键词、关键词词频、总词频 3、范围&#xff1a;270个地级市 4、来源&#xff1a;地级市政府工作报告…

【C++】动态内存管理——new和delete

这篇文章我们讲一下C的动态内存管理&#xff0c;从一个比较陌生的知识说起&#xff0c;我们知道&#xff0c;一个工程可以创建很多.c文件&#xff0c;我们如果定义一个全局变量&#xff0c;只要用extern声明一下&#xff0c;在每个文件都可以用。而用static修饰的全局变量只能在…

MyBatis动态sql语句

1、if if元素可以用于根据条件判断是否包含某个SQL语句片段。 <!--查询年龄大于18岁且小于等于30岁的用户信息:<if>元素用于判断minAge和maxAge是否为null&#xff0c;如果不为null&#xff0c;则将对应的SQL语句片段拼接到最终的SQL语句中 --> <select id&quo…

【ecology】通过F12抓取页面SQL

1、点击流程监控&#xff0c;打开浏览器的”开发者工具“&#xff08;F12&#xff09;&#xff1b; 2、点击搜索&#xff0c;在开发者工具中找到sessionkey&#xff0c;复制后面的值。 3、http://58.213.83.186:8081/api/ec/dev/table/getxml?dataKey 上面的网址的IP地址修改…

Gee教程6.模板(HTML Template)

这一章节的内容是介绍 Web 框架如何支持服务端渲染的场景 实现静态资源服务(Static Resource)。支持HTML模板渲染。 这一章节很多内容是基于net/http库的&#xff0c;该库已经实现了很多静态文件和HMML模板的相关功能的了。 静态文件 网页的三剑客&#xff0c;JavaScript、C…

代理模式-C++实现

代理模式是一种结构型设计模式&#xff0c;为其他对象提供一种代理以控制对这个对象的访问。在某些情况下&#xff0c;一个对象不适合或者无法引用另一个对象&#xff0c;这个时候就需要一个代理对象充当客户端和目标对象之间的中介。 代理模式就是代理对象具备目标对象的所有…

如何写项目部署文档

编写项目部署文档&#xff0c;这是确保项目顺利部署和上线的重要步骤。 1. 项目概述 在这里简要介绍项目的目的、功能和特点。 2. 系统要求 列出项目部署所需的硬件和软件要求&#xff0c;例如操作系统、Java版本、数据库等。 3. 安装步骤 描述项目的下载和安装步骤&…

看懂lscpu的输出

文章目录 1. lscpu1.1 Architecture1.2 逻辑核心数1.3 缓存1.4 CPU型号1.5 NUMA架构1.5.1 CPU多核架构1.5.2 多CPU Socket架构 2. cat /proc/cpuinfo2.1 关键字段 1. lscpu 通过lscpu查看当前系统的CPU信息。 [hadoopserver3 ~]$ lscpuArchitecture: x86_64 …

第十四周课堂笔记

7.Java语言类的关键技术 7.1类的成员的权限修饰符 private&#xff1a;表示仅能在类的内部访问该成员&#xff0c;无法从类的外部访问该成员 public&#xff1a;表示该成员可以被其他的任何类使用 缺省&#xff1a;表示该成员仅能被同一个包中的类所访问或者调用 7.2方法的…

「词令」2023年12月6日蚂蚁庄园今日问题答案是什么?支付宝蚂蚁庄园今日答案12.6

问题&#xff1a;千页豆腐的主要原料是豆腐吗&#xff1f; 选项&#xff1a;A、不是哦 B、当然是 答案&#xff1a;不是哦 解析&#xff1a;千页豆腐是素食新产品&#xff0c;以大豆分离蛋白和水为主要原料&#xff0c;食用植物油、淀粉等为辅料;添加或不添加稳定剂和凝固剂…

java--接口的其他细节

1.jdk8开始&#xff0c;接口新增了三种形式的方法 ①默认方法(实例方法)&#xff1a;使用用default修饰&#xff0c;默认会被加上public修饰。注意&#xff1a;只能使用接口的实现类对象调用 ②私有方法&#xff1a;必须用private修饰(jdk9开始才支持) ③类方法(静态方法)&a…

如何实现微信公众号自助查券返利机器人?

如何实现微信公众号自助查券返利机器人&#xff1f; 在当今的电商时代&#xff0c;越来越多的人选择在网上购物。然而&#xff0c;寻找商品的优惠券和返利却是一件繁琐的事情。为了解决这个问题&#xff0c;我们可以借助微信公众号和微赚淘客系统来实现自助查券返利机器人的功…

【聚类】K-modes和K-prototypes——适合离散数据的聚类方法

应用场景&#xff1a; 假设一批数据&#xff0c;每一个样本中&#xff0c;有唯一标识&#xff08;id&#xff09;、品类&#xff08;cate_id&#xff09;、受众&#xff08;users, 小孩、老人、中年等&#xff09;等属性&#xff0c;希望从其中找出一些样本&#xff0c;使得这…