TORCH_TRACE and tlparse are a structured log and log parser for PyTorch 2. It gives useful information about what code was compiled and what the intermediate build products look like.

Higher order operators

26 days ago 17m 10s

Higher order operators are a special form of operators in torch.ops which have relaxed input argument requirements: in particular, they can accept any form of argument, including Python callables. Their name is based off of their most common us

Inductor - Post-grad FX passes

Apr 12th, 2024 24m 7s

The post-grad FX passes in Inductor run after AOTAutograd has functionalized and normalized the input program into separate forward/backward graphs. As such, they generally can assume that the graph in question is functionalized, except for som

CUDA graph trees

Mar 24th, 2024 20m 50s

CUDA graph trees are the internal implementation of CUDA graphs used in PT2 when you say mode="reduce-overhead". Their primary innovation is that they allow the reuse of memory across multiple CUDA graphs, as long as they form a tree structure

Min-cut partitioner

Mar 17th, 2024 15m 56s

The min-cut partitioner makes decisions about what to save for backwards when splitting the forward and backwards graph from the joint graph traced by AOTAutograd. Crucially, it doesn't actually do a "split"; instead, it is deciding how much of

AOTInductor

Mar 2nd, 2024 17m 30s

AOTInductor is a feature in PyTorch that lets you export an inference model into a self-contained dynamic library, which can subsequently be loaded and used to run optimized inference. It is aimed primarily at CUDA and CPU inference application

Tensor subclasses and PT2

Feb 24th, 2024 13m 25s

Tensor subclasses allow you to add extend PyTorch with new types of tensors without having to write any C++. They have been used to implement DTensor, FP8, Nested Jagged Tensor and Complex Tensor. Recent work by Brian Hirsh means that we can co

Compiled autograd

Feb 19th, 2024 18m 7s

Compiled autograd is an extension to PT2 that permits compiling the entirety of a backward() call in PyTorch. This allows us to fuse accumulate grad nodes as well as trace through arbitrarily complicated Python backward hooks. Compiled autograd

PT2 extension points

Feb 5th, 2024 15m 54s

We discuss some extension points for customizing PT2 behavior across Dynamo, AOTAutograd and Inductor.

Inductor - Define-by-run IR

Jan 24th, 2024 12m 6s

Define-by-run IR is how Inductor defines the internal compute of a pointwise/reduction operation. It is characterized by a function that calls a number of functions in the 'ops' namespace, where these ops can be overridden by different handlers

Unsigned integers

Jan 17th, 2024 13m 7s

Traditionally, unsigned integer support in PyTorch was not great; we only support uint8. Recently, we added support for uint16, uint32 and uint64. Bare bones functionality works, but I'm entreating the community to help us build out the rest. I

Inductor - IR

Jan 16th, 2024 18m

Inductor IR is an intermediate representation that lives between ATen FX graphs and the final Triton code generated by Inductor. It was designed to faithfully represent PyTorch semantics and accordingly models views, mutation and striding. When

Dynamo - VariableTracker

Jan 12th, 2024 15m 55s

I talk about VariableTracker in Dynamo. VariableTracker is Dynamo's representation of the Python. I talk about some recent changes, namely eager guards and mutable VT. I also tell you how to find the functionality you care about in VariableTrac

Unbacked SymInts

Feb 21st, 2023 21m 31s

This podcast goes over the basics of unbacked SymInts. You might want to listen to this one before listening to https://pytorch-dev-podcast.simplecast.com/episodes/zero-one-specialization Some questions we answer (h/t from Gregory Chanan): - Ar

Zero-one specialization

Feb 20th, 2023 21m 7s

Mikey Dagistes joins me to ask some questions about the recent recent composability sync https://www.youtube.com/watch?v=NJV7YFbtoR4 where we discussed 0/1 specialization and its implications on export in PT2. What's the fuss all about? What do

torchdynamo

Dec 6th, 2022 25m 35s

What is torchdynamo? From a bird's eye view, what exactly does it do? What are some important things to know about it? How does it differ from other graph capture mechanisms?For more reading, check out https://docs.google.com/document/d/13K03JN

PyTorch 2.0

Dec 4th, 2022 17m 51s

Soumith's keynote on PT2.0: https://youtu.be/vbtGZL7IrAw?t=1037PT2 Manifesto: https://docs.google.com/document/d/1tlgPcR2YmC3PcQuYDPUORFmEaBPQEmo8dsh4eUjnlyI/edit# PT2 Architecture: https://docs.google.com/document/d/1wpv8D2iwGkKjWyKof9gFdTf8IS

History of functorch

Nov 7th, 2022 19m 10s

Join me with Richard Zou to talk about the history of functorch. What was the thought process behind the creation of functorch? How did it get started? JAX’s API and model is fairly different from PyTorch’s, how did we validate that it would wo

Learning rate schedulers

Jun 13th, 2022 19m 35s

What’s a learning rate? Why might you want to schedule it? How does the LR scheduler API in PyTorch work? What the heck is up with the formula implementation? Why is everything terrible?

Weak references

Jun 6th, 2022 16m 46s

What are they good for? (Caches. Private fields.) C++ side support, how it’s implemented / release resources. Python side support, how it’s implemented. Weak ref tensor hazard due to resurrection. Downsides of weak references in C++. Scott Wolc

Strides

May 30th, 2022 20m 31s

Mike Ruberry has an RFC about stride-agnostic operator semantics (https://github.com/pytorch/pytorch/issues/78050), so let's talk about strides. What are they? How are they used to implement views and memory format? How do you handle them prope

AOTAutograd

May 9th, 2022 19m 12s

AOTAutograd is a cool new feature in functorch for capturing both forward and backward traces of PyTorch operators, letting you run them through a compiler and then drop the compiled kernels back into a normal PyTorch eager program. Today, Hora

Dispatcher questions with Sherlock

May 2nd, 2022 18m 36s

Sherlock recently joined the PyTorch team, having previously worked on ONNX Runtime at Microsoft, and Sherlock’s going to ask me some questions about the dispatcher, and I’m going to answer them. We talked about the history of the dispatcher, h

New CI

Apr 25th, 2022 16m 12s

PyTorch recently moved all of its CI from CircleCI to GitHub Actions. There were a lot of improvements in the process, making my old podcast about CI obsolete! Today, Eli Uriegas joins me to talk about why we moved to GitHub Actions, how the ne

Python exceptions

Apr 17th, 2022 14m 47s

C++ has exceptions, Python has exceptions. But they’re not the same thing! How do exceptions work in CPython, how do we translate exceptions from C++ to Python (hint: it’s different for direct bindings versus pybind11), and what do warnings (wh