tinygrad: A simple and powerful neural network framework

tinygrad| docs| jobs| tinybox (buy now!)| FAQ

tinygrad

We write and maintain tinygrad, the fastest growing neural network framework

It's extremely simple, and breaks down the most complex networks into 3 OpTypes

ElementwiseOps are UnaryOps, BinaryOps, and TernaryOps.
They operate on 1-3 tensors and run elementwise.
example: SQRT, LOG2, ADD, MUL, WHERE, etc...

ReduceOps operate on one tensor and return a smaller tensor.
example: SUM, MAX

MovementOps are virtual ops that operate on one tensor and move the data around
Copy-free with ShapeTracker.
example: RESHAPE, PERMUTE, EXPAND, etc...

But how...where are your CONVs and MATMULs? Read the code to solve this mystery.

Work at tiny corp

We are now funded and hiring full time software engineers. Very talented interns okay.

See our bounty page to judge if you might be a good fit. Bounties pay you while judging that fit.

We are also hiring for operations and hardware, but if you haven't contributed to tinygrad your application won't be considered.

tinybox (now shipping)

We sell a computer called the tinybox. It comes in two colors.

	red	green v2
FP16 (FP32 acc) FLOPS	738 TFLOPS	1492 TFLOPS
GPU Model	6x 7900XTX	4x 5090
GPU RAM	144 GB	128 GB
GPU RAM bandwidth	5760 GB/s	7168 GB/s
GPU link bandwidth	full fabric PCIe 4.0 x16	full fabric PCIe 5.0 x16
CPU	32 core AMD EPYC	32 core AMD GENOA
System RAM	128 GB	192 GB
System RAM bandwidth	204.8 GB/s	460.8 GB/s
Disk size	4 TB raid array + 1 TB boot
Disk read bandwidth	28.7 GB/s	28.7+ GB/s
Networking	2x 1 GbE + open OCP3.0 slot	2x 10 GbE + 2x 1 GbE + open OCP3.0 PCIe5
Noise	< 50 dB, 31 low speed fans
Power Supply	2x 1600W, 100V~240V
BMC	AST2500	AST2600
Operating System	Ubuntu 22.04	Ubuntu 24.04
Dimensions	12U, 16.25" deep, 90 lbs
Rack?	Freestanding or rack mount
Driver Quality	Developing	Great
SHIPPING	IN STOCK - $15,000	IN STOCK - $29,000

for updates on products and inventory, sign up for the mailing list

FAQ

What is a tinybox?: It is a very powerful computer for deep learning, and likely the best performance/$. It was benchmarked in MLPerf Training 4.0 vs computers that cost 10x as much. And of course, anything that can train can do inference.
How do I get a tinybox?: Place an order through the links above. The factory is up and running, and it will ship within one week of us receiving the payment. Currently offering pickup in San Diego + shipping worldwide.
Where can I learn more about the tinybox?: We have a lot of content on our Twitter, we also have a tinybox docs page and a #tinybox discord channel.
Can I customize my tinybox?: In order to keep prices low and quality high, we don't offer any customization to the box or ordering process. Of course, after you buy the tinybox, it's yours and you are welcome to do whatever you want with it!
Can you fill out this supplier onboarding form?: In order to keep prices low and quality high, we don't offer any customization to the box or ordering process. If you aren't capable of ordering through the website, I'm sorry but we won't be able to help.
Can I pay with something besides wire transfer?: In order to keep prices low and quality high, we don't offer any customization to the box or ordering process. Wire transfer is the only accepted form of payment.
Is tinygrad used anywhere?: tinygrad is used in openpilot to run the driving model on the Snapdragon 845 GPU. It replaces SNPE, is faster, supports loading onnx files, supports training, and allows for attention (SNPE only allows fixed weights).
Is tinygrad inference only?: No! It supports full forward and backward passes with autodiff. This is implemented at a level of abstraction higher than the accelerator specific code, so a tinygrad port gets you this for free.
How can I use tinygrad for my next ML project?: Follow the installation instructions on the tinygrad repo. It has a similar API to PyTorch, yet simpler and more refined. Less stable though while tinygrad is in alpha, so be warned, though it's been fairly stable for a while.
When will tinygrad leave alpha?: When we can reproduce a common set of papers on 1 NVIDIA GPU 2x faster than PyTorch. We also want the speed to be good on the M1. ETA, Q2 next year.
How is tinygrad faster than PyTorch?: For most use cases it isn't yet, but it will be. It has three advantages:
It compiles a custom kernel for every operation, allowing extreme shape specialization.

All tensors are lazy, so it can aggressively fuse operations.

The backend is 10x+ simpler, meaning optimizing one kernel makes everything fast.
Where is tinygrad development happening?: On GitHub and on Discord
How can the tiny corp work for me?: Email me, george@tinygrad.org. We are looking for contracts and sponsorships to improve various aspects of tinygrad.
How can I work for the tiny corp?: See hiring above. Contributions to tinygrad on GitHub always welcome, and a good way to get hired.
Can I invest in the tiny corp?: Invest with your PRs.
What's the goal of the tiny corp?: To accelerate. We will commoditize the petaflop and enable AI for everyone.