Technical Writings

A collection of my technical blog posts and articles published across various platforms.

Blog Posts (amaarora.github.io)

An Introduction to Real-Time Guardrails & Qwen3Guard: A comprehensive technical review and introduction to Qwen3Guard, exploring how it addresses critical limitations in existing guardrail models through controversial classification, real-time streaming detection, and multilingual safety moderation across 119 languages.
How LLMs Scaled from 512 to 2M Context: A Technical Deep Dive: A comprehensive technical guide through the evolution of positional encodings such as APE, RoPE, Position Interpolation, NTK-Aware Scaling, Dynamic Scaling, and YaRN in Large Language Models.
What Makes Modern Day LLMs Agentic: Demystifying tool calling in LLMs - how special tokens and training patterns create the illusion of agency, when it’s really just next token prediction with clever scaffolding.
Claude’s New File Capabilities - My Notes and Reflections: Exploring Claude’s new file handling features and their implications for AI-assisted development.
Agent Frameworks Are So Much More Than For Loops: A deep dive into what makes modern agent frameworks powerful beyond simple iteration.
Deciphering LangChain: A Deep Dive into Code Complexity: An analysis of LangChain’s architecture and code complexity.
Paper Review - ‘LaMini-LM’: A review of the paper “LaMini-LM - A Diverse Herd of Distilled Models from Large-Scale Instructions”.
The Annotated CLIP (Part-2): A detailed explanation of the PyTorch code behind CLIP for model building and training.
The Annotated CLIP (Part-1): An introduction to CLIP, comparing it to other research papers and discussing the inspiration behind it.
Swin Transformer: Explanation and PyTorch implementation of the Swin Transformer Model Architecture.
The Annotated DETR: Explanation and PyTorch implementation of the DETR Model Architecture for end-to-end object detection.
The sad state of AI and tech startups in Australia today and what can we do about it: A discussion on the current state of AI and tech startups in Australia.
Adam and friends: Implementation of basic optimizers like Adam, SGD, RMSProp from scratch in PyTorch.
Vision Transformer: Detailed look at the Vision Transformer architectures and their re-implementation in PyTorch from scratch.
The EfficientDet Architecture in PyTorch: A guide on how to implement the EfficientDet architecture in PyTorch from scratch.
EfficientDet - Scalable and Efficient Object Detection: Explanation of how EfficientDets work step-by-step.
U-Net A PyTorch Implementation in 60 lines of Code: Implementation of the U-Net architecture in PyTorch in 60 lines of code.
Top 100 solution - SIIM-ACR Pneumothorax Segmentation: A solution for Image Segmentation based problem in Pytorch with SIIM-ACR Pneumothorax Segmentation competition.
GeM Pooling Explained with PyTorch Implementation and Introduction to Image Retrieval: Explanation of GeM pooling and implementation from scratch in PyTorch.
SIIM-ISIC Melanoma Classification - my journey to a top 5% solution and first silver medal on Kaggle: Sharing of winning solution for SIIM-ISIC Melanoma Classification Kaggle Competition.
EfficientNet: A look at the current SOTA, with top-1 accuracy of 88.5% on ImageNet.
Group Normalization: A look at Group Normalization research paper and implementation in PyTorch from scratch.
DenseNet Architecture Explained with PyTorch Implementation from TorchVision: Introduction to dense blocks, transition layers and the TorchVision implementation of DenseNet step-by-step.
Squeeze and Excitation Networks Explained with PyTorch Implementation: Re-implementation of the Squeeze-and-Excitation networks in PyTorch step-by-step with very minor updates to ResNet implementation.
Label Smoothing Explained using Microsoft Excel: Re-implementation of Label Smoothing in Microsoft Excel step by step.
An introduction to PyTorch Lightning with comparisons to PyTorch: An introduction to Pytorch Lightning and implementation of all the cool tricks like - Gradient Accumulation, 16-bit precision training, and also add TPU/multi-gpu support - all in a few lines of code.
What is Focal Loss and when should you use it?: Understanding what Focal Loss is and when it is used, including a deep dive into its math and step-by-step implementation in PyTorch.
The Annotated GPT-2: An annotated version of the GPT-2 paper in the form of a line-by-line implementation in PyTorch.

External Publications (Weights & Biases Reports)

An Introduction to HuggingFace’s Accelerate Library: An introduction to the Accelerate Library by HuggingFace.
Train, Optimize, Analyze, Visualize and Deploy Models for Automatic Speech Recognition with NVIDIA’s NeMo: A guide on training, optimizing, analyzing, visualizing, and deploying models for Automatic Speech Recognition with NVIDIA’s NeMo.
Interpret any PyTorch Model Using W&B Embedding Projector: A guide on interpreting any PyTorch model using the W&B Embedding Projector.
How Weights and Biases Can Help with Audits & Regulatory Guidelines: A discussion on how Weights and Biases can assist with audits and regulatory guidelines.
How Weights & Biases and MS Fairlearn can help deal with Model and Dataset Bias: A guide on how Weights & Biases and MS Fairlearn can help deal with model and dataset bias.
ResNet Strikes Back: A Training Procedure in TIMM: A report on the training procedure of ResNet in TIMM.
Is MLP-Mixer a CNN in Disguise?: A discussion on whether MLP-Mixer is a CNN in disguise.
Are fully connected and convolution layers equivalent? If so, how?: A report on the equivalence of fully connected and convolution layers.
ConViT: Improving Vision Transformers with Soft Convolutional Inductive Biases: A report on improving Vision Transformers with Soft Convolutional Inductive Biases.
A faster way to get working and up-to-date conda environments using “fastchan”: A guide on using “fastchan” for faster and up-to-date conda environments.
Explained: Characterizing Signal Propagation to Close the Performance Gap in Unnormalized ResNets: An explanation of characterizing signal propagation to close the performance gap in unnormalized ResNets.
Revisiting ResNets: Improved Training and Scaling Strategies: A report on improved training and scaling strategies for ResNets.
EfficientNetV2: A report on EfficientNetV2.
I trained on ImageNet for the “first time” - here’s what I learnt: A report on the author’s experience and learnings from training on ImageNet for the first time.
Understanding Logits, Sigmoid, Softmax, and Cross-Entropy Loss in Deep Learning: A deep dive into understanding logits, sigmoid, softmax, and cross-entropy loss in deep learning.
How to Build a Robust Medical Model Using Weights & Biases: A guide on building a robust medical model using Weights & Biases.
Tracking CO2 Emissions of Your Deep Learning Models with CodeCarbon and Weights & Biases: A guide on tracking CO2 emissions of deep learning models with CodeCarbon and Weights & Biases.
How to track all your experiments using Microsoft Excel?: A guide on tracking all your experiments using Microsoft Excel.
How to save all your trained model weights locally after every epoch: A guide on saving all your trained model weights locally after every epoch.
How to prepare the dataset for the Melanoma Classification?: A guide on preparing the dataset for the Melanoma Classification.
How to use Weights & Biases for your Kaggle Competitions?: A guide on using Weights & Biases for Kaggle competitions.
How to use Weights & Biases for your next Machine Learning Project?: A guide on using Weights & Biases for your next Machine Learning project.

Blog Posts (amaarora.github.io)

External Publications (Weights & Biases Reports)

Subscribe