• Aman Arora’s Blog
  • Aman Arora
Categories
All (32)
AI (8)
AI Agents (6)
Computer Vision (19)
Large Language Models (11)
Programming (4)

What Makes Modern Day LLMs Agentic

Large Language Models
AI Agents

Demystifying tool calling in LLMs - how special tokens and training patterns create the illusion of agency, when it’s really just next token prediction with clever scaffolding

Sep 14, 2025

Claude’s New File Capabilities - My Notes and Reflections

AI

A hands-on exploration of Claude’s new file creation and spreadsheet analysis capabilities, with real-world testing results and insights

Sep 10, 2025

Agent Frameworks Are So Much More Than For Loops

AI Agents
Programming

A balanced perspective on the recent debate about agent frameworks vs. simple while loops

Sep 08, 2025

Building a user facing not-for-profit chatbot for a Hindu Temple

Large Language Models
AI Agents

This blogpost walks you through the process of building a user facing chatbot using a real-world case study of a WhatsApp chatbot for a Hindu Temple. Discover best practices in development, implementation, and crucially, how to properly evaluate your AI application to ensure its effectiveness and reliability.

Jul 28, 2024

Gemma 2

Large Language Models

In this post, we take a deep dive into the architectural components of Gemma 2 such as Grouped Query Attention, Sliding Window Attention, RoPE Embeddings, Logit soft-capping & Model-merging!

Jul 09, 2024

Sliding Window Attention

Large Language Models

In this post, we take a deep dive into Sliding Window Attention that allowed transformers to have long context length. We do this with the help of animations and also implement it from scrath in PyTorch code.

Jul 04, 2024

Image retrieval app using Apple’s 4M-21 any-to-any vision model

Computer Vision
AI

As part of this blog post we are going to build an image retriever app that can take in three inputs - caption, brightness and number of items per image to retrieve the most similar image from a database based on their values.

Jul 01, 2024

Support bot with Claude 3.5 Sonnet using Claudette and Slack-SDK

Large Language Models
AI Agents

As part of this blog post we will build a support bot on Slack that can respond to queries in a slack channel using Claudette (a thin python wrapper on top of Anthropic CLI)

Jun 22, 2024

Demystifying Document Question-Answering Chatbot - A Comprehensive Step-by-Step Tutorial with LangChain

AI Agents
Large Language Models

Embark on an enlightening journey through the world of document-based question-answering chatbots using langchain! With a keen focus on detailed explanations and code walk-throughs, you’ll gain a deep understanding of each component - from creating a vector database to response generation.

Jul 28, 2023

Deciphering LangChain: A Deep Dive into Code Complexity

AI Agents
Programming

Analyzing LangChain’s source code reveals impressive modularity but also surprising complexity in executing simple text generation. The deep call stack makes tracing execution flow challenging.

Jul 25, 2023

Paper Review - ‘LaMini-LM’

Large Language Models
AI

As part of this blog post, we regenerate a small sample of the 2.58M shared Instruction Dataset and also perform human evaluation on some of the generated models shared in the research paper.

May 01, 2023

The Annotated CLIP (Part-2)

Computer Vision
AI

This post is part-2 of the two series blog posts on CLIP (for part-1, please refer to my previous blog post). In this blog, we present the PyTorch code behind CLIP for model building and training. This blog post is in itself a working Jupyter Notebook.

Mar 11, 2023

The Annotated CLIP (Part-1)

Computer Vision
AI

This post is part-1 of the two series blog posts on CLIP. In this blog, we present an Introduction to CLIP in an easy to digest manner. We also compare CLIP to other research papers and look at the background and inspiration behind CLIP.

Mar 03, 2023

Swin Transformer

Computer Vision
Large Language Models

Swin Transformer Model Architecture explained with PyTorch implementation line-by-line.

Jul 04, 2022

The Annotated DETR

Computer Vision
Large Language Models

DETR Model Architecture explained with PyTorch implementation line-by-line.

Jul 26, 2021

The sad state of AI and tech startups in Australia today and what can we do about it

AI
“Did you know that Australia’s investment in AI was only 0.29% of the total investment in AI in 2020? This must explain the sad state of things here in Australia when it…
May 15, 2021

Adam and friends

Computer Vision

Basic optimizers from scratch in PyTorch with working notebook.

Mar 13, 2021

Vision Transformer

Computer Vision
Large Language Models

In this blog post, we will be looking at the Vision Transformer architectures in detail, and also re-implement in PyTorch from scratch.

Jan 18, 2021

The EfficientDet Architecture in PyTorch

Computer Vision

In this blog post, we will look at how to implement the EfficientDet architecture in PyTorch from scratch.

Jan 13, 2021

EfficientDet - Scalable and Efficient Object Detection

Computer Vision

As part of this blog post I will explain how EfficientDets work step-by-step.

Jan 11, 2021

Top 100 solution - SIIM-ACR Pneumothorax Segmentation

Computer Vision
AI

In this blog post, we will looking at Image Segmentation based problem in Pytorch with SIIM-ACR Pneumothorax Segmentation competition serving as a useful example and create a solution that will get us to the top-100 leaderboard position on Kaggle.

Sep 06, 2020

U-Net A PyTorch Implementation in 60 lines of Code

Computer Vision

As part of this blog post we will implement the U-Net architecture in PyTorch in 60 lines of code.

Aug 30, 2020

GeM Pooling Explained with PyTorch Implementation and Introduction to Image Retrieval

Computer Vision

As part of this blog post we will be looking at GeM pooling and also look at the research paper Fine-tuning CNN Image Retrieval with No Human Annotation. We also implement GeM Pooling from scratch in PyTorch.

Aug 30, 2020

SIIM-ISIC Melanoma Classification - my journey to a top 5% solution and first silver medal on Kaggle

Computer Vision
AI

As part of this blog post I share my winning solution for SIIM-ISIC Melanoma Classification Kaggle Competition.

Aug 23, 2020

EfficientNet

Computer Vision

Look at the current SOTA, with top-1 accuracy of 88.5% on ImageNet.

Aug 13, 2020

Group Normalization

Computer Vision

In this blog post, we will look at Group Normalization research paper and also implement Group Normalization in PyTorch from scratch.

Aug 09, 2020

DenseNet Architecture Explained with PyTorch Implementation from TorchVision

Computer Vision
Programming

In this blog post, we introduce dense blocks, transition layers and look at the TorchVision implementation of DenseNet step-by-step.

Aug 02, 2020

Squeeze and Excitation Networks Explained with PyTorch Implementation

Computer Vision

In this blogpost, we re-implement the Squeeze-and-Excitation networks in PyTorch step-by-step with very minor updates to ResNet implementation in torchvision.

Jul 24, 2020

Label Smoothing Explained using Microsoft Excel

Computer Vision

In this blogpost, we re-implement Label Smoothing in Microsoft Excel step by step.

Jul 18, 2020

An introduction to PyTorch Lightning with comparisons to PyTorch

Programming

In this blogpost, we will be going through an introduction to Pytorch Lightning and implement all the cool tricks like - Gradient Accumulation, 16-bit precision training, and also add TPU/multi-gpu support - all in a few lines of code. We will use Pytorch Lightning to work on SIIM-ISIC Melanoma Classification challenge on Kaggle.

Jul 12, 2020

What is Focal Loss and when should you use it?

Computer Vision

In this blogpost, we will understand what Focal Loss and when is it used. We will also take a dive into its math and implement step-by-step in PyTorch.

Jun 29, 2020

The Annotated GPT-2

Large Language Models

This post presents an annotated version of the paper in the form of a line-by-line implementation in PyTorch. This document itself is a working notebook, and should be a completely usable implementation.

Feb 18, 2020
No matching items

Subscribe

* indicates required