index

Building a user facing not-for-profit chatbot for a Hindu Temple

A Step-by-Step Guide on building a user facing chatbot with proper evals, logging and monitoring

LLM

This blogpost walks you through the process of building a user facing chatbot using a real-world case study of a WhatsApp chatbot for a Hindu Temple. Discover best practices in development, implementation, and crucially, how to properly evaluate your AI application to ensure its effectiveness and reliability.

Jul 28, 2024

Aman Arora

Gemma 2

Improving Open Language Models at a Practical Size

LLM

In this post, we take a deep dive into the architectural components of Gemma 2 such as Grouped Query Attention, Sliding Window Attention, RoPE Embeddings, Logit soft-capping & Model-merging!

Jul 9, 2024

Aman Arora

Sliding Window Attention

Longformer - The Long-Document Transformer

LLM

In this post, we take a deep dive into Sliding Window Attention that allowed transformers to have long context length. We do this with the help of animations and also implement it from scrath in PyTorch code.

Jul 4, 2024

Aman Arora

Image retrieval app using Apple’s 4M-21 any-to-any vision model

4M-21 An Any-to-Any Vision Model for Tens of Tasks and Modalities

VLM

As part of this blog post we are going to build an image retriever app that can take in three inputs - caption, brightness and number of items per image to retrieve the most similar image from a database based on their values.

Jul 1, 2024

Aman Arora

Support bot with Claude 3.5 Sonnet using Claudette and Slack-SDK

Creating a support bot that supports API calls using Claudette

LLM

As part of this blog post we will build a support bot on Slack that can respond to queries in a slack channel using Claudette (a thin python wrapper on top of Anthropic CLI)

Jun 22, 2024

Aman Arora

Demystifying Document Question-Answering Chatbot - A Comprehensive Step-by-Step Tutorial with LangChain

Embark on an enlightening journey through the world of document-based question-answering chatbots using langchain! With a keen focus on detailed explanations and code walk-throughs, you’ll gain a deep understanding of each component - from creating a vector database to response generation.

Jul 28, 2023

Aman Arora

Deciphering LangChain: A Deep Dive into Code Complexity

Analyzing LangChain’s source code reveals impressive modularity but also surprising complexity in executing simple text generation. The deep call stack makes tracing execution flow challenging.

Jul 25, 2023

Aman Arora

Ahead of Times - Issue 2 (May 01 - May 07)

Second issue of the weekly newsletter to help you stay ahead of the times with latest news & updates in the field of AI.

Newsletter

As part of this newsletter, I share with you key updates, projects, GitHub repos, research trends, research papers in the field of Computer Vision, Large Language Models and Stable Diffusion.

May 8, 2023

Aman Arora

Ahead of Times - Issue 1 (Apr 24 - Apr 30)

First issue of the weekly newsletter to help you stay ahead of the times with latest news & updates in the field of AI.

Newsletter

As part of this newsletter, I share with you key updates, projects, GitHub repos, research trends, research papers in the field of Computer Vision, Large Language Models and Stable Diffusion.

May 2, 2023

Aman Arora

Paper Review - ‘LaMini-LM’

Paper review of “LaMini-LM - A Diverse Herd of Distilled Models from Large-Scale Instructions” and analysis on released 2.58M instruction dataset.

Large Language Models

Paper Review

As part of this blog post, we regenerate a small sample of the 2.58M shared Instruction Dataset and also perform human evaluation on some of the generated models shared in the research paper.

May 1, 2023

Aman Arora

The Annotated CLIP (Part-2)

Learning Transferable Visual Models From Natural Language Supervision

Multimodal

Transformers

Clip

This post is part-2 of the two series blog posts on CLIP (for part-1, please refer to my previous blog post). In this blog, we present the PyTorch code behind CLIP for model building and training. This blog post is in itself a working Jupyter Notebook.

Mar 11, 2023

Aman Arora

The Annotated CLIP (Part-1)

Learning Transferable Visual Models From Natural Language Supervision

Multimodal

Transformers

This post is part-1 of the two series blog posts on CLIP. In this blog, we present an Introduction to CLIP in an easy to digest manner. We also compare CLIP to other research papers and look at the background and inspiration behind CLIP.

Mar 3, 2023

Aman Arora

Swin Transformer

Hierarchical Vision Transformer using Shifted Windows

Computer Vision

Model Architecure

Transformers

Swin Transformer Model Architecture explained with PyTorch implementation line-by-line.

Jul 4, 2022

Aman Arora

The Annotated DETR

End-to-End Object Detection with Transformers

Computer Vision

Model Architecure

Object Detection

Transformers

DETR Model Architecture explained with PyTorch implementation line-by-line.

Jul 26, 2021

Aman Arora

The sad state of AI and tech startups in Australia today and what can we do about it

Jeremy Howard

“Did you know that Australia’s investment in AI was only 0.29% of the total investment in AI in 2020? This must explain the sad state of things here in Australia when it…

May 15, 2021

Aman Arora

Adam and friends

Adam, SGD, RMSProp from scratch in PyTorch.

Computer Vision

Basic optimizers from scratch in PyTorch with working notebook.

Mar 13, 2021

Aman Arora

Vision Transformer

An Image is Worth 16x16 Words - Transformers for Image Recognition at Scale

Computer Vision

Model Architecture

Transformers

In this blog post, we will be looking at the Vision Transformer architectures in detail, and also re-implement in PyTorch from scratch.

Jan 18, 2021

Aman Arora

The EfficientDet Architecture in PyTorch

Computer Vision

Model Architecture

Object Detection

In this blog post, we will look at how to implement the EfficientDet architecture in PyTorch from scratch.

Jan 13, 2021

Aman Arora

EfficientDet - Scalable and Efficient Object Detection

Computer Vision

Model Architecture

Object Detection

As part of this blog post I will explain how EfficientDets work step-by-step.

Jan 11, 2021

Aman Arora

GeM Pooling Explained with PyTorch Implementation and Introduction to Image Retrieval

Computer Vision

As part of this blog post we will be looking at GeM pooling and also look at the research paper Fine-tuning CNN Image Retrieval with No Human Annotation. We also implement GeM Pooling from scratch in PyTorch.

Aug 30, 2020

Aman Arora

U-Net A PyTorch Implementation in 60 lines of Code

U-Net Convolutional Networks for Biomedical Image Segmentation

Computer Vision

Model Architecture

Image Segmentation

As part of this blog post we will implement the U-Net architecture in PyTorch in 60 lines of code.

Aug 30, 2020

Aman Arora

SIIM-ISIC Melanoma Classification - my journey to a top 5% solution and first silver medal on Kaggle

Winning solution for SIIM-ISIC Melanoma Classification

Computer Vision

Kaggle

As part of this blog post I share my winning solution for SIIM-ISIC Melanoma Classification Kaggle Competition.

Aug 23, 2020

Aman Arora

EfficientNet

Rethinking Model Scaling for Convolutional Neural Networks

Computer Vision

Model Architecture

Look at the current SOTA, with top-1 accuracy of 88.5% on ImageNet.

Aug 13, 2020

Aman Arora

Group Normalization

Computer Vision

In this blog post, we will look at Group Normalization research paper and also implement Group Normalization in PyTorch from scratch.

Aug 9, 2020

Aman Arora

DenseNet Architecture Explained with PyTorch Implementation from TorchVision

Densely Connected Convolutional Networks

Programming

Computer Vision

Model Architecture

In this blog post, we introduce dense blocks, transition layers and look at the TorchVision implementation of DenseNet step-by-step.

Aug 2, 2020

Aman Arora

Squeeze and Excitation Networks Explained with PyTorch Implementation

Squeeze-and-Excitation Networks

Computer Vision

Model Architecture

In this blogpost, we re-implement the Squeeze-and-Excitation networks in PyTorch step-by-step with very minor updates to ResNet implementation in torchvision.

Jul 24, 2020

Aman Arora

Label Smoothing Explained using Microsoft Excel

Better language models and their implications

Computer Vision

In this blogpost, we re-implement Label Smoothing in Microsoft Excel step by step.

Jul 18, 2020

Aman Arora

An introduction to PyTorch Lightning with comparisons to PyTorch

Better language models and their implications

Programming

Computer Vision

In this blogpost, we will be going through an introduction to Pytorch Lightning and implement all the cool tricks like - Gradient Accumulation, 16-bit precision training, and also add TPU/multi-gpu support - all in a few lines of code. We will use Pytorch Lightning to work on SIIM-ISIC Melanoma Classification challenge on Kaggle.

Jul 12, 2020

Aman Arora

What is Focal Loss and when should you use it?

Better language models and their implications

Computer Vision

Loss Function

In this blogpost, we will understand what Focal Loss and when is it used. We will also take a dive into its math and implement step-by-step in PyTorch.

Jun 29, 2020

Aman Arora

The Annotated GPT-2

Better language models and their implications

NLP

Transformers

This post presents an annotated version of the paper in the form of a line-by-line implementation in PyTorch. This document itself is a working notebook, and should be a completely usable implementation.

Feb 18, 2020

Aman Arora

Categories

Building a user facing not-for-profit chatbot for a Hindu Temple

Gemma 2

Sliding Window Attention

Image retrieval app using Apple’s 4M-21 any-to-any vision model

Support bot with Claude 3.5 Sonnet using Claudette and Slack-SDK

Demystifying Document Question-Answering Chatbot - A Comprehensive Step-by-Step Tutorial with LangChain

Deciphering LangChain: A Deep Dive into Code Complexity

Ahead of Times - Issue 2 (May 01 - May 07)

Ahead of Times - Issue 1 (Apr 24 - Apr 30)

Paper Review - ‘LaMini-LM’

The Annotated CLIP (Part-2)

The Annotated CLIP (Part-1)

Swin Transformer

The Annotated DETR

The sad state of AI and tech startups in Australia today and what can we do about it

Adam and friends

Vision Transformer

The EfficientDet Architecture in PyTorch

EfficientDet - Scalable and Efficient Object Detection

Top 100 solution - SIIM-ACR Pneumothorax Segmentation

GeM Pooling Explained with PyTorch Implementation and Introduction to Image Retrieval

U-Net A PyTorch Implementation in 60 lines of Code

SIIM-ISIC Melanoma Classification - my journey to a top 5% solution and first silver medal on Kaggle

EfficientNet

Group Normalization

DenseNet Architecture Explained with PyTorch Implementation from TorchVision

Squeeze and Excitation Networks Explained with PyTorch Implementation

Label Smoothing Explained using Microsoft Excel

An introduction to PyTorch Lightning with comparisons to PyTorch

What is Focal Loss and when should you use it?

The Annotated GPT-2

Categories

Subscribe