• Aman Arora’s Blog
  • Aman Arora
Categories
All (31)
AI (1)
Clip (1)
Computer Vision (17)
Image Segmentation (2)
Jeremy Howard (1)
Kaggle (2)
LLM (4)
Large Language Models (1)
Loss Function (1)
Model Architecture (7)
Model Architecure (2)
Multimodal (2)
NLP (1)
Newsletter (2)
Object Detection (3)
Paper Review (1)
Programming (2)
Transformers (6)
VLM (1)

Building a user facing not-for-profit chatbot for a Hindu Temple

A Step-by-Step Guide on building a user facing chatbot with proper evals, logging and monitoring
LLM
This blogpost walks you through the process of building a user facing chatbot using a real-world case study of a WhatsApp chatbot for a Hindu Temple. Discover best practices in development, implementation, and crucially, how to properly evaluate your AI application to ensure its effectiveness and reliability.
Jul 28, 2024
Aman Arora

Gemma 2

Improving Open Language Models at a Practical Size
LLM
In this post, we take a deep dive into the architectural components of Gemma 2 such as Grouped Query Attention, Sliding Window Attention, RoPE Embeddings, Logit soft-capping & Model-merging!
Jul 9, 2024
Aman Arora

Sliding Window Attention

Longformer - The Long-Document Transformer
LLM
In this post, we take a deep dive into Sliding Window Attention that allowed transformers to have long context length. We do this with the help of animations and also implement it from scrath in PyTorch code.
Jul 4, 2024
Aman Arora

Image retrieval app using Apple’s 4M-21 any-to-any vision model

4M-21 An Any-to-Any Vision Model for Tens of Tasks and Modalities
VLM
As part of this blog post we are going to build an image retriever app that can take in three inputs - caption, brightness and number of items per image to retrieve the most similar image from a database based on their values.
Jul 1, 2024
Aman Arora

Support bot with Claude 3.5 Sonnet using Claudette and Slack-SDK

Creating a support bot that supports API calls using Claudette
LLM
As part of this blog post we will build a support bot on Slack that can respond to queries in a slack channel using Claudette (a thin python wrapper on top of Anthropic CLI)
Jun 22, 2024
Aman Arora

Demystifying Document Question-Answering Chatbot - A Comprehensive Step-by-Step Tutorial with LangChain

Embark on an enlightening journey through the world of document-based question-answering chatbots using langchain! With a keen focus on detailed explanations and code walk-throughs, you’ll gain a deep understanding of each component - from creating a vector database to response generation.
Jul 28, 2023
Aman Arora

Deciphering LangChain: A Deep Dive into Code Complexity

Analyzing LangChain’s source code reveals impressive modularity but also surprising complexity in executing simple text generation. The deep call stack makes tracing execution flow challenging.
Jul 25, 2023
Aman Arora

Ahead of Times - Issue 2 (May 01 - May 07)

Second issue of the weekly newsletter to help you stay ahead of the times with latest news & updates in the field of AI.
Newsletter
As part of this newsletter, I share with you key updates, projects, GitHub repos, research trends, research papers in the field of Computer Vision, Large Language Models and Stable Diffusion.
May 8, 2023
Aman Arora

Ahead of Times - Issue 1 (Apr 24 - Apr 30)

First issue of the weekly newsletter to help you stay ahead of the times with latest news & updates in the field of AI.
Newsletter
As part of this newsletter, I share with you key updates, projects, GitHub repos, research trends, research papers in the field of Computer Vision, Large Language Models and Stable Diffusion.
May 2, 2023
Aman Arora

Paper Review - ‘LaMini-LM’

Paper review of “LaMini-LM - A Diverse Herd of Distilled Models from Large-Scale Instructions” and analysis on released 2.58M instruction dataset.
Large Language Models
Paper Review
As part of this blog post, we regenerate a small sample of the 2.58M shared Instruction Dataset and also perform human evaluation on some of the generated models shared in the research paper.
May 1, 2023
Aman Arora

The Annotated CLIP (Part-2)

Learning Transferable Visual Models From Natural Language Supervision
Multimodal
Transformers
Clip
This post is part-2 of the two series blog posts on CLIP (for part-1, please refer to my previous blog post). In this blog, we present the PyTorch code behind CLIP for model building and training. This blog post is in itself a working Jupyter Notebook.
Mar 11, 2023
Aman Arora

The Annotated CLIP (Part-1)

Learning Transferable Visual Models From Natural Language Supervision
Multimodal
Transformers
This post is part-1 of the two series blog posts on CLIP. In this blog, we present an Introduction to CLIP in an easy to digest manner. We also compare CLIP to other research papers and look at the background and inspiration behind CLIP.
Mar 3, 2023
Aman Arora

Swin Transformer

Hierarchical Vision Transformer using Shifted Windows
Computer Vision
Model Architecure
Transformers
Swin Transformer Model Architecture explained with PyTorch implementation line-by-line.
Jul 4, 2022
Aman Arora

The Annotated DETR

End-to-End Object Detection with Transformers
Computer Vision
Model Architecure
Object Detection
Transformers
DETR Model Architecture explained with PyTorch implementation line-by-line.
Jul 26, 2021
Aman Arora

The sad state of AI and tech startups in Australia today and what can we do about it

AI
Jeremy Howard
“Did you know that Australia’s investment in AI was only 0.29% of the total investment in AI in 2020? This must explain the sad state of things here in Australia when it…
May 15, 2021
Aman Arora

Adam and friends

Adam, SGD, RMSProp from scratch in PyTorch.
Computer Vision
Basic optimizers from scratch in PyTorch with working notebook.
Mar 13, 2021
Aman Arora

Vision Transformer

An Image is Worth 16x16 Words - Transformers for Image Recognition at Scale
Computer Vision
Model Architecture
Transformers
In this blog post, we will be looking at the Vision Transformer architectures in detail, and also re-implement in PyTorch from scratch.
Jan 18, 2021
Aman Arora

The EfficientDet Architecture in PyTorch

Computer Vision
Model Architecture
Object Detection
In this blog post, we will look at how to implement the EfficientDet architecture in PyTorch from scratch.
Jan 13, 2021
Aman Arora

EfficientDet - Scalable and Efficient Object Detection

Computer Vision
Model Architecture
Object Detection
As part of this blog post I will explain how EfficientDets work step-by-step.
Jan 11, 2021
Aman Arora

Top 100 solution - SIIM-ACR Pneumothorax Segmentation

Computer Vision
Kaggle
Image Segmentation
In this blog post, we will looking at Image Segmentation based problem in Pytorch with SIIM-ACR Pneumothorax Segmentation competition serving as a useful example and create a solution that will get us to the top-100 leaderboard position on Kaggle.
Sep 6, 2020
Aman Arora

GeM Pooling Explained with PyTorch Implementation and Introduction to Image Retrieval

Computer Vision
As part of this blog post we will be looking at GeM pooling and also look at the research paper Fine-tuning CNN Image Retrieval with No Human Annotation. We also implement GeM Pooling from scratch in PyTorch.
Aug 30, 2020
Aman Arora

U-Net A PyTorch Implementation in 60 lines of Code

U-Net Convolutional Networks for Biomedical Image Segmentation
Computer Vision
Model Architecture
Image Segmentation
As part of this blog post we will implement the U-Net architecture in PyTorch in 60 lines of code.
Aug 30, 2020
Aman Arora

SIIM-ISIC Melanoma Classification - my journey to a top 5% solution and first silver medal on Kaggle

Winning solution for SIIM-ISIC Melanoma Classification
Computer Vision
Kaggle
As part of this blog post I share my winning solution for SIIM-ISIC Melanoma Classification Kaggle Competition.
Aug 23, 2020
Aman Arora

EfficientNet

Rethinking Model Scaling for Convolutional Neural Networks
Computer Vision
Model Architecture
Look at the current SOTA, with top-1 accuracy of 88.5% on ImageNet.
Aug 13, 2020
Aman Arora

Group Normalization

Computer Vision
In this blog post, we will look at Group Normalization research paper and also implement Group Normalization in PyTorch from scratch.
Aug 9, 2020
Aman Arora

DenseNet Architecture Explained with PyTorch Implementation from TorchVision

Densely Connected Convolutional Networks
Programming
Computer Vision
Model Architecture
In this blog post, we introduce dense blocks, transition layers and look at the TorchVision implementation of DenseNet step-by-step.
Aug 2, 2020
Aman Arora

Squeeze and Excitation Networks Explained with PyTorch Implementation

Squeeze-and-Excitation Networks
Computer Vision
Model Architecture
In this blogpost, we re-implement the Squeeze-and-Excitation networks in PyTorch step-by-step with very minor updates to ResNet implementation in torchvision.
Jul 24, 2020
Aman Arora

Label Smoothing Explained using Microsoft Excel

Better language models and their implications
Computer Vision
In this blogpost, we re-implement Label Smoothing in Microsoft Excel step by step.
Jul 18, 2020
Aman Arora

An introduction to PyTorch Lightning with comparisons to PyTorch

Better language models and their implications
Programming
Computer Vision
In this blogpost, we will be going through an introduction to Pytorch Lightning and implement all the cool tricks like - Gradient Accumulation, 16-bit precision training, and also add TPU/multi-gpu support - all in a few lines of code. We will use Pytorch Lightning to work on SIIM-ISIC Melanoma Classification challenge on Kaggle.
Jul 12, 2020
Aman Arora

What is Focal Loss and when should you use it?

Better language models and their implications
Computer Vision
Loss Function
In this blogpost, we will understand what Focal Loss and when is it used. We will also take a dive into its math and implement step-by-step in PyTorch.
Jun 29, 2020
Aman Arora

The Annotated GPT-2

Better language models and their implications
NLP
Transformers
This post presents an annotated version of the paper in the form of a line-by-line implementation in PyTorch. This document itself is a working notebook, and should be a completely usable implementation.
Feb 18, 2020
Aman Arora
No matching items

    Subscribe

    * indicates required