Aru Sharma
Open to opportunities! I am actively looking for full-time positions, and collaborations in AI/ML. Here is my Resume.
Aru Sharma

Aru Sharma

B.E. Information Technology Student at
University Institute of Engineering and Technology, PU

About Me: I am passionate about building intelligent systems that bridge the gap between human communication and machine understanding. I build multimodal AI systems, contribute to open-source projects, and explore the intersection of NLP and Computer Vision. My approach to engineering is to study where current solutions fall short in real-world applications, and develop practical improvements that make AI more accessible and useful.

Experience: My journey involves significant contributions to open-source ecosystems. I worked as OSS contributor at Google Summer of Code with Mifos Initiative, developing multi-agent bots. I interned at Summer of Bitcoin to contribute to Bitcoin Transcripts. I was also an LFX Mentee at CNCF WasmEdge, and contributed to DocETL at UC Berkeley's EPIC Lab. I have also worked as an AI Engineer with Bennett Legal and HomeHive AI, and conducted risk analysis for crypto tokens.

Community: I lead the OSS club (Pclub) at my college to promote OSS. I also hosted events like Software Freedom Day, OSS hackathons like FOSSHACK and started AISOC so that students can get familiar with how to start contributing to OSS.

Achievements: Selected for the first edition of ESOC'25 under the Open-Source AI for Drug Discovery project. Ranked 15 globally on the NTIRE Image Dehazing and Denoising challenge at CVPR 2024. Published research on Speech Emotion Recognition accepted at the 16th ICCCNT 2025.

About

Education

B.E. Information Technology

University Institute of Engineering and Technology, PU

📍 Chandigarh, India📅 2022-pursuing

Mathematics and Computer Science

Little Scholars, Kashipur (CBSE)

📅 2019-2021
95%

Current Focus

Interested in memory augmented AI systems that can learn and evolve with time just like humans do.

Currently building multimodal AI systems and exploring the intersection of AI with current software landscapes.

Mechanistic InterpretabilityVoice AgentsKernel ProgrammingInference Optimisations

Experience

ML Engineering Intern

Nannie.ai

London, UK

Sep 2025 – Present
  • Worked on testing and deploying SOTA Vision algorithms for classification, segmentation and pose detection
  • Deployed OSS text to video generation models for in-house testing and benchmarking against Veo3

Contract AI Engineer

Deskree

Toronto, Canada

Nov 2025 – Jan 2026
  • Working on Tetrix and building AI agents for your infrastructure including cloud services like AWS.
  • Developed Tetrix CLI- a tool to review architecture, and security issues and enforce code quality for your project.
Jun 2025 – Sep 2025
  • Developed a multi-agent bot letting users know the status of Jira tickets, questions related to Slack discussions.
  • Developed a full-stack web application using FastAPI, NextJs and Firestore as database and Auth client.

Software Engineering Intern

Summer of Bitcoin - Bitcoin-dev-project

Manhattan, NY

May 2025 – Aug 2025
  • Designed and prototyped AI-assisted coding tools for Bitcoin using small language models and domain-specific Retrieval-Augmented Generation (RAG).
  • Developed data pipelines to ingest knowledge from Bitcoin developer calls, YouTube talks, IRC logs, mailing lists, and forums.

Open Source Collaborator

EPIC Lab - University of California Berkeley

Berkeley, CA

Oct 2024 – Jan 2025
  • Contributed User Defined Functions, LLM based data parsing and OCR modules to enhance the usability capability of DocETL.
  • Added structured generation support for Open-Source model based backend using Outlines.

LFX Mentee

CNCF WasmEdge

Austin, TX

Sep 2024 – Dec 2024
  • Developed a RAG based chatbot for code assistance using opensource LLMs with Wasmedge runtime.
  • Created a pipeline to ingest data from Github repository, augmented it using QnA pairs, summary and then embed this into a Qdrant vector database.

Projects

Long Horizon Reasoning Agents

Building a personalised agent that can reason over long term to remember and recall information from past interactions

Key Features:

  • Implementation of the EverMemOS paper from first principles
  • Keyword as well as semantic based retrieval system combined with reranking mechanism

Technologies:

Long Horizon ReasoningMemory Systems

Multimodal Emotion Recognition

Implemented a multimodal emotion recognition system using late and gated fusion techniques on audio and video embeddings to classify emotional states.

Key Features:

  • Whisper-large-v3 for audio feature extraction
  • V-JEPA for video visual embedding extraction
  • Gated Fusion Network for combining modalities

Technologies:

multimodal fusionffmpeg

Multi-Agent Research Tool

Developed an autonomous multi-agent system that facilitates interaction and collaboration of specialized agents to perform comprehensive research tasks.

Key Features:

  • DuckDuckGo Search Agent for web articles
  • ArXiv Agent for academic papers
  • Supervisor Agent for task coordination

Technologies:

LangGraphOpenAI

Bitcoin-ASR-Bench

Benchmarked models from Open-ASR leaderboard for transcribing talks from bitcoin conferences with GPU acceleration support.

Key Features:

  • Multi-model support and evaluation
  • GPU acceleration for efficient processing
  • Chunked processing for long audio files

Technologies:

HuggingfaceNvidia-NemoFFmpeg

LLM-Perf-Bench

This is the project from the sprint that I did over the weekend and benchmarked performance of LLM inference providers.

Key Features:

  • Uses snippet from sharegpt dataset for benchmarking
  • Compares latency and throughput across providers
  • Tried simulating real-world usage patterns using different concurrency levels

Technologies:

vLLMSGLangML-Infrastructure

vLLM hidden state extractor

Created a custom extractor for vLLM to extract hidden states from LLMs for downstream tasks.

Key Features:

  • It uses Pytorch Forward Hooks to extract hidden states from a specific layer.
  • It saves tensors on a GPU buffer which gets released via TTL logic.
  • These tensors can be consumed by a consumer process or thread in a near real-time.

Technologies:

vLLMPyTorchGPU Programming

Contact

Get in Touch

+91 7452029206
Chandigarh, India
Available for Opportunities

Open to full-time positions, and collaborations in AI/ML.

Send Message