Mohammad Ausaf

get to know me

I'm a software engineer in Bengaluru at Galleri5 (part of Collective Artists Network). I build generative AI systems and the infrastructure they run on. I joined when the tech team was about 10 people, and as one of four engineers on the AI stack I work the whole depth of it: ML research and model training at one end, distributed systems and production ops at the other. Most of that in my first 3 years.

Most of what I do is architect and build our canvas: an agentic system that turns a plain creative brief into finished images, video, and audio. Under it is a harness that drives generation end to end. You describe what you want, and it plans the work: picks the right model for each step, holds one route per turn so nothing drifts, and checks with you before it burns real compute. Give it a video brief and it goes all the way to a stitched cut. I'm as much an infra person as a dev, so I own what runs under it too: the H100 and RunPod fleet, the job dispatch and crash recovery, the deploys across GCP, Azure, and AWS.

The pipelines I built went into Mahabharat - Ek Dharmayudh on JioHotstar, and the AI teaser for Chiranjeevi Hanuman: The Eternal, which premiered on IMAX and did 8M+ views on YouTube and 12M+ on Instagram. I'm also co-developing an AI asset platform with Microsoft.

The canvas is only the last year. Before it I was hands-on with the models themselves: image and video pipelines in ComfyUI and PyTorch, CUDA where it mattered, LoRAs trained for character consistency and style transfer. That's the craft the canvas orchestrates now, and I still spend a lot of my time in it on the R&D side. Earlier than that it was classical ML, the NLP and vector search behind product discovery for Myntra and Ajio.

Email Me

Tech Stack

Python

FastAPI

PyTorch

MongoDB

Redis

Docker

GCP

Azure

AWS

Firebase

Pinecone

ComfyUI

CLIP

LangChain

Google Gemini

OpenAI

Replicate

FAL AI

Agentic AI

Generative AI

Distributed Systems

experience

software engineer (ai)

Galleri5 - Bengaluru

Dec 2023 - Present

Agentic AI Platform: Architected and built AI Studio, a multi-tenant generative-media system (RBAC + enterprise permissions) where a canvas agent turns a creative brief into finished images and video, planning and running the generation itself. It commits to one route per turn, gates expensive generation behind human approval, and picks its models and pipeline shape from a live capability catalog tuned in config rather than hardcoded. Large batches author in two passes so results stream while the batch fills
Microsoft Collaboration: Co-developing AI asset platform with Microsoft. Designed multi-layer architecture handling versioned storage, semantic retrieval, automated quality evaluation, and preference-aware ranking. Implemented real-time review workflows with human feedback integration for creative iteration
Distributed Infrastructure: Built Inference Gateway orchestrating 81+ models across 8 providers with WRR-based fair scheduling and 4-level concurrency control. Designed GPU Dispatcher managing 8 H100 GPU server fleet with lease-based locking, priority queuing (Redis sorted sets), and crash recovery via WAL
ML Pipelines: Built end-to-end ML pipelines serving major e-commerce clients (Myntra, Ajio, H&M). Developed CLIP-based vectorization for semantic product search across millions of SKUs, text classification with sentence transformers (BGE, mxbai-embed), and real-time vector upsertion with Pinecone
R&D & Prototyping: Researched and prototyped SOTA image/video generation techniques. Built custom ComfyUI workflows for client-specific image generation. Developed image generation evaluation loops using Gemini Vision for automated quality assessment
Team Leadership: Technical lead for 3-person junior engineering team. Established code review processes, JIRA workflows, and onboarding practices

Tech Stack: Python, FastAPI, MongoDB, Redis, AWS, GCP, Azure, Firebase, PyTorch, CLIP, Pinecone, ComfyUI, Docker, LangChain, OpenAI, Gemini, Replicate, FAL AI

open-source systems tooling

cfgit - version control for live database records

2026

Record history: Built a Python tool for status, diff, log, adopt, tag, and restore on records that stay in MongoDB or Postgres
Drift checks: Kept hashing, diffing, rollback, and stale-write detection in a DB-neutral core with datastore-specific adapters
Shared actions: Exposed the same operations through CLI JSON, localhost UI, MCP tools, and portable agent instructions

Tech Stack: Python, MongoDB, Postgres, MCP, CLI tooling, Localhost UI

bachelor of technology - computer science

KIET Group of Institution, Ghaziabad

2024

Relevant Coursework:

Probability and Statistics, Calculus, Operating Systems
Data Structures and Algorithms, Machine Learning
Databases

work

ai studio, agentic generative-media system

The canvas I architect and build at Galleri5. A creative brief goes in, finished images, video, and audio come out. It plans the work itself: picks the right model for each step from a live catalog (81+ models across 8 providers), holds one route per turn so nothing drifts, and gates spend behind your approval. Big batches author in two passes so results start streaming in seconds. Shipped to JioHotstar and IMAX; the Hanuman teaser did 8M+ views on YouTube and 12M+ on Instagram. Technologies: FastAPI, MongoDB, Redis, LangChain, Agentic Systems, Computer Vision.

cfgit - version control for live database records

Git, but for the live config rows in your database. Version, diff, drift-detect, and restore records that stay in MongoDB or Postgres, without exporting them to files. History, tags, a localhost UI, and MCP tools so an agent can safely edit config too. Technologies: Python, MongoDB, Postgres, MCP, CLI tooling, Localhost UI.

searchy - local ai photo search for mac

Search your Mac photos by describing them, fully on-device, nothing leaves your machine. SigLIP/CLIP semantic search with OCR hybrid ranking, so a screenshot of "invoice 2024" ranks for that query. Face grouping, duplicate detection, external volumes, and GPU-accelerated indexing on Metal/MPS. Shipped as a signed Mac app. Technologies: Swift, SwiftUI, Python, SigLIP/CLIP, OCR, Metal/MPS.

View Project

boring.notch lyrics - macOS notch music widget

Fork of Boring Notch adding real-time synchronized lyrics display. Features multiple display modes (flowing, alternating, stacked), per-monitor configuration, and timing offset controls. Supports Spotify, Apple Music, and YouTube Music. Technologies: Swift, SwiftUI, macOS APIs, Music Integration.

deep learning based medicinal plants detection

This project aimed to identify medicinal herbs using a machine learning model(s). The model used ResNet, a type of CNN, with a validation accuracy of 98%.

get to know me

Tech Stack

experience

software engineer (ai)

open-source systems tooling

bachelor of technology - computer science

work

ai studio, agentic generative-media system

cfgit - version control for live database records

searchy - local ai photo search for mac

boring.notch lyrics - macOS notch music widget

deep learning based medicinal plants detection