⚡ personal web world · local models

Ashesh.Kaji

About Me

I'm an AI Engineer currently pursuing an MS in Computer Engineering at NYU Tandon, building on a BS with Honors in Cognitive Science (ML & Neural Computation) from UC San Diego.

My work spans the full ML lifecycle — from designing RAG pipelines and fine-tuning LLMs for production at SageX Global and UniQreate, to conducting neuroscience research on neuroimaging and iron metabolism at UCSD. I'm especially drawn to problems at the intersection of efficient ML inference, quantized models, hardware-aware optimization, and autonomous systems.

When I'm not training models, you'll find me building Rust CLI tools, hacking on FPGA-based ZKP accelerators, or contributing to open-source projects like MediaSync (a media library pipeline) and stock portfolio optimization research.

3+ Years in AI/ML
12 Public Repos
3 Languages (EN/GU/HI)
MS+BS Degrees

Experience

01/2026 – PRESENT
Consulting AI Engineer
SageX Global · Remote
  • Consulting on AI tooling pipelines spanning LLM deployment, data transformation, and production MLOps
09/2024 – 01/2026
Artificial Intelligence Engineer
SageX Global · Remote
  • Designed pipelines for custom data transforms and fine-tuning strategies across varied data lifecycle requirements
  • Led development of major products: LLMs, SLMs, statistical mapping models, and semantic database memory retrieval
  • Stack: PyTorch, Scikit-Learn, Azure, AWS, Python, MLOps, NLP, RAG, LLMs, LangChain, AI-SDK, ACP, MCP servers, multimodal data pipelines
09/2023 – 07/2024
Machine Learning Intern
UniQreate · Remote
  • Designed and implemented data extraction pipelines and chat interfaces using LLMs and vector databases
  • Developed a Retrieval-Augmented Generation (RAG) product that went into production with Azure integration and serverless deployment of locally-hosted LLMs
  • Owned architecture design, code implementation, rapid versioning, and production testing
10/2022 – 06/2025
Undergraduate Research Assistant
UC San Diego · Dr. Mary Boyle's Lab
  • Studied relationship between peripheral iron levels, NAFLD, and neurodegenerative disorders using UK BioBank data
  • Explored effect of metal exposure from vapes on MRI imaging of subjects from the ABCD study
  • Stack: Python, CLI, NumPy, Pandas, Statsmodels API, epidemiology

Education

MS in Computer Engineering
NYU Tandon School of Engineering
01/2026 – 12/2027 (expected)
Graduate studies focusing on systems, ML infrastructure, and hardware-software co-design.
BS with Honors in Cognitive Science
UC San Diego
09/2021 – 06/2025
Specialization in Machine Learning and Neural Computation. Honors thesis on the effect of metal exposure from vapes on brain imaging. Completed advanced coursework in ML methods, deep learning, and neurophysiology.
International Baccalaureate Diploma
SVKM's JV Parekh International School
2019 – 2021 · Score: 39/45
Rigorous pre-university program with advanced coursework across sciences, mathematics, and humanities.

Projects & Open Source

fetching repositories from github...

Skills

Python PyTorch Rust LLMs RAG NLP MLOps Scikit-Learn NumPy Pandas Azure AWS Docker Git Linux / CLI SQL Vector Databases WebAssembly FPGA / Hardware Statsmodels Seaborn

Ask Me Anything

This chat attempts to run the real onnx-community/Bonsai-1.7B-ONNX model with q1 1-bit weights directly in your browser through WebGPU. No server. No API calls. No data leaves your machine. If WebGPU or the model fails, the interface reports the exact failure and does not fabricate a response.

WebGPU Required q1 1-bit Quantization Real ONNX Model 100% Client-Side No Simulated Answers
Ashesh Kaji (Bonsai 1.7B)
click "Load Model" to begin
not loaded
No model has been loaded yet.
AK

Hey — this is the real browser-local Bonsai interface, not a canned chatbot.

Click Load real Bonsai to check WebGPU, load the q1 ONNX model, and enable local generation. If anything fails, the exact error stays visible here.

The model receives only a small factual profile context from this page and is instructed not to invent unknown details.

Pass condition: WebGPU ready + real model generation event.