About
I am a Machine Learning/AI Engineer. I started Scaled Focus to help companies evaluate, optimize and deploy agents.
I have written a book on Natural Language Processing for software engineers.
I have also contributed to Ragas, built FastEmbed and maintained Awesome NLP.
Book¶
Book: NLP in Python: Quickstart Guide has sold over 5000 copies.
I wrote this book in 2018 to make Natural Language Programming more accessible for software engineers and programmers. The accompanying code is available on Github.
Papers and Open Source Contributions¶
-
Ragas
- We helped improve the most widely used Agent Metrics layer. My contributions are here.
-
FastEmbed
- I built the first version of FastEmbed, an embedding library built for speed using ONNX. It is used by Qdrant, NVIDIA Nemo Guardrails and 3000+ more respositories alone!.
-
OpenAI Cookbook: Fine-tuned Retrieval Augmented Generation with Qdrant
-
Hinglish: github, paper focussed on code-mixed languages was published in ACL 2019.
-
Awesome Project Ideas
- Curated list of machine learning (mostly deep learning) project ideas with datasets. These ideas range from Vision, Text, Forecasting to Recommender Systems
-
Awesome NLP
Curated list of Natural Language Processing Resources.
- Recommended by Dr. Andrew Ng's (Stanford) CS 230
- Featured in Github's Official Machine Learning Collection since 2016 and
-
2018: State of the Art Language Modeling in Hindi + new datasets, check the code here at hindi2vec
Recognitions¶
- Analytics Vidhya Magazine named me one of the Top 5 GenAI Scientists in India in 2023.
- Won the First ever NLP themed Kaggle Kernel Prize (2019): The Hitchhiker's Guide to NLP in spaCy
- My Jupyter Notebook best practices were appreciated by Nobel Laureate Dr. Paul Romer in 2018 (link).
Talks¶
- Fifth Elephant MLOps Conf 2021: Slides
- PyCon India 2019: Slides and Youtube
- inMobi Tech Talks: A Nightmare on the LM Street; Slides
- Wingify DevFest: NLP for Indian Languages; Slides, Youtube
- PyData Bengaluru Inaugral Talk: Quiz Generation with spaCy; Youtube
See more recognitions, competitions, and mentions
- Search and Informational Retrieval Ranking Challenge hosted by Bing AI Team (2019)
- FactorDaily's piece on [The great rush to data sciences in India](https://factordaily.com/rush-training-data-science-machine-learning-ai-india/) ends with a direct quote from me. * FactorDaily is a new age news company which sits at the intersection of technology with life, culture and society in India.
- First Runner's Up at the Future Group Datathon (March 2019) * Two stage Machine Learning hackathon called [Tathastu](https://www.tathastu.ai/datathon), working on recommendation systems and item information extraction problems
- Opened AI Hackathon (2019) * Awesome NCERT won the Best use of IBM Watson API; [blog](https://medium.com/opened-ai/global-hackweek-winners-2017-a9e5da513270) * Idea: Find recent+relevant news articles against any NCERT chapter in sciences and social studies