初心
I am primarily interested in Natural Language Processing with a focus on AI Safety, Alignment, Reasoning and Text-Watermarking. I obtained my Masters in Computer Science from the Georgia Institute of Technology and have industry experience as an Applied Scientist at Amazon. I have worked on a range of NLP problems such as Semantic Parsing, Named-Entity-Recognition, Cross-Lingual Transfer Learning, and addressing the bias and fairness issues in Large Language Models (LLMs). More recently, I have worked on Red-Teaming LLMs, Text-Watermarking and Hallucination Detection. I was also co-organizer for the TrustNLP workshop at NAACL 2022 and have served as a committer to the Apache Software Foundation in the past.
During undergrad, I participated in the Google Summer of Code program, where I worked on PhyloGeoRef, an open-source project for visualizing Phylogenetic trees in Google Earth. After the program ended, I started contributing to the Apache Hama project which was an open source implementation of the Google Pregel paper. This experience helped me build a solid foundation in concurrency and synchronization primitives in distributed systems.
I began my career as an engineer at Bloomreach, where I spent several years building the data infrastructure for large-scale data processing. I was fortunate to work with amazing ex-Google and ex-Microsoft engineers who were kind enough to mentor me and help me become a strong engineer. This formative period has instilled an engineering rigor that underpins my research methodology to this day.
In 2013, I learned about the ImageNet Challenge and the advancements made by Geoffrey Hinton's group. To get a better understanding of deep learning and machine learning theory, I decided to apply for graduate school. I subsequently enrolled at Georgia Tech in 2015 after taking some time to save up funds, explore my interests, research the graduate application process and improve my english for GRE and TOEFL.
During my time at Georgia Tech, I had the opportunity to take the first deep-learning course offered on campus, where I built a system to detect skin cancer from skin lesions. Back in 2016, as deep learning was just starting out, this system was already delivering an impressive accuracy of 89%, which was comparable to that of a human expert. The models were trained using the Torch Lua framework. I also collaborated on two research papers with Professor Le Song's group. Our research focused on detecting changes in dynamic events over networks (Li et al., 2017) and guiding information diffusion in online networks (Wang et al., 2018) . Other projects I worked on during this time included Multi-Hop Question Answering using Key-Value Memory Networks, and Epileptic Seizure Detection through LSTMs.
After graduating, I started working at Amazon Boston as an Applied Scientist on the Alexa innovations team. During 2017-2018, I focused on developing core pipelines to facilitate scaling out Alexa NLU model training. From 2018 to 2020, I worked on creating an LSTM-CRF based model for hierarchical parsing of long-form Alexa utterances. However, due to visa issues, I had to move back to India in 2020. I was able to continue working for Amazon and worked on model distillation and fine-tuning for Hindi Multilingual models. 2021 was a tough year with the pandemic in India. I returned to Amazon Boston in 2022 to join the Alexa Trustworthy AI team where I contributed to multiple initiatives focused on evaluation and mitigation of biases in Large Language Models (LLMs) (Soltan et al., 2022) , (Gupta et al., 2022) , (Krishna et al., 2022).
In 2023, I joined a promising startup after the large-scale layoffs at Alexa. As a technology worker, I was lucky to retain my position throughout the pandemic despite facing visa challenges. I now empathize with those who suffered job losses due to the pandemic. Even though I was saddened by the news, I had a strong belief in the Alexa vision. I am viewing this as a chance to focus on my health and delve into new fields within machine learning.
As of 2024, I joined Bloomberg as a Senior Machine Learning Engineer and have also enrolled in a doctoral program at NJIT. I am quite happy with the work-life balance at Bloomberg and their focus on philanthropy. One of my recent achievements is writing a comprehensive paper on Red-Teaming Attacks against LLMs. This paper provides an extensive review of various attack methods and defense mechanisms, as well as strategies for successful red-teaming (Verma et al., 2024).
My past research focus has been on, (1) Model Alignment, (2) Text-Watermarking, (3) Hallucination in LLMs, and (4) AI Safety. I have recently also gotten interested in hyperparam transfer and RL for LLMs. I always enjoy learning about the work of fellow researchers and exploring potential intersections for collaboration. Pursuing a part-time PhD affords me the privilege to operate somewhat like the subjects of Alec Foege's "The Tinkerers"; adopting a curiosity-driven approach to research over the immediate pressures of the "publish-or-perish" cycle. I am not totally immune to it but I try to be mindful of Claude Shannon’s caution against the "research bandwagon", and focus on doing research that is useful and expands our understanding about something. If you would like to explore potential research collaboration, feel free to reach out via email or Twitter DM.