Research Statement

I am primarily interested in Natural Language Processing, Machine Learning, and Distributed Systems. I have an MS from Georgia Tech and have worked on significant projects at Amazon Alexa, Bloomreach, and the Apache Software Foundation. While at Amazon, I have focused my research on Semantic Parsing, Cross-Lingual transfer learning, Multimodal learning, and addressing bias and fairness in Large Language models. AI fairness and safety is especially important to me and I was honored to serve as one of the core organizers of the TrustNLP workshop at NAACL 2022, which focused on this vital topic.

At the beginning of my career, I had the opportunity to participate in the Google Summer of Code program, where I worked on an open-source project. After the program ended, I continued contributing to open source by joining the Apache Hama community. This experience allowed me to build a solid foundation in working with synchronization primitives and large-scale distributed systems.

As I entered engineering, I was fortunate to have amazing mentors and peers who introduced me to technologies such as Staged Event-Driven Architecture, Map-Reduce, NoSQL databases, data warehousing, Information Retrieval, web services, and search. This foundational knowledge has been invaluable in transforming my research into practical production systems and utilizing software-engineering principles to solve complex problems.

In 2013, I learned about the ImageNet Large Scale Visual Recognition Challenge and the advancements made by Geoffrey Hinton's group. It became apparent that I needed a better understanding of deep learning theory, prompting me to apply for graduate school. I enrolled at Georgia Tech in 2015 after taking some time to research the graduate application process and saving up funds for my education in case I still needed to secure funding.

During my time at Georgia Tech, I had some memorable experiences. One of the highlights was being a part of the first deep-learning course offered on campus, where I built a system to detect skin cancer from skin lesions. In 2016, when deep learning was only beginning, this system was already achieving an impressive accuracy rate of 89%, comparable to that of an expert human. I was fortunate to work alongside skilled researchers and collaborate on two research papers with Professor Le Song's group. Our research was focused on detecting changes in dynamic events (Li et al., 2017) and information diffusion in online networks (Wang et al., 2018). I also worked on notable projects, including Multi-hop Question Answering, which used Key-Value Memory Networks and graph embeddings, and Epileptic Seizure detection through deep learning.

I started working at Amazon Boston in the Applied Science group in 2017. During 2017-2018, I focused on developing core pipelines to facilitate scaling out Alexa NLU model training. In 2018-2020, I worked on creating a successful feature for hierarchical parsing designed for multiple intents and voice-controlled routines. However, due to visa issues, I moved to Amazon Bangalore in 2020 to work on Multilingual learning in Hindi and Indian English. 2021 was an unusual year, and I returned to Amazon Boston in 2022 to join the Alexa Trustworthy AI team. Some papers that I contributed to during this time are as follows:

(1) Fairness Evaluation of AlexaTM 20B Model: Large-scale pre-trained language models (LLMs) can demonstrate undesirable biases. In this work, we conducted experiments to understand the potential harm of the AlexaTM 20B model in the context of representational bias (harmful negative generalization about a particular social group resulting from stereotyping that can propagate to model output and performance) and toxicity in an open-ended generation setting. We also compare how various training objectives, such as generative versus denoising, can affect the fairness of an LLM. (Soltan et al., 2022) We achieved a new state of the art on the Winogender dataset for the zero-shot setting.

(2) Auditing Models Along Fairness Attributes: Sometimes, machine learning models can make incorrect predictions in scenarios that go against stereotypes. For example, image captioning models may mistakenly identify the wrong gender in a non-stereotypical image, such as a woman skateboarding. To address this issue, we created ANALYST, an automatic framework that can audit machine learning models based on various attributes, such as visual objects and demographic annotations. Using SHAP, we analyze the importance of these attributes and learn a predictor function for model performance. In our study of the Microsoft OSCAR image captioning model, we discovered that certain attributes, such as (gender=female) hurt performance.

(3) Fairness Measurement and Mitigation: In (Gupta et al., 2022), we propose a novel approach to mitigate gender disparity in text generation by learning a fair model during knowledge distillation. We also published (Krishna et al., 2022), which presents a new metric for measuring the fairness of text classifiers based on the model’s prediction sensitivity to perturbations in input features. Both these works were published at ACL. As an Applied Scientist at Alexa AI, I have also contributed to reducing performance disparities in Alexa NLU models across different demographic groups. We achieved this by implementing a fair data augmentation technique.

In 2023, I joined a promising startup after the large-scale layoffs at Alexa. As a tech employee, I was fortunate enough to keep my job during the pandemic; I now understand and sympathize with those who lost their jobs due to the pandemic. Although disappointed about the news, I truly believed in the Alexa vision. I am taking this as an opportunity to prioritize health and explore new areas in machine learning.

Several themes and directions emerge as the next steps through the work and experiences that I’ve outlined above:

Scaling and alignment in distributed settings Archimedes famously said, "Give me a lever long enough and a fulcrum on which to place it, and I shall move the world." Science has come a long way since then, and the current buzz around AI can be seen as a contemporary interpretation of this quote. The updated version goes, "Provided with ample data to train on and a powerful GPU, I can acquire human intelligence." Rich Sutton echoes this sentiment in, "The Bitter Lesson", where he argues that scaling up computational models has proven a more fruitful path than a human-centric approach to machine learning. I believe this trend will continue. Scaling up may also require more scalable methods for providing oversight and learning from AI feedback, as human supervision is often slow and expensive. As someone with coursework in Reinforcement Learning and experience in Distributed Systems, I am interested in exploring compute and parameter-efficient techniques to create models that better align with human preferences.

Multilingual/Multimodal Learning While at Amazon, I focused on developing multilingual models for Indic languages. While models like GPT and PaLM have been successful for the top 20 languages worldwide and somewhat successful for the next 100, including low-resource languages and dialects in these language models is crucial. Jacob Steinhardt also predicts that the large foundation models' modalities will continue to expand, like ImageBind, for two reasons. Firstly, pairing less common modalities, such as proteins and networks, with languages is economically beneficial. Secondly, language data resources are becoming scarce. In my research efforts, I intend to work with other researchers to develop novel architectures, data sources, and training methods for multilingual and multimodal settings.

Bias and Fairness, Interpretability, Risk Estimation On the Trustworthy AI front, currently, I am researching debiasing techniques that do not require prior knowledge of protected characteristics, such as race and gender, and apply to multimodal tasks that involve vision and language. Moreover, I am also exploring methods from interpretable machine learning and risk estimation to boost human confidence in machine-generated predictions and outcomes.

Training and Inference Optimizations I am passionate about engineering and love building things. My focus is on finding ways to improve the architecture of modern machine learning models to maximize performance and effectively implement large-scale models in real-world applications. I am particularly interested in investigating longer contexts, faster decoding, and better distillation and quantization strategies.

If you want to collaborate with me on these topics, please message me for potential research collaborations.

[1]
AlexaTM 20B: Few-Shot Learning Using a Large-Scale Multilingual Seq2Seq Model.
By Soltan, S., Ananthakrishnan, S., FitzGerald, J.G.M., Gupta, R., Hamza, W., Khan, H., Peris, C.S., Rawls, S., Rosenbaum, A., Rumshisky, A., Prakash, C., Sridhar, M., Triefenbach, F., Verma, A., Tur, G. and Natarajan, P.
In ArXiv, vol. abs/2208.01448, 2022.
[2]
Mitigating Gender Bias in Distilled Language Models via Counterfactual Role Reversal.
By Gupta, U., Dhamala, J., Kumar, V., Verma, A., Pruksachatkun, Y., Krishna, S., Gupta, R., Chang, K.-W., Ver Steeg, G. and Galstyan, A.
In Findings of the Association for Computational Linguistics: ACL 2022, Dublin, Ireland, pp. 658–678, 2022.
[3]
Measuring Fairness of Text Classifiers via Prediction Sensitivity.
By Krishna, S., Gupta, R., Verma, A., Dhamala, J., Pruksachatkun, Y. and Chang, K.-W.
In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Dublin, Ireland, pp. 5830–5842, 2022.
[4]
Detecting Changes in Dynamic Events Over Networks.
By Li, S., Xie, Y., Farajtabar, M., Verma, A. and Song, L.
In IEEE Transactions on Signal and Information Processing over Networks2017.
[5]
A Stochastic Differential Equation Framework for Guiding Online User Activities in Closed Loop.
By Wang, Y., Theodorou, E., Verma, A. and Song, L.
In Proceedings of the Twenty-First International Conference on Artificial Intelligence and Statisticsvol. 84, , pp. 1077–1086, 2018.