Staff Data Scientist - 09/2022-11/2024
Axios HQ
Founding team member for Axios HQ, helped grow our customer base from 10 to ~600 companies. My contributions spanned ML technical leadership, getting proprietary "Smart Brevity" ML models patented, maturing their data strategy by establishing a data flywheel, and using transparency and interpretability to help users develop better mental models.
Led ML lifecycle (shape, develop, release, monitor and iterate) for text generation, image generation and classification models to help users compose and edit communications in a proprietary style (Smart Brevity).
Inventor on Axios HQ’s patent (Oberoi et al, 2023)
Worked across Product, Data Engineering and leadership to advocate for and establish a data flywheel (data collection, monitoring, evaluation metrics, analysis and iteration feedback loop) so model output improves as users use the product.
Scaled team from 2 to a cross-disciplinary team of 7, including data engineering, ML engineering, data analysis, and a director of ML.
Presented ML technical content to non-technical stakeholders including existing and prospective customers, the company board of directors, executive leadership and cross-functional partners.
Established a data eventing pipeline, and task-specific evaluations (evals) to measure model output quality and model the impact on north star metrics.
Drove product roadmap prioritization with Product, Design and User Research by validating use case assumptions and current user behavior through data analysis
Developed a domain-specific knowledge layer for user insight, product research, and retrieval augmented generation.
Staff Data Scientist - 07/2022-09/2022
Senior Data Scientist - 09/2019-07/2022
Axios (Media)
Developed the first version of Axios HQ for early customers, worked closely with Editors to develop ML tools to deliver Axios' proprietary Smart Brevity style.
Established a data science team to create NLP tools for users to write in Smart BrevityTM style in a new writing platform (incubation for Axios HQ)
Instituted responsible ML practices: model cards (Mitchell et al, 2019) to describe model training and intended use, data cards to describe fine-tuning datasets, and model monitoring and observability.
Led out a partnership with ArthurAI to develop product-specific metrics for monitoring
Data Scientist - 01/2017-09/2019
Deep Learning Analytics
Develop machine learning models for government and commercial clients using Python and a variety of deep learning frameworks (including TensorFlow, Caffe, Keras, PyTorch) and manage junior data scientists on the team
Research and implement methods to improve object detection and recognition in images under constraints such as limited or unbalanced data using Generative Adversarial Networks, style transfer and domain translation
Placed 2nd globally on the 2018 iNaturalist Competition and presented our results at CVPR 2018.
Prototyped interpretability tools that provide insight into model performance and failure, this product was adopted across three client projects and was rolled out at GDMS to prepare data labels for machine learning.
Data Scientist - 02/2016-01/2017
Commerce Data Service, US Department of Commerce
Developed predictive models to identify business with the highest probability of partnering with the Department of Commerce, hierarchical clustering models to build profiles on DOC partners, and photovoltaic energy prediction models
Led the data science portfolio at NIST, which involved managing teams, domain research, project scoping, communicating statistical methods and findings, and facilitating data product implementation into existing processes
Established and managed the NetZero House open data pipeline and data release at TechCrunch Disrupt SF Hackathon
Taught Data Science and R classes to 100+ employees across DOC
Sexual Orientation and Gender Identity Interagency (SOGI) Working Group: Used NLP topic modeling (latent Dirichlet allocation) to find latent topics in research on surveying sexual orientation and gender identity, in order to inform survey guidelines and research on SOGI
Pro Bono Data Science
Whitman Walker Health - 07/2016-01/2017
Worked with WWH to formulate goals for a data product, scoped out a multi-part data project, did domain research, cleaned and merged data from disparate sources, and developed statistical models to 1) cluster clients, 2) understand patterns in Medicare Part D usage, 3) manage caseworker loads, 4) model legal service impact on health outcomes
Data Scientist (Lead Scientist) - 07/2015-02/2016
Data Scientist (Staff Scientist) - 01/2014-06/2015
Booz Allen Hamilton
Managed a team of 5 data scientists at the Food and Drug Administration to make data collected by the agency more readily available through data visualization and stats dashboards to promote transparency, manage workload and resources, encourage platform adoption and facilitate internal and congressional reporting
Built trust with clients at the FDA by understanding the drug application process, identifying pain points, developing user stories, and releasing minimum viable products early on for users to give feedback on
Prototyped binary classification models to flag applications at risk of missing PDUFA/GDUFA goal dates
Leveraged core groups of invested FDA users and rapid iterative development cycles to release more than 25 dashboards over 8 months to multiple offices, which promoted platform adoption and retired standalone databases and excel sheets
Coded web app with interactive data visualization for genome variant storage and analysis platform using R and R Shiny; stood up instances in Amazon Web Services so Hadoop clusters could query the variant database
Acted as a liaison for Principal Investigators and other intramural researchers at the NIH’s National Heart, Lung and Blood Institute, to meet their IT and Application needs
Visualized patient data in D3.js (JavaScript) and Tableau dashboards from wearable tech to allow healthcare professionals to track patient health post-discharge and flag indicators for high readmission risk
Accreditation Manager - 07/2012-12/2013
National Committee for Quality Assurance
Led a cross-functional, multi-department effort to collect data, conduct analyses and prepare reports on the performance of health plans and clinical measure data for the Center for Medicaid and Medicare Services (CMS)
Collected, managed and conducted statistical analyses on health plan performance datasets to find gaps in performance, identify areas for reviewer improvement, reevaluate standards of care, and prepare reports for CMS
Independently developed and maintained SQL databases to track national health plans
SNP Assessment Analyst - 10/2010-06/2012
National Committee for Quality Assurance
Collected, managed and conducted statistical analyses on health plan performance datasets to find gaps in performance, identify areas for reviewer improvement, reevaluate standards of care, and prepare reports for CMS
Bioinformatics Intern - 11/2010-02/2011
Smithsonian Institution, National Museum of Natural History
Handled museum holdings, collected project material, and researched taxonomic literature to populate and organize an integrated, taxonomic, specimen, nomenclatural, and image database for Tabanidae
Developed data processing scripts and Unix Shell scripting to parse large data files
SNP Assessment Coordinator - 01/2009-09/2010
National Committee for Quality Assurance
Coordinated efforts between NCQA and external surveyors, internal reviewers, and national health insurance plans regarding eligibility, product-related inquiries, process updates, and timely completion of reviews
MS in Biotechnology, concentration in Bioinformatics
Johns Hopkins University
Aug 2013
BA in Biology
Bard College
Dec 2009
Axios HQ Values Award - 2023
DCFemTech “Power Women in Code, Design, and Data” Award - 2018
iNaturalist competition 2nd place - 2018
Featured on USAjobs Data Science (medium article by Data Society) - 2017
Featured on Booz Allen Hamilton's Innovo Magazine - 2015
NCQA Annual Award for Initiative - 2011
Keesing, F., P. Oberoi, R. Vaicekonyte, K. Gowen, L. Henry, S. Mount, P. Johns, and R.S. Ostfeld. Effects of Garlic Mustard (Alliaria petiolata) on Entomopathogenic Fungi. Ecoscience 18(2)2011:164-168