Data Innovation and Transformation
With Data Science and AI

I'm a passionate data scientist and analyst who is eager to take on new challenges.

I am an experienced and highly motivated Data Scientist and Analyst with a passion for Natural Language Understanding (NLU), Deep Learning (DL), and researching new technologies with innate curiosity and a love of learning.

With 6 years‘ experience in data analytics, data science, and machine learning, I have a proven track record of writing quality code and delivering projects on schedule as well as demonstrated ability to learn new tools quickly and develop innovative solutions to problems.

My leadership experience includes providing training and assisting colleagues on new technologies as well as being active in S&P Global’s Women In Technology. I have a passion not only to succeed but to help others succeed as well.
95%

Predictive Modeling and Analytics

95%

Data Visualization

97%

ETL Solutions

93%

Statistical Analysis

97%

Python

95%

Tableau, Plotly Dash

93%

Azure Databricks, PySpark, Kafka, Hadoop, Presto, Git

93%

SQL

90%

AWS, GCP

88%

Docker, Airflow

80%

R

  • Vision Transformer (ViT)
  • Natural Language Processing (NLP) - Bidirectional Encoder Representations from Transformers (BERT) and BERTweet (A pre-trained language model for English Tweets)
  • Convolutional Neural Networks (CNN)
  • CNN with Transfer Learning:
    • VGG16, VGG19, ResNet50, ResNet152, ResNet152V2
    • DenseNet201, Xception, InceptionV3, EfficientNetB7, MobileNetV
  • Long-Short Term Memory (LSTM) Recurrent Neural Networks (RNN)
  • Multi-Layer Perceptrons (MLPs)
  • XGBoost
  • Decision Trees
  • Random Forest
  • Simple Linear/Multivariate Regression
  • Logistic Regression
  • General Linear Model (GLM)
  • K-Means Clustering
  • Seasonal ARIMA
  • STL Decomposition
  • Exponential Smoothing
  • Prophet

Data Analyst

September 2021 - Present

► Develop five key accuracy metrics to evaluate the relative performance of 20 mainstream oil and gas price forecasts and forward curves, providing critical insights for 23 proprietary hedge funds. ► Build a Tableau dashboard to visualize price forecasts and forecast accuracy metrics for 20 oil and gas market data sets, streamlining manual reporting processes and saving approximately 5 hours per week. ► Engineer ETL solutions for market- and asset-level gas and oil data, significantly reducing average processing time from 7 days to under 5 minutes, leveraging Databricks, PySpark, Python, SQL, AWS, and Airflow. ► Train and mentor team members in adopting new technologies such as Databricks, PySpark, and AWS S3, which enabled the establishment and management of end-to-end data pipelines, cutting the average processing time for analysis-ready data from 6 weeks to a few days. ► Build a machine learning pipeline to detect anomalies on time-series data across different environments (production vs. development) and database systems (PostgreSQL and SQL Server) using various statistical methods such as hypothesis testing with t-tests and Augmented Dickey-Fuller tests, as well as advanced machine learning techniques such as Random Forest and RNN, improving detection accuracy by 95%. ► Active board member and speaker for various panel discussions in S&P Global’s Women in Technology, contributing to the promotion of diversity and inclusion within the tech community.

About the company

Teaching Assistant

January 23, 2023 - January 27, 2023

► Answered technical questions in Python training for Machine Learning (ML) and NLP for 75+ data practitioners at Center For Disease Control and Prevention (CDC) live stream training. ► Guided trainees in breakouts with code exercises in real time and to proceed ML- and NLP-driven work-related projects.

About the company

Senior Data Researcher

May 2018 - September 2021

► Utilized machine learning and data science techniques to deliver actionable insights in global power generation in 10 geographical regions covering 190+ countries worldwide. ► Built an interactive dashboard using Dash Python to formulate energy market- and asset-level impacts on fossil fuel outlook. ► Developed high-performing ETL pipelines for power and utility data, decreasing average processing time from 10 days to under 10 minutes.

About the company

Research Contractor

May 2019 - July 2019

Collaborated with R&D team to develop RNN-based LSTM while applying denoising tools (i.e. Fast Fourier Transform) to detect anomalies in time series sensor data on athletes’ performance, improved detection performance by 55%.

About the company

Quantitative Methodology Analyst

May 2006 - September 2009

Collaborated with cross-functional teams to develop and manage methodologies and algorithms on the KPIs of mutual fund and fund. Led the quantitative analysis on the US side of individual fund, fund families, and asset class/industry to identify trends and insights for investment-decision making.

About the company

University of California, Berkeley

Master of Information and Data Science - Graduated in May 2024

About the school

University of Colorado, Boulder

Studied Computer Science toward Bachelor of Science

About the school

University of Colorado, Denver

Pursued a Master of Business Administration (MBA) with an emphasis in Finance

About the school
  • Irvine, California, United States

Contact me if you're interested in learning more about more recent work.