September 2021 - Present
► Develop five key accuracy metrics to evaluate the relative performance of 20 mainstream oil and gas price forecasts and forward curves, providing critical insights for 23 proprietary hedge funds. ► Build a Tableau dashboard to visualize price forecasts and forecast accuracy metrics for 20 oil and gas market data sets, streamlining manual reporting processes and saving approximately 5 hours per week. ► Engineer ETL solutions for market- and asset-level gas and oil data, significantly reducing average processing time from 7 days to under 5 minutes, leveraging Databricks, PySpark, Python, SQL, AWS, and Airflow. ► Train and mentor team members in adopting new technologies such as Databricks, PySpark, and AWS S3, which enabled the establishment and management of end-to-end data pipelines, cutting the average processing time for analysis-ready data from 6 weeks to a few days. ► Build a machine learning pipeline to detect anomalies on time-series data across different environments (production vs. development) and database systems (PostgreSQL and SQL Server) using various statistical methods such as hypothesis testing with t-tests and Augmented Dickey-Fuller tests, as well as advanced machine learning techniques such as Random Forest and RNN, improving detection accuracy by 95%. ► Active board member and speaker for various panel discussions in S&P Global’s Women in Technology, contributing to the promotion of diversity and inclusion within the tech community.
About the companyJanuary 23, 2023 - January 27, 2023
► Answered technical questions in Python training for Machine Learning (ML) and NLP for 75+ data practitioners at Center For Disease Control and Prevention (CDC) live stream training. ► Guided trainees in breakouts with code exercises in real time and to proceed ML- and NLP-driven work-related projects.
About the companyMay 2018 - September 2021
► Utilized machine learning and data science techniques to deliver actionable insights in global power generation in 10 geographical regions covering 190+ countries worldwide. ► Built an interactive dashboard using Dash Python to formulate energy market- and asset-level impacts on fossil fuel outlook. ► Developed high-performing ETL pipelines for power and utility data, decreasing average processing time from 10 days to under 10 minutes.
About the companyMay 2019 - July 2019
Collaborated with R&D team to develop RNN-based LSTM while applying denoising tools (i.e. Fast Fourier Transform) to detect anomalies in time series sensor data on athletes’ performance, improved detection performance by 55%.
About the companyMay 2006 - September 2009
Collaborated with cross-functional teams to develop and manage methodologies and algorithms on the KPIs of mutual fund and fund. Led the quantitative analysis on the US side of individual fund, fund families, and asset class/industry to identify trends and insights for investment-decision making.
About the company