21/08/2023

LLM-Driven Gas Flow Aggregates QA & Anomaly Detection Automation - S&P Global Commodity Insights, 2026

Built an LLM-driven QA/anomaly-detection pipeline in Databricks (PySpark + SQL) to validate 2.7M+ gas flow aggregates (2005–present) across 400+ US/Mexico/Canada pipelines at component and throughput-group levels, surfacing root causes through tabular anomaly outputs (multiple anomaly types), visuals, and plain-text GPT-4 reports - cutting manual QA ~85%, speeding detection ~15x, saving ~100 hrs/month, and helping close ~90% of issues cross-functionally by shifting work from investigation to scalable, automated detection and faster resolution.

LLM-Driven Gas Flow Aggregates QA & Anomaly Detection Automation - S&P Global Commodity Insights, 2026
LLM-Driven Gas Flow Aggregates QA & Anomaly Detection Automation - S&P Global Commodity Insights, 2026
LLM-Driven Gas Flow Aggregates QA & Anomaly Detection Automation - S&P Global Commodity Insights, 2026

Gas Pipeline Flow Anomaly Detection & Data Alignment - S&P Global Commodity Insights, 2025

Developed ML dashboards and NLP models to detect gas flow anomalies and automate post-merger data reconciliation, improving accuracy by 95% and cutting processing time by 80%

Gas Pipeline Flow Anomaly Detection & Data Alignment - S&P Global Commodity Insights, 2025
Gas Pipeline Flow Anomaly Detection & Data Alignment - S&P Global Commodity Insights, 2025

Oil and Gas Price Forecasts and Forecast Accuracy Metrics - S&P Global Commodity Insights, 2024

Developed five key accuracy metrics to evaluate the performance of 20 mainstream oil and gas price forecasts and forward curves, providing critical insights for 23 proprietary hedge funds. Created an interactive Tableau dashboard to visualize price forecasts and accuracy metrics for these datasets, streamlining manual reporting and saving around 5 hours each week.

Oil and Gas Price Forecasts and Forecast Accuracy Metrics - S&P Global Commodity Insights, 2024
Oil and Gas Price Forecasts and Forecast Accuracy Metrics - S&P Global Commodity Insights, 2024
Oil and Gas Price Forecasts and Forecast Accuracy Metrics - S&P Global Commodity Insights, 2024

Fuzzy Matching Company Names - S&P Global Market Intelligence, 2022

Built an NLP-based organization entity resolution pipeline to support Named Entity Recognition (NER) workflows by extracting and normalizing company (ORG) entities and deduplicating 40,000+ name variants. Used TF-IDF vectorization and similarity scoring to map abbreviations, legal suffix variations (Inc/LLC/Ltd), omissions, and typos to a single canonical company name for consistent downstream analytics.

Fuzzy Matching Company Names - S&P Global Market Intelligence, 2022

Global Energy Trends with Fossil Fuel Price Prediction - S&P Global Market Intelligence, 2022

Developed an interactive web interface using Python Dash and Plotly to provide insights into global energy trends and predict fossil fuel prices. Applied advanced time series models, including Seasonal ARIMA, STL Decomposition, Exponential Smoothing, and Prophet, to forecast energy prices and present valuable data-driven insights.

Global Energy Trends with Fossil Fuel Price Prediction - S&P Global Market Intelligence, 2022
Global Energy Trends with Fossil Fuel Price Prediction - S&P Global Market Intelligence, 2022

Energy Market Sentiment Analysis with NLP and LSTM - S&P Global Market Intelligence, 2021

Designed an NLP-based sentiment and impact prediction pipeline using tokenization, embeddings, and LSTM neural networks with softmax output to analyze energy-related news headlines and predict positive/negative signals influencing oil-and-gas index values, optimized via cross-entropy training and hyperparameter tuning. This approach provided a deeper understanding of market dynamics driven by sentiment, offering critical insights into market behavior.

Energy Market Sentiment Analysis with NLP and LSTM - S&P Global Market Intelligence, 2021
Energy Market Sentiment Analysis with NLP and LSTM - S&P Global Market Intelligence, 2021

Prediction of Capital and Operating Costs of Power Plants - S&P Global Market Intelligence, 2020

Applied multivariate regression to predict future capital expenditures and operating costs of U.S. power plants, accounting for various fuel and technology types. This analysis enabled more accurate financial forecasting, supporting decision-making across different plant configurations and cost structures.

Prediction of Capital and Operating Costs of Power Plants - S&P Global Market Intelligence, 2020
Prediction of Capital and Operating Costs of Power Plants - S&P Global Market Intelligence, 2020

Mapping Gas Power Plants to a Gas Hub in the U.S. - S&P Global Market Intelligence, 2019

Built a Python application to map U.S. gas power plants to their nearest gas hub district using the K-Means clustering algorithm and Vincenty distance function. This mapping ensured that each power plant was within a 12-mile radius of every other plant in the hub, helping optimize distribution networks and improve logistical efficiency.

Mapping Gas Power Plants to a Gas Hub in the U.S. - S&P Global Market Intelligence, 2019