Jugal Marfatia

Jugal Marfatia

Data Scientist
Kaggle Competition Expert (Top 2%)


I am a 4th year PhD student in economics and simultaneously pursuing a masters in statistics at Washington State University.

My research interest lies at the intersection of econometrics and machine learning.

Please contact me at: jugal.marfatia@wsu.edu


  • Econometrics
  • Machine Learning
  • Data Science


  • PhD in Economics, Expected 2021

    Washington State University

  • MS in Statistics, Expected 2021

    Washington State University

  • MA in Economics, 2017

    San Diego State University







Machine Learning



Graduate Instructor/ Research Assistant

Washington State University

August 2017 – Present Pullman, Washington
  • EconS 321 - Economics of Sports in America. (4 Semesters, 2 In-Class and 2 Online)
  • Github link to course material: (link)

Data Science/ Regional Models Intern

San Diego Association of Governments (Sandag)

May 2016 – August 2017 San Diego, California

Responsibilities included:

  • Developed Python based demographic and economic forecasting model adopted by 17 San Diego counties for regional planning: (Github)
  • Contributed towards the development of real estate forecasting model also based in Python. (Github)
  • Developed Python/SQL scripts which automated import, storage and analysis of data from external sources such as BLS, BEA, Census, California Dept. of finance, California Dept. of Education and others.
  • Developed several visualization and logical check tools for data quality check purposes. Assisted on data request from other internal de- partments and external organizations.
  • Made presentations to team members and senior executives of Sandag explaining modeling process and results.

Research Assistant

San Diego State University

August 2015 – May 2017 San Diego, California
  • Conducted Data/ Statistical analysis on health and elderly care that resulted in 1 published paper.
  • Conducted econometric analysis on issues of immigration and income inequality in the United States.
  • Cleaned and analyzed large panel data sets with over 1 million observations from IPUMS.

Data Science Competitions

Top 9% (bronze medal), Kaggle M5 Forecasting - Accuracy

The main objective of this competition was to use hierarchical sales data from Walmart, the world’s largest company by revenue, to forecast daily sales for the next 28 days. The data, covers stores in three US States (California, Texas, and Wisconsin) and includes item level, department, product categories, and store details. In addition, it has explanatory variables such as price, promotions, day of the week, and special events. Together, this robust dataset can be used to improve forecasting accuracy.

Top 1% (silver medal), Kaggle NFL Big Data Bowl Competition 2020.

The main objective of the competition was to develop a model to predict how many yards a team will gain on given rushing plays as they happen. We were provided game, play, and player-level data, including the position and speed of players as provided in the NFL’s Next Gen Stats data.

Top 10% (bronze medal), Kaggle IEEE-CIS Fraud Detection Competition 2019.

The main objective of this competition was to predict whether a particular credit card transaction is fraud.



Actual spread of Covid19.

US County Level Analysis

Finalist at NFL Big Data Bowl 2020.

One of 6 finalists nationally in the NFL’s 2020 Big Data Bowl competition.

Wimbledon Predictions using Neural Network.

The Neural Network predicts Novak Djokovic to be the ultimate champion beating Roger Federer in the finals.