no image
Location: New York, NY, USA
School: New York University
Major: Management of Information System


Zhipeng Xu
28-30 Jackson Ave, Long Island City, NY, 11101 •------------ • ------------
New York University, New York, USA
Master of Science in Information System Management, GPA: (3.88/4)
· Relevant Coursework: Database design and management, Database Process and structure, Data Mining and Data Warehousing and etc Sep 2018 – Apr 2020
Nanfang college of Sun Yat-sen University, Guangzhou, China
Bachelor of Arts in Accounting(Distinct), GPA: (3.9/4)
· 1st Class individual scholarship for three academic years (top 1%)
· Outstanding Student Cadre (top 2%) Sep 2014 – Jun 2018
· Programming skills: Python(sklearn, pandas, numpy), SQL, Matlab
· Machine learning skills: Decision Tree, Random Forest, Linear Regression, Logistic Regression, K-nearest-neighbors, Principal Component Analysis
· Statistics Analysis: Hypothesis Testing, A/B Testing, Time analysis
Amazon Prime Video Data Analysis
· Built 5 different machine learning models to predict a movie’s cumulated view time per day in Amazon Prime Video
· Worked on Amazon Prime Video data set to pre-proposed the prime data such as Replace all the3424 missing value in the feature by the average of each feature
· Applied python packages(matplotlib, pandas plot, seaborn) for data exploration
· Investigated in the feature importance to find the top predictive features that influence a movie’s performance
· Chose Random Forest as the using model since it had the lowest Mean square error(130458013.467)

User Churn Prediction in Telecommunication Industry
· Applied algorithms to help telecommunication companies to predict customer churn probability to identify who will stop using service
· Processed raw data by categorical feature conversion by one-hot-encoding, data cleaning, feature scaling, etc
· Built supervised learning models including Random Forest, Logistic Regression to predict customer churn probability
· Used testing data to check whether models were overfitting and applied regularization to solve overfitting
· Applied Confusion Matrix(precision, recall, accuracy) to evaluate model performance, and got Random Forest as the most suitable model
· Random Forest got the highest accuracy(0.969), precision(0.941) and recall(0.795)
CITIC Securities, Beijing, China
Intern at Investment Banking Division
· Helped 5 clients with stock portfolios and gain ~ 5 million RMB by calculating and analyzing correlations between different assets
· Wrote pitchbooks in acquisition of 2 big clients – Shanghai Rural Commercial Bank and Shenzhen Rural Commercial Bank

Dec 2017 – Feb 2018

data analyst