Prashant, thank you for sharing your work.
After reviewing your work I have a general comment and a few specific comments.
I trust you will take my comments as constructive feedback.
1. General comment. Be very careful when making generalizations as to the causation of data results.
Remembering the most fundament of rules: correlation does not imply causation. To be more specific,
data scientist we must not assume facts that have not been placed into evidence. I will give specific examples below.
2. Input [16] you state "
Their future customer is more likely from NSW". This is not what the data shows. It only shows that historically more customers live in NSW. You cannot automatically infer that future customer will come from NSW. What if the market is saturated in NSW. Wouldn't this data beg the question:
Is our advertising connecting well in the other states? Or are our product offering right for the consumers in the other states? You did this in Input [17].
3. Input [17] This has the same issue as [16] Does the data show that car owners are better off financially as non-car owners? Could the non-car owners live in urban areas where car ownership is much more expensive, or unnecessary because everything is close to where they work and live?
4. Finally, you did a great job of Reporting & Visualizing the data. No, using machine learning algorithms you need to analyze the data. Look into the clustering, classification, association, and regression algorithms to quantify and predict the behavior you are interested in. So far you have only presented assumptions based on reporting and no predictions based upon the data provided to you.
------------------------------
Lee Allan
------------------------------
Original Message:
Sent: Tue July 14, 2020 06:14 PM
From: Prashant Shukla
Subject: Evaluation of Data Analysis of Sales dataset
Please evaluate my data analysis and visualization on sales dataset of a client company of KPMG which I got through virtual internship.
Now, I have many doubts which ML algorithm is correct for sales data to achieve desired goal. The link of my jupyter notebook is https://github.com/ShuklaPrashant21/KPMG_Virtual_Internship/blob/master/Task_Solutions/KPMG_Virtual_Internship_Data_Visualization.ipynb
Please guide me to model development and how can I deploy any on this dataset.
------------------------------
Prashant Shukla
------------------------------
#GlobalAIandDataScience
#GlobalDataScience