Getting Started with Data Science Using Python
Rosie Reeves is an entrepreneurial middle-school student who sells homemade lemonade from a stand at the park near her house. To promote her lemonade-stand, she distributes leaflets in the park. Rosie records details of her sales and flyer (leaflet) distribution, along with weather measurements including the temperature and rainfall each day.
It’s the first project we might to do when we want to dive into Data Science field. Yups, it’s data manipulation. The data come from Microsoft and actually it’s like Iris dataset, seems good to begin with. There are many things we can do, such as data aggregating, string manipulation, sorting, etc. The data consists of seven variables: date, day, temperature, rainfall, flyers, price, and sales. Through the previous story, we are asked to help Reeves to solve her lemonade business problem.
By reading the Jupyter Notebook, you will get several info and knowledge:
- Data importing
- Data structure and summary of statistics
- Data type manipulation
- Slicing, filtering, and subsetting data
- Missing value check
- Data aggregating
- Cross tabulation
- Standarization (min-max scaler and normalization)
- Data exporting
For more detail, feel free to go over my Github repository!