Data Science Academy - COMPFEST UI 2019
COMPFEST
COMPFEST is an annual one stop IT event run by students of the Faculty of Computer Science, University of Indonesia. As the biggest IT event run by students, it has been held anually since its very first event in 2009. Since then, it has evolved to become a core part in Faculty of Computer Science’s culture. Every year, we strive to deliver a different core message and theme, making it distinct from one another.
Data Science Academy (DSA)
Data science is a slice of various dimensions of science ranging from programming to statistics that are useful to provide insight, answers, and knowledge of multisectoral problems, ranging from industry, government, to health, through empirical processes. The phenomenon of big data in the era of the industrial revolution 4.0 requires humans to be able to extract and process data in such a way that it can be useful for various sectors in human life.
Data Science Academy is a series of events divided into two camps aimed at providing knowledge and application of data science, especially in the business industry. During the series of events, participants will get material from speakers and trainers who are experienced in data science, mentoring sessions, and case study problems that will be presented at the 2nd camp. With the presence of Data Science Academy COMPFEST 12, we hope that participants can develop their knowledge and skills in the field of data science so that it can meet the needs that exist in the business industry.
Preliminary of DSA 2019
Context
Eid al-Fitr in our country is one of the most important holidays, considering that Indonesia is a country with 87.2% of the population are Muslims. It is a tradition that every year in the days leading up to Eid al-Fitr, most of the Indonesian population who migrates to metropolitan such as Jabodetabek go to their hometown. They go home to visit their relatives with the aim of silaturahmi and celebrating Eid al-Fitr together. Specifically for Jabodetabek residents, they who leave Jabodetabek are estimated by the Ministry of Transportation to be 15 million people. Then, the travellers who spent the Eid in their hometowns would return to the capital for work. It is estimated that as many as 71 thousand newcomers will come to Jakarta.
Assume you are a Data Scientist from the Human Resources Department (HRD) of a company engaged in the marketplace called TokoLapak. This company has employees spread across Jabodetabek (Jakarta, Bogor City, Bogor Regency, Depok City, Tangerang City, South Tangerang City, Tangerang Regency, Bekasi City, and Bekasi Regency), and each employee is required to install the TokoLapak application, so you have a sample data of employees’s location every day.
Datasets
There are two datasets, for location and profile of employees
- catatan_lokasi.csv
Is a sample record of the location of the TokoLapak employees from 21st May 2019 to 15th June 2019 which consists of the following variables:- id: unique ID of employee
- tanggal: date
- lokasi_dominan : the location where the employee is at the city/ regency level
- data_profil.csv
It is an profile of employee which consists of the following variables:- id: unique ID of employee
- jenis_kelamin : employee sex
- divisi : the division where the employee works at TokoLapak
- umur : employee age
For the questions and also answers (Python and R language), you can directly head over to my Github repository!
Sources