# Question Solved1 AnswerDecision Trees, Entropy and Information Gain An important class of machine learning models is decision trees: you can use them for both classification and regression. Decision trees can summarize the way humans reasons. Below is a table of data gathered from a recent census in Ontario, Canada. The study was done according to different features including / AGE: a continuous feature listing the age of the individual / EDUCATION, a categorical feature listing the highest education award achieved by the individual (high school, bachelors, doctorate) / MARITAL STATUS (never married, married, divorced) / OCCUPATION (transport = works in the transportation industry; professional = doctors, lawyers, etc.; agriculture = works in the agricultural industry; armed forces = is a member of the armed forces) and finally the ANNUAL INCOME, the target feature with 3 levels (<25K, 25K–50K, >50K) ID AGE EDUCATION STATUS OCCUPATION INCOME 1 39 bachelors never married transport 25K–50K 2 50 bachelors married professional 25K–50K 3 4 18 28 high school bachelors never married married agriculture professional <25K 25K–50K 5 37 high school married agriculture 25K–50K 6 7 24 52 high school high school never married divorced armed forces transport <25K 25K–50K 8 40 doctorate married professional >50K a) In this part, you are asked to compute the entropy of this dataset. b) Now, In this part, you are asked to Calculate information gain for the features: EDUCATION, MARITAL STATUS, and OCCUPATION (based on entropy).

7SDQC0 The Asker · Computer Science

Decision Trees, Entropy and Information Gain

An important class of machine learning models is decision trees: you can use them for both classification and regression. Decision trees can summarize the way humans reasons. Below is a table of data gathered from a recent census in Ontario, Canada. The study was done according to different features including / AGE: a continuous feature listing the age of the individual / EDUCATION, a categorical feature listing the highest education award achieved by the individual (high school, bachelors, doctorate) / MARITAL STATUS (never married, married, divorced) / OCCUPATION (transport = works in the transportation industry; professional = doctors, lawyers, etc.; agriculture = works in the agricultural industry; armed forces = is a member of the armed forces) and finally the ANNUAL INCOME, the target feature with 3 levels (<25K, 25K–50K, >50K)

 ID AGE EDUCATION STATUS OCCUPATION INCOME 1 39 bachelors never married transport 25K–50K 2 50 bachelors married professional 25K–50K 3 4 18 28 high school bachelors never married married agriculture professional <25K 25K–50K 5 37 high school married agriculture 25K–50K 6 7 24 52 high school high school never married divorced armed forces transport <25K 25K–50K 8 40 doctorate married professional >50K

a) In this part, you are asked to compute the entropy of this dataset.

b) Now, In this part, you are asked to Calculate information gain for the features: EDUCATION, MARITAL STATUS, and OCCUPATION (based on entropy).

More