Selecting important variables for model building

One of the benefits of Random forest is the power of handle large data set with higher dimensionality. It can handle thousands of input variables and identify most significant variables so it is considered as one of the dimensionality reduction methods. Further, the model outputs the importance of the variables, which can be a very handy feature.


featimp = pd.Series(model.feature_importances_, index=predictors).sort_values(ascending=False)

print (featimp)

I have selected all the features available in the train data set and model it using random forest:

predictors=['ApplicantIncome', 'CoapplicantIncome', 'Credit_History','Dependents', 'Education', 'Gender', 'LoanAmount',
            'Loan_Amount_Term', 'Married', 'Property_Area', 'Self_Employed', 'TotalIncome','Log_TotalIncome']

Run feature importance command and identify Which variable has the highest impact on the model??

Possible answers

LoanAmount

Dependents

Gender

Education

Introduction to Python for Data Analysis

Python Libraries and data structures

Exploratory analysis in Python using Pandas

Data Munging in Python using Pandas

Building a Predictive model in Python

Expert advice to improve model performance

Exercise

Selecting important variables for model building

Instructions

Possible answers