Hi Smita,

Its great that my article helped you in some way :)

In get_features() the value is being assigned to high_correlations variable which I am then returning from the function.

high_correlations = abs_corrs[abs_corrs > correlation_threshold].index.values.tolist()

The threshold value being used is 0.05 i.e the correlation between input feature and target variable ‘quality’ should be more than 0.05

Here we are checking abs_corrs is greater than correlation_threshold.

As you can see in the below image, we have negative coefficients as well. So I am taking abs value for each and comparing it with the threshold value. We can see that ‘residual sugar’ feature’s correlation will be less than 0.05 and so it will be eliminated. Rest other features will be taken into account.

In the heatmap, we see diagonals have correlation 1 because each attribute is being compared it with itself. So when we find correlation between ‘alcohol’ and ‘alcohol’ it will return a value 1.

Codes a little 💻, eats a lot 🍕

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store