作者:ANKIT GUPTA
原文:Solutions for Skilltest Machine Learning : Revealed
注:有答案及解析,本文只给出索引,完整题目及解答请点原文
1780人参加了这项测试,最终的分数分布如下:
Q1. Which of the following method is best suited to detect outliers in a n-dimensional space, where n > 1:
Q2. Logistics regression differs from multiple regression analysis in following ways ?
Q3. What does it mean to bootstrap data?
Q4. “Overfitting is a challenge with supervised learning, but not with unsupervised learning.” Is the above statement True or False?
Q5. Which of the following are true with regards to choosing “k” in k-fold cross validation?
Q6. A regression model is suffering with multicollinearity. How will you deal with this situation without losing much information?
Q7. After evaluation of model, it is identified that we have high bias in our model. What could be the possible ways to reduce it?
Q8. While building a decision tree based model, we split a node on the attribute, which has highest information gain. In the image above, select the attribute which has the highest information gain?
Q9. Which of the following is a correct statement about “Information Gain”, while splitting a node in a decision tree?
Q10. An SVM model is suffering with under fitting. Which of the following steps will help to improve the model performance?
Q11. Suppose, we were plotting the visualization for different values of gamma (Kernel coefficient) in SVM algorithm. Due to some reason, we forgot to tag the gamma values with visualizations. In that case, which of the following option best explains the gamma values for the images below (1,2,3 left to right, so gamma values are g1 for image1, g2 for image2 and g3 for image3 ) in case of rbf kernel.
Q12. We are solving a classification problem (Binary class prediction) and predicting the probabilities of classes instead of actual outcome (0,1). Now suppose, I have taken the model probabilities and applied a threshold of 0.5 to predict the actual class (0,1). Probabilities more than or equals to 0.5 will be consider as positive class (say 1) and below 0.5 will be considered negative class (say 0). Next, if I use a different threshold, which is higher than 0.5 for classification of positive and negative class, what is the most appropriate answer below you can think of?
Q13. “Click through rate” prediction is a problem with imbalanced classses (say 99% negative class and 1% positive class in our training data). Suppose, we were building a model on such imbalanced data and we found our training accuracy is 99%. What could be the conclusion?
Q14. Let’s say we are training a model using kNN where training data has less number of observations (below is a snapshot of training data with two attributes x, y and two labels as “+” and “o”). Now for k=1 in kNN, what would be the leave one out cross validation error?
Q15. We want to train a decision tree on a large dataset. What options could you consider for building a model which will take less time for training?
Q16. Which of the following options are true regarding neural network?
Q17. Assume we are using the primal non-linearly separable version of the SVM optimization target function. What do we need to do to guarantee that the resulting model is linearly separable?
Q18. After training an SVM, we can discard all examples which do not support vectors and can still classify new examples?
Q19. Which of the following algorithm(s) can be constructed with the help of a neural network?
Q20. Please choose the datasets / problems, where we can apply Hidden Markov Model?
Q21. We are building a ML model on a dataset with 5000 features and more than a million observations. Now, we want to train a model on this dataset but we are facing a challenge with this big size data. What are the steps will you consider to train model efficiently?
Q22. We want to reduce the number of features in a data set. Which of the following steps can you take to reduce features (choose most appropriate answers)?
Q23. Please choose options which are correct in case of RandomForest and GradientBoosting Trees.
Q24. For PCA (Principle Component Analysis) transformed features, the independence assumption of Naive Bayes would always be valid because all principal components are orthogonal and hence uncorrelated. Is this statement True or False?
Q25. Which of the following statements is true about PCA?
Q 26: What would be the optimum number of principle components in the figure given
Q27. Data scientists always use multiple algorithms for prediction and they combine output of multiple machine learning algorithms (known as “Ensemble Learning”) for getting more robust or generalized output which outperform all the individual models. In which of the following options you think this is true (Choose best possible answer)?
Q28. How can we use clustering method in supervised machine learning challenge?
Q29. Which of the following statement(s) are correct?
Q30. Which options is / are true regarding the GradientBoosting tree algorithm.
Q31. Which of the following is a decision boundary of KNN?
Q32. If a trained machine learning model achieves 100% accuracy on test set, does this mean that the model will perform similar on a newer test set, i.e. give 100%?
Q33. Below are the common Cross Validation Methods:
Q34. Removed
Q35. Variable selection is intended to select the “best” subset of predictors. In case of variable selection what are things we need to check with respect to model performance?
Q36. Which of the following statement(s) may be true post including additional variables in a linear regression model?
Q37. If we were evaluating the model performance with the help below visualizations which we have plotted for three different models for same regression problem on same training data. What do you conclude after seeing this visualization?
Q38. What are the assumptions we need to follow while applying linear regression?
Q39. When we build linear models, we look at the correlation between variables. While searching for the correlation coefficient in the correlation matrices, if we found that correlation between 3 pairs of variables (Var1 and Var2 , Var2 and Var3 , Var3 and Var1) is -0.98, 0.45 and 1.23 respectively. What can we infer from this?
Q40. If there is a high non-linearity & complex relationship between dependent & independent variables, a tree model is likely to outperform a classical regression method. Is this statement correct?
Q41. Removed