ThatQuiz Test Library Take this test now
formattive-2(DataSciece)
Contributed by: Dhanalakshmi
  • 1. Decision tree is the most powerful for ____
A) None of these
B) both a and b
C) classification
D) prediction
  • 2. Decision trees can handle_____
A) low diamesional data
B) medium dimensional data
C) High dimensional data
D) None of these
  • 3. In Decision-tree algorithm At the beginning, we consider the whole training set as ____
A) root
B) steam
C) None of these
D) leaf
  • 4. ___is the measure of uncertainty of a random variable, it characterizes the impurity of an arbitrary collection of examples.
A) Information Gain
B) None of these
C) Entropy
D) Gini Index
  • 5. What are the advantages of the decision tree?
A) Non-linear patterns in the data can be captured easily
B) What are the advantages of the decision tree?
C) None of these
D) Both
  • 6. Which of the following is correct with respect to random forest?
A) None of these
B) Random forest are easy to interpret but often very accurate
C) forest are Random difficult to interpret but often very accurate
D) Random forest are difficult to interpret but very less accurate
  • 7. Which of the following is an essential process in which the intelligent methods are applied to extract data patterns?
A) Warehousing
B) Text Mining
C) Data Mining
D) Data Selection
  • 8. What is KDD in data mining?
A) Knowledge data house
B) Knowledge Data definition
C) Knowledge Discovery Data
D) Knowledge Discovery Database
  • 9. For what purpose, the analysis tools pre-compute the summaries of the huge amount of data?
A) To obtain the queries response
B) In order to maintain consistency
C) For data access
D) For authentication
  • 10. What are the functions of Data Mining?
A) Cluster analysis and Evolution analysis
B) All of the above
C) Prediction and characterization
D) Association and correctional analysis classification
  • 11. Which one of the following statements about the K-means clustering is incorrect?
A) The nearest neighbor is the same as the K-means
B) K-means clustering can be defined as the method of quantization
C) All of the above
D) The goal of the k-means clustering is to partition (n) observation into (k) clusters
  • 12. In data mining, how many categories of functions are included?
A) 3
B) 2
C) 4
D) 5
  • 13. What is the importance of using PCA before the clustering? Choose the most complete answer
A) Find good features to improve your clustering score
B) Find which dimension of data maximize the features variance
C) Find the explained variance
D) Avoid bad features
  • 14. Following the steps to run a PCA's algorithm, why is so important standardize your data?
A) data allows other people understand better your work
B) Find the features which can best predicts Y
C) Make the training time more fast
D) Use Standardize the best practices of data wrangling
  • 15. . Which of the following model model include a backwards elimination feature selection routine?
A) MARS
B) All of the mentioned
C) MCV
D) MCRS
  • 16. Which of the following function is a wrapper for different lattice plots to visualize the data?
A) levelplot
B) featurePlot
C) plotsample
D) None of the mentioned
  • 17. Which of the following can be used to impute data sets based only on information in the training set?
A) preProcess
B) process
C) All of the above
D) postProcess
  • 18. The function preProcess estimates the required parameters for each operation.
A) False
B) True
  • 19. Which of the following can also be used to find new variables that are linear combinations of the original set with independent components?
A) SCA
B) ICA
C) None of the mentioned
D) PCA
  • 20. . The preProcess class can be used for many operations on predictors.
A) False
B) True
Created with That Quiz — where a math practice test is always one click away.