A) The data is unlabeled, and the model must find patterns on its own. B) The @data is labeled, meaning each example is paired with a target output. C) The data is generated randomly by the algorithm. D) The data is generated randomly by the algorithm.
A) Memorize the entire training dataset perfectly. B) Gen@eralize from the training data to make accurate predictions on new, unseen data. C) Discover hidden patterns without any guidance. D) Reduce the dimensionality of the input data for visualization.
A) The input features. B) The model's parameters. C) The loss function. D) The@ label or target output.
A) Predicting the selling price of a house based on its features. B) Diagnosing @a tumor as malignant or benign based on medical images. C) Forecasting the temperature for tomorrow. D) Estimating the annual revenue of a company.
A) Dimensionality reduction problem. B) Classification problem. C) Regressio@n problem. D) Clustering problem.
A) To predict a target variable based on labeled examples. B) To classify emails into spam and non-spam folders. C) To achieve perfect accuracy on a held-out test set. D) To @discover the inherent structure, patterns, or relationships within unlabeled data.
A) Reinforcement Learning. B) Classification. C) Regression. D) C@lustering.
A) A support vector machine for classification. B) Clu@stering, a type of unsupervised learning. C) Linear Regression, a type of supervised learning. D) Logistic Regression, a type of supervised learning.
A) Increase the number of features to improve model accuracy. B) Re@duce the number of features while preserving the most important information in the data. C) Assign categorical labels to each data point. D) Predict a continuous output variable.
A) Deep learning with neural networks. B) Regression in supervised learning. C) Classification in supervised learning. D) Ass@ociation rule learning in unsupervised learning.
A) It is always more accurate than fully supervised learning. B) It requires no labeled data at all. C) It is simpler to implement than unsupervised learning. D) La@beling data is often expensive and time-consuming, so it leverages a small labeled set with a large unlabeled set.
A) "H@ow much?" or "How many?" B) "What is the underlying group?" C) "Is this pattern anomalous?" D) "Which category?"
A) "What is the correlation between these variables?" B) "Whic@h category?" or "What class?" C) "How can I reduce the number of features?" D) "How much?" or "How many?"
A) Decision Tree for classification B) k-Nearest Neighbors for classification C) Logistic Regression D) Lin@ear Regression
A) Dimensionality reduction B) Regression C) Mult@i-class classification D) Clustering
A) The average value of a continuous target B) The probability of moving to the next node C) The input features for a new data point D) Th@e final class labels or decisions
A) A c@ontinuous value, often the mean of the target values of the training instances that reach the leaf B) The name of the feature used for splitting C) A categorical class label D) A random number
A) Immunity to overfitting on noisy datasets B) Inter@pretability; the model's decision-making process is easy to understand and visualize C) Superior performance on all types of data compared to other algorithms D) Guarantee to find the global optimum for any dataset
A) Initialize the weights of a neural network B) Perform linear regression more efficiently C) Grow a tree structure by making sequential decisions D) Fin@d a linear separating hyperplane in a high-dimensional feature space, even when the data is not linearly separable in the original space
A) All data points in the training set B) . Da@ta points that are closest to the decision boundary and most critical for defining the optimal hyperplane C) The weights of a neural network layer D) The axes of the original feature space
A) Their inherent resistance to any form of overfitting B) The@ir effectiveness in high-dimensional spaces and their ability to model complex, non-linear decision boundaries C) Their superior interpretability and simplicity D) Their lower computational cost for very large datasets
A) Clustering B) Tr@aining or model fitting C) Dimensionality reduction D) Data preprocessing
A) There@ are no ground truth labels to compare the results against B) The algorithms are not well-defined C) The data is always too small D) The models are always less accurate than supervised models
A) A Classification algorithm like Logistic Regression B) A Regression algorithm like Linear Regression C) Dimen@sionality Reduction techniques like Principal Component Analysis (PCA) D) An Association rule learning algorithm
A) Classification, a supervised learning method B) Clus@tering, an unsupervised learning method C) Regression, a supervised learning method D) A neural network for image recognition
A) Principal component B) Artifi@cial neuron or perceptron, which receives inputs, applies a transformation, and produces an output C) Support vector D) Decision node in a tree
A) Loss function B) Kernel function C) Optimization algorithm D) Activ@ation function
A) The identity function (f(x) = x) B) Rectifie@d Linear Unit (ReLU) C) The mean squared error function D) A constant function
A) Iterativ@ely adjusting the weights and biases to minimize a loss function B) Randomly assigning weights and never changing them C) Clustering the input data D) Manually setting the weights based on expert knowledge
A) Perform clustering on the output layer B) Visualize the network's architecture C) Initialize the weights before training D) Efficient@ly calculate the gradient of the loss function with respect to all the weights in the network, enabling the use of gradient descent
A) K-means clustering exclusively B) Neural n@etworks with many layers (hence "deep") C) Decision trees with a single split D) Simple linear regression models
A) Operate without any need for data preprocessing B) Auto@matically learn hi@erarchical feature representations from data C) Be perfectly interpretable, like a decision tree D) Always train faster and with less data
A) Image @data, due to their architecture which exploits spatial locality B) Text data and natural language processing C) Unsupervised clustering of audio signals D) Tabular data with many categorical features
A) Flatten the input into a single vector B) Detect@ local features (like edges or textures) in the input by applying a set of learnable filters C) Initialize the weights of the network D) Perform the final classification
A) Static, non-temporal data B) Independent and identically distributed (IID) data points C) Only image data D) Sequ@ential data, like time series or text, due to their internal "memory" of previous inputs
A) The model overfitting to the training data B) The gradients becoming too large and causing numerical instability C) The@ gradients becoming exceedingly small as they are backpropagated through many layers, which can halt learning in early layers D) The loss function reaching a perfect value of zero
A) Deploy the model in a production environment B) Tune the model's hyperparameters C) Fit th@e model's parameters (e.g., the weights in a neural network) D) Provide an unbiased evaluation of a final model's performance
A) Tun@ing hyperparameters and making decisions about the model architecture during development B) The initial training of the model's weights C) Data preprocessing and cleaning D) The final, unbiased assessment of the model's generalization error
A) Ignored in the machine learning pipeline B) Used repeatedly to tune the model's hyperparameters C) Used repeatedly to tune the model's hyperparameters D) Use@d only once, for a final evaluation of the model's performance on unseen data after model development is complete
A) Is too simple to capture the trends in the data B) Learns @the training data too well, including its noise and outliers, and performs poorly on new, unseen data C) Fails to learn the underlying pattern in the training data D) Is evaluated using the training set instead of a test set
A) Training for more epochs without any checks B) Increasing the model's capacity by adding more layers C) Dropo@ut, which randomly ignores a subset of neurons during training D) Using a smaller training dataset
A) The error from sensitivity to small fluctuations in the training set, leading to overfitting B) The erro@r from erroneous assumptions in the learning algorithm, leading to underfitting C) The activation function used in the output layer D) The weights connecting the input layer to the hidden layer
A) The error from erroneous assumptions in the learning algorithm, leading to underfitting B) The er@ror from sensitivity to small fluctuations in the training set, leading to overfitting C) The intercept term in a linear regression model D) The speed at which the model trains
A) Only bias is important for model performance B) Decrea@sing bias will typically increase variance, and vice versa. The goal is to find a balance C) Only variance is important for model performance D) Bias and variance can be minimized to zero simultaneously
A) Underfitting B) A well-generalized model C) Overf@itting D) Perfect model performance
A) The speed of the backpropagation algorithm B) The accuracy on the test set C) The number of layers in the network D) How well the model is performing on the training data; it's the quantity we want to minimize during training
A) Randomly searches the parameter space for a good solution B) Is only used for unsupervised learning C) Iteratively adjusts parameters in the direction that reduces the loss function D) Guarantees finding the global minimum for any loss function
A) The amount of training data used in each epoch B) The activation function for the output layer C) The size of the step taken during each parameter update. A rate that is too high can cause divergence, while one that is too low can make training slow D) The number of layers in a neural network
A) The final evaluation on the test set B) The processing of a single training example C) A type of regularization technique D) One complete pass of the entire training dataset through the learning algorithm
A) The number of layers in the network B) The number of training examples used in one forward/backward pass before the model's parameters are updated C) The number of validation examples D) The total number of examples in the training set
A) One complete pass of the entire training dataset through the learning algorithm B) The processing of a single training example
A) One complete pass of the entire training dataset through the learning algorithm B) The processing of a single training example
A) One complete pass of the entire training dataset through the learning algorithm B) One complete pass of the entire training dataset through the learning algorithm
A) One complete pass of the entire training dataset through the learning algorithm B) One complete pass of the entire training dataset through the learning algorithm
A) One complete pass of the entire training dataset through the learning algorithm B) The processing of a single training example
A) One complete pass of the entire training dataset through the learning algorithm B) The processing of a single training example
A) The processing of a single training example B) One complete pass of the entire training dataset through the learning algorithm
A) The processing of a single training example B) One complete pass of the entire training dataset through the learning algorithm
A) The processing of a single training example B) One complete pass of the entire training dataset through the learning algorithm
A) The processing of a single training example B) One complete pass of the entire training dataset through the learning algorithm
A) The processing of a single training example B) One complete pass of the entire training dataset through the learning algorithm
A) One complete pass of the entire training dataset through the learning algorithm B) The processing of a single training example
A) One complete pass of the entire training dataset through the learning algorithm B) The processing of a single training example
A) One complete pass of the entire training dataset through the learning algorithm B) The processing of a single training example
A) One complete pass of the entire training dataset through the learning algorithm B) The processing of a single training example |