1. What is Supervised Learning?
Ans. Supervised learning is a type of machine learning where the algorithm learns from labeled data, making predictions or decisions based on input features.
2. What is the difference between Regression and Classification?
Ans. Regression predicts continuous numerical values, while classification predicts categorical labels or classes.
3. What is Overfitting?
Ans. Overfitting occurs when a model learns the training data too well, capturing noise and random fluctuations. It performs poorly on new, unseen data.
4. What is a Hyperparameter?
Ans. A hyperparameter is a configuration setting of a model that is set before training and remains constant during training.
5. Give an example of a Classification Algorithm.
Ans. Logistic Regression is an example of a classification algorithm, used to predict binary outcomes.
6. Explain Bias-Variance Tradeoff.
Ans. The bias-variance tradeoff is the balance between a model’s simplicity (bias) and its ability to fit diverse data patterns (variance). Finding the right balance minimizes prediction errors.
7. What is Cross-Validation?
Ans. Cross-validation is a technique used to assess the performance of a model. The dataset is divided into subsets for training and testing, allowing for more robust evaluation.
8. Why do we use a Test Set?
Ans. The test set is used to evaluate the model’s performance on data it has never seen before, providing an unbiased estimate of its predictive power.
9. What is Feature Engineering?
Ans. Feature engineering involves creating or transforming features from raw data to improve a model’s performance.
10. Explain Precision in Classification.
Ans. Precision is the ratio of true positive predictions to the total predicted positives. It measures the accuracy of positive predictions.
11. Define Recall in Classification.
Ans. Recall (Sensitivity) is the ratio of true positive predictions to the total actual positives. It measures the ability of the model to identify all relevant cases.
12. What is the F1-Score?
Ans. The F1-Score is the harmonic mean of precision and recall. It provides a balanced measure of a model’s performance.
13. Explain Regularization.
Ans. Regularization is a technique used to prevent overfitting by adding a penalty term to the model’s loss function.
14. Name an Ensemble Learning Technique.
Ans. Bagging (Bootstrap Aggregating) is an ensemble learning technique that combines the predictions of multiple base learners to improve overall performance.
15. What is Gradient Descent?
Ans. Gradient Descent is an iterative optimization algorithm used to minimize the loss function of a model by adjusting the model’s parameters in the direction of steepest descent.
16. Define Confusion Matrix.
Ans. A confusion matrix is a table that visualizes the performance of a classification algorithm, showing the true positive, true negative, false positive, and false negative counts.
17. Explain One-Hot Encoding.
Ans. One-hot encoding is a technique used to represent categorical variables as binary vectors, where only one bit is ‘hot’ (1) indicating the category.
18. What is a Decision Tree?
Ans. A decision tree is a tree-like model used for both classification and regression. It makes decisions based on the values of input features.
19. Define K-Nearest Neighbors (KNN).
Ans. K-Nearest Neighbors is a simple, instance-based learning algorithm where predictions are made by averaging the target values of ‘K’ nearest neighbors in the training set.
20. What is a Neural Network?
Ans. A neural network is a computational model inspired by the structure of the human brain. It consists of layers of interconnected nodes (neurons) and is used for various machine learning tasks.