Machine Learning Interview Q and A

Table of Contents

What is Machine Learing ?

Machine learning is a subfield of artificial intelligence (AI) that makes use of statistical models and algorithms to teach computers how to learn from data without being specifically programmed to do so. This is accomplished by presenting the data in a way that is digestible by the computer.

What are different types of Machine Learning ?

Supervised learning,
Unsupervised learning,
Semi-supervised learning, and
Reinforcement learning.

What is Supervised learning algorithms ?

Ans. Machine learning algorithms that learn from input/output pairs

Some examples of supervised machine learning ?

  • Detecting fraudulent activity in credit card transactions
  • Identifying the zip code from handwritten digits on an envelope
  • Spam filtering

What is Labelled data in supervised machine learning ?

The input data which is already tagged with correct out is known as lebelled data.

Is it true training data provided work as supervisor for machine ?


What is the work of supervisor (training data) here ?

Supervisor support the machine to predict the output correctly.

Types of supervised machine learning ?

  • Regression
  • Classification

What is regression in machine learning ?

Regression is a technique for investigating the relationship between independent variables and a dependent variable.

When to use Regression algorithm ?

When there is a relation between input and output variable.

When to use Classification algorithm ?

When the output variable is categorical.

Some examples of regression algorithm ?

  • Linear Regression
  • Regression Trees
  • Non-Linear Regression
  • Bayesian Linear Regression
  • Polynomial Regression

What is Linear regression ?

Linear Regression is the supervised Machine Learning model in which the model finds the best fit linear line between the independent and dependent variable.

What is dependent and independent variables in machine learning ?

Independent variables are the input for a process that is being analyzes.

Dependent variables are the output of the process.

Mathematical equation for linear regression ?

Y= aX+b  

Y = dependent variables
X= Independent variables
a and b are the linear coefficients.

What is Unsupervised learning algorithms ?

In unsupervised learning, only the input data is known, and no known output data is given to the algorithm

  • Identifying topics in a set of blog posts
  • Detecting abnormal access patterns to a website

What is scikit-learn ?

  • scikit-learn is an open source project.
  • scikit-learn is Python library for Machine learning.
  • For more visit: User guide

What is Training Data and Testing Data ?

Training data is the initial dataset you use to teach a machine learning application.

Testing data is used to evaluate applications accuracy.

What is Jupyter Notebook ?

Jupyter Notebook is an open-source web application that allows a user, scientific researcher, scholar or analyst to create and share the document called the Notebook, containing live codes, documentation, graphs, plots, and visualizations.

What is NumPy ?

Numpy is one of the most commonly used packages for scientific computing in Python. It provides a multidimensional array object, as well as variations such as masks and matrices, which can be used for various math operations.

What is SciPy ?

SciPy is a collection of mathematical algorithms and convenience functions built on the NumPy extension of Python.

What is Mapplotlib ?

Matplotlib is a cross-platform, data visualization and graphical plotting library for Python and its numerical extension NumPy.

What is Pandas ?

Pandas is an open source Python package that is most widely used for data science/data analysis and machine learning tasks. It is built on top of another package named Numpy, which provides support for multi-dimensional arrays.

What is reinforcement learning?

The process of an algorithm learning by interacting with its surroundings in order to maximise a reward function belongs to the field of machine learning known as reinforcement learning.

What is deep learning?

Deep learning is a branch of machine learning that makes use of neural networks that have numerous layers in order to understand complicated patterns and relationships in data.

What is a neural network?

A neural network is a type of machine learning algorithm that is intended to mimic the way in which the human brain processes information by passing it through a network of interconnected nodes or neurons. This type of algorithm is supposed to mimic the way in which the human brain works.

What is overfitting?

When a machine learning algorithm learns from noise or irrelevant features in the training data, a condition known as overfitting occurs. This leads to poor performance on data that the system has not before encountered.

What is underfitting?

Underfitting is a problem that happens in machine learning when an algorithm is overly simplistic and cannot adequately represent the complexities of the data. This leads to poor performance on both the training data and the test data.

What is cross-validation?

The performance of a machine learning system can be evaluated through a method called cross-validation. This method involves dividing the available data into numerous subsets and then using each subset for both training and testing purposes.

What is feature selection?

The process of picking the most essential features or variables contained within a dataset in order to enhance the overall performance of a machine learning system is referred to as feature selection.

What is data preprocessing?

In the context of machine learning, “data preprocessing” refers to the process of cleaning, converting, and otherwise getting data ready for analysis and modelling.

What is feature scaling?

The purpose of the machine learning technique known as feature scaling is to standardise the range and scale of various features or variables contained within a dataset in order to improve the overall performance of the model.

What is a hyperparameter?

In the context of machine learning, the term “hyperparameter” refers to a parameter of an algorithm that is not learned from the data but rather is defined by the user.

What is a confusion matrix?

The performance of a machine learning algorithm can be evaluated using a table called a confusion matrix. This table compares the predicted and actual values of the target variable in order to determine how well the algorithm performed.

What is precision?

Precision is a metric that represents the percentage of true positives among all positive predictions. It is used to evaluate the effectiveness of machine learning algorithms because it measures how accurate the algorithms are.

What is meant by the term recall?

A machine learning algorithm’s performance can be evaluated using a metric called recall, which quantifies the percentage of true positives among all actual positives.

What does the F1 score mean?

F1 score combines precision and recall to evaluate machine learning algorithms.

What is gradient descent?

Gradient descent optimises model parameters by minimising the cost function in machine learning.

What is backpropagation?

Calculating the gradient of the cost function with respect to the weights and biases in a neural network is accomplished through the process of backpropagation, which is a technique employed in neural networks.

Decision trees?

A decision tree is a sort of machine learning algorithm that models decisions and the probable outcomes of those decisions by employing a tree-like structure to organise the data.

What is a random forest?

An ensemble learning approach known as a random forest aggregates the results of numerous decision trees in order to increase the accuracy and robustness of the model.

What is clustering?

Clustering is a form of unsupervised learning in which the algorithm groups data points that are similar together based on the features or characteristics of those data points.

What is a support vector machine?

Support vector machines are a sort of supervised learning method that may be used for classification and regression analysis. This is accomplished by locating the ideal hyperplane that divides the data into distinct categories. Support vector machines are also known as SVMs.

What is deep reinforcement learning?

Combining deep learning with reinforcement learning results in deep reinforcement learning, which includes the use of neural networks to learn from interactions with an environment in order to maximise a reward function.

What is natural language processing?

Artificial intelligence (AI) is broken down into a number of subfields, one of which is natural language processing. This subfield makes use of algorithms and models to teach computers how to comprehend, translate, and create human language.

What is transfer learning?

Transfer learning is a method of machine learning in which an already-trained model is used as a jumping off point for the training of a new model to address a new task or issue, instead of to building a model from the ground up.

What is regularization?

Regularisation is a technique that is used in machine learning to prevent overfitting. It does this by adding a penalty term to the cost function, which pushes the model to have smaller weights or simpler features. Regularisation is accomplished by the usage of the phrase “regularisation.”

What is a kernel in machine learning?

In support vector machines and other machine learning methods, a kernel is a function that converts the input data into a space with a greater dimension so that the classes can be separated more effectively. Other machine learning algorithms also use kernels.

What is a convolutional neural network?

A convolutional neural network is a type of neural network that is designed to handle images and videos by using convolutional layers that learn local features and patterns.

What is a recurrent neural network?

A recurrent neural network is a type of neural network that is designed to handle sequence data by using layers that can remember information over time.

What is a generative adversarial network?

A generative adversarial network is a type of neural network that is made up of two parts: a generator and a discriminator. These two parts are taught to work together to make synthetic data that looks and sounds real.

What is deep belief network?

A deep belief network is a sort of neural network that is optimised for unsupervised learning by employing numerous layers of restricted Boltzmann machines to build hierarchical representations of the data.

What is a loss function?

In machine learning, a loss function is a function that measures the difference between the predicted value of a target variable and its actual value.

What is a cost function?

In machine learning, a cost function is a function that measures the total cost or error of a model given a set of parameters or weights.

What is a learning rate?

In machine learning, the learning rate is a hyperparameter that controls the size of the steps or the rate at which the algorithm updates the model’s weights or parameters while it is being trained.

What is early stopping?

Early stopping is a machine learning method that stops the training process when the performance on the validation set starts to get worse. This is done to avoid overfitting.

EasyExamNotes © 2023

Table of Contents