Sample questions

Trends on Artificial Intelligence

Read the following sentences and fill in the blanks.

The first AI boom occurred in the 1950s. Around that time, a program called artificial intelligence was solving problems based on (A). In particular, (B), developed by IBM in 1996, was famous for winning against Garry Kasparov, the world chess champion at that time. However, since (A) could only solve (C) such as mazes and puzzles, the scope of its application was limited, resulting in the first boom to end.

(A)
1. knowledge representation
2. representation learning
3. machine learning
4. exploration and reasoning

(B)
1. Deep Blue
2. Bonkras
3. Ponanza
4. Sharp

(C)
1. A/B testing
2. pattern matching
3. toy problem
4. Dartmouth workshop

Choose all that are correct about the international image recognition competition, “ILSVRC2012.”

1. Image recognition is a task that deep learning can achieve with the highest accuracy as of 2017.
2. ImageNet is a dataset for handwritten character recognition.
3. The winning team was SuperVision, led by Professor Geoffrey Hinton of the University of Toronto.
4. The result of this competition was called “a breakthrough of a 50-year-old problem in artificial intelligence research.”

Problems in The Field of Artificial Intelligence

The terms listed below are the issues raised during the second AI boom.
Choose one appropriate explanation for each issue.

(a) Frame problem
(b) Symbol grounding problem

1. It is difficult to systematize the vast amount of human knowledge.
2. It is difficult to select and consider only the necessary information from the huge amount of information.
3. It is difficult to connect symbols such as words with their meanings.
4. It is difficult to develop a computer to process a huge amount of knowledge.
5. It is difficult to set up the internet to get enough data.

Select two appropriate explanations for “Strong AI and Weak AI”.

1. “Strong AI” is called an expert system and is still widely used today.
2. What is called AGI (Artificial General Intelligence) is closer to “Strong AI”.
3. The development of “the computer that thinks like a human being” in the original sense was the trigger for the third AI boom.
4. In international image recognition competitions, “Weak AI” has achieved discrimination performance that surpasses that of humans.

Machine Learning Methods

The following sentences describe various machine learning techniques. Choose one of the words that best fits the blank.

There are several methods of machine learning, and it is important to understand the meaning of the terms correctly. A method that uses data with correct labels, called training data, is referred to as (A). In contrast, a method that uses data without correct labels is called (B), and there is also a method called (C), in which correct labels are given only to some samples.

1. unsupervised learning
2. supervised learning
3. reinforcement learning
4. representation learning
5. multi-task learning
6. semi-supervised learning
7. manifold learning

Select the options that best fit the blanks below.

There are various performance metrics in the classification problem. Here, we consider a binary classification that divides the sample into two classes, Positive and Negative. The (A) is simply the ratio of the number of samples, for which the prediction was correct, out of the total sample. It is desirable to use (B) when you want to focus on reducing false positives (FP), and (C) when you want to focus on reducing false negatives (FN). However, since there is a trade-off between the two, (D) which is obtained by harmonizing them is often used.

1. accuracy
2. realization rate
3. collaborative rate
4. harmony rate
5. precision
6. recall
7. f-score
8. p-value
9. t-statistic
10. z-value

In machine learning, it is a general rule to divide the training data into several pieces and to use only a part of them for training. In other words, the rest of the training data should not be used for model training, but should be left behind. Choose the most appropriate option for the purpose of adopting such a technique.

1. To train with a small amount of data once and save the computational resources in the initial stage.
2. To remove samples with outliers contained in the data.
3. Semi-supervised learning can be done even if some of the data is not labeled.
4. To correctly estimate the performance when the model is in operation.

Choose the one that is most suitable for the combination of words that fits in the blank.

Supervised learning problems can be broadly divided into two types depending on the type of output value. The (A) problem is used when the output is a discrete value and you want to predict the category. On the other hand, the (B) problem is used when the output is a continuous value and you want to predict the continuous value itself.

1. 1. (A) limited (B) general
2. (A) part (B) complete
3. (A) classification (B) regression
4. (A) linear (B) non-linear

Overview of Deep Learning

Choose all that apply as reason(s) why deep learning has rapidly become highly successful in recent years.

1. Due to the progress of semiconductor technology led by the improvement of computer performance and high-speed parallel computing by GPU, it has become possible to perform training in a realistic timeframe.
2. With the development of neuroscience, it has become possible to reproduce the structure of the human brain corresponding to tasks, such as the visual cortex and language field for image recognition, and natural language processing.
3. The spread of the Internet has made it possible for highly expressive models to obtain large amounts of data without overfitting.
4. The invention of the back-propagation method has made it possible to train multi-layer neural networks, which was difficult up until then.
5. Many frameworks for deep learning have been developed and implemented easily.

Read the following sentences and fill in the blanks.

The steepest descent method, which is an optimization method used in conventional machine learning, is called (A) because it uses all the data for one training session. However, it is difficult to do so in the case of deep learning because of the large amount of data. Therefore, a method called stochastic gradient descent (SGD) is often used. A method using only one sample is called (B). Both (A) and (B) have advantages and disadvantages, so it is recommended to adopt (C) with a certain number of samples.

1. set learning
2. batch learning
3. online learning
4. point learning
5. sampling learning
6. mini-batch learning

When training a neural network model, an error against the test data was observed, and the error kept decreasing steadily until the number of training models exceeded 100, after which the error gradually increased.
Choose the most appropriate reason for this.

1. As the number of training models increases, the value of the error function becomes more difficult to update.
2. As the number of training models increases, it has been optimized for the training data only.
3. As the number of training models increases, the number of parameters that must be updated at one time increases.
4. As the number of training models increases, the time required for calculation processing increases.

Deep Learning Methods

Read the following sentences and choose the options that best fit the blanks.

ILSVRC (ImageNet Large Scale Visual Recognition Competition) is an international competitions for image recognition. In 2012, (A), the model of CNN, won the championship. Since then, the model of CNN has been achieving great results. In 2014, (B), which utilized the structure of the inception module, won the championship, while (C) also achieved the similar excellent result in the same year. In 2015, (D), which enabled Deep Residual Network learning (Deep ResNet), won the championship.

1. AlexNet
2. ElmanNet
3. GoogLeNet
4. ImageNet
5. LeNet
6. ResNet
7. VGG
8. WaveNet

Read the following sentences and fill in the blanks.

In the neural network, (A) was initially used as the activation function in the intermediate layer. However, since (B), there was a problem that the gradient used for training becomes almost 0 when the layer was deepened. This is an important problem called the vanishing gradient problem.
(C), which is often used as an activation function in deep learning, is less likely to cause this problem than (A). It is also characteristic that the amount of calculation is small. On the other hand, it is known that training proceeds faster by using (D) even when (A) is used as the activation function.

(A)
1. Step function
2. ReLU
3. Sigmoid function
4. Softmax function

(B)
1. the output becomes constant when a negative value is input
2. the average value of the output is 0 and the standard deviation is not 1
3. there is a non-differentiable point in the function
4. the output will be almost constant if the absolute value of the input is large

(C)
1. Step function
2. ReLU
3. Sigmoid function
4. Softmax function

(D)
1. Dropout
2. Batch Normalization
3. Regularization
4. Weight Decay

Select the most suitable combination of (A) and (B) in the following sentence.

Originally, (A) was considered to be the most suitable for analyzing time-series data, but in the field of speech processing, which is one of the time-series data, (B) records extremely high accuracy.

1. (A) Recurrent Neural Network (B) Convolutional Neural Network
2. (A) Recurrent Neural Network (B) Autoencoder
3. (A) Convolutional Neural Network (B) Recurrent Neural Network
4. (A) Convolutional Neural Network (B) Autoencoder
5. (A) Autoencoder (B) Convolutional Neural Network
6. (A) Autoencoder (B) Recurrent Neural Network

Read the following sentences and fill in the blanks.

RNN (Recurrent Neural Network) was developed to handle (A). Compared with the previous feedforward neural network, RNN was characterized by adopting a structure, (B) in which (C) is input to the hidden layer in addition to the input data.

(A)
1. periodic data
2. cumulative data
3. chained data
4. series data

(B)
1. last input
2. state of the previous intermediate layer
3. all past inputs
4. state of all intermediate layers in the past

(C)
1. recursion
2. convolution
3. back-propagation
4. regularization

Select one of the most appropriate features that contributes to improving the generalization performance of the classification problem of Convolutional Neural Networks, which is not found in ordinary neural networks.

1. Feedback is given recursively in the middle layer of the network.
2. The activation function is used to make the decision boundary non-linear.
3. Input features are extracted at regular intervals for the entire image.
4. At the output layer, the output is converted to probability.

Research Fields in Deep Learning

Choose options that best fits the blanks below.

Machine learning applications are also being advanced in the field of robotics.
For example, there are many cases in which (A) algorithms, such as Q learning and Monte Carlo methods, are used to control the motion of a robot.
Since a robot has a (B) system that can collect different sensor information such as camera (visual sense), a microphone (auditory sense), and a pressure sensor (tactile sense), research is being conducted on the integrated processing of this information by DNN, and research is also being conducted on (C), which attempts to generate a series of robot motions by a single DNN.

(A)
1. End to End Learning
2. Supervised Learning
3. Motion Learning
4. Adaptive Learning
5. Reinforcement Learning
6. Representation Learning

(B)
1. Multi-modal
2. Inception
3. Cognitive
4. Full-scratch

(C)
1. End to End Learning
2. Supervised Learning
3. Motion Learning
4. Adaptive Learning
5. Reinforcement Learning
6. Representation Learning

Choose the most suitable reasons why RNN (Recurrent Neural Networks) has contributed to improving accuracy in the field of natural language processing.

1. By performing the convolution process in the convolution layer, the context can be read from the position where the word appears.
2. Past information can be retained in the hidden layer, and the meaning can be extracted from the character sequence.
3. By providing a storage part outside the network, it became possible to easily refer to sentence patterns.
4. It became possible to learn repeatedly and automatically until the correct sentences could be output.

JDLA Deep Learning Certificates