Keio University

An Invitation to Machine Learning

Publish: October 11, 2021

In famous models that describe natural phenomena, it is common to try to represent the phenomenon with a small number of parameters. In recent years, in the field of machine learning, solutions using deep learning (neural networks: artificial neural networks) have been attempted for predictive problems targeting image recognition, natural language analysis, and various other data, and I believe that research is progressing in many fields and laboratories within our Faculty of Science and Technology as well. In this deep learning, the models require a vast number of parameters. To give an example, in the case of convolutional neural networks, which are now widely used, the LeNet-5 proposed in 1998 had only about 100,000 parameters. However, the AlexNet, which achieved remarkable results in the field of image recognition in 2012, had about 60 million, and the VGGNet proposed in 2014 exceeded 100 million. In this way, the number of parameters has been increasing year by year. Are this many parameters really necessary to capture the nature of things? Can't they be simplified like models that describe natural phenomena?

In the field of machine learning, there is a proposition known as Occam's razor. This is the idea that "one should not use more assumptions or factors than are necessary to explain a matter." Isaac Newton and others have made the same proposition. Currently, the reason deep learning is solving various problems can be attributed to its algorithms and architectures, which can find appropriate parameters despite their vast number. Normally, when the number of parameters is large, the model becomes one that can make appropriate predictions only for the data used to build it, making it difficult to predict for other data. However, it is an interesting characteristic of deep learning that it has relatively overcome these problems. Particularly in recent years, in the field of natural language processing, it is said that the performance of language analysis improves as the number of parameters increases, and models with parameters on the order of trillions, far exceeding the order of hundreds of millions, have been proposed. Therefore, in the current state of deep learning, research institutions that can utilize vast computational resources may have an advantage in advancing research, but its scope of application is broad, and each researcher should be able to devise application methods and models for their own field of study. Furthermore, in deep learning, the large number of parameters makes it currently difficult to interpret why a particular prediction was made. For example, when using deep learning to predict cancer through medical imaging, it can predict that the probability of cancer is 90%, but on the other hand, it is difficult to explain the reason for that prediction. Therefore, it is also essential to interpret and explain the prediction results.

The field of neural networks has repeatedly gone through periods of active research and periods of less activity. However, this does not mean that it stagnated during the periods when research was not actively conducted. The aforementioned convolutional neural networks and algorithms for finding parameters were devised during such periods and were later rediscovered to become the foundational technologies of today. If history repeats itself, there may come a time in the future when research in this field stagnates. In that case, models and methods that are not receiving much attention now might cause a breakthrough again. Shouldn't we be advancing research with a view to the future, looking ten or twenty years ahead?

Gakumon no susume (An Encouragement of Learning) (Research Introduction)

Showing item 1 of 3.

Gakumon no susume (An Encouragement of Learning) (Research Introduction)

Showing item 1 of 3.