Tesla & Edison – Impact on the future of ML
Speakers: Srinivas Atreya, Chief Data Scientist, Cigniti
-
Here is the Transcript
Srini: Thanks for tuning in. This is Srini Atreya again.
Today I’m going to talk about two very influential personalities in modern science and how their differing styles may have a huge impact on the future of machine learning.
Tesla was born exactly 164 years ago on 10th of July,1856. One of the most important statements he released to the New York times on Edison was if he had a needle to find in a haystack, he would not stop to reason where it was most likely to be. But would proceed at once with the feverish diligence of a bee to examine straw after straw until he found the object of his search.
If this sounds eerily familiar to backpropagation in neural networks, it is not an accident. Coming to the field of machine learning and going over the developments in the vision and natural language space of the last decade or so, we can see that deep learning has been the most successful technique. Neural networks at large can be considered as the brute force hammers of the AI world. Effective but inefficient. In fact, one of the reasons for their stupendous success in the recent past has been the cheap availability of hardware, including GPUs, and their ability to run massively parallel computations. But of course, there’s a dirty side to all of this. Huge energy consumption resulting in high carbon footprints. Specifically, there is a process called neural architecture search, which using trial and error, automates the design of neural networks. Hailed as one of the greatest new kings of the deep learning castle, it is seen as a quick and dirty way of getting great accuracy for the machine learning task without too much thought put in. However, training something like the transformer architecture, which is a state-of-the-art architecture for natural language processing tasks.,- Just to translate between English and French, let’s say with neural architecture search is vastly more time consuming, up to a 1000 times more compared to without neural architecture search processes. Some of these complex models have high carbon emissions, in fact, so high that they are higher than about five cars over their entire lifetimes during one training cycle. Given the dire warning presented by the intergovernmental panel on climate change, the world has less than twelve years to limit the damage of global warming. Large carbon footprints should be examined closely. Simply planting more trees may no longer cut it. I call this the Edison way of doing machine learning. It definitely has its benefits in the shorter term, but not sustainable over the medium to long term. But all is not lost. There are other methods out there too, which are trying to make learning more efficient.
Techniques like active learning, few short learning, and generative teaching networks are catching up and are expected to have a greater impact over the coming decade. I think the time for Edison-inspired brute force methods are over. It is time to look at tesla for inspiration. The future algorithms need not only be effective, but efficient too. Intelligent choice of priors using meaningful domain assumptions can go a long way incurring this ill of mindless brute force search methods. There are techniques like extreme learning machines which have a lot of promise and are wading in the wings that can achieve similar levels of accuracy with just single layers and no backpropagation.
Once again, thanks for listening in, and as usual, all comments are welcome.