Thanks, Other Matt! My understanding is that machine learning requires very large volumes of matrice multiplications (tensors, in data science jargon). Until recently, graphical processing units (GPUs) were used for these computations because graphics require lots of matrice operations, so the processing units were relatively well suited for it. However, a new type of of processing unit call tensor processing units (TPUs) have been developed and released for public use in the last few years. That's the hardware development.
Training of massive machine learning models requires whole fleets of TPUs, which requires distributing those computations across multiple TPUs. That's the software development.
These technologies were brand new and hard to come by just a few years ago, which is why there was such a high cost. However, these technologies are being deployed at a very large scale by several companies (Microsoft, Amazon, and Google) that are trying to secure market share for training machine learning models on TPUs. That deployment has resulted in the dramatic plunge in prices in such a short period of time.