The rush to adopt machine learning (ML) in broad applications hasn’t yet been matched by efforts to use ML to inform computer architecture design. That’s now changing and a paper (A Survey of Machine Learning Applied to Computer Architecture Design) by Oregon State University researchers and senior IEEE members Drew Penney and Lizhong Chen provides a starting point.
They write, “[T]he recent resurgence in AI research is, at least partly, attributed to improved processing capabilities. These improvements are enhanced by hardware optimizations exploiting available parallelism, data reuse, sparsity, etc. in existing ML algorithms. In contrast, there has been relatively limited work applying ML to improve architectural design, with branch prediction being one of a few mainstream examples. This nascent work, although limited, presents an auspicious approach for architectural design. This paper presents an overview of ML applied to architectural design and analysis.”
Penney and Chen cite four main categories of learning approaches shown here:
- Supervised learning: In supervised learning, the model is trained using input features and output targets, with the result being a model that can predict the output for new, unseen inputs. Common supervised learning applications include regression (predicting a value such as processor IPC (instructions per cycle)) and classification (predicting a label such as the optimal core configuration for application execution). Feature selection, discussed in Section 2.3, is particularly important in these applications as the model must learn to predict solely based on feature values.
- Unsupervised learning: Unsupervised learning uses just input data to extract information without human effort. These models can therefore be useful, for example, in reducing data dimensionality by finding appropriate alternative representations or clustering data into classes that may not be obvious for humans…Thus far, the primary two unsupervised learning models applied to architecture are principal components analysis (PCA) and k-means clustering.
- Semi-supervised learning: Semi-supervised learning represents a mix of supervised and unsupervised methods, with some paired input/output data, and some unpaired input data. Using this approach, learning can take advantage of limited labeled data and potentially significant unlabeled data. We note that this approach has, thus far, not yet found application in architecture. Nevertheless, one work on circuits analysis presents a possible strategy that could be adapted in future work.
- Reinforcement Learning: In reinforcement learning, an agent is sequentially provided with input based on an environment state and learns to perform actions that optimize a reward. For example, in the context of memory controllers, the agent replaces traditional control logic. Input could include pending reads and writes while actions could include standard memory controller commands (row read, write, pre-charge, etc.). Throughput could then be optimized by including it in the reward function. Given this setup, the agent will potentially, over time, learn to choose control actions that maximize throughput.
The authors a dig into many of the challenges – feature selection, for example – and developing approaches to handle them.
Here’s an excerpt looking at GPU performance: “Porting applications for execution on GPUs is a challenging task with potentially uncertain benefits over CPU execution. Work has there- fore examined methods to predict speedup or efficiency improvements using just CPU execution behavior. Baldini et al. [19] cast the problem as a classification task, training a modified nearest-neighbor and a support vector ma- chine (SVM) model to determine, based on a threshold, whether GPU implementation would be beneficial. Using this approach, they predicted near-optimal configurations 91% of the time. In contrast, Ardalani et al. [20] trained a large ensemble of regression models to directly predict GPU performance for the code segment. Although several code segments exhibit high error, the geometric mean of the absolute value of the relative error is still just 11.6% and the model successfully identifies several code segments (both beneficial and non-beneficial) that are incorrectly predicted by human experts.”
The paper is organized as follows: Section 2 provides background on ML and existing models to build intuition on ML applicability to architectural issues. Section 3 presents existing work on ML applied to architecture. Section 4 then compares and contrasts implementation strategies in existing work to highlight significant design considerations. Section 5 identifies possible improvements and extensions to existing work as well as promising, new applications for future work.”
Link to paper: https://arxiv.org/pdf/1909.12373.pdf