![]() |
![]() |
:: Via malfidinda reta dokumento de referenco pri la Skarpac. :: |
Navigation: > [Home page] > [papers] > [Power Estimation] | [Alphabetical Index] [Tree View] |
Local contents:
![]() |
File : | 01d_6.pdf | 286 kbytes | 2004-10-08 |
Title: | System-Level Power Analysis Methodology Applied to the AMBA AHB Bus | |||
Authors: | M. Caldari, M. Conti, M. Coppola, P. Crippa, S. Orcioni, L. Pieralisi, C. Turchetti | |||
Abstract: | The specification on power consumption of a digital system is extremely important due to the growing relevance of the market of portable devices and must be taken into account since the early phases of a complex System-on-Chip design. In this paper some guidelines are provided for the integration of the information on power consumption in the executable model of parameterized cores, with particular attention to the AMBA AHB bus. This will give important information for the analysis and choice between different design architectures driven by functional, timing and power constraints of the System-on-Chip. | |||
![]() |
File : | 2004_dac_anish.pdf | 162 kbytes | 2004-08-02 |
Title: | Automated Energy/Performance Macromodeling of Embedded Software | |||
Authors: | Anish Muttreja, Anand Raghunathan, Srivaths Ravi, Niraj K. Jha | |||
Abstract: | Efficient energy and performance estimation of embedded software is a critical part of any system-level design flow. Macromodeling based estimation is an attempt to speed up estimation by exploiting reuse that is inherent in the design process. Macromodeling involves pre-characterizing reusable software components to construct high-level models, which express the execution time or energy consumption of a sub-program as a function of suitable parameters. During simulation, macromodels can be used instead of detailed hardware models, resulting in orders of magnitude simulation speedup. However, in order to realize this potential, significant challenges need to be overcome in both the generation and use of macromodels- including how to identify the parameters to be used in the macromodel, how to define the template function to which the macromodel is fitted, etc. This paper presents an automatic methodology to perform characterization-based high-level software macromodeling, which addresses the aforementioned issues. Given a sub-program to be macromodeled for execution time and/or energy consumption, the proposed methodology automates the steps of parameter identification, data collection through detailed simulation, macromodel template selection, and fitting. We propose a novel technique to identify potential macromodel parameters and perform data collection, which draws from the concept of data structure serialization used in distributed programming. We utilize symbolic regression techniques to concurrently filter out irrelevant macromodel parameters, construct a macromodel function, and derive the optimal coefficient values to minimize fitting error. Experiments with several realistic benchmarks suggest that the proposed methodology improves estimation accuracy and enables wide applicability of macromodeling to complex embedded software, while realizing its potential for estimation speedup. We describe a case study of how macromodeling can be used to rapidly explore algorithm-level energy tradeoffs, for the zlib data compression library. | |||
![]() |
File : | 2004_vlsid_yunsi.pdf | 701 kbytes | 2004-08-02 |
Title: | Energy-Optimizing Source Code Transformations for OS-driven Embedded Software | |||
Authors: | Yunsi Fei, Srivaths Ravi, Anand Raghunatha, Niraj K. Jha | |||
Abstract: | The increasing software content of battery-powered embedded systems has fueled much interest in techniques for developing energyefficient embedded software. Source code transformations have previously been considered for application software to reduce its energy consumption. For complex embedded software applications, which consist of multiple concurrent processes running with the support of an embedded operating system (OS), it is known that the OS and the application-OS interaction significantly affect energy consumption. However, source code transformations explicitly targeting these effects have not been sufficiently studied. This paper proposes novel transformations for the source code of OS-driven multi-process embedded software programs in order to reduce their energy consumption. The key features of our optimizations are that they span process boundaries, and that they minimize the energy consumed in the execution of OS functions and services - opportunities which are beyond the reach of conventional compiler optimizations and source code transformation techniques. We propose four types of transformations, namely process-level concurrency management, message vectorization, computation migration and inter-process communication mechanism selection. We discuss how to systematically identify opportunities for the proposed transformations and apply them directly to the program source code. We have applied the proposed techniques to several multi-process software benchmark programs, and evaluated their applicability in the context of an embedded system containing an Intel StrongARM processor and embedded Linux OS. Our techniques achieve up to 37.9% (23.8% on an average) energy reduction compared to highly compileroptimized implementations. | |||
![]() |
File : | 00666239.pdf | 30.7 kbytes | 2004-11-13 |
Title: | Software Timing Analysis Using HW/SW Cosimulation and Instruction Set Simulator | |||
Authors: | J. Liu, M. Lajolo, A. Sangiovanni-Vincentelli | |||
Abstract: | Timing analysis for checking satisfaction of constraints is a crucial problem in real-time system design. In some current approaches, the delay of software modules is precalculated by a software performance estimation method, which is not accurate enough for hard real-time systems and complicated designs. In this paper, we present an approach to integrate a clock-cycle-accurate instruction set simulator (ISS) with a fast event-based system simulator. By using the ISS, the delay of events can be measured instead of estimated. An interprocess communication architecture and a simple protocol are designed to meet the requirement of robustness and flexibility. A cached refinement scheme is presented to improve the performance at the expense of accuracy. The scheme is especially effective for applications in which the delay of basic blocks is approximately data-independent. We also discuss the implementation issues by using the Ptolemy simulation environment and the ST20 simulator as an example. | |||
![]() |
File : | 00777398.pdf | 340 kbytes | 2004-11-12 |
Title: | A Compilation-based Software Estimation Scheme for Hardware/Software Co-Simulation | |||
Authors: | M. Lajolo, M. Lazarescu, A. Sangiovanni-Vincentelli | |||
![]() |
File : | 00957910.pdf | 695 kbytes | 2004-11-13 |
![]() |
File : | 00998484.pdf | 281 kbytes | 2004-07-11 |
![]() |
File : | 01012747.pdf | 687 kbytes | 2004-07-11 |
![]() |
File : | 01186633.pdf | 259 kbytes | 2003-04-15 |
Title: | Library Functions Timing Characterization for Source-Level Analysis | |||
Authors: | C. Brandolese, W. Fornaciari, F. Salice, D. Sciuto | |||
Abstract: | Execution time estimation of software at source-level is nowadays a crucial phase of the system design flow, especially for portable devices and real-time systems. From a source-level perspective, a call to an external library function is a black-box: only the binary code of such functions is, in fact, available. This paper proposes a methodology for library functions analysis within a source-level estimation framework. | |||
![]() |
File : | Brandolese-codes00.pdf | 174 kbytes | 2004-01-27 |
Title: | Energy Estimation for 32bit Microprocessors | |||
Authors: | C.Brandolese, W.Fornaciari, F.Salice, D.Sciuto | |||
![]() |
File : | Brandolese-codes01.pdf | 164 kbytes | 2004-01-27 |
Title: | Source–Level Execution Time Estimation of C Programs | |||
Authors: | C. Brandolese, W. Fornaciari, F. Salice, D. Sciuto | |||
Abstract: | In this paper a comprehensive methodology for software execution time estimation is presented. The methodology is supported by rigorous mathematical models of C statements in terms of elementary operations. The deterministic contribution is combined with a statistical term accounting for all those aspects that cannot be quantified exactly. The methodology has been validated by realizing a complete prototype toolset, used to carry out the experiments. | |||
![]() |
File : | Brandolese-dac00.pdf | 188 kbytes | 2004-01-27 |
Title: | An Instruction-level Functionality-based Energy Estimation Model for 32bits Microprocessors | |||
Authors: | C. Brandolese, W. Fornaciari, F. Salice, D. Sciuto | |||
![]() |
File : | Brandolese-date03.pdf | 84.3 kbytes | 2004-01-27 |
Title: | Library Functions Timing Characterization for Source-Level Analysis | |||
Authors: | C. Brandolese, W. Fornaciari, F. Salice, D. Sciuto | |||
Abstract: | Execution time estimation of software at source-level is nowadays a crucial phase of the system design flow, especially for portable devices and real-time systems. From a source-level perspective, a call to an external library function is a black-box: only the binary code of such functions is, in fact, available. This paper proposes a methodology for library functions analysis within a source-level estimation framework. | |||
![]() |
File : | Brandolese-hdlcon00.pdf | 59.5 kbytes | 2004-01-27 |
Title: | Fast Software-Level Power Estimation for Design Space Exploration | |||
Authors: | Carlo Brandolese, William Fornaciari, Fabio Salice, Donatella Sciuto | |||
Abstract: | Aim of the proposed methodology is to perform design space exploration at a high-level of abstraction based on high-level estimations of different parameters. In particular, this paper presents a methodology for static and dynamic estimation of the power consumption of the software components. This analysis is based on a fast software compilation strategy that allows a fast re-targeting over different microprocessors. The paper focuses on the description of the overall power assessment flow and its application on an industrial application. | |||
![]() |
File : | Brandolese-iccad01.pdf | 86.3 kbytes | 2004-01-27 |
Title: | An Assembly–Level Execution–Time Model for Pipelined Architectures | |||
Authors: | G. Beltrame, C. Brandolese, W. Fornaciari, F. Salice, D. Sciuto, V. Trianni | |||
Abstract: | The aim of this work is to provide an elegant and accurate static execution timing model for 32-bit microprocessor instruction sets, covering also inter–instruction effects. Such effects depend on the processor state and the pipeline behavior, and are related to the dynamic execution of assembly code. The paper proposes a mathematical model of the delays deriving from instruction dependencies and gives a statistical characterization of such timing overheads. The model has been validated on a commercial architecture, the Intel486, by means of timing analysis of a set of benchmarks, obtaining an error within 5%. This model can be seamlessly integrated with a static energy consumption model in order to obtain precise software power and energy estimations. | |||
![]() |
File : | Brandolese-isss00.pdf | 200 kbytes | 2004-01-27 |
Title: | Dynamic Modeling of Inter–Instruction Effects for Execution Time Estimation | |||
Authors: | G. Beltrame, C. Brandolese, W. Fornaciari, F. Salice, D. Sciuto, V. Trianni | |||
Abstract: | The market for embedded applications is facing a growing interest in power consumption issues: this work is intended to provide a new model to estimate software–level power consumption of 32-bit microprocessors. This model extends previous ones by considering dynamic inter–instruction effects that take place during code execution, providing a static means to characterize their energy consumption. The model is formally sound: it is conceived for a generic architecture and it has been preliminary validated on the Intel486 architecture. | |||
![]() |
File : | Brandolese-isss01.pdf | 134 kbytes | 2004-01-27 |
Title: | Dynamic Modeling of Inter–Instruction Effects for Execution Time Estimation | |||
Authors: | G. Beltrame, C. Brandolese, W. Fornaciari, F. Salice, D. Sciuto, V. Trianni | |||
Abstract: | The market for embedded applications is facing a growing interest in power consumption issues: this work is intended to provide a new model to estimate software–level power consumption of 32-bit microprocessors. This model extends previous ones by considering dynamic inter–instruction effects that take place during code execution, providing a static means to characterize their energy consumption. The model is formally sound: it is conceived for a generic architecture and it has been preliminary validated on the Intel486 architecture. | |||
![]() |
File : | Brandolese-isss02.pdf | 153 kbytes | 2004-01-27 |
Title: | Modeling Assembly Instruction Timing in Superscalar Architectures | |||
Authors: | G. Beltrame, C. Brandolese, W. Fornaciari, F. Salice, D. Sciuto, V. Trianni | |||
Abstract: | This paper proposes an original model of the execution time of assembly instructions in superscalar architectures. The approach is based on a rigorous mathematical model and provides a methodology and a toolset to perform data analysis and model tuning. The methodology also provides a framework for building new trace simulators for generic architectures. The results obtained show a good accuracy paired with a satisfactory computational efficiency. | |||
![]() |
File : | Brandolese-jcsc02.pdf | 213 kbytes | 2004-01-27 |
Title: | The Impact of Source Code Transformations on Software Power And Energy Consumption | |||
Authors: | C. Brandolese, W. Fornaciari, F. Salice, D. Sciuto | |||
Abstract: | Software power consumption minimization is becoming more and more a very relevant issue in the design of embedded systems, in particular those dedicated to mobile devices. The paper aims at reviewing state of the art source code transformations in terms of their effectiveness on power and energy consumption reduction. A design framework for the C language has been set up, using the gcc compiler with SimplePower as the simulation kernel. Some new transformations have been also identified aiming at reducing the power consumption. Four classes of transformations will be considered: loop transformations, data structures transformations, inter-procedural transformations and control structure transformations. For each transformation, together with the evaluation of the energy and power consumption, some applicability criteria have been defined. | |||
![]() |
File : | Brandolese-tcad02.pdf | 265 kbytes | 2004-01-27 |
![]() |
File : | Brandolese-wcc-icda00.pdf | 172 kbytes | 2004-01-27 |
Title: | A Retargetable Software Power Estimation Methodology | |||
Authors: | Carlo Brandolese | |||
![]() |
File : | Chackrabarti - instruction level power model of microcontrollers.pdf | 42.7 kbytes | 2003-03-26 |
Title: | Instruction Level Power Model Of Microcontrollers | |||
Authors: | Chaitali Chakrabarti, Dinesh Gaitonde | |||
Abstract: | In the design of low power systems, it is important to analyze and optimize both the hardware and the software component of the system. To evaluate the software component of the system, a good instruction-level energy model is essential. In this paper we present a methodology for instruction level modelling of microcontrollers using gate level power estimation tools. We use the microcontroller, M68HC11, to illustrate this method. We study two different implementations of the microcontroller and show that the energy consumption of each instruction is quite different. Our study reveals that data correlation does not significantly affect the energy consumption of most instructions. Finally, we show the correctness of this model by running some sample programs and showing that the predicted energy estimates are quite close to the actual estimates. | |||
![]() |
File : | chandrakasan 95 - optimizing power using transformations.pdf | 83.9 kbytes | 2003-03-27 |
Title: | Optimizing Power Using Transformations | |||
Abstract: | The increasing demand for portable computing has elevated power consumption to be one of the most critical design parameters. A high-level synthesis system, HYPER-LP, is presented for minimizing power consumption in application specific datapath intensive CMOS circuits using a variety of architectural and computational transformations. The synthesis environment consists of high-level estimation of power consumption, a library of transformation primitives, and heuristic/probabilistic optimization search mechanisms for fast and efficient scanning of the design space. Examples with varying degree of computational complexity and structures are optimized and synthesized using the HYPER-LP system. The results indicate that more than an order of magnitude reduction in power can be achieved over current-day design methodologies while maintaining the system throughput; in some cases this can be accomplished while preserving or reducing the implementation area. | |||
![]() |
File : | Choi2001-efficient-instruction-level-optimization.pdf | 345 kbytes | 2003-03-27 |
![]() |
File : | chung01source.pdf | 230 kbytes | 2003-03-26 |
Title: | Source Code Transformation based on Software Cost Analysis | |||
Authors: | Eui-Young Chung, Luca Benini, Giovanni De Micheli | |||
Abstract: | This paper presents a model and a strategy for source-code transformation applied to software application programs to reduce their energy cost. We propose a flexible performance and energy model for a processor-memory system. The benefit of the model is generality (it is not tied to a single memory and processor architecture) and effectiveness of evaluation. With this model, we first estimate the effects of source-code transformations (called transformation cost), representing the improvement ratios of processor cycles, I-cache misses, and D-cache misses. Next, we combine the transformation cost model with hardware parameters to estimate the actual effect of a transformation on performance and energy. The model can be used to guide software transformation selection for power and performance. The experimental results show that the proposed approach finds the optimal transformation in 95% of the cases, and that the penalty when the non-optimal transformation is selected is within 5%. | |||
![]() |
File : | Debray2001software-power-optimization.ps | 117 kbytes | 2003-03-27 |
![]() |
File : | ferrandi98power.pdf | 197 kbytes | 2003-03-26 |
Title: | Power Estimation of Behavioral Descriptions | |||
Authors: | Ferrandi, Fummi, Macii, Poncino, Sciuto | |||
![]() |
File : | FPGA-Power-iccad03.pdf | 614 kbytes | 2003-12-02 |
Title: | On The Interaction Between Power-Aware FPGA CAD Algorithms | |||
Authors: | Julien Lamoureux and Steven J.E Wilton | |||
Abstract: | As Field-Programmable Gate Array (FPGA) power consumption continues to increase, lower power FPGA circuitry, architectures, and Computer-Aided Design (CAD) tools need to be developed. Before designing low-power FPGA circuitry, architectures, or CAD tools, we must first determine where the biggest savings (in terms of energy dissipation) are to be made and whether these savings are cumulative. In this paper, we focus on FPGA CAD tools. Specifically, we describe a new power-aware CAD flow for FPGAs that was developed to answer the above questions. Estimating energy using very detailed post-route power and delay models, we determine the energy savings obtained by our power-aware technology mapping, clustering, placement, and routing algorithms and investigate how the savings behave when the algorithms are applied concurrently. The individual savings of the power-aware technology-mapping, clustering, placement, and routing algorithms were 7.6%, 12.6%, 3.0%, and 2.6% respectively. The majority of the overall savings were achieved during the technology mapping and clustering stages of the power-aware FPGA CAD flow. In addition, the savings were mostly cumulative when the individual power-aware CAD algorithms were applied concurrently with an overall energy reduction of 22.6%. | |||
![]() |
File : | Franke-compiler-transformation-of-pointers.ps | 147 kbytes | 2003-03-27 |
![]() |
File : | givargis00instructionbased.pdf | 636 kbytes | 2004-10-08 |
Title: | Instruction-based System-level Power Evaluation of System-on-a-chip Peripheral Cores | |||
Authors: | T. Givargis, F. Vahid, J. Henkel | |||
![]() |
File : | gls-vlsi98.pdf | 76.2 kbytes | 2002-06-04 |
Title: | How to transform an architectural synthesis tool for low power VLSI designs | |||
Authors: | S. Gailhard, N. Julien, J.-Ph. Diguet, E. Martin | |||
Abstract: | High Level Synthesis (HLS) for Low Power VLSI design is a complex optimization problem due to the Area/Time/Power interdependence. As few low power design tools are available, a new approach providing a modular low power synthesis method is proposed. Although based for the moment on a generic architectural synthesis tool Gaut, the use of different commercial tools is possible. Our Gaut_w HLS tool is constituted of low power modules = High level power dissipation estimation, Assignment, Module selection (Operators and supply voltage), Optimization criteria and Operators library. As illustration, power saving factors on DWT algorithms are presented. | |||
![]() |
File : | icics01-pignolo.pdf | 51.0 kbytes | 2002-06-04 |
Title: | High-Level Optimization of energy consumed by real-time applications embedded into DSP systems | |||
Authors: | Sebastien PIGNOLO, E. Martin, B. Saget, N. Julien, E. Senn | |||
Abstract: | This paper deals with Energy Consumption of Real-time applications running on embedded systems. Assumed that a significant part of the power dissipated by systems is due to Input / Output made between all the chips and especially between data-cacheless Digital Signal Processors shipped with tiny internal scratchpad data-memories and compared to huge external memories, it is worth settling a strategy to reduce their transfers by keeping inside the processor the more interesting data that yield to minimize the consumption. Our method is based on a data density criteria and on the modeling of the C source application into two Dependence Graphs of Data and Expression that give information about the behavioral of the data, their precise moments of use and the best instruction scheduling so that data reuse is maximal. First results show that by our method it is possible to reduce by 70% the energy of some applications. | |||
![]() |
File : | Julien - power estimation C DSP.PDF | 51.4 kbytes | 2002-06-04 |
Title: | Power Estimation of a C algorithm based on the Functional-Level Power Analysis of a Digital Signal Processor | |||
Authors: | Nathalie Julien, Johann Laurent, Eric | |||
Abstract: | A complete methodology to estimate power consumption at the C-level for on-the-shelf processors is introduced. It relies on the Functional-Level Power Analysis, which results in a power model of the processor that describes the consumption variations relatively to algorithmic and configuration parameters. Some parameters can be predicted directly from the C-algorithm with simple assumptions on the compilation. Maximum and minimum bounds for power consumption are obtained, together with a very accurate estimation; for the TI C6x, a maximum error of 6% against measurements is obtained for classical digital signal processing algorithms. Estimation results are summarized on a consumption map; the designer can compare the algorithm consumption, and its variations, with the application constraints. | |||
![]() |
File : | Julien-power-estimation-C.pdf | 40.7 kbytes | 2003-10-24 |
Title: | Power Estimation of a C algorithm based on the Functional-Level Power Analysis of a Digital Signal Processor | |||
Authors: | Nathalie Julien, Johann Laurent, Eric Senn, Eric Martin | |||
Abstract: | A complete methodology to estimate power consumption at the C-level for on-the-shelf processors is introduced. It relies on the Functional-Level Power Analysis, which results in a power model of the processor that describes the consumption variations relatively to algorithmic and configuration parameters. Some parameters can be predicted directly from the C-algorithm with simple assumptions on the compilation. Maximum and minimum bounds for power consumption are obtained, together with a very accurate estimation; for the TI C6x, a maximum error of 6% against measurements is obtained for classical digital signal processing algorithms. Estimation results are summarized on a consumption map; the designer can compare the algorithm consumption, and its variations, with the application constraints. | |||
![]() |
File : | Kandemir02compiler-power.ps | 173 kbytes | 2003-03-26 |
Title: | Influence of Compiler Optimizations on System Power | |||
Authors: | M. Kandemir, N. Vijaykrishnan, M. J. Irwin, and W. Ye | |||
Abstract: | Highlevel compiler optimizations have been widely used to achieve speedups on arraybased codes. Such optimizations are becoming increasingly important in embedded signal processing and multimedia systems. The focus of these optimizations has traditionally been on improving performance. However, energy constraints are of critical importance in batteryoperated embedded devices. In this paper, we present an experimental evaluation of several stateoftheart compiler optimizations on energy consumption, considering both the processor core (datapath) and memory system. This is in contrast to many of the previous works that have considered them in isolation. | |||
![]() |
File : | kulkarni93-loop-data-transformation-tutorial.ps | 685 kbytes | 2003-03-27 |
Title: | Loop and Data Transformations: A Tutorial - Technical Report CSRI337, University of Toronto, June 1993 | |||
Authors: | Dattatraya Kulkarni and Michael Stumm | |||
Abstract: | In this tutorial, we address the problem of restructuring a (possibly sequential) pro gram to improve execution efficiency on parallel machines. This restructuring involves the transformation and partitioning of loop structures and data so as to improve parallelism, static and dynamic locality, and load balance. We present previous and ongoing work on loop and data transformations and motivate a unified framework. Key Words: Dependence Analysis, Iteration and Data Spaces, Hierarchical Memory, Parallelism, Locality, Load Balance, Conventional and Unified Loop transformations, Data Alignment, Data Distributions. | |||
![]() |
File : | lebeck00power.pdf | 278 kbytes | 2003-03-26 |
Title: | Power Aware Page Allocation | |||
Authors: | Alvin R. Lebeck, Xiaobo Fan, Heng Zeng, Carla Ellis | |||
![]() |
File : | muth99alto.ps | 369 kbytes | 2003-03-27 |
![]() |
File : | paper1.pdf | 269 kbytes | 2003-02-18 |
Title: | Timing and Energy Estimation of C Programs | |||
Authors: | C.Brandolese, W.Fornaciari, F.Salice, D.Sciuto | |||
Abstract: | This paper affords the problem of analyzing the timing and energetic aspects of software for embedded applications. The main goal of the approach is to enable design space exploration over different microprocessors, development environments and coding alternatives. The approach embodies the benefits of static and dynamic analysis within a formal mathematical framework and takes full advantage of the accuracy of low–level methodologies while operating at source code level. The experimental assessment of the methodology considered C programs derived from real–world applications and confirmed its accuracy and effectiveness. | |||
![]() |
File : | Power and Energy Conservation for Clusters.pdf | 162 kbytes | 2003-03-26 |
Title: | Research Directions in Power and Energy Conservation for Clusters | |||
Authors: | Ricardo Bianchini | |||
![]() |
File : | power-consumption-estimation-of.pdf | 170 kbytes | 2004-08-30 |
![]() |
File : | power-estimation-of-a.pdf | 54.3 kbytes | 2004-08-30 |
![]() |
File : | Russell - software-power-estimation.pdf | 87.8 kbytes | 2003-03-26 |
Title: | Software Power Estimation and Optimization for High Performance, 32-bit Embedded Processors | |||
Authors: | Jeffry T. Russell, Margarida F. Jacome | |||
![]() |
File : | sami00instructionlevel.pdf | 181 kbytes | 2003-03-26 |
Title: | Instruction-Level Power Estimation for Embedded VLIW Cores | |||
Authors: | M. Sami D. Sciuto C. Silvano V. Zaccaria | |||
Abstract: | In this paper, a power estimation methodology operating at the instruction-level is proposed. The methodology is tightly related to the characteristics of the system archi- tecture, mainly in terms of one or more target processors, the memory sub-system, the system-level buses and the co- processors. In this system-level framework, our main goal is to define a power model for CPU cores at the instruction- level. First, the proposed power model deals with a general five-stage pipeline processor architecture, then, the model is extended to VLIW processors. The derivation of a VLIW instruction-level power model results to be intractable from the point of view of spatial complexity (which grows exponentially w.r.t. the number of possible operations in the ISA). In order to tackle this complexity, a new kind of simplification, based on the original concept of separability of processor functional units, is introduced. The proposed system-level methodology is the first step toward a more general framework to support the design of power-oriented applications through hardware/software co-design. | |||
![]() |
File : | sim-wattch-1.02.tar.gz | 4.52 Mbytes | 2003-12-02 |
![]() |
File : | SimpleScalar-v2-TR.pdf | 962 kbytes | 2003-03-27 |
Title: | The SimpleScalar Tool Set, Version 2.0 | |||
Authors: | Doug Burger* Todd M. Austin | |||
Abstract: | This document describes release 2.0 of the SimpleScalar tool set, a suite o f free, publicly available simulation tools that offer both detailed and high-performance simulation o f modern microprocessors. The new release offers more tools and capabilities, precompiled binaries, cleaner interfaces, better documentation, easier installation, improved portability, and higher performance. This paper contains a complete description o f the tool set, including retrieval and installation instructions, a description of how to use the tools, a description of the target SimpleScalar architecture, and many details about the internals o f the tools and how to customize them. With this guide, the tool set can be brought up and generating results in under an hour (on supported platforms). | |||
![]() |
File : | Stanca99array-based-structure-loop.pdf | 107 kbytes | 2003-03-27 |
![]() |
File : | su94-low power architecture and compilation.pdf | 74.3 kbytes | 2004-01-27 |
Title: | Low Power Architecture Design and Compilation Techniques for High-Performance Processors | |||
Authors: | Ching-Long Su, Chi-Ying Tsui, Alvin M. Despain | |||
Abstract: | Reducing switching activity would significantly reduce power consumption of a processor chip. In this paper, we present two novel techniques, Gray code addressing and Cold scheduling, for reducing switching activity on high performance processors. We use Gray code which has only one-bit different in conseuctive number for addressing. Due to locality of program execution, Gray code addressing can significantly reduce the number of bit switches. Experimental results show that for typical programs running on a RISC microprocessor, using Gray code addressing reduce the switching activity at the address lines by 30~50% compared to using normal binary code addressing. Cold scheduling is a software method which schedules instructions in a way that switching activity is minimized. We carried out experiments with cold scheduling on the VLSI-BAM. Preliminary results show that switching activity in the control path is reduced by 20-30%. | |||
![]() |
File : | tan01high-level-sw-energy-modelling.pdf | 1.46 Mbytes | 2003-03-27 |
Title: | High-level Software Energy Macro-modeling | |||
Authors: | T.K. Tan, A. Raghunathan, G. Lakshminarayana, N.K. Jha | |||
![]() |
File : | tiwari94compilation-techniques-for-low-power.ps | 69.6 kbytes | 2003-03-27 |
![]() |
File : | tiwari94power.pdf | 224 kbytes | 2003-03-26 |
Title: | Power Analysis of Embedded Software: A First Step Towards Software Power Minimization | |||
Authors: | Tiwari, Maki, Wolfe | |||
![]() |
File : | tiwari96instruction.pdf | 303 kbytes | 2003-03-26 |
Title: | Instruction Level Power Analysis and Optimization of Software | |||
Authors: | V. Tiwari, S. Malik, A. Wolfe, M. T. Lee | |||
![]() |
File : | trace-driven-memory-simulation.pdf | 198 kbytes | 2003-04-16 |
Title: | Trace-driven Memory Simulation: A Survey | |||
Authors: | Richard A. Uhlig, Trevor N. Mudge | |||
Abstract: | As the gap between processor and memory speeds continues to widen, methods for evaluating memory-system designs before they are implemented in hardware are becoming increasingly important. One such method, trace-driven memory simulation, has been the subject of intense interest among researchers and has, as a result, enjoyed rapid development and substantial improvements during the past decade. This paper surveys and analyzes these developments by establishing criteria for evaluating trace-driven methods, and then applies these criteria to describe, categorize and compare over 50 trace-driven simulation tools. We discuss the strengths and weaknesses of different approaches and show that no single method is best when all criteria, including accuracy, speed, memory, flexibility, portability, expense, and ease-of-use are considered. In a concluding section, we examine fundamental limitations to trace-driven simulation, and survey some recent developments in memory simulation that may overcome these bottlenecks. | |||
![]() |
File : | wattch-isca2000.pdf | 249 kbytes | 2003-12-02 |
Title: | Wattch: a Framework for Architectural-level Power Analysis and Optimizations | |||
Authors: | David Brooks, Vivek Tiwari, Margaret Martonosi | |||
![]() |
File : | Wilton - CACTI An Enhanced Cache Access and Cycle Time Model.pdf | 284 kbytes | 2003-12-02 |
Title: | CACTI: An Enhanced Cache Access and Cycle Time Model | |||
Authors: | Steven J.E. Wilton, Norman P. Jouppi | |||
Abstract: | This paper describes an analytical model for the access and cycle times of on-chip direct-mapped and set-associative caches. The inputs to the model are the cache size, block size, and associativity, as well as array organization and process parameters. The model gives estimates that are within 6% of Hspice results for the circuits we have chosen. This model extends previous models and fixes many of their major shortcomings. New features include models for the tag array, comparator, and multiplexor drivers, non-step stage input slopes, rectangular stacking of memory subarrays, a transistor-level decoder model, column-multiplexed bitlines controlled by an additional array organizational parameter, load-dependent size transistors for wordline drivers, and output of cycle times as well as access times. Software implementing the model is available via ftp. | |||
![]() |
File : | Wilton WRL-TR-93.5.ps | 1.31 Mbytes | 2003-12-02 |
![]() |
File : | ye 2000 - The Design and use of simplepower.ps | 282 kbytes | 2003-03-26 |
Navigation: > [Home page] > [papers] > [Power Estimation] | [Alphabetical Index] [Tree View] |
![]() |
This page was last updated on 2004-12-26 at 18:22:48. This site was automagically generated by MajaMaja, a simple and easy to use web content manager written in Tcl by scarpaz <scarpaz@scarpaz.com>. |