Day 1 (10/1, Sat)

AutoDL

https://sites.google.com/rice.edu/auto-dl/

  • DNN 모델 개발은 어렵다
  • DNN 하드웨어 가속기 개발도 어렵다
    • 둘 모두 시간+노동 집약적
  • 둘 다 하더라도, 그 둘을 합쳤을 때 최적의 성능을 보여준다고 말할 수 없다
    • In terms of both efficiency & accuracy

→ Hardware-algorithm Co-design이 필요

  • keyword: NAS (neural architecture search)
    • HW-NAS (HW-aware NAS)
  • 관련하여 3개의 연구 소개
  1. Auto-NBA

http://proceedings.mlr.press/v139/fu21d/fu21d.pdf

  • Input: User demands about accuracy & efficiency
  • Output: Paired network & bitwidth & accelerator
  1. HW-NAS-Bench

https://openreview.net/pdf?id=_0kaDkv3dVf

  • Search the optimal DNN architectures for the target non-customizable devices
  • (maybe a hint on model conversion for specific ai-chip?)
  1. DNN-Chip Predictor

https://ieeexplore.ieee.org/document/9053977

FireSim & Chipyard

https://fires.im/micro-2022-tutorial/


Day 2 (10/2, Sun)

FastPath

https://fastpathconference.github.io/FastPath2022/

International Workshop on Performance Analysis of Machine Learning Systems

  • Ingesting and Processing Data Efficiently for Machine Learning
  • Machine Learning for Better Medicine
    • Canceled
  • System Requirements for Deep Learning Foundational Models
  • Using Processing-in-Memory to Accelerate Edge Machine Learning
  • AI4Physics: From Conceptualization to AI-Driven Discovery at Scale
  • SODA: An End-To-End Open-Source Hardware Compiler for Machine Learning Accelerators
  • Faster Learning on Slow Hardware

Day 3 (10/3, Mon)

Session 4B: Machine Learning

  • Skipper: Enabling Efficient SNN Training Through Activation-Checkpointing and Time-Skipping
  • Going Further With Winograd Convolutions: Tap-Wise Quantization for Efficient Inference on 4x4 Tiles
  • Adaptable Butterfly Accelerator for Attention-Based NNs via Hardware and Algorithm Co-Design
  • DFX: A Low-Latency Multi-FPGA Appliance for Accelerating Transformer-Based Text Generation
  • HARMONY: Heterogeneity-Aware Hierarchical Management for Federated Learning System

Day 4 (10/4, Tue)

Keynote

Democratizing Customized Computing

  • How to make programmer w/o knowledge of circuits use FPGA easily?
    • AutoSA
    • AutoDSE
    • GNN-DSE
    • HeteroCL
  • But compile time too long…
    • TAPA
  • MLIR? → 조사해볼만한듯

Session 5B: Accelerators In/Near Memory

  • GenPIP: In-Memory Acceleration of Genome Analysis by Tight Integration of Basecalling and Read Mapping
  • BEACON: Scalable Near-Data-Processing Accelerators for Genome Analysis near Memory Pool with the CXL Support
  • Sparse Attention Acceleration with Synergistic In-Memory Pruning and On-Chip Recomputation
  • ICE: An Intelligent Cognition Engine with 3D NAND-based In-Memory Computing for Vector Similarity Search Acceleration

Day 5 (10/5, Wed)

Session 9A: Design Methodology

  • RemembERR: Leveraging Microprocessor Errata for Improving Design Testing and Validation
  • Datamime: Generating Representative Benchmarks by Automatically Synthesizing Datasets
  • An Architecture Interface and Offload Model for Low-Overhead, Near-Data, Distributed Accelerators
  • Towards Developing High Performance RISC-V Processors Using Agile Methodology

Session 10B: Machine Learning

  • 3D-FPIM: An Extreme Energy-Efficient DNN Acceleration System Using 3D NAND Flash-Based In-Situ PIM Unit
  • Sparseloop: An Analytical Approach to Sparse Tensor Accelerator Modeling
  • DeepBurning-SEG: Generating DNN Accelerators of Segment-Grained Pipeline Architecture
  • ANT: Exploiting Adaptive Numerical Data Type for Low-Bit Deep Neural Network Quantization
  • Ristretto: An Atomized Processing Architecture for Sparsity-Condensed Stream Flow in CNN
ℹ️
pat(이창림)
한국 서버 개발자