2. Overall Architecture

개요 / Overview

한국어

ENN-PyTorch는 사용자 API, 설정, 모델, 커널·정밀도, 데이터 파이프라인, 학습·예측 런타임이 하나의 실행 흐름으로 연결된 구조다.

파일별로만 보면 전체 구조가 잘 보이지 않는다. Model, Fuser, Scaler, KernelManager, StatelessAutocast, Sampler, Session, ProcessBroker, Checkpointer 같은 구성요소가 서로 다른 디렉터리에 흩어져 있지만, 실제 동작은 사용자 API에서 시작해 설정 구성, 데이터 준비, 워커 런타임 실행, 모델 forward, 결과물 회수로 이어진다.

따라서 ENN-PyTorch의 전체 아키텍처는 파일 목록보다 실행 흐름과 계층별 책임을 기준으로 이해하는 것이 좋다.

English

ENN-PyTorch is structured as a single execution flow that connects the user API, configuration, model, kernel and precision strategy, data pipeline, and training and prediction runtime.

The overall structure is hard to understand by looking only at individual files. Components such as Model, Fuser, Scaler, KernelManager, StatelessAutocast, Sampler, Session, ProcessBroker, and Checkpointer are spread across different directories, but the actual behavior flows from the user API to configuration, data preparation, worker runtime execution, model forward execution, and artifact collection.

For this reason, the overall architecture of ENN-PyTorch is best understood through execution flow and layer responsibilities, rather than through a simple file list.

사용자 API / User API
  → 설정 구성 / Configuration
  → 데이터 준비 / Data Preparation
  → 워커 런타임 실행 / Worker Runtime Execution
  → 모델 forward / Model Forward
  → checkpoint / prediction / export artifact 생성
    Checkpoint, prediction, and export artifact generation

전체 실행 흐름 / Overall Execution Flow

한국어

아래 도식은 ENN-PyTorch의 전체 실행 흐름을 보여준다. 사용자 API와 설정 계층에서 시작한 실행은 데이터 준비, 학습·예측 런타임, 모델 실행, 결과 회수, 저장·내보내기 계층으로 이어진다. 커널·정밀도 전략과 운영 환경은 특정 한 계층에만 속하지 않고, 모델 실행과 런타임 안정성에 함께 영향을 준다.

English

The diagram below shows the overall execution flow of ENN-PyTorch. Execution starts from the user API and configuration layer, then moves through data preparation, the training and prediction runtime, model execution, artifact collection, and save or export layers. The kernel and precision strategy and the operating environment do not belong to only one layer; they affect both model execution and runtime stability.

flowchart TD
    A["사용자 API<br/>User API<br/>new_model / train / predict<br/>save_model / load_model"] --> B["설정 구성<br/>Configuration<br/>ModelConfig / RuntimeConfig"]

    B --> C["데이터 준비<br/>Data Preparation<br/>raw data cleanup<br/>memmap staging<br/>scale statistics"]
    C --> D["학습·예측 런타임<br/>Training & Prediction Runtime<br/>elastic worker<br/>ProcessBroker<br/>distributed process group"]

    D --> E["모델 실행<br/>Model Execution<br/>Embedding / Scaler<br/>Template / Fuser<br/>Collector / SigmoidGate"]
    E --> D

    D --> F["결과 회수<br/>Artifact Collection<br/>checkpoint<br/>prediction chunks<br/>manifest"]

    F --> G["저장 또는 내보내기<br/>Save or Export<br/>native checkpoint<br/>ONNX / ORT / TensorRT<br/>PT2 / AOTI / ExecuTorch"]

    H["커널·정밀도 전략<br/>Kernel & Precision Strategy<br/>AttentionPolicy<br/>KernelManager<br/>Autocast / FP8 / BF16 / FP32"] --> E

    I["운영 환경<br/>Operating Environment<br/>Python / OS<br/>GPU / filesystem<br/>temporary cache"] --> C
    I --> D
    I --> H

한국어

이 흐름에서 모델 실행은 런타임과 분리되어 있지 않다. 모델은 단순히 forward()만 수행하는 객체가 아니라, Scaler, autocast, attention backend, prediction fallback, checkpoint 회수와 함께 동작한다.

English