Generated by GPT-5-mini| NNEF | |
|---|---|
| Name | NNEF |
| Developer | Khronos Group |
| Released | 2017 |
| Latest release | 1.0 |
| Operating system | Cross-platform |
| Genre | File format |
| License | Open standard |
NNEF
NNEF is a standardized interchange format for representing neural network models and computational graphs, designed to facilitate portability among frameworks and deployment targets. It was developed to provide a compact, framework-agnostic representation that complements exporter and importer toolchains from organizations such as Intel Corporation, ARM Holdings, NVIDIA Corporation, Google LLC, and Apple Inc.. NNEF aims to reduce fragmentation between toolsets like TensorFlow, PyTorch, Caffe, MXNet, and runtime engines such as ONNX Runtime, TensorRT, OpenVINO, and Core ML.
NNEF is specified by the Khronos Group as an open standard to describe computational graphs, tensor shapes, operator attributes, and binary weights. The specification targets interoperability with popular frameworks like TensorFlow, PyTorch, Caffe, MXNet, Theano, Chainer and runtime engines including TensorRT, OpenVINO, ONNX Runtime, Core ML, TVM, Glow, PlaidML, ARM Compute Library and Intel OpenVINO Toolkit. Implementations frequently involve converters from model formats produced by Google Colaboratory, Jupyter Notebook, Docker (software), and CI systems operated by companies such as Microsoft Corporation and Amazon Web Services. NNEF is often discussed alongside standards like ONNX, FlatBuffers, Protobuf, JSON, and HDF5.
Work on NNEF began in the mid-2010s within the Khronos Group ecosystem, aiming to provide an alternative to emergent interchange formats used by Google LLC, Facebook, Inc., and other major contributors to machine learning tooling. Early contributors included engineers from Intel Corporation, ARM Holdings, NVIDIA Corporation, Qualcomm, and certain academic labs affiliated with MIT and Stanford University. The specification matured through public working group meetings, face-to-face sessions at conferences such as CVPR, NeurIPS, ICML, and ECCV, and was published to encourage broad tooling support by vendors like Samsung Electronics and Sony Corporation. NNEF’s development paralleled efforts on Open Neural Network Exchange while differentiating by focusing on a compact textual graph notation and explicit binary weight packaging.
NNEF defines a textual syntax for operators, tensors, and graph topology, plus a binary blob format for serialized parameter data. The format specifies tensor metadata (dtype, shape, layout) and operator signatures for primitives such as convolution, pooling, batch normalization, activation functions, recurrent units, and quantization. It references numeric types and structures used by implementers at companies like Intel Corporation, ARM Holdings, NVIDIA Corporation, Qualcomm, and standards bodies such as IEEE. The specification documents explicit semantics for data layout transformations between NHWC and NCHW conventions and normative behavior for edge cases similar to definitions found in BLAS libraries and compiler backends such as LLVM. NNEF also outlines an extension mechanism to allow vendor-specific ops used by Google LLC’s accelerators, Apple Inc.’s Neural Engine, NVIDIA Corporation’s CUDA kernels, and custom ASICs by companies like Graphcore and Cerebras Systems.
Typical use cases include model exchange between research prototypes in Jupyter Notebook or Google Colaboratory and deployment pipelines targeting inference engines like TensorRT, OpenVINO, Core ML, TVM, or embedded SDKs from ARM Holdings and NXP Semiconductors. Converters exist to translate models from TensorFlow, Keras, PyTorch, Caffe, and MXNet into NNEF, often as intermediate artifacts for testing on hardware from NVIDIA Corporation, Intel Corporation, Samsung Electronics, and Qualcomm. NNEF has been used in interoperability demos at trade shows and conferences hosted by CES, MWC (Mobile World Congress), and ISC High Performance. Tooling includes parsers and exporters in languages supported by GitHub, GitLab, Bitbucket, and continuous integration stacks run on Travis CI and Jenkins.
Adoption of NNEF has been strongest among hardware vendors and middleware providers seeking a compact exchange for inference deployment. Companies including Intel Corporation, ARM Holdings, NVIDIA Corporation, Samsung Electronics, Qualcomm, Google LLC, and Apple Inc. have engaged with Khronos working groups or published tooling that references the format. Academic groups at Stanford University, MIT, UC Berkeley, and Carnegie Mellon University have used NNEF in research projects comparing portability with ONNX and custom serialization strategies. While NNEF did not displace formats such as ONNX in all ecosystems, it influenced best practices for operator semantics and binary packaging adopted by various SDKs and vendor libraries, including runtime deployments on Android (operating system), iOS, Windows and Linux distributions maintained by Canonical (company) and Red Hat.
NNEF is often compared to ONNX, TensorFlow SavedModel, Core ML model format, FlatBuffers-based formats, and protobuf-based exports used by TensorFlow Lite and MXNet. Compared with ONNX, NNEF emphasizes a compact textual graph plus binary blobs rather than a single protobuf wrapper, trading potentially easier human readability for ecosystem momentum. Against TensorFlow SavedModel and Core ML the format is more neutral and vendor-agnostic, whereas those formats are tightly integrated with Google LLC and Apple Inc. toolchains respectively. Compared to serialized numeric containers like HDF5 and Protobuf, NNEF provides explicit operator semantics and extension hooks aimed at easing correct implementation across diverse accelerators such as those from NVIDIA Corporation, Intel Corporation, ARM Holdings, and emerging vendors like Graphcore and Cerebras Systems.
Category:File formats