Mesh Intermediary Representation (MIR)

The Mesh Intermediary Representation (MIR) specification is an open standard describing a platform-neutral data format for long-term storage and interchange of mesh-like data.

It has also been designed to serve as an intermediary step when converting the mesh-like data between different computational mechanics application formats.

You can read the MIR specification online, or download a printable PDF.

This research is based upon the previous work by Peter Iványi [Ivanyi2012] [Ivanyi2014].

MIR concepts

Warning

A valid MIR data file must contain all of the layers, datasets and attributes defined by this specification, even if some of them are empty.

Identifiers

Identifiers (also referred to as names) are described by the following lexical definitions:

identifier ::=  first_word ("_" word)*
first_word ::=  letter [word]
word       ::=  (letter | digit)+
letter     ::=  "a"..."z"
digit      ::=  "0"..."9"

where letter has to be a lowercase ASCII character.

Non-standard identifiers

Non-standard or vendor specific identifiers must have their names prefixed with _x_:

non_standard_identifier ::=  "_x_" identifier

Data types

MIR specification defines the following atomic data types:

  • integer numbers (signed)
  • floating point numbers
  • Unicode strings

Data elements using non-standard or vendor specific data types must use a non-standard identifier for their name and are beyond the scope of MIR specification.

Attributes

An attribute is a small metadata object attached directly to a dataset or a layer, used to describe their nature and/or intended usage.

Non-standard or vendor specific attributes must use a non-standard identifier for their name and are beyond the scope of MIR specification.

Datasets

A dataset is a rectangular 2D array (generally known as a matrix) of data elements, arranged in rows and columns.

A dataset may use different data types for each column, but all data elements within a single column must be of the same data type.

Non-standard or vendor specific datasets must use a non-standard identifier for their name and are beyond the scope of MIR specification.

Layers

Layers are used to organize individual features and properties of a MIR data file into separate logical groups.

Non-standard or vendor specific layers must use a non-standard identifier for their name and are beyond the scope of MIR specification.

A MIR data file is composed of the following layers:

geometry layer

The geometry layer has the following geometry primitives, each stored in a separate dataset:

Dataset name Vertices Edges Faces Dimensions
lines 2 1 0 1
triangles 3 3 1 2
quads 4 4 1 2
hexagons 6 6 1 2
tetrahedrons 4 6 4 3
cuboids 8 12 6 3

A vertex is a point in 3D space whose coordinates are defined as a (x, y, z) triplet, while a single coordinate is represented by a floating point number.

Each dataset stores a single geometry primitive per row, and has 3 times as many columns as its corresponding geometry primitive has vertices. This is because only the coordinates are stored, and each vertex is defined by 3 coordinates.

For example, the quads dataset requires 12 columns to store its coordinates, as it has 4 vertices. A single row of this dataset would be layed out like so:

# Vertex "A"    Vertex "B"    Vertex "C"    Vertex "D"
# x   y   z     x   y   z     x   y   z     x   y   z

 1.1 1.2 1.3   2.1 2.2 2.3   3.1 3.2 3.3   4.1 4.2 4.3

Each dataset also has the following attributes attached to it:

num_vertices
Number of vertices, stored as an integer number.
num_coordinates

Number of coordinates, stored as an integer number.

Equal to: num_vertices * 3

metadata layer

The metadata layer has no datasets, and instead uses the following attributes to further describe the MIR data file:

generator_name

Name of the program that created the file, stored as a Unicode string.

For example: foobar

generator_version

Version of the program that created the file, stored as a Unicode string.

For example: 1.0.3

mir_version

Version of MIR specification used when creating the file, stored as a Unicode string.

For example: 1.12.3

created_at

Date and time at which the file was created, stored as a Unicode string in ISO 8601 format.

For example: 2017-08-15T13:51:02+02:00

MIR file format

MIR is backed up by the HDF5 data file format.

HDF5 is an extensible and open standard, comprised of platform independent technologies which are available under Open Source licenses. HDF5 format has been designed with the objective of creating data storages that are self-descriptive, flexible, and have extremely fast and efficient access patterns to the stored data.

Note

Although not mandatory, MIR data files should have the .mir extension (lowercase), for easier identification on the file system.

HDF5 key concepts

Although HDF5 key concepts are best explained by the HDF5 User’s Guide, a quick overview is given bellow for readers convenience:

Group

A collection of HDF5 objects (including other groups).

As suggested by its name “Hierarchical Data Format”, an HDF5 file is hierarchically structured, like a tree. This tree structure is very similar to the file system structures employed on UNIX systems, with directories and files. HDF5 groups are analogous to the directories, HDF5 datasets are analogous to the files.

Every HDF5 data file has at least one object, the root group. All other objects (including groups) are either members of the root group or its descendants.

Dataset

A multidimensional array of data elements, together with supporting metadata.

An HDF5 dataset is an object composed of a collection of data elements and metadata that stores a description of the data elements, data layout, and all other information necessary to write, read, and interpret the stored data.

Data type

A description of a specific class of data element including its storage layout as a pattern of bits.

HDF5 data types implement a flexible, extensible, and portable mechanism for specifying and discovering the storage layout of the data elements, determining how to interpret the elements (for example, as float­ing point numbers), and for transferring data from different compatible layouts.

Atomic data types are indivisible. Composite data types are composed of multiple elements of atomic data types.

In addition to the standard types, users can define additional custom data types.

Attribute

A small metadata object attached to a group or a dataset.

Attributes are a critical part of what makes HDF5 a “self-describing” format. They are small named pieces of data attached directly to group or dataset objects. This is the official way to store metadata in HDF5.

Mapping MIR concepts onto HDF5

In general, MIR concepts map cleanly onto their HDF5 counterparts:

MIR HDF5
Data type Data type
Attribute Attribute
Dataset Dataset
Layer Group

Detailed MIR/HDF5 data type map:

  • integer numbers are stored as 32-bit signed integers, using the H5T_STD_I32LE HDF5 data type
  • floating point numbers are stored as IEEE 754 binary64, using the H5T_IEEE_F64LE HDF5 data type
  • all Unicode strings are UTF-8 encoded

Additionally, each MIR layer (HDF5 group) is a child of the root HDF5 group.

Example programs and data files

An example MIR data file, filled with random data, is available on our GitHub repository, under examples/random-data.mir.

Creating a MIR data file

The following Python program creates a MIR data file, filled with random data:

#!/usr/bin/env python
from datetime import datetime
import os

import numpy as np # $ pip install numpy
import tables as tb # $ pip install tables
from tzlocal import get_localzone # $ pip install tzlocal


filename = 'random-data.mir'

sample_num_rows = 100
geometry_primitives = [
    # dataset_name , num_vertices
    ('lines'       , 2),
    ('triangles'   , 3),
    ('quads'       , 4),
    ('hexagons'    , 6),
    ('tetrahedrons', 4),
    ('cuboids'     , 8),
]


filters = tb.Filters(complib='zlib', complevel=1, shuffle=True, fletcher32=True)
with tb.open_file(filename, 'w', filters=filters) as fp:
    geometry_layer = fp.create_group(where='/', name='geometry')
    for dataset_name, num_vertices in geometry_primitives:
        num_coordinates = num_vertices * 3 # num_coordinates_per_vertex

        dataset = fp.create_earray(
            where=geometry_layer,
            name=dataset_name,
            atom=tb.Float64Atom(),
            shape=(0, num_coordinates),
        )
        dataset._v_attrs.num_vertices = num_vertices
        dataset._v_attrs.num_coordinates = num_coordinates

        dataset.append(np.random.rand(sample_num_rows, num_coordinates))

    metadata_layer = fp.create_group(where='/', name='metadata')
    metadata_layer._v_attrs.generator_name = __file__
    metadata_layer._v_attrs.generator_version = '1.0'
    metadata_layer._v_attrs.mir_version = '0.1'
    metadata_layer._v_attrs.created_at = datetime.now(get_localzone()).replace(
        microsecond=0
    ).isoformat()

Note

The geometry layer, while structurally valid, may not contain valid geometry primitives due to randomly generated coordinates.

Reading from a MIR data file

The following Python program reads some data from a MIR data file:

#!/usr/bin/env python
import tables as tb # $ pip install tables


filename = 'random-data.mir'


with tb.open_file(filename) as fp:
    print 'Loading the `/geometry/quads` dataset...'
    dataset = fp.root.geometry.quads

    print '\nFirst 5 geometry primitives in the dataset:'
    print dataset[:5]

    print '\nLast vertex of the first 5 geometry primitives in the dataset:'
    print dataset[:5, -3:]

    print '\n`num_vertices` dataset attribute:',
    print dataset._v_attrs.num_vertices
[Ivanyi2012]Iványi P. Finite element mesh conversion based on regular expressions. Advances in Engineering Software. 2012;51:20–39.
[Ivanyi2014]Iványi P. A Technique for the Long Term Preservation of Finite Element Meshes. In: 2014 IEEE Intl Conf on High Performance Computing and Communications, 2014 IEEE 6th Intl Symp on Cyberspace Safety and Security, 2014 IEEE 11th Intl Conf on Embedded Software and Syst (HPCC, CSS, ICESS). IEEE; 2014. p. 116–120.