• Introduction to SXP Reflection System

    Class reflection provides run-time introspection of classes, which enables for example versioned object serialization, exposing class variables in editors and data replication for networking. Instead of writing specific code for each class for all these systems, the systems can utilize generic class reflection relieving programmers from writing and maintaining tons of system specific code. However, because C++ language doesn't have intrinsic support for class reflection, Spin-X Platform provides a reflection infrastructure as a collection of macros, which can be used to easily add reflection to existing C++ classes.

    In this article Iíll cover how reflection is defined for types using the SXP reflection system. To keep the article from stretching too long, I wonít delve too much into how classes with reflection can be used, but try to keep it strictly as reflection definition side of a documentation.

    SXP reflection system is designed to provide a pragmatic solution to wide range of real life cases. The system is extensively used within SXP and has evolved over the years to address various situations prevailing particularly in performance and memory intensive game engine code. The design of the reflection system features following traits to maximize its usability:
    • No meta language: No need to learn and be limited by a custom meta language. Simpler build process. Can easily add reflection to classes as an afterthought.
    • Portable: Uses standard C++ features to support wide range of platforms (e.g. no pdb parsing).
    • Support for data encapsulation: No need to compromise data encapsulation by making class members public for reflection.
    • No enforced inheritance: No need to derive classes from a "reflection" class, as it induces multiple inheritance and potentially adds memory/performance overhead.
    • Natural support for lightweight classes: No memory/performance overhead for monomorphic classes (e.g. vec3f). No custom class reflection code in the reflection system.
    • Support for custom reflection code: Ability to write C++ reflection code for special cases (e.g. write data directly to GPU texture memory upon deserialization).
    • Local reflection definitions: Complete reflection definition for a class exists in the same scope as the reflected class (e.g. no global type lists or type specific code in the reflection system)
    • High performance: Usable in very performance intensive code (e.g. able to pass a lot of object state data within a frame using reflection).
    • No need for RTTI: Support for minimalistic builds to save resources on constrained systems.
    • Support for nested types: Support for nested (public/private) types defined within classes.
    • Support for template classes: Not limited only to non-template classes.
    • No C++ code obfuscation: Reflection definition doesn't obfuscate class definitions (for sake of code readability and to enable tools to cope with the code).
    • Custom properties: In addition to name and type properties, custom properties such as description and flags can be associated with class members.


    Monomorphic Classes

    Monomorphic classes (as opposed to polymorphic classes) are classes without virtual functions. Reflection definition can be added to such classes using PFC_MONO() macro as follows:
    Code cpp:
    struct foo
    { PFC_MONO(foo) {PFC_VAR3(x, y, z);}
      int x, y, z;
    };
    For the class sizeof(foo)==3*sizeof(int), because the macro doesnít add member variables or virtual functions to the class in order to keep it lightweight. PFC_VAR3() is a convenience macro for reflecting 3 variables at once, which could also be defined with PFC_VAR(x); PFC_VAR(y); PFC_VAR(z);

    Now that the class has reflection definition, we can for example serialize the content of foo objects to a memory buffer like this:
    Code cpp:
    foo f={1, 2, 3};
    char buf[32];
    mem_output_stream ms(buf, sizeof(buf));
    ms<<f;
    mem_output_stream uses the reflection system to serialize each reflected member variable to the memory stream. As the result the content of buf will be [0x01, 0x00, 0x00, 0x00, 0x02, 0x00, 0x00, 0x00, 0x03, 0x00, 0x00, 0x00] (assuming little-endian system and 32-bit int type). This simple case could of course be achieved with simple memcpy(buf, &f, sizeof(f)); which is a common method in C, but assumes a specific layout for the foo class. For example the class could have transient member variables we do not wish to serialize. Itís perfectly valid to omit member variables from reflection definitions for example to reduce the amount of serialized data.

    The following example saves the object content to the buffer and swaps endianess of all the member variables, thus writing the following data to the buffer instead [0x00, 0x00, 0x00, 0x01, 0x00, 0x00, 0x00, 0x02, 0x00, 0x00, 0x00, 0x03]:
    Code cpp:
    foo f={1, 2, 3};
    char buf[32];
    mem_output_stream ms(buf, sizeof(buf));
    endian_output_stream es(ms);
    es<<f;
    The reflection definition also enables using the class as a member of another class, which can be reflected in the reflection definition of the enclosing class:
    Code cpp:
    struct bar
    { PFC_MONO(bar) {PFC_VAR4(f, a, b, c);}
      foo f;
      float a, b, c;
    };

    Polymorphic Classes

    Reflection can be added to polymorphic base classes with PFC_BASE_CLASS() macro and to derived classes with PFC_CLASS() macro as shown below:
    Code cpp:
    class foo
    { PFC_BASE_CLASS(foo) {PFC_VAR3(x, y, z);}
    private:
      int8 x, y, z;
    };
     
    class bar: public foo
    { PFC_CLASS(bar, foo) {PFC_VAR(w);}
    private:
      int8 w;
    };
    When the class is introspected, first the base class members are reflected followed by derived classes in the order of inheritance. For example serializing bar object to a memory buffer, and assigning x=1, y=2, z=3 and w=4 for the object, the following code generates the data [0x01, 0x02, 0x03, 0x04] in the buf:
    Code cpp:
    bar a;
    char buf[32];
    mem_output_stream ms(buf, sizeof(buf));
    foo *base=&a;
    ms<<*base; // use base class pointer just to demonstrate serialization via base class pointer. could be also: ms<<a;
    Note that a reflected base class defined with PFC_BASE_CLASS() doesn't have to be the actual base class of the class hierarchy, e.g. foo class could be derived from a class without changing the above reflection definitions, in which case the base class simply wouldn't be reflected. This can be important particularly in situations where the base class is defined in an external library. Also, reflection doesnít have to be defined for each of the derived classes either, though it's a good practice to define the reflection anyway even if there are no members in the class to be reflected.

    The reflection system doesn't support multiple inheritance reflection definitions. It's possible to have classes with multiple inheritance, but it's not possible to add reflection which defines two or more parent classes. In the case of multiple inheritance only one of the parent classes can be defined for reflection.


    Enumerated Types

    While enum types can be serialized to the streams used in the previous examples without enum reflection definitions, some systems, such as versioned object serialization and editor property sheets, may require that the reflection is defined also for enum types. Missing an enum type reflection definition when it's required results in a compile-time error. The enum type reflection can be added by using PFC_ENUM() macro along with enum, and defining the reflection for the enum (usually in .cpp scope) as follows:
    Code cpp:
    // in file.h:
    enum e_fruit
    {
      fruit_orange,
      fruit_apple,
      fruit_peach
    };
    PFC_ENUM(e_fruit);
     
    // in file.cpp:
    #define PFC_ENUM_TYPE e_fruit
    #define PFC_ENUM_PREFIX fruit_
    #define PFC_ENUM_VALS PFC_ENUM_VAL(orange)\
                          PFC_ENUM_VAL(apple)\
                          PFC_ENUM_VAL(peach)
    #include "core/enum.inc"
    This adds functions for mapping enum values to enum strings and vice versa for the given list of enum values. Note that you don't have to list all the enum values for reflection: It's common for example to have fruit_enum_end to define number of values in an enum type, but it doesn't have to be declared in the list as itís never supposed to be assigned to an enum variable. Also, explicitly defined enum values are supported (e.g. fruit_orange=0x12345678), with the limitation that the same value canít be used for multiple enum values (results in a compile-time error).

    Once the enum reflection definition has been added, functions enum_string() and enum_value() can be used for the mapping as follows:
    Code cpp:
    const char *str=enum_string(fruit_orange); // str="orange"
    e_fruit f;
    enum_value(f, "peach"); // f=fruit_peach
    If mapping to the either direction fails, enum_string() returns 0-pointer, and enum_value() returns false.

    Enum values are usually exposed in editors by displaying the string for each enum value without the prefix. Mutable enum variables display a drop-down list showing all the reflected enum strings for users to pick. Enum strings are used instead of enum values also in versioned object serialization, to make the deserialization to tolerate changes in enum values (e.g. when rearranging enum values without explicitly defined values).


    Member Variable Definitions

    Mutability Properties
    In the above examples we used only PFC_VAR() macro for defining member variables for reflection. While this is enough for data serialization, different types of macros are useful for other purposes. For defining how variables can be modified in editors, a family of macros is provided for the purpose:
    • PFC_VAR() immutable variable, immutable pointer data hierarchies
    • PFC_MVAR() mutable variable, immutable pointer data hierarchies
    • PFC_VARMP() immutable variable, mutable pointer data hierarchies
    • PFC_MVARMP() mutable variable, mutable pointer data hierarchies
    • PFC_HVAR() hidden variable (not visible in an editor but is serialized)
    If a variable is defined as immutable, it can't be changed in an editor. If a variable is defined as "immutable pointer data hierarchies", immutability propagates to the entire hierarchy referred by a pointer when exposed in an editor. This is similar to const pointers in C++, except that constness of a C++ pointer doesn't propagate beyond the first level, i.e. in C++ you can't modify variables of an object referred by const foo*, but you can modify variables of objects pointed by a non-const pointer in foo. To demonstrate how immutability of variables works, below is a simple example of defining mutable and immutable variable reflection in a class:
    Code cpp:
    struct foo
    { PFC_MONO(foo)
      {
        PFC_MVAR(x); // can modify x in an editor
        PFC_VAR(y);  // can't modify y in an editor
      }
      int x, y;
    };
    When the foo class is referred by pointers in a data hierarchy, also the other member reflection definition macros become relevant:
    Code cpp:
    struct bar
    { PFC_MONO(bar)
      {
        PFC_VAR(f1);    // can't change f1 pointer, can't change foo variables
        PFC_MVAR(f2);   // can change f2 pointer, can't change foo variables
        PFC_VARMP(f3);  // can't change f3 pointer, can change f3->x
        PFC_MVARMP(f4); // can change f4 pointer, can change f4->x
        PFC_VAR(v1);    // can't change v1 variable
        PFC_MVAR(v2);   // can change v2 variable
      }
      foo *f1, *f2, *f3, *f4;
      int v1, v2;
    };
    To further demonstrate the propagation of immutability in a data hierarchy, below is an example of a class using bar class as a member variable:
    Code cpp:
    struct pla
    { PFC_MONO(pla)
      {
        PFC_VAR(b1);    // can't change bar variables, can't change foo variables
        PFC_MVAR(b2);   // can change b2.f2, b2.f4 and b2.v2, can't change foo variables
        PFC_VARMP(b3);  // can't change bar variables, can change b3.f3->x and b3.f4->x
        PFC_MVARMP(b4); // can change b4.f2, b4.f4 and b4.v2, can change b4.f3->x and b4.f4->x
      }
      bar b1, b2, b4, b4;
    };
    The immutability propagation ensures that the entire data structure can be guaranteed to be immutable. A practical example of this is a 3D mesh class which may refer to materials, and a material class may further refer to textures. However, we may not want to allow mutation of material nor texture variables when exposed via 3D mesh class in the 3D mesh editor, or mutation of texture variables when exposed in a material editor. Thus we can define material pointers in 3D mesh class, and texture pointers in material class using PFC_VAR() or PFC_MVAR() macros. However, we need to be able to modify material variables when exposed in the material editor, or texture variables when exposed in the texture editor. Thus material editor and texture editors can expose a pointer to material & texture classes with PFC_VARMP() enabling the variables to be modified in their respective editors.


    Array Variables
    Array member variables can be reflected by using PFC_AVAR() family of macros as follows:
    Code cpp:
    struct foo
    { PFC_MONO(foo) {PFC_AVAR(arr, 4);}
      float arr[4];
    };
    The size of the array must be either constant like in the example above, or a member value reflected before the array. For dynamic size arrays usually either array or deque template classes are used because they take care of the memory management as well. Similarly to PFC_VAR() family of macros, PFC_AVAR() has also variants for mutability, such as PFC_MAVAR().


    Post-Mutate Expressions
    Another useful family of macros is PFC_MVAR_MCALL(). This family can be used to define mutable variables, which execute a C++ expression after the variable mutation. Note that the macro takes an expression, not a function as an argument, so you can define arguments to be passed in the function call or define a small piece of code to be executed after the mutation. In the example below update() function is called, which updates vlen variable, after vector variable is mutated:
    Code cpp:
    struct foo
    { PFC_MONO(foo) {PFC_MVAR_MCALL(vector, update());}
      vec3f vector;
      float vlen;
      void update() {vlen=norm(vector);}
    };


    Mutable Virtual Variables
    Instead of using post-mutate expressions to hook functionality to variable mutations, another option is to define a variable setter function with PFC_MVVAR() macro family. The mutation of the variable is entirely the responsibility of the function implementation. The function signature must match exactly void func(const T&, unsigned var_index_), where T is the type of the variable, and var_index_ a custom index given in the variable definition. Below example demonstrates how variable x mutation is clamped to range [0, 100] with custom setter function:
    Code cpp:
    struct foo
    { PFC_MONO(foo) {PFC_MVVAR(x, set_x, 0);}
      void set_x(const int &x_, unsigned) {x=clamp(x_, 0, 100);}
      int x;
    };


    Variable Deprecation
    When member variables are renamed, they are treaded as remove-add variable operation by versioned object deserialization, which results the data for the member variable to be lost. The object deserialization system can't know if a variable was truly removed and a new one was added, or if the variable was merely renamed. This issue can be addressed by declaring a deprecated variable using PFC_DEP_VAR() macro, which will result the data using the old variable name to be loaded to the renamed variable.

    If upon serialization a class has the following signature:
    Code cpp:
    struct foo
    { PFC_MONO(foo) {PFC_VAR(x);}
      int x;
    };
    And the member variable x is renamed to y, we can define the following reflection to load the data that was originally stored as variable x to the renamed variable y:
    Code cpp:
    struct foo
    { PFC_MONO(foo) {PFC_VAR(y); PFC_DEP_VAR(y, x);}
      int y;
    };
    Once the class using this new signature is serialized again, the new name y is used for the variable effectively converting the data to the new format. Finally, once all the data has been converted, the PFC_DEP_VAR() definition can be removed from the reflection definition (for example after nightly/weekly data build).


    Variable Decorations
    Class variables exposed in an editor can be decorated with PFC_VEXP_*() family of macros. These macros enable adding decorations such as description, alternative variable name and color, to make the variables more accessible to users. Decorations are added for a variable by listing decoration macros right after the variable reflection definition:
    Code cpp:
    struct sphere
    { PFC_MONO(sphere)
      {
        PFC_VAR(rad);
        PFC_VEXP_N("Radius");
        PFC_VEXP_C(0xff0000);
        PFC_VEXP_D("Radius of the sphere");
      }
      float rad;
    };
    In the above example, the sphere class variable rad is exposed as red "Radius" (instead of "rad") in the editor with "Radius of the sphere" description for the variable.

    For convenience, several short-hand versions of the decoration macros exists, such as PFC_VEXP_NDC() for defining alternate variable name, description and color with single macro, or PFC_MVAR_D() to combine both mutable variable reflection definition and description decoration.


    Customized Reflection

    While the common reflection definitions can be used in most cases, sometimes there's a need to provide custom reflection functionality instead. This can be done using PFC_CUSTOM_STREAMING() macro in the beginning of reflection definition and writing C++ code in the definition instead of using the variable reflection macros. This enables custom logic to be written for data streaming and exposure.

    Writing custom reflection code requires some understanding of the details of the reflection system implementation; The reflection definition is in fact a friend template function with two parameters: A property enumerator PE &pe_ and the introspected object T &v_. The property enumerator interface matches the interface of class prop_enum_interface, which can be used in the custom reflection code for custom behavior.

    The following example demonstrates a simple dynamic data buffer which requires custom reflection implementation. When the data is deserialized, the custom reflection code below allocates the required buffer and reads raw data to the buffer. Upon serialization and exposure, the size and raw data is simply passed to the enumerator:
    Code cpp:
    struct foo
    { PFC_MONO(foo)
      {
        PFC_CUSTOM_STREAMING(0);
        switch(unsigned(PE::pe_type))
        {
          case penum_input:
          {
            // read data
            PFC_MEM_FREE(v_.data);
            PFC_VAR(size);
            v_.data=PFC_MEM_ALLOC(v_.size);
            pe_.data(v_.data, v_.size);
          } break;
     
          case penum_output:
          case penum_display:
          {
            // write/expose data
            PFC_VAR(size);
            pe_.data(v_.data, v_.size);
          } break;
        }
      }
     
      foo() {size=0; data=0;}
      unsigned size;
      void *data;
    };
    It's important to realize that custom reflection code doesn't automatically support versioned object serialization, but versioning must be handled manually in the reflection code. In addition to extravagant reflection code, this is a major reason to try to avoid custom reflection code in general as manual versioning can impose significant maintenance overhead and hinder class changes. If part of the class data requires custom reflection, it's better to decouple that data to a separate class with custom reflection so that rest of the class variables can still utilize automatic versioning.

    The manual version management of the data is done by using the data version number returned by the PFC_CUSTOM_STREAMING() macro. The macro takes the current data layout version number as an argument, which must be incremented when the data layout changes. The returned version number is the version of deserialized data, and is used to branch in the code to reflect changes in the data layout. For example if we would add a new variable data_sum to the above class serialization, the reflection code would look something like this:
    Code cpp:
    struct foo
    { PFC_MONO(foo)
      {
        unsigned ver=PFC_CUSTOM_STREAMING(1);
        switch(unsigned(PE::pe_type))
        {
          case penum_input:
          {
            // read data
            PFC_MEM_FREE(v_.data);
            PFC_VAR(size);
            if(ver>0)
              PFC_VAR(data_sum);
            v_.data=PFC_MEM_ALLOC(v_.size);
            pe_.data(v_.data, v_.size);
          } break;
     
          case penum_output:
          case penum_display:
          {
            // write/expose data
            PFC_VAR(size);
            PFC_VAR(data_sum);
            pe_.data(v_.data, v_.size);
          } break;
        }
      }
     
      foo() {size=0; data_sum=0; data=0;}
      unsigned size, data_sum;
      void *data;
    };
    Here the data version was increased from 0 to 1. In the deserialization code, if the version of the data is greater than 0 (i.e. data has the new member variable) the data_sum variable is deserialized. For serialization thereís no need for version checking since we always want to serialize the latest data (version is always the latest version).

    While the data version management is done manually in custom deserialization, itís still able to tolerate data type changes of individual member variables. For example changing ďunsigned size, data_sum;Ē to ďuint16 size, data_sum;Ē works without changes to the custom deserialization code, and conversion of the variables is done automatically from unsigned to uint16. This is particularly important when deserializing class type variables since deserialization of class types may change independently from custom deserialization.

    Finally, custom reflection adds quite a bit of bloat to the class interface, thus the implementation can be decoupled from the interface with PFC_INTROSPEC_DECL and PFC_INTROSPEC_CPP_DEF / PFC_INTROSPEC_INL_DEF macros (depending if the implementation is in .cpp or .inl file scope):
    Code cpp:
    // in file.h
    struct foo
    { PFC_MONO(foo) PFC_INTROSPEC_DECL;
     
      foo() {size=0; data_sum=0; data=0;}
      uint8 size, data_sum;
      void *data;
    };
     
    // in file.cpp
    PFC_INTROSPEC_CPP_DEF(foo)
    {
      PFC_CUSTOM_STREAMING(0);
      switch(unsigned(PE::pe_type))
      {
        case penum_input:
        {
          // read data
          PFC_MEM_FREE(v_.data);
          PFC_VAR(size);
          v_.data=PFC_MEM_ALLOC(v_.size);
          pe_.data(v_.data, v_.size);
        } break;
     
        case penum_output:
        case penum_display:
        {
          // write/expose data
          PFC_VAR(size);
          pe_.data(v_.data, v_.size);
        } break;
      }
    }
    For polymorphic classes, itís possible to mix default and custom reflection in the class hierarchy. This is properly handled by the SXP reflection system and versioned object deserialization is still able to automatically convert data for default reflection upon changes.


    Conclusion

    This article should have given pretty good idea about the basics of the SXP reflection system and how to interpret SXP reflection definitions. There are some additional reflection system features omitted for sake of brevity, but they are not really relevant to get started in using the reflection system. For more examples of the reflection system use cases, you can browse SXP code base to see how the system can be utilized in various different scenarios.



    [Update 01/17/2012: Added "Mutable Virtual Variables" section]
  • Upcoming Events

kas ekimi sac ekimi