Issue 12508: Computation of KeyHash is unspecified (dds-interop-rtf) Source: PrismTech (Mr. Niels Kortstee, niels.kortstee@prismtech.com Niels.Kortstee@prismtech.com) Nature: Uncategorized Issue Severity: Summary: Summary: The specification does not describe how the KeyHash is computed from the Key. This implicates that key hashes in messages coming from different RTPS implementations can never be interpreted, because one implementation may utilize a different key hash algorithm then the other. I would prefer that the hash algorithm becomes part of the specification. RTPS implementations can choose to implement the prescribed algorithm or simply use a zero-valued key. Resolution: The UDP PIM should mandate that the KeyHash is computed either as the serialized key or else as an MD5 digest, depending on whether the serialize key for the type can exceed the 128 that are used for the KeyHash. Resolution: The UDP PIM should mandate that the KeyHash is computed either as the serialized key or else as an MD5 digest, depending on whether the serialize key for the type can exceed the 128 that are used for the KeyHash. The resolution of this issue depends on the presence of new section 9.6.3.3 titled "Key Hash" added as part of the resolution of issue. 12504. If issue 12504 is resolved in a way that does not add this section, then the resolution proposed here would not be valid. Revised Text: Apply the following changes after the modifications indicated in issue 12504 Add to the end of Section 9.6.3.3 (KeyHash (PID_KEY_HASH), which was added by the resolution of issue 12504) The KeyHash_t is computed from the Data as follows using one of two algorithms depending on whether the Data type is such that the maximum size of the sequential CDR encapsulation of all the key fields is guaranteed to be less than 128bits (the size of the KeyHash_t): · If the maximum size of the sequential CDR encapsulation of all the key fields is guaranteed to be less than 128 bits then the KeyHash_t shall be computed as the CDR Big-Endian encapsulation of all the Key fields in sequence. Any unfilled bits in the KeyHash_t after all the key fields have been encapsulated shall be set to zero. · Otherwise the KeyHash_t shall be computed as a 128 bit MD5 Digest (IETF RFC 1321) applied to the CDR Big-Endian encapsulation of all the Key fields in sequence. Note that the choice of the algorithm to use depends on the data-type, not on any particular data value. Example 1: Assume the following IDL-described type: struct TypeWithShortKey { long id; /* assume defined as a key field */ string name<6>; /* assume defined as a key field */ /* other non-key fields */ }; Then we know that the maximum size for the CDR encapsulation of the key fields is 15 Bytes (4 for the 'id' field, plus 4 for the length of the string 'name' plus at most 7 Bytes for the string (includes extra byte for terminating NUL). In this example the KeyHash_t shall be computed as: [CDR(id), CDR(name), <zero fill to 16 bytes> ] Where CDR(x) represents the big-endian CDR encapsulation of that field. A concrete data value of this type such as { 32, "hello", …} would be encapsulated as: 0......8.....16.....24.....32 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | 0x00 | 0x00 | 0x00 | 0x20 | +------+------+------+------+ | 0x00 | 0x00 | 0x00 | 0x06 | +------+------+------+------+ | 'h' | 'e' | 'l' | 'l' | | 'o' | 0x00 | 0x00 | 0x00 | Note that for clarity use a notation where each byte can be represented either as an hexadecimal number (e.g. 0x20) or as a character (e.g. 'h'); Example 2: Assume the following IDL-described type: struct TypeWithShortKey { long id; /* assume defined as a key field */ string name<8>; /* assume defined as a key field */ /* other non-key fields */ }; Then we know that the maximum size for the CDR encapsulation of the key fields is 17 Bytes (4 for the 'id' field, plus 4 for the length of the string 'name' plus at most 9 Bytes for the string (includes extra byte for terminating NUL). In this example the KeyHash_t shall be computed as: MD5( [CDR(id), CDR(name)]) Proposed Disposition: Resolved Actions taken: May 20, 2008: received issue October 27, 2008: closed issue Discussion: End of Annotations:===== MG Issue No: 12508R#11 Title: Computation of KeyHash is unspecified Source: PrismTech (Niels Kortstee, Niels.Kortstee@prismtech.com) Summary: The specification does not describe how the KeyHash is computed from the Key. This implicates that key hashes in messages coming from different RTPS implementations can never be interpreted, because one implementation may utilize a different key hash algorithm then the other. I would prefer that the hash algorithm becomes part of the specification. RTPS implementations can choose to implement the prescribed algorithm or simply use a zero-valued key. Resolution: The UDP PIM should mandate that the KeyHash is computed either as the serialized key or else as an MD5 digest, depending on whether the serialize key for the type can exceed the 128 that are used for the KeyHash. Revised Text: Apply the following changes after the modifications indicated in R#6. Add Section 9.6.3.3 9.6.3.3 KeyHash (PID_KEY_HASH) The key hash inline parameter contains the CDR encoding of the KeyHash_t. The KeyHash_t is defined as a 16-Byte octet array (see Table 9.4) therefore the key hash inline parameter just copies those 16 Bytes. The KeyHash_t is computed from the Data as follows using one of two algorithms depending on whether the Data type is such that the maximum size of the sequential CDR encapsulation of all the key fields is guaranteed to be less than 128bits (the size of the KeyHash_t): · If the maximum size of the sequential CDR encapsulation of all the key fields is guaranteed to be less than 128 bits then the KeyHash_t shall be computed as the CDR Big-Endian encapsulation of all the Key fields in sequence. Any unfilled bits in the KeyHash_t after all the key fields have been encapsulated shall be set to zero. · Otherwise the KeyHash_t shall be computed as a 128 bit MD5 Digest (IETF RFC 1321) applied to the CDR Big-Endian encapsulation of all the Key fields in sequence. Note that the choice of the algorithm to use depends on the data-type, not on any particular data value. Example 1: Assume the following IDL-described type: struct TypeWithShortKey { long id; /* assume defined as a key field */ string name<6>; /* assume defined as a key field */ /* other non-key fields */ }; Then we know that the maximum size for the CDR encapsulation of the key fields is 15 Bytes (4 for the .id. field, plus 4 for the length of the string .name. plus at most 7 Bytes for the string (includes extra byte for terminating NUL). In this example the KeyHash_t shall be computed as: [CDR(id), CDR(name), ] Where CDR(x) represents the big-endian CDR encapsulation of that field. A concrete data value of this type such as { 32, .hello., .} would be encapsulated as: 0......8.....16.....24.....32 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | 0x00 | 0x00 | 0x00 | 0x20 | +------+------+------+------+ | 0x00 | 0x00 | 0x00 | 0x06 | +------+------+------+------+ | .h. | .e. | .l. | .l. | | .o. | 0x00 | 0x00 | 0x00 | Note that for clarity use a notation where each byte can be represented either as an hexadecimal number (e.g. 0x20) or as a character (e.g. .h.); Example 2: Assume the following IDL-described type: struct TypeWithShortKey { long id; /* assume defined as a key field */ string name<8>; /* assume defined as a key field */ /* other non-key fields */ }; Then we know that the maximum size for the CDR encapsulation of the key fields is 17 Bytes (4 for the .id. field, plus 4 for the length of the string .name. plus at most 9 Bytes for the string (includes extra byte for terminating NUL). In this example the KeyHash_t shall be computed as: MD5( [CDR(id), CDR(name)]) Disposition: