Issue 14357: OCL 2.1 Resolution of missing Concrete Syntaxes and Reserved Words (ocl2-rtf) Source: Model Driven Solutions (Dr. Edward Willink, ed(at)willink.me.uk) Nature: Uncategorized Issue Severity: Summary: Define the concrete syntax of a simpleNameCS to avoid punctuation collisions, support Unicode characters, and add a double quoted form with escape sequences for awkward names. Define the concrete syntax of a StringLiteralExpCS to support escape sequences for awkward characters. Define the concrete syntax of RealLiteralExpCS and IntegerLiteralExpCS. Define a variety of effectively reserved words such as true, self, Bag, String as reserved. Resolution: Revised Text: At the end of 7.4 add Multiple adjacent strings are concatenated allowing a long string to be specified on multiple lines. 'This is a ' 'concatenated ''string' -- 'This is a concatenated string' Unicode characters are used within single quoted sequences, with the following backslash based escape sequences used to define backslash and other characters. \b -- backspace \t -- horizontal tab \n -- linefeed \f -- form feed \r -- carriage return \" -- double quote \' -- single quote \\ -- backslash \xhh -- #x00 to #xFF \uhhhh -- #x0000 to #xFFFF where h is a hex digit: 0 to 9, A to F or a to f. Reserved words such as true and arbitrary awkward spellings may be used as names by enclosing the name in underscore-prefixed single quotes. self._'if' = _'tabbed\tvariable'._'spaced operation'() In 7.4.8 replace is conceptually equal to the expression: a.+(b) by is equivalent to the expression: a._'+'(b) In the first paragraph of Section 9.3 replace As a convention to the concrete syntax, conflicting properties or conflicting class names can be aliased using the «_» (underscore) prefix. Inside an OCL expression that is written with the concrete syntax, when a property name or a class name is found to start with a «_›, firstly the symbol is lookup in the metamodel. If not found, the same symbol with the «_» skipped is tried. by In the concrete syntax, names that are reserved words or include punctuation characters can be used by enclosing the required name in underscore-prefixed single quotes. _'and' _'>=' [In OCL 2.0 and 2.1 a reserved word could be used as a name after prefixing it with an underscore. _and The subsequent symbol lookup would look first for the spelling with an underscore in the meta-model and if that was not found would attempt a further lookup after removing the underscore. This behaviour was indeterminate, could not access names that existed both with and without prefixes, and did not support punctuation characters. The simple underscore prefix is therefore deprecated in OCL 2.3 and will be removed in OCL 3.0.] In 9.3 simpleNameCS replace The exact syntax of a String is undefined in UML 1.4, and remains undefined in OCL 2.0. The reason for this is internationalization. simpleNameCS ::= <String> Abstract syntax mapping simpleNameGr.ast : String Synthesized attributes simpleNameGr.ast = <String> Inherited attributes -- none Disambiguating rules -- none by The abstract syntax of a simpleNameCS String is undefined in UML 2.3, and so is undefined in OCL 2.3. The reason for this is internationalization. The concrete syntax of a simpleNameCS String supports a Unicode letter-prefixed identifier (form [A]). Reserved words and names involving awkward characters such as punctuation may be specified by prefixing a String Literal with an '_' (form [B] and [C]). [A] simpleNameCS ::= NameStartChar NameChar* [B] simpleNameCS ::= '_' #x27 StringChar* #x27 [C] simpleNameCS[1] ::= simpleNameCS[2] WhiteSpaceChar* #x27 StringChar* #x27 The identifier form starts with a Unicode letter: NameStartChar ::= [A-Z] | "_" | "$" | [a-z] | [#xC0-#xD6] | [#xD8-#xF6] | [#xF8-#x2FF] | [#x370-#x37D] | [#x37F-#x1FFF] | [#x200C-#x200D] | [#x2070-#x218F] | [#x2C00-#x2FEF] | [#x3001-#xD7FF] | [#xF900-#xFDCF] | [#xFDF0-#xFFFD] | [#x10000-#xEFFFF] and may continue with a Unicode letter or digit. NameChar ::= NameStartChar | [0-9] The StringChar form is defined under StringLiteralExpCS. Example simpleNameCS values are: String i3 a?et? MAX_VALUE isLetterOrDigit _'true' _'>=' _'\'' Abstract syntax mapping simpleNameCS.ast : String Synthesized attributes [A] simpleNameCS.ast = <CodePoints of NameStartChar NameChar*> [B] simpleNameCS.ast = <CodePoints of StringChar*> [C] simpleNameCS[1].ast = simpleNameCS[2] + <CodePoints of StringChar*> Inherited attributes -- none Disambiguating rules [1] [A] the character, if any, following the last NameChar is not a NameChar. [2] [A] simpleNameCS.ast is not a reserved word [3] [B] No whitespace is permitted between the '_' and the first NameChar. [4] [C] simpleNameCS[2] is a simpleNameCS [B] or [C]. In 9.3 IntegerLiteralExpCS replace This rule represents integer literal expressions. IntegerLiteralExpCS ::= <String> ... Synthesized attributes IntegerLiteralExpCS.ast.integerSymbol = <String>.toInteger() by This rule represents integer literal expressions. The lexical representation of an integer is a sequence of at least one of the decimal digit characters, without a leading zero; except that a single leading zero character is required for the zero value. IntegerLiteralExpCS ::= <Integer Lexical Representation> ... Synthesized attributes IntegerLiteralExpCS.ast.integerSymbol = <Integer Value> In 9.3 RealLiteralExpCS replace This rule represents real literal expressions. RealLiteralExpCS ::= <String> ... Synthesized attributes RealLiteralExpCS.ast.realSymbol = <String>.toReal() by This rule represents real literal expressions. A real literal consists of an integer part, a fractional part and an exponent part. The exponent part consists of either the letter 'e' or 'E', followed optionally by a '+' or '-' letter followed by an exponent integer part. Each integer part consists of a sequence of at least one of the decimal digit characters. The fractional part consists of the letter '.' followed by a sequence of at least one of the decimal digit characters. Either the fraction part or the exponent part may be missing but not both. RealLiteralExpCS ::= <Real Lexical Representation> ... Synthesized attributes RealLiteralExpCS.ast.realSymbol = <Real Value> In 9.3 StringLiteralExpCS replace This rule represents string literal expressions. StringLiteralExpCS ::= “<String> “ ... Synthesized attributes StringLiteralExpCS.ast.symbol = <String> by This rule represents string literal expressions. The concrete syntax comprises a sequence of zero or more characters or escape sequences surrounded by single quote characters. The [B] form with adjacent strings allows a long string literal to be split into fragments or to be written across multiple lines. [A] StringLiteralExpCS ::= #x27 StringChar* #x27 [B] StringLiteralExpCS[1] ::= StringLiteralExpCS[2] WhiteSpaceChar* #x27 StringChar* #x27 where StringChar ::= Char | EscapeSequence WhiteSpaceChar ::= #x09 | #x0a | #x0c | #x0d | #x20 Char ::= [#x20-#x26] | [#x28-#x5B] | [#x5D-#xD7FF] | [#xE000-#xFFFD] | [#x10000-#x10FFFF] EscapeSequence ::= '\' 'b' -- #x08: backspace BS | '\' 't' -- #x09: horizontal tab HT | '\' 'n' -- #x0a: linefeed LF | '\' 'f' -- #x0c: form feed FF | '\' 'r' -- #x0d: carriage return CR | '\' '"' -- #x22: double quote " | '\' ''' -- #x27: single quote ' | '\' '\' -- #x5c: backslash \ | '\' 'x' Hex Hex -- #x00 to #xFF | '\' 'u' Hex Hex Hex Hex -- #x0000 to #xFFFF Hex ::= [0-9] | [A-F] | [a-f] Synthesized attributes [A] StringLiteralExpCS.ast.symbol = <CodePoints of StringChar*> [B] StringLiteralExpCS.ast.symbol = StringLiteralExpCS[2] + <CodePoints of StringChar*> In 9.3 OperationalCallExpCS remove [3] [C] The name of the referred Operation cannot be an operator. Set{‘+’,’-’,’*’,’/’,’and’,’or’,’xor’,’=’,’<=’,’>=’,’<‘,’>’}->excludes(simpleNameCS.ast) At the end of 9.4.1 add In OCL 2.0 and 2.1 a reserved word could be used as a name after prefixing it with an underscore. Therefore, for compatibility, a lookup of simpleNameCS[A] name with a leading underscore may need to be looked up twice. The symbol is first looked up in the meta-model with the underscore prefix, and if no value is found, the symbol is looked up gain without the underscore prefix. A double lookup is not required for a simpleNameCS[B] or [C] name (an underscore-prefixed singly quoted string). The second lookup after removing the underscore prefix is deprecated in OCL 2.3 and will be discontinued in OCL 3.0. Tool implementors should provide a warning message for this deprecated usage. Actions taken: September 10, 2009: received issue April 25, 2011: closed issue Discussion: See Issue 14583 for a resolution of reserved words and definition of reservedKeywordCS. Numbers A valid integer literal should have no leading zeroes to avoid confusion with languages that use leading zeroes to indicate octal. A valid integer (or real) literal should have no leading sign; a unary minus may be used to create negative values. If a leading minus is part of a number, there is a problem parsing "5-4" after tokenising as UnlimitedNatural(5) Integer(-4) rather than UnlimitedNatural(5) Letter(-) UnlimitedNatural(4). A valid real literal should not have a leading or trailing dot to avoid confusion with a dot or dot dot operator. e.g. "1..2" should be a collection range rather than "1." and ".2". Prohibiting the edge dots avoids the ambiguity. 63 Strings OCL 2.1 defines a string as a character string surrounded by single quotes, but defines no mechanism for defining a single quote. It is unclear how non-printable characters such as new-lines should be interpreted within strings, or how nonprintable characters can be specified in the concrete syntax without causing problems with text editors and other tools that provide special treatment for nonprintable characters. QVT 1.0 Operational Mappings defines a number of Java-inspired extensions, although it applies Unicode escapes in the Concrete Syntax mapping rather than the character serialisation. QVTo also defines concatenation of adjacent strings. Adoption of Java-like escape sequences solves the problem of awkward characters, although support for octal sequences seems unnecessary in modern languages; the QVTo specification of octal sequences is flawed ('\111" has three valid meanings). Concatenation of adjacent strings is useful. Introduction of \ as an escape character changes the semantics of '\' which in OCL 2.0 and 2.1 was a valid string defining a single backslash character, although many practical OCL 2.0 tools may have anticipated this specification of backslash sequences. Identifiers UML places no constraints on identifier spellings. OCL 2.1 in 9.3 simpleNameCS echoes this lack of constraint, but fails to identify how the arbitrary Abstract Syntax can be realised in the Concrete Syntax. Punctuation should not ever be a valid name in the Concrete Syntax. 9.3 OperationCallExpCS disambiguating rule 3 specifically prohibits the 'conceptual' example in 7.4.8. A valid Concrete Syntax name should use a Unicode variant of a letter then letter-or-digit identifier. An arbitrary Abstract Syntax name should be expressible by enclosing the correspondingly arbitrary character sequence in some form of quotes, using Java-like escaping definitions for awkward characters. The 'conceptual' example in 7.4.8 should therefore be valid using a quoted form. 64 OCL 2.1 uses single quotes, and although OCL 2.1 does not define any semantics for double quotes, derived languages uch as QVTo do. It is therefore appropriate to use the existing underscore prefix for reserved words to prefix a singly quoted string and so convert the string literal to an identifier. This accommodates any spelling and since the need for conversion is rare the clumsiness of three characters is acceptable. 9.3 OperationCallExpCS disambiguating rule 3 should therefore prohibit only unquoted punctuation and reserved words. a.+(b) or a.and(b) is invalid, but a._'+'(b) or a._'and'(b) is valid. The prohibition is therefore on the reserved spelling rather than the use of the referenced operation. A similar escaping mechanism should apply to awkward characters as defined for single quoted strings, so that the \= operation can be invoked as a._'\\='(b). Since the underscore-prefixed single quotes supports any awkward characters, the OCL 2.0 and 2.1 underscore identifier prefix is redundant. In view of the inadequacies highlighted in 14224 and the need to avoid the underscore prefix syntax applying within _'_self', it is appropriate to deprecate the old underscore prefix with a view to removing it in OCL 3.0. End of Annotations:===== ronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: ApoEAKfsqErUnw4S/2dsb2JhbADfXoQYBQ Date: Thu, 10 Sep 2009 20:14:21 +0100 From: Ed Willink User-Agent: Thunderbird 2.0.0.22 (Windows/20090605) To: issues@omg.org Subject: OCL 2.1 Resolution of missing Concrete Syntaxes and Reserved Words X-Plusnet-Relay: ae27a27edb1d8d6112a63998b170b73e Hi The attached provides revised text to: Define the concrete syntax of a simpleNameCS to avoid punctuation collisions, support Unicode characters, and add a double quoted form with escape sequences for awkward names. Define the concrete syntax of a StringLiteralExpCS to support escape sequences for awkward characters. Define the concrete syntax of RealLiteralExpCS and IntegerLiteralExpCS. Define a variety of effectively reserved words such as true, self, Bag, String as reserved. Regards Ed Willink X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: ApoEAPvG5krUnw4U/2dsb2JhbADaLIQ/BA Date: Tue, 27 Oct 2009 17:12:44 +0000 From: Ed Willink User-Agent: Thunderbird 2.0.0.22 (Windows/20090605) To: ocl2-rtf@omg.org Subject: Re: Issue 14357 Resolution of missing Concrete Syntaxes and Reserved Words X-Plusnet-Relay: 1a2421dcb4216745ea91ae81074e2306 Hi Mariano Attached resolution defines the concrete syntax of Strings, Integers, Reals and Identifiers, satisfying the requirements for arbitrary string/identifier characters. Arbitrary strings can have escape sequences. Doubled single quote is a single quote. Strings can be concatenated across lines. Arbitrary identifiers can have escape sequences when expressed in a double quoted form. Reserved words is a much more complicated issue; Issue 14583. Regards Ed Willink 14357-ConcreteSyntaxes.odt X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: ApoEAGuK/kpUXebz/2dsb2JhbADTEoQ8BA Date: Sat, 14 Nov 2009 18:49:01 +0000 From: Ed Willink User-Agent: Thunderbird 2.0.0.23 (Windows/20090812) To: "'ocl2-rtf@omg.org'" Subject: Re: Issue 14357 Resolution of missing Concrete Syntaxes and Reserved Words X-Plusnet-Relay: e203e17bbb44dec47a250ce9ca201071 Hi Attached revised resolution: a) Adopts _'xxx' rather than "xxx" as an option for awkward identifier spelling b) Provides the overlapping resolution for Issue 14224 that _xxx is always xxx. Regards Ed Willink 14357-ConcreteSyntaxes1.odt Sender: Adolfo Sanchez Barbudo Date: Sun, 15 Nov 2009 19:20:18 +0000 From: Adolfo Sáhez-Barbudo Herrera Organization: Open Canarias S.L. User-Agent: Thunderbird 2.0.0.23 (Windows/20090812) To: Ed Willink CC: "'ocl2-rtf@omg.org'" Subject: Re: Issue 14357 Resolution of missing Concrete Syntaxes and Reserved Words Hi Ed, I think you forgot to adjust some parts of the revised text: Reserved words such as true and arbitrary awkward spellings may be used as names by enclosing the name in double quotes. self."if" = "tabbed\tvariable"."spaced operation"() In 7.4.8 replace is conceptually equal to the expression: a.+(b) by is equivalent to the expression: a."+"(b) Should be: Reserved words such as true and arbitrary awkward spellings may be used as names by enclosing the name in single quotes and preceded by an underscore: self._'if' = _'tabbed\tvariable'._'spaced operation'() In 7.4.8 replace is conceptually equal to the expression: a.+(b) by is equivalent to the expression: a._'+'(b) --------- Apart from that, the resolution seems to prohibit an _id property to be also accessed by "myVar._id" (you have to write __id), which looks like another example of compatibility problem (There could be a lot of OCL expressions which are already using "myVar._id" to access an "_id" property). I guess you want to remove this alternative to uniform the processing from the concrete syntax (the first _ is always removed from the identifier's symbol, and just the said symbol is looked up once). Maybe, the reason comes from the QVTr issue you mention in the 14224 resolution, which I haven't honestly followed. In anycase, I think that issue 14224 can be solved without prohibiting "myVar._id" to access an "_id" property. The idea is that you firstly lookup the name WITHOUT <<_>>. If not found, you lookup the name WITH the <<_>>. This obviously may provoke an unnecessary second lookup, which could be avoided if everybody know and understand the use of the underscore in OCL 2.3. Besides, in the case I have an "_id" property, the use of "myVar._id" could be confusing and may vary depending on I also have a "id" property or I don't. Again, we have an interesting debate concerning the compatibility problems an specifications's change may provoke. What does RFT think ? Cheers, Adolfo. Ed Willink escribiĂłHi Attached revised resolution: a) Adopts _'xxx' rather than "xxx" as an option for awkward identifier spelling b) Provides the overlapping resolution for Issue 14224 that _xxx is always xxx. Regards Ed Willink -- Adolfo Sáhez-Barbudo Herrera adolfosbh(at)opencanarias(dot)com C/ElĂ­ Ramos Gonzáz, 4, ofc. 304 38001 SANTA CRUZ DE TENERIFE Tel.: +34 922 240231 / +34 617 718268 X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: Ajk1AOanAEvUnw4T/2dsb2JhbACZM64LjSiEPAQ Date: Mon, 16 Nov 2009 09:18:31 +0000 From: Ed Willink User-Agent: Thunderbird 2.0.0.23 (Windows/20090812) To: "'ocl2-rtf@omg.org'" Subject: Re: Issue 14357 Resolution of missing Concrete Syntaxes and Reserved Words X-Plusnet-Relay: bd708ca86635df8e87dfd4b4b8efbc71 Hi Adolfo I think you forgot to adjust some parts of the revised text: Agreed. Apart from that, the resolution seems to prohibit an _id property to be also accessed by "myVar._id" (you have to write __id), which looks like another example of compatibility problem (There could be a lot of OCL expressions which are already using "myVar._id" to access an "_id" property). I guess you want to remove this alternative to uniform the processing from the concrete syntax (the first _ is always removed from the identifier's symbol, and just the said symbol is looked up once). Maybe, the reason comes from the QVTr issue you mention in the 14224 resolution, which I haven't honestly followed. In anycase, I think that issue 14224 can be solved without prohibiting "myVar._id" to access an "_id" property. The idea is that you firstly lookup the name WITHOUT <<_>>. If not found, you lookup the name WITH the <<_>>. This obviously may provoke an unnecessary second lookup, which could be avoided if everybody know and understand the use of the underscore in OCL 2.3. Besides, in the case I have an "_id" property, the use of "myVar._id" could be confusing and may vary depending on I also have a "id" property or I don't. Again, we have an interesting debate concerning the compatibility problems an specifications's change may provoke. Ow! This is horrible. In a meta-model with both _self and self. A) OCL 2.0 looks up _self as _self else self and so does not permit access to self. B) The 14224 submission, 14357 resolution looks up _self as self always and so requires __self to access _self. C) Adolfo's idea looks up _self as self else _self and so does not permit access to _self. The original problem with A) was obscure and is soluble by using _'self'. The problem with C) is perhaps even more obscure and also soluble by _'_self' if the quoted name is literal. B) is simpler long term but has an only slightly obscure incompatibility. None of these work. I followed the simplification too readily because I wanted to avoid _'_self' being resolved as _self in one parsing phase and then as self in another, either causing confusion from double _ treatments, or requiring implementations to maintain context to indicate whether the AS _self came from _self or _'_self'. C) provides a bridge to B) whereby the fallback lookup is provided for compatibility, but is deprecated for removal in OCL 3.0. A) and C) are incompatible whenever a meta-model contains both an x and an _x property. ----- B) is simple to understand, specify and implement. A) and C) both expose the failure to integrate the first paragraph of 9.3 into the specification of Environment::lookupLocal. The problem is that if _'_x' is to lookup precisely _x, whereas _x looks up_x or x, lookupLocal must be sensitive to whether an argument of _x originated as _'_x' or _x. Only the latter can be _x or x. The meta-model dependency cannot be performed in the simpleNameCS[B] synthesized attribute because it is not known whether the simpleNameCS is a reference to a property, class, or ... or a definition (without passing additional environment information that may be suspect for syntaxes in need of disambiguation). I see the following options for _xx and _'xx' support a) Adopt B) and introduce an incompatibility for accessing _xx. b) Abandon _'xx' and fail to support arbitrary identifiers c) Add an isLiteral argument to lookupLocal and related functions d) Change lookupLocal and related functions to have a simpleNameCS rather than String argument e) Add a parallel family of lookupLocal and related functions with a simpleNameCS rather than String argument Environment is a 'less normative' part of the specification so c) and d) are perhaps possible. I prefer a) but doubt that it's acceptable. b) achieves nothing. c) and d) have minor compatibility problems so that leaves e) which aligns with a change I had already been considering for the MDT/OCL implementation. e) maintains compatibility and implements the specification by performing the potentially two invocations of lookupLocal(String) within lookupLocal(SimpleNameCS). ----- So, unless an incompatible change requiring accesses to _id to be rewritten as __id is acceptable, I propose that we: - specify a parallel family of lookupXXX functions with simpleNameCS/pathNameCS arguments to implement the unquoted _ lookup - interpret a reference _x as x if there is an x, else as _x if there is an _x, else invalid - interpret a definition _x as x - specify that interpretation of _x as _x is deprecated Regards Ed Willink X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: AoETAP8+AUtUXebi/2dsb2JhbACCHzKWW4EwrkSOE4Q8BIFt Date: Mon, 16 Nov 2009 20:02:52 +0000 From: Ed Willink User-Agent: Thunderbird 2.0.0.23 (Windows/20090812) To: ocl2-rtf@omg.org Subject: Re: Issue 14357 Resolution of missing Concrete Syntaxes and Reserved Words X-Plusnet-Relay: f6252df191f923f641e1c7310223a7fd Hi Mariano I think that you're proposing a different long term solution. myobject.name always looks up exactly name and name may have a leading underscore, but may not be a reserved word and may not have punctaution characters. The new _'name' syntax is available to look up exactly name, with a completely arbitrary character sequence. In the interim, the entire OCL 2.0 _ prefix syntax is preserved but deprecated for removal in OCL 3.0. In practice the preferred lookup of _x resolving to _x is supported unchanged, the fall-back lookup of _x resolving to x requires a warning message. Eventually this eliminates simpleNameCS[B] and allows _ as a NameStartChar for simpleNameCS[A]. This is even simpler than my B) solution, avoiding a problem for _id, but requiring a rewrite to e.g. _'domain' in QVTr2QVTc. I quite like this; it is fewer syntaxes for users to remember; there is no subtle lexical mangling, just one very obvious gross mangle. Unfortunately we ought to specify the OCL 2.3 semantics accurately, which requires changes/augmentation to lookupXXXX signatures. I favour the change to simpleNameCS/pathNameCS arguments. However, if we deprecate the entire OCL 2.0 _ prefix syntax, we could get away with a paragraph explaining the compatibility in words without changing lookupXXXX at all. Regards Ed Willink mariano.belaunde@orange-ftgroup.com wrote: Hi Ed and Adolfo, Very interesting discussion. My feedback below. If a class in a MM has an _id property, I think an OCL user would feel much more confortable that in any circuntancies (H,I,J from Oscar message), myobject._id references this property (avoiding double underscores). The actual OCL 2.0 resolution order for '_' convention actually guarranties this. Now in the case of a keyword, like 'self', since we are introducing the new notation _'xxx', as you reported the issue 14224 is no more an issue (access to the 'self' property can now be done with the new notation myobject._'self', and myobject._self will acces _self if _self exist). I guess, A OCL2.3 tool could warn that use of unquoted _ to escape keywords is obsolete but retain it in OCL2.3 for comptability. But if, at some version of the standard we remove the deprecated unquoted _ lookup convention, this would mean simply that the second lookup call will no more be executed (underscore _ becomes an ordinary character except when immediately followed by quotes). Concerning _'_self', from my point of view, the treatment of the part within quotes should not be recursive: The expression is solved in one pass as a reference to the property _self. If the _self property not exists, it does not look for a property named self and it simply raises an error. Do you agree? If you think that, after making necessary adjustments to lookupXXX definitions in 9.3, we still have a problem concerning solution (A), please let me know. Maybe I miss something. Cheers, Mariano -----Message d'origine----- De : Ed Willink [mailto:ed@willink.me.uk] EnvoyĂ© lundi 16 novembre 2009 10:19 Ă€: 'ocl2-rtf@omg.org' Objet : Re: Issue 14357 Resolution of missing Concrete Syntaxes and Reserved Words Hi Adolfo I think you forgot to adjust some parts of the revised text: Agreed. Apart from that, the resolution seems to prohibit an _id property to be also accessed by "myVar._id" (you have to write __id), which looks like another example of compatibility problem (There could be a lot of OCL expressions which are already using "myVar._id" to access an "_id" property). I guess you want to remove this alternative to uniform the processing from the concrete syntax (the first _ is always removed from the identifier's symbol, and just the said symbol is looked up once). Maybe, the reason comes from the QVTr issue you mention in the 14224 resolution, which I haven't honestly followed. In anycase, I think that issue 14224 can be solved without prohibiting "myVar._id" to access an "_id" property. The idea is that you firstly lookup the name WITHOUT <<_>>. If not found, you lookup the name WITH the <<_>>. This obviously may provoke an unnecessary second lookup, which could be avoided if everybody know and understand the use of the underscore in OCL 2.3. Besides, in the case I have an "_id" property, the use of "myVar._id" could be confusing and may vary depending on I also have a "id" property or I don't. Again, we have an interesting debate concerning the compatibility problems an specifications's change may provoke. Ow! This is horrible. In a meta-model with both _self and self. A) OCL 2.0 looks up _self as _self else self and so does not permit access to self. B) The 14224 submission, 14357 resolution looks up _self as self always and so requires __self to access _self. C) Adolfo's idea looks up _self as self else _self and so does not permit access to _self. The original problem with A) was obscure and is soluble by using _'self'. The problem with C) is perhaps even more obscure and also soluble by _'_self' if the quoted name is literal. B) is simpler long term but has an only slightly obscure incompatibility. None of these work. I followed the simplification too readily because I wanted to avoid _'_self' being resolved as _self in one parsing phase and then as self in another, either causing confusion from double _ treatments, or requiring implementations to maintain context to indicate whether the AS _self came from _self or _'_self'. C) provides a bridge to B) whereby the fallback lookup is provided for compatibility, but is deprecated for removal in OCL 3.0. A) and C) are incompatible whenever a meta-model contains both an x and an _x property. ----- B) is simple to understand, specify and implement. A) and C) both expose the failure to integrate the first paragraph of 9.3 into the specification of Environment::lookupLocal. The problem is that if _'_x' is to lookup precisely _x, whereas _x looks up_x or x, lookupLocal must be sensitive to whether an argument of _x originated as _'_x' or _x. Only the latter can be _x or x. The meta-model dependency cannot be performed in the simpleNameCS[B] synthesized attribute because it is not known whether the simpleNameCS is a reference to a property, class, or ... or a definition (without passing additional environment information that may be suspect for syntaxes in need of disambiguation). I see the following options for _xx and _'xx' support a) Adopt B) and introduce an incompatibility for accessing _xx. b) Abandon _'xx' and fail to support arbitrary identifiers c) Add an isLiteral argument to lookupLocal and related functions d) Change lookupLocal and related functions to have a simpleNameCS rather than String argument e) Add a parallel family of lookupLocal and related functions with a simpleNameCS rather than String argument Environment is a 'less normative' part of the specification so c) and d) are perhaps possible. I prefer a) but doubt that it's acceptable. b) achieves nothing. c) and d) have minor compatibility problems so that leaves e) which aligns with a change I had already been considering for the MDT/OCL implementation. e) maintains compatibility and implements the specification by performing the potentially two invocations of lookupLocal(String) within lookupLocal(SimpleNameCS). ----- So, unless an incompatible change requiring accesses to _id to be rewritten as __id is acceptable, I propose that we: - specify a parallel family of lookupXXX functions with simpleNameCS/pathNameCS arguments to implement the unquoted _ lookup - interpret a reference _x as x if there is an x, else as _x if there is an _x, else invalid - interpret a definition _x as x - specify that interpretation of _x as _x is deprecated Regards Ed Willink Subject: RE: Issue 14357 Resolution of missing Concrete Syntaxes and Reserved Words Date: Tue, 17 Nov 2009 10:28:50 +0100 X-MS-Has-Attach: X-MS-TNEF-Correlator: Thread-Topic: Issue 14357 Resolution of missing Concrete Syntaxes and Reserved Words Thread-Index: Acpm9/LB95X25mlTTMq4zNGY53mhbQAboHVg From: To: , X-OriginalArrivalTime: 17 Nov 2009 09:28:54.0212 (UTC) FILETIME=[616AA840:01CA6768] Hi Ed, >>> I quite like this; it is fewer syntaxes for users to remember; there is no subtle lexical mangling, just one very obvious gross mangle. Great. So if no major objection comes, let's follow this. >>> but requiring a rewrite to e.g. _'domain' in QVTr2QVTc This would mean issuing a specific issue for the QVT revision that will be aligned with OCL 2.3. >>> However, if we deprecate the entire OCL 2.0 _ prefix syntax, we could get away with a paragraph explaining the compatibility in >>> words without changing lookupXXXX at all. OK. I find this acceptable: since the functionality is deprecated, but retained temporarily for compatibility in v2.3, we can explain in words how lookupXXXX functions would need to be changed to support the deprecated feature. This will avoid extra effort on formalizing something which is only temporary and which usage is no more reccommended. Regards, Mariano -------------------------------------------------------------------------------- De : Ed Willink [mailto:ed@willink.me.uk] EnvoyĂ© lundi 16 novembre 2009 21:03 Ă€: ocl2-rtf@omg.org Objet : Re: Issue 14357 Resolution of missing Concrete Syntaxes and Reserved Words Hi Mariano I think that you're proposing a different long term solution. myobject.name always looks up exactly name and name may have a leading underscore, but may not be a reserved word and may not have punctaution characters. The new _'name' syntax is available to look up exactly name, with a completely arbitrary character sequence. In the interim, the entire OCL 2.0 _ prefix syntax is preserved but deprecated for removal in OCL 3.0. In practice the preferred lookup of _x resolving to _x is supported unchanged, the fall-back lookup of _x resolving to x requires a warning message. Eventually this eliminates simpleNameCS[B] and allows _ as a NameStartChar for simpleNameCS[A]. This is even simpler than my B) solution, avoiding a problem for _id, but requiring a rewrite to e.g. _'domain' in QVTr2QVTc. I quite like this; it is fewer syntaxes for users to remember; there is no subtle lexical mangling, just one very obvious gross mangle. Unfortunately we ought to specify the OCL 2.3 semantics accurately, which requires changes/augmentation to lookupXXXX signatures. I favour the change to simpleNameCS/pathNameCS arguments. However, if we deprecate the entire OCL 2.0 _ prefix syntax, we could get away with a paragraph explaining the compatibility in words without changing lookupXXXX at all. Regards Ed Willink mariano.belaunde@orange-ftgroup.com wrote: Hi Ed and Adolfo, Very interesting discussion. My feedback below. If a class in a MM has an _id property, I think an OCL user would feel much more confortable that in any circuntancies (H,I,J from Oscar message), myobject._id references this property (avoiding double underscores). The actual OCL 2.0 resolution order for '_' convention actually guarranties this. Now in the case of a keyword, like 'self', since we are introducing the new notation _'xxx', as you reported the issue 14224 is no more an issue (access to the 'self' property can now be done with the new notation myobject._'self', and myobject._self will acces _self if _self exist). I guess, A OCL2.3 tool could warn that use of unquoted _ to escape keywords is obsolete but retain it in OCL2.3 for comptability. But if, at some version of the standard we remove the deprecated unquoted _ lookup convention, this would mean simply that the second lookup call will no more be executed (underscore _ becomes an ordinary character except when immediately followed by quotes). Concerning _'_self', from my point of view, the treatment of the part within quotes should not be recursive: The expression is solved in one pass as a reference to the property _self. If the _self property not exists, it does not look for a property named self and it simply raises an error. Do you agree? If you think that, after making necessary adjustments to lookupXXX definitions in 9.3, we still have a problem concerning solution (A), please let me know. Maybe I miss something. Cheers, Mariano -----Message d'origine----- De : Ed Willink [mailto:ed@willink.me.uk] EnvoyĂ© lundi 16 novembre 2009 10:19 Ă€: 'ocl2-rtf@omg.org' Objet : Re: Issue 14357 Resolution of missing Concrete Syntaxes and Reserved Words Hi Adolfo I think you forgot to adjust some parts of the revised text: Agreed. Apart from that, the resolution seems to prohibit an _id property to be also accessed by "myVar._id" (you have to write __id), which looks like another example of compatibility problem (There could be a lot of OCL expressions which are already using "myVar._id" to access an "_id" property). I guess you want to remove this alternative to uniform the processing from the concrete syntax (the first _ is always removed from the identifier's symbol, and just the said symbol is looked up once). Maybe, the reason comes from the QVTr issue you mention in the 14224 resolution, which I haven't honestly followed. In anycase, I think that issue 14224 can be solved without prohibiting "myVar._id" to access an "_id" property. The idea is that you firstly lookup the name WITHOUT <<_>>. If not found, you lookup the name WITH the <<_>>. This obviously may provoke an unnecessary second lookup, which could be avoided if everybody know and understand the use of the underscore in OCL 2.3. Besides, in the case I have an "_id" property, the use of "myVar._id" could be confusing and may vary depending on I also have a "id" property or I don't. Again, we have an interesting debate concerning the compatibility problems an specifications's change may provoke. Ow! This is horrible. In a meta-model with both _self and self. A) OCL 2.0 looks up _self as _self else self and so does not permit access to self. B) The 14224 submission, 14357 resolution looks up _self as self always and so requires __self to access _self. C) Adolfo's idea looks up _self as self else _self and so does not permit access to _self. The original problem with A) was obscure and is soluble by using _'self'. The problem with C) is perhaps even more obscure and also soluble by _'_self' if the quoted name is literal. B) is simpler long term but has an only slightly obscure incompatibility. None of these work. I followed the simplification too readily because I wanted to avoid _'_self' being resolved as _self in one parsing phase and then as self in another, either causing confusion from double _ treatments, or requiring implementations to maintain context to indicate whether the AS _self came from _self or _'_self'. C) provides a bridge to B) whereby the fallback lookup is provided for compatibility, but is deprecated for removal in OCL 3.0. A) and C) are incompatible whenever a meta-model contains both an x and an _x property. ----- B) is simple to understand, specify and implement. A) and C) both expose the failure to integrate the first paragraph of 9.3 into the specification of Environment::lookupLocal. The problem is that if _'_x' is to lookup precisely _x, whereas _x looks up_x or x, lookupLocal must be sensitive to whether an argument of _x originated as _'_x' or _x. Only the latter can be _x or x. The meta-model dependency cannot be performed in the simpleNameCS[B] synthesized attribute because it is not known whether the simpleNameCS is a reference to a property, class, or ... or a definition (without passing additional environment information that may be suspect for syntaxes in need of disambiguation). I see the following options for _xx and _'xx' support a) Adopt B) and introduce an incompatibility for accessing _xx. b) Abandon _'xx' and fail to support arbitrary identifiers c) Add an isLiteral argument to lookupLocal and related functions d) Change lookupLocal and related functions to have a simpleNameCS rather than String argument e) Add a parallel family of lookupLocal and related functions with a simpleNameCS rather than String argument Environment is a 'less normative' part of the specification so c) and d) are perhaps possible. I prefer a) but doubt that it's acceptable. b) achieves nothing. c) and d) have minor compatibility problems so that leaves e) which aligns with a change I had already been considering for the MDT/OCL implementation. e) maintains compatibility and implements the specification by performing the potentially two invocations of lookupLocal(String) within lookupLocal(SimpleNameCS). ----- So, unless an incompatible change requiring accesses to _id to be rewritten as __id is acceptable, I propose that we: - specify a parallel family of lookupXXX functions with simpleNameCS/pathNameCS arguments to implement the unquoted _ lookup - interpret a reference _x as x if there is an x, else as _x if there is an _x, else invalid - interpret a definition _x as x - specify that interpretation of _x as _x is deprecated Regards Ed Willink Sender: Adolfo Sanchez Barbudo Date: Thu, 12 Nov 2009 10:18:06 +0000 From: Adolfo Sáhez-Barbudo Herrera Organization: Open Canarias S.L. User-Agent: Thunderbird 2.0.0.23 (Windows/20090812) To: Ed Willink CC: ocl2-rtf@omg.org Subject: Re: Preparing first ballot for OCL 2.3 RTF Hi Ed, Yes. To clarify; in the two places where the editing instruction says "In 8.1 CollectionLiteralExp replace" please read it as "In 8.3.7 CollectionLiteralExp replace" Of course ----------------------- Issue 14357 ------------------------ About the double quotes: As commented in the Issue resolution, this one will make an incompatibility in QVTo due to the use of double quote for strings. Expressions like 2."+"(2) will fail in QVTo, and 2.+(2) wouldn't be valid anymore. A new Issue to remove the use of double quotes as String would be needed (with the compatibilty problems this change produces). 2.+(2) was not valid in OCL 2.0. QVTo can define it as an extension without problem at present, but there is a risk that future OCL evolution, perhaps involving regular expression syntaxes, may provide a conflicting semantic. The "x" compatibility is certainly unfortunate, but it's QVTo's problem. QVT claims OCL compatibility. OCL like many other languages has exactly one clear syntax for a string. Introducing the "x" alternative is confusing, rather than helpful, for users of C-like languages where 'x' means character literal and "x" means string literal. It allows programmers to think that "x" is valid OCL and then get confused as to why 'x' is not a character literal in QVTo and why "x" doesn't work in any other OCL tool including QVTr or QVTc. Users of OCL are not helped by trying to make OCL masquerade as Java. Overloading the quotes was not compatible with the spirit of OCL and now that incompatibility causes the problem. ?? A new Issue could resolve the permitted extensibility of OCL to clarify what an extending language may do without imposing constraints on OCL evolution ?? Sorry for not being clear enough: 1. I should have said, that I'm in favour of the issue resolution. I only wanted to remark that this change has a big impact in QVTo, and hence: 2. About the new issue, I meant a new one against QVT spec, of course >.<. This change needs a quick adaptation in the QVTo spec. As soon as this change is introduced in OCL, an immediate QVT issue (and resolution) is required to forbid the use double quote as String literals. In other case neither 2.+(2) nor 2."+"(2) would be possible in QVTo: only 2 + 2. About the affirmation of 2.+(2) was not valid in OCL 2.0. Let me say that this is debatable. If I read in a specific section (7.4.8) that: The expression: a + b is conceptually equal to the expression: a.+(b) I could pedantically say that the OCL spec let do 2.+(2). The point is that this section is contradictory with other sections, such us the concrete syntax which only allows the use of a SimpleName in operation call expressions. So, we have two options to solve this contradiction: - Change the section 7.4.8 so that it's not in contradiction with other sections - or change the other sections (for instance, an operation call exp) so that they are not in contraction with the section 7.4.8. I agree with the resolution, using an operator or a reserved keyword in an operation call exp is, at least, weird (but, it could be tractable). However, until the contradiction above is not resolved, I wouldn't definitely dare to affirm that 2.+(2) is not valid in OCL. There is a section which beats said affirmation. Best regards, Adolfo. Regards Ed Willink -- Adolfo Sáhez-Barbudo Herrera adolfosbh(at)opencanarias(dot)com C/ElĂ­ Ramos Gonzáz, 4, ofc. 304 38001 SANTA CRUZ DE TENERIFE Tel.: +34 922 240231 / +34 617 718268 X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: ApUHAJJ0+0pUXebi/2dsb2JhbACCISgHIQSENI1KhV4JwUmEPAQ Date: Thu, 12 Nov 2009 10:39:16 +0000 From: Ed Willink User-Agent: Thunderbird 2.0.0.23 (Windows/20090812) To: ocl2-rtf@omg.org Subject: Re: Preparing first ballot for OCL 2.3 RTF X-Plusnet-Relay: 51d48a618fde88d59798520ce3066d26 Hi Adolfo 7.4.8 is certainly debatable, but not normative. 9.3 OperationCallExpCS Disambiguation Rule is normative. [3] [C] The name of the referred Operation cannot be an operator. Set{.+.,.-.,.*.,./.,.and.,.or.,.xor.,.=.,.<=.,.>=.,.<.,.>.}->excludes(simpleNameCS.ast) The resolution has to remove this rule to allow 2."+"(2). Regards Ed About the affirmation of 2.+(2) was not valid in OCL 2.0. Let me say that this is debatable. If I read in a specific section (7.4.8) that: The expression: a + b is conceptually equal to the expression: a.+(b) I could pedantically say that the OCL spec let do 2.+(2). The point is that this section is contradictory with other sections, such us the concrete syntax which only allows the use of a SimpleName in operation call expressions. So, we have two options to solve this contradiction: - Change the section 7.4.8 so that it's not in contradiction with other sections - or change the other sections (for instance, an operation call exp) so that they are not in contraction with the section 7.4.8. I agree with the resolution, using an operator or a reserved keyword in an operation call exp is, at least, weird (but, it could be tractable). However, until the contradiction above is not resolved, I wouldn't definitely dare to affirm that 2.+(2) is not valid in OCL. There is a section which beats said affirmation. Best regards, Adolfo. Regards Ed Willink -- Adolfo Sáhez-Barbudo Herrera adolfosbh(at)opencanarias(dot)com C/ElĂ­ Ramos Gonzáz, 4, ofc. 304 38001 SANTA CRUZ DE TENERIFE Tel.: +34 922 240231 / +34 617 718268 Subject: RE: Preparing first ballot for OCL 2.3 RTF Date: Thu, 12 Nov 2009 12:44:41 +0100 X-MS-Has-Attach: X-MS-TNEF-Correlator: Thread-Topic: Preparing first ballot for OCL 2.3 RTF Thread-Index: AcpjgZ2y62/20lQfS2KiO1OcxO+KSAABrQsQ From: To: , Cc: X-OriginalArrivalTime: 12 Nov 2009 11:44:41.0978 (UTC) FILETIME=[85CC55A0:01CA638D] Hi Adolfo and Ed, Concerning the issue on use of double quotes, I have a different opinion. The target of QVTo is people that is familiar with modern imperative programming language like Java, Javascript and Python. That's why it was important to minimize the gap between OCL syntax and the syntax commonly used in these programming languages. One of the big problems was the mismatch with equality operator: OCL uses '=' whereas programmers are more familiar with "==". That's why QVTo allow people use "==" in place of "=". The same with strings: most programmers use indifferently double quotes and simple quotes (double is even most common). That's why QVTo has introduced this flexibility. So, from my point of view, QVTo should not change that: flexibility with use of double quotes should continue. This is more important than trying to be at any cost 100% compatible with OCL. For OCL, from my point of view, use of double quotes symmetrically with use of simple quotes should also be possible for strings. But if most OCL practitioners don't like this evolution, I will not push necessarily in this direction. Anyway, I would be happy to hear arguments for not considering this evolution in OCL. I don't think the case of language C is useful here, since the distinction between characters and strings does not exist in OCL. Cheers, Mariano -------------------------------------------------------------------------------- De : Adolfo Sanchez Barbudo [mailto:adolfosbh@opencanarias.es] De la part de Adolfo Sáhez-Barbudo Herrera EnvoyĂ© jeudi 12 novembre 2009 11:18 Ă€: Ed Willink Cc : ocl2-rtf@omg.org Objet : Re: Preparing first ballot for OCL 2.3 RTF Hi Ed, Yes. To clarify; in the two places where the editing instruction says "In 8.1 CollectionLiteralExp replace" please read it as "In 8.3.7 CollectionLiteralExp replace" Of course ----------------------- Issue 14357 ------------------------ About the double quotes: As commented in the Issue resolution, this one will make an incompatibility in QVTo due to the use of double quote for strings. Expressions like 2."+"(2) will fail in QVTo, and 2.+(2) wouldn't be valid anymore. A new Issue to remove the use of double quotes as String would be needed (with the compatibilty problems this change produces). 2.+(2) was not valid in OCL 2.0. QVTo can define it as an extension without problem at present, but there is a risk that future OCL evolution, perhaps involving regular expression syntaxes, may provide a conflicting semantic. The "x" compatibility is certainly unfortunate, but it's QVTo's problem. QVT claims OCL compatibility. OCL like many other languages has exactly one clear syntax for a string. Introducing the "x" alternative is confusing, rather than helpful, for users of C-like languages where 'x' means character literal and "x" means string literal. It allows programmers to think that "x" is valid OCL and then get confused as to why 'x' is not a character literal in QVTo and why "x" doesn't work in any other OCL tool including QVTr or QVTc. Users of OCL are not helped by trying to make OCL masquerade as Java. Overloading the quotes was not compatible with the spirit of OCL and now that incompatibility causes the problem. ?? A new Issue could resolve the permitted extensibility of OCL to clarify what an extending language may do without imposing constraints on OCL evolution ?? Sorry for not being clear enough: 1. I should have said, that I'm in favour of the issue resolution. I only wanted to remark that this change has a big impact in QVTo, and hence: 2. About the new issue, I meant a new one against QVT spec, of course >.<. This change needs a quick adaptation in the QVTo spec. As soon as this change is introduced in OCL, an immediate QVT issue (and resolution) is required to forbid the use double quote as String literals. In other case neither 2.+(2) nor 2."+"(2) would be possible in QVTo: only 2 + 2. About the affirmation of 2.+(2) was not valid in OCL 2.0. Let me say that this is debatable. If I read in a specific section (7.4.8) that: The expression: a + b is conceptually equal to the expression: a.+(b) I could pedantically say that the OCL spec let do 2.+(2). The point is that this section is contradictory with other sections, such us the concrete syntax which only allows the use of a SimpleName in operation call expressions. So, we have two options to solve this contradiction: - Change the section 7.4.8 so that it's not in contradiction with other sections - or change the other sections (for instance, an operation call exp) so that they are not in contraction with the section 7.4.8. I agree with the resolution, using an operator or a reserved keyword in an operation call exp is, at least, weird (but, it could be tractable). However, until the contradiction above is not resolved, I wouldn't definitely dare to affirm that 2.+(2) is not valid in OCL. There is a section which beats said affirmation. Best regards, Adolfo. Regards Ed Willink -- Adolfo Sáhez-Barbudo Herrera adolfosbh(at)opencanarias(dot)com C/ElĂ­ Ramos Gonzáz, 4, ofc. 304 38001 SANTA CRUZ DE TENERIFE Tel.: +34 922 240231 / +34 617 718268 StringConcreteSyntax.odt