Issue 1642: Java to IDL identifier mapping (java2idl-rtf) Source: (, ) Nature: Revision Severity: Summary: Summary: orbos/98-02-01 says: We have provided name manglings to work around [the limitations], but we recommend that OMG consider extending the definitions of IDL identifiers so that a wider range of unicode characters can be supported and so that case is significant in distinguishing identifiers. I would suggest to remove this recommendation because it is pragmatically unimplementable for implementation languages that do not support extended character sets. If the OMG were to take this step, it would effectively alienate CORBA from all such implementation languages, which I believe would be detrimental to the success of CORBA. Because overloaded methods are a popular feature in object-oriented programming languages we recommend that OMG considers extending IDL to allow overloaded methods. Again, I suggest to remove the recommendation because it is pragmatically unimplementable. Steve Vinoski recently posted an interesting article on this topic in comp.object.corba -- I have attached a copy below. Resolution: Revised Text: Actions taken: July 8, 1998: received issue February 23, 1999: closed issue Discussion: End of Annotations:===== Return-Path: X-Authentication-Warning: tigger.dstc.edu.au: michi owned process doing -bs Date: Tue, 7 Jul 1998 22:32:42 +1000 (EST) From: Michi Henning To: java2idl-rtf@omg.org, issues@omg.org Subject: Java to IDL identifier mapping orbos/98-02-01 says: We have provided name manglings to work around [the limitations], but we recommend that OMG consider extending the definitions of IDL identifiers so that a wider range of unicode characters can be supported and so that case is significant in distinguishing identifiers. I would suggest to remove this recommendation because it is pragmatically unimplementable for implementation languages that do not support extended character sets. If the OMG were to take this step, it would effectively alienate CORBA from all such implementation languages, which I believe would be detrimental to the success of CORBA. Because overloaded methods are a popular feature in object-oriented programming languages we recommend that OMG considers extending IDL to allow overloaded methods. Again, I suggest to remove the recommendation because it is pragmatically unimplementable. Steve Vinoski recently posted an interesting article on this topic in comp.object.corba -- I have attached a copy below. On the topic of name mangling: It is possible that the name mangling rules we define in Section 5.2 may result in name collisions. [...] We believe that in practice the name mangling rules we have chosen will create names that would not have been used by normal Java programmers, and that the risk of of namepsace collisions by the mangled names is acceptably low. I take it that if such a collision occurs due to the presence of an abnormal Java programmer, the Java is not translatable to legal IDL? If so, this should be explicitly stated. The name mangling algorithm in general results in virtually impossible- to-use names in the IDL. For example, the spec quotes a three-character identifier that results in a seven-character mangled identifer (a$b -> aU0024b). For a three-character identifer consisting of characters that are not legal in IDL, this results in an 18-character IDL identifier that looks something like U0024BU0024CU0024D This is not to mention things like a 6-character Kanji or Hangul identifier, which would result in a 36-character IDL identifier that is completely incomprehensible. Similarly, the mangling for overloaded operations results in identifiers such as hello__long__a_b_c__long hello__CORBA_sequences_sequence_of_long These identifier names are the result of overloaded Jave operations that are perfectly reasonable and innocuous. For Jave identifiers differing only in case, the name mangling creates identifiers such as jack_, Jack_0, and jAcK_1_3. I am doubtful that such a name mangling is practical. I think it is likely that anyone will turn away in disgust from a Java interface that has been translated this way if it contains even a handful of such identifiers. What is the point of having a mapping if it creates identifiers that no-one can use? Why not take a different approach? 1) Compile the Java. If the Java input contains any identifiers that don't map cleanly or the Jave contains overloaded operations, the compiler produces a list of the problematic identifers, such as a$b jack Jack jAcK void hello(); void hello(int x, a.b.c y, int z); void hello(int z[]); 2) I now augment the list given to me by the compiler by adding the identifiers I want to use to map away from the problems. In other words, I edit the list to read: a$b a_dollar_b jack jack Jack big_jack jAcK mixed_up_jack void hello(); hello void hello(int x, a.b.c y, int z); hello3 void hello hello_array 3) I compile the Java spec a second time and feed the file I have just created into the compiler to augment the compilation. The compiler uses the identifiers I have specified to map away from the problems. With this approach, I get IDL that is readable, makes sense to a human, is usable, and actually has some chance of not being rejected by everyone. The implementation effort is no larger than mangling names. Nice, easy, understandable, and effective. Why not use it? Cheers, Michi. -- Michi Henning +61 7 33654310 DSTC Pty Ltd +61 7 33654311 (fax) University of Qld 4072 michi@dstc.edu.au AUSTRALIA http://www.dstc.edu.au/BDU/staff/michi-henning.html Steve's article: Subject: Re: Why no overloaded methods in IDL? From: Steve Vinoski Date: 1998/06/20 Message-ID: <358B6423.F10D57DE@iona.com> Newsgroups: comp.object.corba I chopped out a bunch of stuff in a failed attempt to keep this short. The topic is whether IDL should support the overloading of operation names. Greg Pasquariello wrote: > The Java-to-IDL spec already forces this, in the opposite direction. In > addition, this is only an issue for mappings to languages that do not > support IDL. On a side note, my own personal opinion is that they can do whatever they want with the Java-to-IDL mapping, including making it as ugly as they like, as long as they don't try to change IDL just to make that mapping easier. The recent damage to the OMG IDL grammar and the CORBA object model caused by the Java-hype-influenced Objects By Value Specification is bad enough. But don't get me started. > This is not directed personally at anyone, but so far all I've seen are > people telling me why it can't be done! I think something that allows for a > richer interface mapping, particularly one that maps naturally to many > modern languages, should be seriously considered. I agree with Greg that it would have been much nicer to allow for overloading when mapping into languages like C++ and Java that support it, since there are many, many more CORBA users writing in C++ and Java than there are in C or other languages that do not support overloading. However, I think there are some pretty tough issues to resolve before it would even start appearing to be possible. One issue is that we would have to figure out exactly what type equality means in IDL. Believe it or not, this is not at all clear, even after all these years and despite the presence of the TypeCode::equal operation. For example, in the C++ mapping, the following two sequences yield different types that would allow for overloading of C++ functions: typedef sequence A; typedef sequence B; However, in C, they map to the same underlying sequence type of which A and B are just aliases. So, are these different IDL types or not? TypeCode::equal says they are, but that's not the only legal or useful interpretation. What if C++ said the two types were just aliases of the same underlying type, as in the C mapping? If they're different in IDL, overloading two IDL operations where one takes an A and one takes a B would be fine, and yet such overloading would fail to map into overloading in this hypothetical C++ mapping. Similarly, string and string<3> might be different in IDL, but they're not in C or C++. Jon Biggar has recently been on a crusade to fix the type equality problem, but I don't remember the details so I don't know if his proposals would actually tighten things up enough to allow overloading in IDL. IDL overloading might also have to take into account the lowest common denominator programming language overloading rules. For example, C++ does not allow overloading on return type only, so IDL might have to avoid that too. As someone else has already pointed out, there is also the issue of overloading operations in base interfaces with operations in derived interfaces. One question when faced with such a situation is whether the derived interface writer actually intended to overload the operation in the base interface four bases down, or was he just unaware that the same name was already used? Should the IDL compiler warn about such a situation, or should special overloading syntax always be required so as to make overloading explicit? Someone also (sorry for being too lazy to look it up who it was) brought up the issue of how to name overloaded operations on the wire so as to make them unique. I believe Greg has suggested some form of mangling performed by the IDL compiler, but that would be less than ideal at the very least for the poor DII programmer who has to use the mangled names to create requests, and for the Interface Repository user who has to look up operations by their mangled names. The mangling would have to be done such that adding operations did not renumber or rename existing operations, as has already been pointed out, and the mangled names would need to be mappable directly into the various languages for which mappings already exist, so the mangling would have to avoid putting leading numbers, underscores, etc. onto the mangled names. A better approach might be to make the IDL writer responsible for mangling, maybe like this: interface A { void op(in string s) mangle(op_s); void op(in char c) mangle(op_c); }; A language mapping like C would ignore the name "op" and always use the "op_s" and "op_c" names, where C++ could just use "op" for both. This way, the mangled names would be documented in the interface, could be made less ugly than an automated mangling approach, could be checked by the IDL compiler to avoid collisions in base interfaces, and would serve as the "on the wire" operation names. Of course, at this point you'd have to ask if it's really worth it -- if the IDL developer has to think up the mangled name anyway, one could argue that they might as well not overload them in the first place and just go with the mangled name. Finally, there is much inertia in the OMG when it comes to big changes in IDL, and rightfully so. IDL is the basis for all CORBA interfaces, both those in the CORBA specs and those developed by application developers, and changes in IDL tend to ripple into all corners of the CORBA spec and your applications. Changes of this magnitude, even if they are fairly straightforward, are greeted with skepticism simply because folks are afraid of unforeseen consequences, and because of the amount of work they take to address all the places in all the specs that are affected. The recent major screw-up of the IDL grammar by the Objects By Value submission should be evidence enough that tampering with IDL should be left to the experts. The bottom line in my opinion is that while overloading would have been nice if designed into IDL from the start, it's just too late and too far-reaching to do it now. --steve -- Steve Vinoski vinoski at iona.com Senior Architect 1-800-ORBIX-4U IONA Technologies, Inc. Cambridge, MA USA 02138 60 Aberdeen Ave. http://www.iona.com/hyplan/vinoski/ Copyright 1998 Stephen B. Vinoski. All Rights Reserved. Return-Path: From: "Daniel R. Frantz" To: "'Michi Henning'" , , Subject: RE: Java to IDL identifier mapping Date: Tue, 7 Jul 1998 10:03:09 -0400 X-MSMail-Priority: Normal Importance: Normal X-MimeOLE: Produced By Microsoft MimeOLE V4.72.2106.4 >-----Original Message----- >From: Michi Henning [mailto:michi@dstc.edu.au] >Sent: Tuesday, July 07, 1998 8:33 AM >To: java2idl-rtf@omg.org; issues@omg.org >Subject: Java to IDL identifier mapping > > >orbos/98-02-01 says: > > We have provided name manglings to work around [the >limitations], ... > >I would suggest to remove this recommendation because it is ... That recommendation was in the Java-to-IDL document's "rationale" section. It isn't normative. Like recommendations in other submissions' rationale sections, it has no force. Either the Task Force picks it up and issues an RFP or it doesn't. If somebody else make a normative proposal in another submission, it gets discussed in terms of that submission. I guess what I'm saying, Michi, is that it's not something to worry about. FWIW, I agree with both your comments (name mangling, overloading). They recommend and we say forget it. Easy enough. Dan Return-Path: To: Michi Henning cc: java2idl-rtf@omg.org, issues@omg.org, crawley@dstc.edu.au Subject: Re: Java to IDL identifier mapping Date: Wed, 08 Jul 1998 12:09:25 +1000 From: Stephen Crawley [WARNING: Humour Alert!] > orbos/98-02-01 says: > It is possible that the name mangling rules we define in Section 5.2 > may result in name collisions. [...] We believe that in practice > the name mangling rules we have chosen will create names that would > not have been used by normal Java programmers, and that the risk > of of namepsace collisions by the mangled names is acceptably low. > > I take it that if such a collision occurs due to the presence of an > abnormal Java programmer, the Java is not translatable to legal IDL? > If so, this should be explicitly stated. Perhaps ... the spec needs to say how a java2idl compiler can diagnose abnormal Java programmers, and how what it should do if it does. Should it dispatch them? Should it name-mangle them? Should it narrow them into a more derived class? Should it garbage collect them? Should it update their behaviour via their meta-object interfaces? Should it force them to use the C++ binding? :-) -- Steve