Subject: Summary of KBSE-92 The Seventh Annual Knowledge-Based Software Engineering Conference W. Lewis Johnson USC / Information Sciences Institute 4676 Admiralty Way Marina del Rey, CA 90292-6695 johnson@isi.edu The 7th Annual Knowledge-Based Software Engineering Conference was held at the McLean Hilton at Tysons Corner, in McLean, Virginia, on Sept. 20-23, 1992. This conference was sponsored by Rome Laboratory and held in cooperation with the IEEE Computer Society, ACM SIGART and SIGSOFT, and the American Association for Artificial Intelligence (AAAI). The focus of the KBSE conferences is the application of artificial intelligence and knowledge-based techniques to software engineering problems. This includes techniques for constructing, representing, reasoning about, understanding and adapting software artifacts and processes. The conferences are concerned with all activities related to software, including project planning, domain modeling, requirements acquisition, specification, design, coding, documentation, understanding, reuse, evolution, testing and maintenance; provided that intelligent tools can perform these activities, support humans in performing them or cooperate with humans in performing them. The core of this year's conference was a three-day block of technical presentations, including panels, paper sessions, and demonstration sessions. The main conference was preceded by a day of tutorials. The conference proceedings may be ordered from IEEE Computer Society Press, P.O. Box 3014, Los Alamitos, CA 90720-1264, order number 2880. Background In 1983 RADC (now, Rome Laboratory) published a report calling for the development of a knowledge-based software assistant, which would employ artificial intelligence techniques to support all phases of the software development process. The original KBSA vision revolved around a new software process model, including knowledge-based software design and prototyping by executable specifications, and the generation of implementations using semantic-preserving rules. Research and development efforts around the world, including those supported by Rome Laboratory's long-term KBSA program, have led to the development of significant pieces of this vision. The annual KBSA Conference has provided a forum for discussion and presentation of work related to the KBSA effort. In 1991, the conference expanded its scope to include other work in knowledge-based software engineering, and changed its name to the Knowledge-Based Software Engineering Conference. The 1991 conference was quite successful in attracting technical papers from around the world on knowledge based software engineering. Since 1991 the KBSE conference took further steps to establish itself as the major conference in the field. An independent steering committee was established consisting of distinguished researchers and sponsors of research, a program committee was formed including many important KBSE researchers and a conference committee representing business, government and academia was formed to organize the conference. Major Themes The papers and presentations at the conference provided a wide variety, even divergence, of perspectives on knowledge-based software engineering. This variety is to be expected, since software engineering subsumes a number of different activities and problems, and artificial intelligence offers a number of techniques to bring to bear on these problems. There were groups of papers that focused on common issues, such as transformational synthesis or reuse. On the other hand, there were papers that focused on topics virtually ignored by the rest of the conference, such as software configuration management, or training of KBSE professionals. Nevertheless, there were some recurrent themes at this year's conference. One was the advantages of making KBSE systems domain-oriented, that is specializing them so they meet the needs of software practitioners and users in particular application areas. Another theme was how to get KBSE systems adopted and used by software engineers. Major Presentations The theme of domain orientation was prominent in the two invited presentations at the conference, by Elaine Kant of Schlumberger Laboratory for Computer Science, and Gerhard Fischer of the University of Colorado. Dr. Kant's talk was concerned with knowledge-based support for scientific programming. Kant observed that until recently scientific computing was carried out by mathematical modelers writing Fortran programs. However, the programming process has become more time-consuming as faster machines make it feasible to tackle more complex problems. Furthermore, the scientific concepts and principles underlying scientific programs are well understood and relatively easy to codify. Scientific computing therefore is an attractive domain in which to automate code synthesis. Dr. Kant has been working for several years on developing such a system, called SINAPSE, which generates finite difference programs to simulate partial differential equation models, optimized for various target languages and architectures. In SINAPSE the requirements of a given program can be described interactively, in terms of the relevant mathematic concepts rather than programming concepts. The project demonstrates the effectiveness of a domain-oriented approach, insofar as it has been used successfully to generate a wide variety of codes in this particular application domain. Kant appears to have been rather successful at getting potential users interested and involved in the project. Gerhard Fischer's presentation gave an overview of a new paradigm for automated design support, which he calls Domain-Oriented Design Environments (DODE). In his talk he pointed out a number of problems with conventional software engineering, which in his view are not adequately addressed in current knowledge-based software engineering systems. First, the hardest part of designing complex systems is deciding what the system should do. Systems that focus on generation of code from specifications ignore the hard task of getting the specifications right in the first place. Second, an understanding of the desired properties of a system arises only through interaction and evaluation of designs. These designs must be expressed in ``languages of doing,'' i.e., notations that designers can manipulate with directly. Specification-based approaches that separate specification >from implementation are bound to fail, in Fischer's view, because they distance the designer from the design activity, and because specification languages are not intended as languages of doing. Finally, the design environment should be domain-oriented, employing notations that are familiar to domain specialists in their daily practice, and incorporating knowledge about the domain. Fischer went on to describe a number of prototypes developed in his laboratory that embody the principles that he described. However, in response to a question from the audience, Dr.Fischer acknowledged that his prototypes rely on a visual, Mac-Draw-like metaphor, and it is not clear how applicable this metaphor is to complex software development. Presented Papers It is not possible in this space to summarize all of the presentations at the conference, particularly the paper presentations. I will focus here on those that stimulated the most interest, and refer the reader to the proceedings for discussion of the others. Sanjay Bhansali, of Stanford University, presented a paper authored by him and Penny Nii titled "Software Design by Reusing Architectures." The work presented provides a formalism for describing generic architectures, and a tool which assists in the customization of such generic architectures to produce specific architectures. It so far has been applied only to a single generic architecture, a blackboard architecture for signal processing. Don Cohen and Neil Campbell of USC / Information Sciences Institute, in their paper titled "Automatic Composition of Data Structures to Represent Relations," described a method for automatically constructing complex data structures based on descriptions of the desired behaviors of those data structures. This work uses a graphical interface to specify data behaviors, such as data sharing, composition interdependencies, etc. There were two questions from the audience following his presentation. The first question was from Cordell Green asking about the utility of graphical specification languages. The answer was that the system also accepts textual interface, and that regardless there would need to be a tool to translate the interfaces of other programming language into one this system accepts. The second question was from Richard Jullig on comparing relational languages, as those found in this system, to using languages like Refine which use primitives like maps, sets and tuples. Don Cohen explained that using a relational language increases the abstraction level over primitives since there are many ways of describing the same relation in terms of different combinations of maps, sets and tuples. Martin Feather of USC/ISI presented a paper titled "Explorations on the Formal Frontier of Distributed System Design." This paper describes a formal method for deriving a specification for a composite system, i.e., a system consisting of multiple interactive agents, in this case the set of agents and communication protocols responsible for controlling railroad train signals. This work follows on studies conducted by him and Steve Fickas's group at the University of Oregon. This paper was selected as a prize paper. Manfred Jeusfeld presented a logic-based approach to software configuration management. In this deductive theory of software development, modules are represented as predicates and configuration threads as deductive rules. Using recent results on deductive integrity checking, semantic query optimization, and intensional updates, many of the problems of configuration management can be addressed in a single framework. Rich Keller presented a paper by him and Michal Rimon, titled "A Knowledge-based Software Development Environment for Scientific Model-building". This paper was also selected as prize paper. This work focuses on constructing a scientific software model from a graphical representation of the underlying equation in a system called SIGMA. Construction requires a domain-specific knowledge base and specifically incorporates constraint propagation. This work illustrates how data flow diagrams are a good representation for conveying scientific models. There were three questions. The first question asked whether symbols *plus* constraints were necessary and sufficient for conveying semantics. Dr. Keller explained that sufficiency depends on the expressibility of the constraint language. In principle, symbols + constraints are sufficient. In practice, SIGMA has a limited constraint language which thus far has been sufficient to say what it needs to say about semantics. The second question concerned the sufficiency of data flow diagrams for representing simultaneous sets of equations. Rich Keller explained that he does not claim that the data flow diagram representation is optimal for every type of scientific model, only that it is useful for a reasonable subclass of models. Empirically, thus far, he have found this to be true. In simultaneous sets of equations, the user is relying on a black box algorithm to solve the equations in the proper sequence. Thus, the control is implicit. The data flow diagram makes the control explicit. So you can think of a data flow diagram as a more explicit representation for a set of simultaneous equations that have been ordered. The third question was from Elaine Kant commenting that sometimes you need to derive approximations of equations in order to implement a model, and this is part of the synthesis process. Rich Keller agreed and went on to explain that approximation may or may not be considered part of the specification process, depending on where you draw the line between specification and synthesis. Wojtek Kozacynski, Jim-Qun Ning, and Tom Sarver, of Andersen Consulting, contributed a paper titled "Program Concept Recognition." This paper describes work at Andersen to build a tool which can recognize programming plans in existing old code. The paper describes the techniques employed in good, clear detail, and was selected as a prize paper. Discussion of the paper at the conference centered on questions of how many concepts could be recognized, how easily the technique would scale up, and how robust the technique is in the face of poorly written code. Yves Ledru, of the Universit'e Catholique de Louvain, presented a paper by him and M.-H. Li'egeois titled "Prototyping VDM Specifications with KIDS." This paper describes how the KIDS system of the Kestrel Institute may be used to prototype VDM specifications. They describe a VDM-style development process that yields a Refine specification as the target. The process proceeds through a sequence of mechanized steps, some of which are user-guided and some of which are fully automatic. Yingsha Liao, currently at NEC, presented a paper titled "Efficiently Computing Derived Performance Data." This work focuses on combining data collection and data processing by providing a method, called PMMS, to embed event recognition and instrumentation in a program. The PMMS system builds a temporal event dependency graph to represent the required event recognition and data collection. The intent is to provide an efficient program debugging system. There were two questions from the audience following his presentation. The first question asked how his system deals with cyclic temporal dependencies among events. Dr. Liao explained that the system checks each monitoring question for cyclic temporal dependency using the event dependent graph. An error will be issued if there is some cyclic temporal dependency in the question definition. The second question asked if this methodology could be used in parallel and distributed computation models. Dr. Liao explained that the methodology, of letting users specify what monitoring question is and letting the machine figure out what data to collect, where to insert instrumentation and how to process the collected data, can be applied to parallel and distributed models. However, the reasoning process based on temporal dependency (especially those that depend on execution order of events) would need to be enhanced. Neil Maiden of City University, London, presented a paper by him and Alistair Sutcliffe titled "Domain Abstractions in Requirements Engineering: An Exemplar Approach?" The paper focuses on the process of reuse at the domain level. By representing domains as a hierarchy of abstractions, and providing concrete examples, visualization, and guided acquisition, their system, AIR (Advisor for Intelligent Reuse) facilitates a user in describing a domain and finding related domain abstractions to improve the clarity and completeness of requirements specifications. Testing of a prototype has begun. Walt Scacchi of the University of Southern California, in his paper co-authored with Peiwei Mi and Ming-June Lee titled "A Knowledge-Based Software Process Library for Process-Driven Software Development," described the SPLIb system which aids a user in creating a customized process model that, for example, will comply with national standards, organizational practice, and resolve new technology. Process models and instances are represented in a formal knowledge representation system that aids in the retrieval and composition of instances in the repository to create a new process model. Peter Selfridge, of AT&T Bell Laboratories, presented a paper titled "Managing Design Knowledge to Provide Assistance to Large-Scale Software Development," co-authored with Loren Terveen and David Long. This paper describes an experiment in which design ``folklore'' was captured in an advice-giving tool. This tool, developed in cooperation with production software engineers, is in current use. This is an excellent successful example of development and adoption of KBSE technology. Panels The program included four panels. The first was titled "Software Process and Knowledge-Based Tools," and was chaired by Ron Willis of Hughes Aircraft Company. The panels included Gail Kaiser, of Columbia University, Win Royce, of TRW, William C. Sasso, of Andersen Consulting, and Walt Scacchi of the University of Southern California. The panel addressed questions of how knowledge-based software engineering tools might change the process in which software is developed. Larry Miller of Aerospace Corporation chaired a panel titled "Program Understanding - Does It Offer Hope for Aging Software?" The participants included Prem Devanbu, of AT&T Bell Laboratories, W. Lewis Johnson, of USC/Information Sciences Institute, Jim-Qun Ning, of Andersen Consulting, and Alex Quilici, of the University of Hawaii. The panelists discussed various techniques in using program understanding technology to support the maintenance and recovery of reusable components from old code. Throughout there was some confusion as to what the subject of "program understanding" is: is it automatic inference of intent from programs, or is it tools that help people to understand programs? The consensus was that both will be required to some degree. For example, Lewis Johnson raised concerns about whether automatic program understanding would ever result in significant assistance for software maintenance. The problem is that automated tools may be able to recognize some plans in code, but other parts of the code will not be automatically understandable, and a partial understanding is not adequate for maintenance. William Sasso of Andersen Consulting chaired a panel titled "DoD Software Technology Plans: What do They Mean for Knowledge-Based Software Engineering?" The participants were Barry Boehm, of University of Southern California, Richard Jullig, of the Kestrel Institute, Mort Hirschberg, from the Army Ballistics Research Laboratory, and the Douglas White, from Rome Laboratory. Much of the discussion centered on two major software development plans, the SWTS plan proposed by DARPA and the KBSA plan being carried out by Rome Laboratory. Of the two, the SWTS plan envisions incremental adoption of new technology, while KBSA takes a more radical approach. Finally, Peter Selfridge of AT&T Bell Laboratories chaired a panel titled "Assessing KBSE Research: Issues in Goals, Metrics, and Transferability," whose participants were Barry Boehm, Gerhard Fischer, Douglas Smith, Lewis Johnson, Louis Hoebel of Rome Laboratory, and Glover Ferguson of Andersen Consulting. The panel touched on issues of how to evaluate KBSE research, and how to get it used. Doug Smith argued that formal mathematical analysis was critical for evaluation of research results, and there was not much disagreement about this. A number of the panelists raised questions about what is really involved in getting revolutionary techniques into practice, and what the consequences of such revolutionary steps might be. Generally, evolutionary adoption is much more likely to succeed than revolutionary adoption; in fact, Selfridge questioned whether revolutionary adoption was likely ever to occur. Lewis Johnson, using Columbus's discovery of America as a metaphor, observed that revolutionary steps are likely to lead to unforeseen consequences. Glover Ferguson noted that many of the problems that software engineers face are so severe that the only recourse is revolutionary change. Generally, the panels addressed interesting topics. Unfortunately, the panelists tended to give long presentations focusing on their own work. Either there was no debate among the panelists, or time ran out just as the discussion was getting interesting. Demonstrations In this year's KBSE conference program demonstrations were given as prominent place as a separate track, alongside the paper and panel sessions. Demonstration presenters each gave forty-five minute presentations before a large audience, and published descriptions of their demonstrations in the proceedings. The purpose was allow conference attendees to get a detailed view of current KBSE systems, and to foster discussion of those systems. This style of demonstration has been implemented successfully at other conferences such as the SIGCHI conference. >From the very beginning, the demonstration track ran into problems. First, projection devices for displaying computer video output on a screen were found to be either extremely expensive, or unacceptably low in quality. The computer equipment used in the demonstrations had numerous hardware problems, both the hardware supplied by the conference and the hardware brought in by individual demonstrators. Some demonstrators were simply unable to set up their presentations in time. In spite of these technical problems, the demonstrations contributed substantially to the overall conference program. William Mark's demonstration of Lockheed's Comet system was rather successful. Comet is a tool to aid in the design of software systems by modifying and reusing existing software artifacts. Comet tracks the ``commitments'' in the design, i.e., the constraints that hold between components in the design. Different components of the design may be required to establish and maintain these constraints, and other components can then assume that the constraints hold. Comet represents a subset of these constraints in a form such that an automated reasoning capability can detect when they are not being met in an evolving design, and indicate to the designer the impact of changing a commitment. The Kestrel Institute's KIDS system, demonstrated by Douglas Smith, gave a good in-depth view of a transformational approach to software synthesis. In KIDS algorithms are synthesized by applying generic design tactics such as divide-and-conquer and techniques such as finite differencing to executable specifications, in order to transform them into efficient code. During this process a directed inference capability is employed to simplify the program, and to prove properties of the program which can be used to guide the transformation process. A system for organizing programming knowledge into ``theories'' has been developed, making it possible to build up and make use of substantial knowledge bases. The ARIES system, demonstrated by Lewis Johnson, is an integrated tool that supports the process of acquiring informal requirements and using them to generate formal specifications. ARIES allows analysts to build up specifications through a gradual transformation process. It is based upon a common knowledge base representation that spans both requirements and specifications. A number of diagrammatic and textual notations are supported, each of which is mapped onto the underlying representation. The transformation and presentation systems are closely coordinated, so that when an analyst thinks of a change to make to a presentation the set of transformations that can effect that change are presented to the analyst to select from. Among the notations supported is natural language: ARIES incorporates a robust natural language generation capability making it relatively easy to compare formalized requirements with initial informal descriptions of requirements. Andersen Consulting's KBSA Concept Demonstration was not ready for presentation until late in the conference, due to hardware problems, and not many people got a chance to see it. This is regrettable, as it was one of the most impressive demonstrations at the conference. The Concept Demonstration showcases a number of the technologies developed in the course of Rome Laboratory's KBSA research and development program, and gives a taste of how future knowledge-based software engineering systems will revolutionize the entire software development process. Given the importance of demonstrations in the KBSE field, it is hoped that future KBSE conferences will continue to give demonstrations a prominent role in the program, and overcome the technical difficulties involved. This will involve planning for and identifying hardware problems ahead of time, investing in adequate projection facilities, and possibly scheduling demonstrations more than once so that more attendees get a chance to see them. General Observations It would appear from this conference that the field of knowledge-based software engineering is in a transitional phase, and its apparent promise is not fully realized. KBSE is an application-oriented discipline, yet there are few real applications in current use. On the other hand, the situation now is quite different from what it was even a few years ago. Automatic programming was an active area of research in the 60's and 70's, and then for quite a few years there was little apparent progress. This was in part because fully automatic code synthesis is extremely hard, and could only be applied to the construction of relatively small programs. It was several years before the KBSA report called for a departure from automatic programming, addressing the broader issues of software engineering while employing more interactive techniques. The KBSA report itself projected that a fifteen-year effort would be required before KBSA systems were fully usable. The KBSA efforts have moved progressively closer to industrial-strength systems. Meanwhile, researchers have been searching for approaches to the problem that will yield results in the nearer term. Domain-oriented techniques seem to offer the best promise of making this possible. Gauging from the reactions of attendees, a great deal of cohesion and common interests exist within the KBSE community, even though workers in the field focus on different software engineering problems. Attendees tended not to pick and choose which sessions to attend, but instead tried to attend as many sessions as possible. Attendees were therefore unusually critical of the organization of the conference program into parallel sessions. The degree of shared interests is good news for KBSE as a field, but is something that future KBSE conferences will have to take into account when developing their programs. Judging from my own observations and from the evaluations of the attendees, this year's KBSE Conference was quite successful. Attendees rated the conference quite highly, and the vast majority of first-time attendees indicated that they would plan to attend future KBSE conferences. When the KBSE conference format was first introduced in 1991, there was some question as to whether there was sufficient need for an annual conference in this field. At this point the consensus of the community is that these annual conferences are viable and desirable, and will be continued. Plans are already underway for upcoming conferences. Next year's conference will be held in Chicago, chaired by Bruce Johnson, the head of Andersen Consulting's research facility. The 1994 conference will be held in California, chaired by Douglas Smith of the Kestrel Institute. I encourage those interested in knowledge-based software engineering to keep these future conferences in mind. Acknowledgements I would like to thank Gail Kaiser, Bill Sasso, Peter Selfridge, Dorothy Setliff, Alistair Sutcliffe, and anonymous conference attendees who contributed their thoughts and observations to this summary. Barbara Radzisz's assistance in collecting, collating, and summarizing conference evaluations was also greatly appreciated.