Publications
2015 |
Muşlu, Kivanç; Brun, Yuriy; Meliou, Alexandra Preventing Data Errors with Continuous Testing Inproceedings In: Proceedings of the ACM SIGSOFT International Symposium on Software Testing and Analysis (ISSTA), pp. 373–384, Baltimore, MD, USA, 2015. @inproceedings{Muslu15issta, Today, software systems that rely on data are ubiquitous, and ensuring the data's quality is an increasingly important challenge as data errors result in annual multi-billion dollar losses. While software debugging and testing have received heavy research attention, less effort has been devoted to data debugging: identifying system errors caused by well-formed but incorrect data. We present continuous data testing (CDT), a low-overhead, delay-free technique that quickly identifies likely data errors. CDT continuously executes domain-specific test queries; when a test fails, CDT unobtrusively warns the user or administrator. We implement CDT in the ConTest prototype for the PostgreSQL database management system. A feasibility user study with 96 humans shows that ConTest was extremely effective in a setting with a data entry application at guarding against data errors: With ConTest, users corrected 98.4% of their errors, as opposed to 40.2% without, even when we injected 40% false positives into ConTest's output. Further, when using ConTest, users corrected data entry errors 3.2 times faster than when using state-of-the-art methods. |
Shin, Seung Yeob; Brun, Yuriy; Osterweil, Leon J.; Balasubramanian, Hari; Henneman, Philip L. Resource Specification for Prototyping Human-Intensive Systems Inproceedings In: Proceedings of the 18th International Conference on Fundamental Approaches to Software Engineering (FASE), pp. 332–346, London, England, 2015. @inproceedings{Shin15fase, Today's software systems rely heavily on complex resources, such as humans. Human-intensive systems are particularly important in our society, especially in the healthcare, financial, and software development domains. One challenge in developing such systems is that the system design must account for the constraints, capabilities, and allocation policies of their complex resources, particularly the humans. The resources, their capabilities, and their allocation policies and constraints need to be carefully specified, and modeled. Toward the goal of supporting the design of systems that respect, and make effective use of, the capabilities of such resources, we introduce a resource specification language and a process-aware, discrete-event simulation engine that simulates system executions while adhering to these resource specifications. The simulation supports (1) modeling the resources that are used by the system, and the ways in which they are used, (2) experimenting with different resource capability mixes and allocation policies, and (3) identifying such undesirable situations as bottlenecks, and inefficiencies that result from these mixes and policies. The joint use of detailed resource specifications and simulation supports rapid evaluation of different approaches to human-intensive system designs. We evaluate our specification language and simulation framework in the healthcare domain, modeling and evaluating, together with a domain expert, different approaches to developing a software system that manages a hospital emergency department's patient care. |
Beschastnikh, Ivan; Brun, Yuriy; Abrahamson, Jenny; Ernst, Michael D.; Krishnamurthy, Arvind Using declarative specification to improve the understanding, extensibility, and comparison of model-inference algorithms Journal Article In: IEEE Transactions on Software Engineering (TSE), vol. 41, no. 4, pp. 408–428, 2015, ISSN: 0098-5589. @article{Beschastnikh15tse, It is a staple development practice to log system behavior. Numerous powerful model-inference algorithms have been proposed to aid developers in log analysis and system understanding. Unfortunately, existing algorithms are typically declared procedurally, making them difficult to understand, extend, and compare. This paper presents InvariMint, an approach to specify model-inference algorithms declaratively. We applied the InvariMint declarative approach to two model-inference algorithms. The evaluation results illustrate that InvariMint (1) leads to new fundamental insights and better understanding of existing algorithms, (2) simplifies creation of new algorithms, including hybrids that combine or extend existing algorithms, and (3) makes it easy to compare and contrast previously published algorithms. InvariMint's declarative approach can outperform procedural implementations. For example, on a log of 50,000 events, InvariMint's declarative implementation of the kTails algorithm completes in 12 seconds, while a procedural implementation completes in 18 minutes. We also found that InvariMint's declarative version of the Synoptic algorithm can be over 170 times faster than the procedural implementation. |
2014 |
Barr, Earl T.; Brun, Yuriy; Devanbu, Premkumar; Harman, Mark; Sarro, Federica The Plastic Surgery Hypothesis Inproceedings In: Proceedings of the 22nd ACM SIGSOFT Symposium on the Foundations of Software Engineering (FSE), pp. 306–317, Hong Kong, China, 2014. @inproceedings{Barr14fse, Recent work on genetic-programming-based approaches to automatic program patching have relied on the insight that the content of new code can often be assembled out of fragments of code that already exist in the code base. This insight has been dubbed the plastic surgery hypothesis; successful, well-known automatic repair tools such as GenProg rest on this hypothesis, but it has never been validated. We formalize and validate the plastic surgery hypothesis and empirically measure the extent to which raw material for changes actually already exists in projects. In this paper, we mount a large-scale study of several large Java projects, and examine a history of 15,723 commits to determine the extent to which these commits are graftable, i.e., can be reconstituted from existing code, and find an encouraging degree of graftability, surprisingly independent of commit size and type of commit. For example, we find that changes are 43% graftable from the exact version of the software being changed. With a view to investigating the difficulty of finding these grafts, we study the abundance of such grafts in three possible sources: the immediately previous version, prior history, and other projects. We also examine the contiguity or chunking of these grafts, and the degree to which grafts can be found in the same file. Our results are quite promising and suggest an optimistic future for automatic program patching methods that search for raw material in already extant code in the project being patched. |
Krka, Ivo; Brun, Yuriy; Medvidovic, Nenad Automatic Mining of Specifications from Invocation Traces and Method Invariants Inproceedings In: Proceedings of the 22nd ACM SIGSOFT Symposium on the Foundations of Software Engineering (FSE), pp. 178–189, Hong Kong, China, 2014. @inproceedings{Krka14fse, Software library documentation often describes individual methods' APIs, but not the intended protocols and method interactions. This can lead to library misuse, and restrict runtime detection of protocol violations and automated verification of software that uses the library. Specification mining, if accurate, can help mitigate these issues, which has led to significant research into new model-inference techniques that produce FSM-based models from program invariants and execution traces. However, there is currently a lack of empirical studies that, in a principled way, measure the impact of the inference strategies on model quality. To this end, we identify four such strategies and systematically study the quality of the models they produce for nine off-the-shelf, real-world libraries. We find that (1) using invariants to infer an initial model significantly improves model quality, increasing precision by 4% and recall by 41%, on average; (2) effective invariant filtering is crucial for quality and scalability of strategies that use invariants; and (3) using traces in combination with invariants greatly improves robustness to input noise. We present our empirical evaluation, implement new and extend existing model-inference techniques, and make public our implementations, subject libraries, ground-truth models, and experimental data. Our work can lead to higher-quality model inference, and directly improve the name techniques and tools that rely on model, specification, and API inference. |
Ohmann, Tony; Herzberg, Michael; Fiss, Sebastian; Halbert, Armand; Palyart, Marc; Beschastnikh, Ivan; Brun, Yuriy Behavioral Resource-Aware Model Inference Inproceedings In: Proceedings of the 29th IEEE/ACM International Conference on Automated Software Engineering (ASE), pp. 19–30, Västerås, Sweden, 2014. @inproceedings{Ohmann14ase, Software bugs often arise because of differences between what developers think their system does and what the system actually does. These differences frustrate debugging and comprehension efforts. We describe Perfume, an automated approach for inferring behavioral, resource-aware models of software systems from logs of their executions. These finite state machine models ease understanding of system behavior and resource use. Perfume improves on the state of the art in model inference by differentiating behaviorally similar executions that differ in resource consumption. For example, Perfume separates otherwise identical requests that hit a cache from those that miss it, which can aid understanding how the cache affects system behavior and removing cache-related bugs. A small user study demonstrates that using Perfume is more effective than using logs and another model inference tool for system comprehension. A case study on the TCP protocol demonstrates that Perfume models can help understand non-trivial protocol behavior. Perfume models capture key system properties and improve system comprehension, while being reasonably robust to noise likely to occur in real-world executions. |
Abrahamson, Jenny; Beschastnikh, Ivan; Brun, Yuriy; Ernst, Michael D. Shedding Light on Distributed System Executions Inproceedings In: Proceedings of the Poster Track at the International Conference on Software Engineering (ICSE), pp. 598–599, Hyderabad, India, 2014. @inproceedings{Abrahamson14icse-poster, In a distributed system, the hosts execute concurrently, generating asynchronous logs that are challenging to comprehend. We present two tools: ShiVector to transparently add vector timestamps to distributed system logs, and ShiViz to help developers understand distributed system logs by visualizing them as space-time diagrams. ShiVector is the first tool to offer automated vector timestamp instrumentation without modifying source code. The vector-time-stamped logs capture partial ordering information, useful for analysis and comprehension. ShiViz space-time diagrams are simple to understand and interactive --- the user can explore the log through the visualization to understand complex system behavior. We applied ShiVector and ShiViz to two systems and found that they aid developers in understanding and debugging. |
Ohmann, Tony; Thai, Kevin; Beschastnikh, Ivan; Brun, Yuriy Mining Precise Performance-Aware Behavioral Models from Existing Instrumentation Inproceedings In: Proceedings of the New Ideas and Emerging Results Track at the International Conference on Software Engineering (ICSE), pp. 484–487, Hyderabad, India, 2014. @inproceedings{Ohmann14icse-nier, Software bugs often arise from differences between what developers envision their system does and what that system actually does. When faced with such conceptual inconsistencies, debugging can be very difficult. Inferring and presenting developers with accurate behavioral models of the system implementation can help developers reconcile their view of the system with reality and improve system quality. We present Perfume, a model-inference algorithm that improves on the state of the art by using performance information to differentiate otherwise similar-appearing executions and to remove false positives from the inferred models. Perfume uses a system's runtime execution logs to infer a concise, precise, and predictive finite state machine model that describes both observed executions and executions that have not been observed but that the system can likely generate. Perfume guides the model inference process by mining temporal performance-constrained properties from the logs, ensuring precision of the model's predictions. We describe the model inference process and demonstrate how it improves precision over the state of the art. |
Beschastnikh, Ivan; Brun, Yuriy; Ernst, Michael D.; Krishnamurthy, Arvind Inferring Models of Concurrent Systems from Logs of their Behavior with CSight Inproceedings In: Proceedings of the 36th International Conference on Software Engineering (ICSE), pp. 468–479, Hyderabad, India, 2014. @inproceedings{Beschastnikh14icse, Concurrent systems are notoriously difficult to debug and understand. A common way of gaining insight into system behavior is to inspect execution logs and documentation. Unfortunately, manual inspection of logs is an arduous process, and documentation is often incomplete and out of sync with the implementation. To provide developers with more insight into networked systems, we developed CSight. CSight mines logs of a system's executions to infer a concise and accurate model of that system's behavior, in the form of a communicating finite state machine (CFSM). Engineers can use the inferred CFSM model to understand complex behavior, detect anomalies, debug, and increase confidence in the correctness of their implementations. CSight's only requirement is that the logged events have vector timestamps. We provide a tool that automatically adds vector timestamps to system logs. Our tool prototypes are available at http://synoptic.googlecode.com/. This paper presents algorithms for inferring CFSM models from traces of concurrent systems, proves them correct, provides an implementation, and evaluates the implementation in two ways: by running it on logs from three different networked systems and via a user study that focused on bug finding. Our evaluation finds that CSight infers accurate models that can help developers find bugs. |
2013 |
Brun, Yuriy; Holmes, Reid; Ernst, Michael D.; Notkin, David Early Detection of Collaboration Conflicts and Risks Journal Article In: IEEE Transactions on Software Engineering (TSE), vol. 39, no. 10, pp. 1358–1375, 2013, ISSN: 0098-5589, (Recognized as a Spotlight Paper.). @article{Brun13tse, Conflicts among developers' inconsistent copies of a shared project arise in collaborative development and can slow progress and decrease quality. Identifying and resolving such conflicts early can help. Identifying situations which may lead to conflicts can prevent some conflicts altogether. By studying nine open-source systems totaling 3.4 million lines of code, we establish that conflicts are frequent, persistent, and appear not only as overlapping textual edits but also as subsequent build and test failures. Motivated by this finding, we develop a speculative analysis technique that uses previously-unexploited information from version control operations to precisely diagnose important classes of conflicts. Then, we design and implement Crystal, a publicly-available tool that helps developers identify, manage, and prevent conflicts. Crystal uses speculative analysis to make concrete advice unobtrusively available to developers. |
Muşlu, Kivanç; Brun, Yuriy; Meliou, Alexandra Data Debugging with Continuous Testing Inproceedings In: Proceedings of the New Ideas Track at the 9th Joint Meeting of the European Software Engineering Conference and ACM SIGSOFT Symposium on the Foundations of Software Engineering (ESEC/FSE), pp. 631–634, Saint Petersburg, Russia, 2013. @inproceedings{Muslu13ni-fse, Today, systems rely as heavily on data as on the software that manipulates those data. Errors in these systems are incredibly costly, annually resulting in multi-billion dollar losses, and, on multiple occasions, in death. While software debugging and testing have received heavy research attention, less effort has been devoted to data debugging: discovering system errors caused by well-formed but incorrect data. In this paper, we propose continuous data testing: using otherwise-idle CPU cycles to run test queries, in the background, as a user or database administrator modifies a database. This technique notifies the user or administrator about a data bug as quickly as possible after that bug is introduced, leading to at least three benefits: (1) The bug is discovered quickly and can be fixed before it is likely to cause a problem. (2) The bug is discovered while the relevant change is fresh in the user's or administrator's mind, increasing the chance that the underlying cause of the bug, as opposed to only the discovered side-effect, is fixed. (3) When poor documentation or company policies contribute to bugs, discovering the bug quickly is likely to identify these contributing factors, facilitating updating documentation and policies to prevent similar bugs in the future. We describe the problem space and potential benefits of continuous data testing, our vision for the technique, challenges we encountered, and our prototype implementation for PostgreSQL. The prototype's low overhead shows promise that continuous data testing can address the important problem of data debugging. |
Muşlu, Kivanç; Brun, Yuriy; Ernst, Michael D.; Notkin, David Making Offline Analyses Continuous Inproceedings In: Proceedings of the 9th Joint Meeting of the European Software Engineering Conference and ACM SIGSOFT Symposium on the Foundations of Software Engineering (ESEC/FSE), pp. 323–333, Saint Petersburg, Russia, 2013. @inproceedings{Muslu13fse, Developers use analysis tools to help write, debug, and understand software systems under development. A developer's change to the system source code may affect analysis results. Typically, to learn those effects, the developer must explicitly initiate the analysis. This may interrupt the developer's workflow and/or the delay until the developer learns the implications of the change. The situation is even worse for impure analyses --- ones that modify the code on which it runs --- because such analyses block the developer from working on the code. This paper presents Codebase Replication, a novel approach to easily convert an offline analysis --- even an impure one --- into a continuous analysis that informs the developer of the implications of recent changes as quickly as possible after the change is made. Codebase Replication copies the developer's codebase, incrementally keeps this copy codebase in sync with the developer's codebase, makes that copy codebase available for offline analyses to run without disturbing the developer and without the developer's changes disturbing the analyses, and makes analysis results available to be presented to the developer. We have implemented Codebase Replication in Solstice, an open-source, publicly-available Eclipse plug-in. We have used Solstice to convert three offline analyses --- FindBugs, PMD, and unit testing --- into continuous ones. Each conversion required on average 436 NCSL and took, on average, 18 hours. Solstice-based analyses experience no more than 2.5 milliseconds of runtime overhead per developer action. |
Brun, Yuriy; Medvidovic, Nenad Entrusting Private Computation and Data to Untrusted Networks Journal Article In: IEEE Transactions on Dependable and Secure Computing (TDSC), vol. 10, no. 4, pp. 225–238, 2013, ISSN: 1545-5971. @article{Brun13tdsc, We present sTile, a technique for distributing trust-needing computation onto insecure networks, while providing probabilistic guarantees that malicious agents that compromise parts of the network cannot learn private data. With sTile, we explore the fundamental cost of achieving privacy through data distribution and bound how much less efficient a privacy-preserving system is than a non-private one. This paper focuses specifically on NP-complete problems and demonstrates how sTile-based systems can solve important real-world problems, such as protein folding, image recognition, and resource allocation. We present the algorithms involved in sTile and formally prove that sTile-based systems preserve privacy. We develop a reference sTile-based implementation and empirically evaluate it on several physical networks of varying sizes, including the globally distributed PlanetLab testbed. Our analysis demonstrates sTile's scalability and ability to handle varying network delay, as well as verifies that problems requiring privacy-preservation can be solved using sTile orders of magnitude faster than using today's state-of-the-art alternatives. |
Sukkerd, Roykrong; Beschastnikh, Ivan; Wuttke, Jochen; Zhang, Sai; Brun, Yuriy Understanding Regression Failures through Test-Passing and Test-Failing Code Changes Inproceedings In: Proceedings of the New Ideas and Emerging Results Track at the 35th International Conference on Software Engineering (ICSE), pp. 1177–1180, San Francisco, CA, USA, 2013. @inproceedings{Sukkerd13icse-nier, Debugging and isolating changes responsible for regression test failures are some of the most challenging aspects of modern software development. Automatic bug localization techniques reduce the manual effort developers spend examining code, for example by focusing attention on the minimal subset of recent changes that results in the test failure, or on changes to components with most dependencies or highest churn. We observe that another subset of changes is worth the developers' attention: the complement of the maximal set of changes that does not produce the failure. While for simple, independent source-code changes, existing techniques localize the failure cause to a small subset of those changes, we find that when changes interact, the failure cause is often in our proposed subset and not in the subset existing techniques identify. In studying 45 regression failures in a large, open-source project, we find that for 87% of those failures, the subset our technique finds is different from existing work, and that for 78% of the failures, our technique identifies relevant changes ignored by existing work. These preliminary results suggest that combining our ideas with existing techniques, as opposed to using either in isolation, can improve the effectiveness of bug localization tools. |
Beschastnikh, Ivan; Brun, Yuriy; Abrahamson, Jenny; Ernst, Michael D.; Krishnamurthy, Arvind Unifying FSM-inference algorithms through declarative specification Inproceedings In: Proceedings of the 35th International Conference on Software Engineering (ICSE), pp. 252–261, San Francisco, CA, USA, 2013. @inproceedings{Beschastnikh13icse, Logging system behavior is a staple development practice. Numerous powerful model inference algorithms have been proposed to aid developers in log analysis and system understanding. Unfortunately, existing algorithms are difficult to understand, extend, and compare. This paper presents InvariMint, an approach to specify model inference algorithms declaratively. We apply InvariMint to two model inference algorithms and present evaluation results to illustrate that InvariMint (1) leads to new fundamental insights and better understanding of existing algorithms, (2) simplifies creation of new algorithms, including hybrids that extend existing algorithms, and (3) makes it easy to compare and contrast previously published algorithms. Finally, InvariMint's declarative approach can outperform equivalent procedural algorithms. |
Shin, Seung Yeob; Balasubramanian, Hari; Brun, Yuriy; Henneman, Philip L.; Osterweil, Leon J. Resource Scheduling through Resource-Aware Simulation of Emergency Departments Inproceedings In: Proceedings of the 5th International Workshop on Software Engineering in Health Care (SEHC), pp. 64–70, San Francisco, CA, USA, 2013. @inproceedings{Shin13SEHC, This paper proposes using resource-aware, discrete-event simulation to measure the effects of resource scheduling in hospital emergency departments. Determining staffing and resource allocation is a complex constraint-optimization problem that has significant impact on hospital costs and patient care quality. We developed detailed models of the emergency department process of caring for patients, the resources available to support that process, and the scheduling constraints on the deployment of those resources. We then ran a battery of discrete-event simulations of this process, varying details of process, resource mixes, and scheduling constraints, to analyze the effects of resource availability (e.g., staffing patterns) on patient length of stay. Our simulation approach proved to be particularly adept at supporting the systematic investigation of two issues of particular interest to domain experts: (1) an excessive focus on minimizing the average length of stay (the objective most typically used for optimizing emergency department staffing) can have undesirable, previously unappreciated effects, (2) too strong a focus on one particular kind of resource as the preferred vehicle for decreasing patient length of stay can tend to obscure the value of considering other kinds of resources. The unexpected nature of some of our results raises open questions about how to validate the results of complex simulations. |
Zhao, Xiang; Brun, Yuriy; Osterweil, Leon J. Supporting Process Undo and Redo in Software Engineering Decision Making Inproceedings In: Proceedings of the 8th International Conference on Software and System Process (ICSSP), pp. 56–60, San Francisco, CA, USA, 2013. @inproceedings{Zhao13ICSSP, This paper presents a provenance-based approach for supporting undo and redo for software engineers. Writing software entails creating and reworking intricately intertwined software artifacts. After discovering a mistake in an earlier-completed task, a developer may wish to redo this task, but without undoing much of the work done since. Unfortunately, state-of-the-practice undo and redo mechanisms force the developer to manually redo the work completed since the mistake. This can cause considerable extra, often error-prone work. We propose tracking the software engineering process provenance data, and using it to enable (1) undoing tasks by reverting the state of the process execution, (2) revisiting an old task while storing the provenance of undone tasks, and (3) automatically redoing those undone tasks that are consistent with the revision. Our case study of a developer performing a well-understood but complex refactoring demonstrates how our approach can greatly reduce the cost of mistakes made early but discovered late. |
Zhao, Xiang; Boose, Emery R.; Brun, Yuriy; Lerner, Barbara Staudt; Osterweil, Leon J. Supporting Undo and Redo in Scientific Data Analysis Inproceedings In: Proceedings of the 5th USENIX Workshop on the Theory and Practice of Provenance (TaPP), Lombard, IL, USA, 2013. @inproceedings{Zhao13TaPP, This paper presents a provenance-based technique to support undoing and redoing data analysis tasks. Our technique targets scientists who experiment with combinations of approaches to processing raw data into presentable datasets. Raw data may be noisy and in need of cleaning, it may suffer from sensor drift that requires retrospective calibration and data correction, or it may need gap-filling due to sensor malfunction or environmental conditions. Different raw datasets may have different issues requiring different kinds of adjustments, and each issue may potentially be handled by different approaches. Thus, scientists must often experiment with different sequences of approaches. In our work, we show how provenance information can be used to facilitate this kind of experimentation with scientific datasets. We describe an approach that supports the ability to (1) undo a set of tasks while setting aside the artifacts and consequences of performing those tasks, (2) replace, remove, or add a data-processing technique, and (3) redo automatically those set aside tasks that are consistent with changed technique. We have implemented our technique and demonstrate its utility with a case study of a common, sensor-network, data-processing scenario showing how our approach can reduce the cost of changing intermediate data-processing techniques in a complex, data-intensive process. |
Brun, Yuriy; Desmarais, Ron; Geihs, Kurt; Litoiu, Marin; Lopes, Antonia; Shaw, Mary; Smit, Mike A design space for adaptive systems Incollection In: Lemos, Rogério; Giese, Holger; Müller, Hausi A.; Shaw, Mary (Ed.): Software Engineering for Self-Adaptive Systems II, vol. 7475, pp. 33–50, Springer-Verlag, 2013, ISBN: 978-3-642-35813-5. @incollection{Brun13SEfSAS, Self-adaptive systems research is expanding as systems professionals recognize the importance of automation for managing the growing complexity, scale, and scope of software systems. The current approach to designing such systems is ad hoc, varied, and fractured, often resulting in systems with parts of multiple, sometimes poorly compatible designs. In addition to the challenges inherent to all software, this makes evaluating, understanding, comparing, maintaining, and even using such systems more difficult. This paper discusses the importance of systematic design and identifies the dimensions of the self-adaptive system design space. It identifies key design decisions, questions, and possible answers relevant to the design space, and organizes these into five clusters: observation, representation, control, identification, and enacting adaptation. This characterization can serve as a standard lexicon, that, in turn, can aid in describing and evaluating the behavior of existing and new self-adaptive systems. The paper also outlines the future challenges for improving the design of self-adaptive systems. |
Lemos, Rogério; Giese, Holger; Müller, Hausi A.; Shaw, Mary; Andersson, Jesper; Baresi, Luciano; Becker, Basil; Bencomo, Nelly; Brun, Yuriy; Cukic, Bojan; Desmarais, Ron; Dustdar, Schahram; Engels, Gregor; Geihs, Kurt; Goeschka, Karl M.; Gorla, Alessandra; Grassi, Vincenzo; Inverardi, Paola; Karsai, Gabor; Kramer, Jeff; Litoiu, Marin; Lopes, Antonia; Magee, Jeff; Malek, Sam; Mankovskii, Serge; Mirandola, Raffaela; Mylopoulos, John; Nierstrasz, Oscar; Pezzè, Mauro; Prehofer, Christian; Schäfer, Wilhelm; Schlichting, Rick; Schmerl, Bradley; Smith, Dennis B.; Sousa, João P.; Tamura, Gabriel; Tahvildari, Ladan; Villegas, Norha M.; Vogel, Thomas; Weyns, Danny; Wong, Kenny; Wuttke, Jochen Software engineering for self-adaptive systems: A second research roadmap Incollection In: Lemos, Rogério; Giese, Holger; Müller, Hausi A.; Shaw, Mary (Ed.): Software Engineering for Self-Adaptive Systems II, vol. 7475, pp. 1–32, Springer-Verlag, 2013, ISBN: 978-3-642-35813-5. @incollection{Lemos13, The goal of this roadmap paper is to summarize the state-of-the-art and identify research challenges when developing, deploying and managing self-adaptive software systems. Instead of dealing with a wide range of topics associated with the field, we focus on four essential topics of self-adaptation: design space for self-adaptive solutions, software engineering processes for self-adaptive systems, from centralized to decentralized control, and practical run-time verification & validation for self-adaptive systems. For each topic, we present an overview, suggest future directions, and focus on selected challenges. This paper complements and extends a previous roadmap on software engineering for self-adaptive systems published in 2009 covering a different set of topics, and reflecting in part on the previous paper. This roadmap is one of the many results of the Dagstuhl Seminar 10431 on Software Engineering for Self-Adaptive Systems, which took place in October 2010. |
2012 |
Muşlu, Kivanç; Brun, Yuriy; Holmes, Reid; Ernst, Michael D.; Notkin, David Speculative Analysis of Integrated Development Environment Recommendations Inproceedings In: Proceedings of the 27th ACM SIGPLAN Conference on Object-Oriented Programming, Systems, Languages, and Applications (OOPSLA), pp. 669–682, Tucson, AZ, USA, 2012. @inproceedings{Muslu12oopsla, Modern integrated development environments recommend and automate common tasks, such as refactorings, auto-completions, and error corrections. However, these tools present little or no information about the consequences of the recommended changes. For example, a rename refactoring may: (1) modify the source code without changing program semantics, (2) modify the source code and (incorrectly) change program semantics, (3) modify the source code and (incorrectly) create compilation errors, (4) show a name collision warning and require developer input, or (5) show an error and not change the source code. Having to compute the consequences of a recommendation --- either mentally or by making source code changes --- puts an extra burden on the developers. This paper aims to reduce this burden with a technique that informs developers of the consequences of code transformations. Taking Eclipse Quick Fix as an example, we build a plug-in, Quick Fix Scout, that computes the consequences of Quick Fix recommendations. Our experiments with 20 users demonstrate that developers complete compilation-error removal tasks 10% faster when using Quick Fix Scout than Eclipse's Quick Fix. |
Edwards, George; Brun, Yuriy; Medvidovic, Nenad Automated analysis and code generation for domain-specific models Inproceedings In: the joint 10th Working IEEE/IFIP Conference on Software Architecture and 6th European Conference on Software Architecture (WICSA/ECSA), pp. 161–170, Helsinki, Finland, 2012. @inproceedings{Edwards12wicsa, Domain-specific languages (DSLs) concisely express the essential features of system designs. However, using a DSL for automated analysis and code generation requires developing specialized tools. We describe how to create model analysis and code generation tools that can be applied to a large family of DSLs, and show how we created the LIGHT platform, a suite of such tools for the family of software architecture-based DSLs. These tools can be easily reused off-the-shelf with new DSLs, freeing engineers from having to custom-develop them. The key innovation underlying our strategy is to enhance DSL metamodels with additional semantics, and then automatically synthesize configurations and plug-ins for flexible analysis and code generation frameworks. Our evaluation shows that, for a DSL of typical size, using our strategy relieves software engineers of developing approximately 17,500 lines of code, which amounts to several person-months of programming work. |
Wuttke, Jochen; Beschastnikh, Ivan; Brun, Yuriy Effects of Centralized and Distributed Version Control on Commit Granularity Journal Article In: Tiny Transactions on Computer Science, vol. 1, 2012. @article{Wuttke12tinytocs, Version control systems are critical for coordinating work in large software engineering teams. Recently, distributed version control (DVC) systems have become popular, as they have many advantages over their centralized (CVC) counterparts. DVC allows for more frequent commits, and simplifies branching and merging. These features encourage developers to make smaller, finer-grained commits that do not interleave changes related to different development tasks. Such commits improve accountability and ease certain tasks, such as reverting changes that later cause problems. DVC systems are also better suited for repository mining techniques, making available more useful information about the development process. For example, approaches that infer collaboration patterns can benefit from the more detailed attribution of data in DVC. This can be used by an integration server to send email about failed test cases to just the subset of developers who authored the relevant code. DVC may also lead to smaller and more focused commits, which could benefit mining techniques that identify changes relevant to specific development tasks, such as refactorings. However, to date, there has been no explicit evaluation of the practical differences in mining DVC over CVC, though some work acknowledges that developers might use DVC and CVC differently. We report on such an evaluation with one counterintuitive finding that raises doubts about certain DVC promises and opens research questions into what causes DVC and CVC differences. Further, our finding indicates that repository type should be controlled for in repository mining experiments. |
Brun, Yuriy; Medvidovic, Nenad Keeping Data Private while Computing in the Cloud Inproceedings In: Proceedings of the 5th International Conference on Cloud Computing (CLOUD), pp. 285–294, Honolulu, HI, USA, 2012. @inproceedings{Brun12cloud, The cloud offers unprecedented access to computation. However, ensuring the privacy of that computation remains a significant challenge. In this paper, we address the problem of distributing computation onto the cloud in a way that preserves the privacy of the computation's data even from the cloud nodes themselves. The approach, called sTile, separates the computation into small subcomputations and distributes them in a way that makes it prohibitively hard to reconstruct the data. We evaluate sTile theoretically and empirically: First, we formally prove that sTile systems preserve privacy. Second, we deploy a prototype implementation on three different networks, including the globally-distributed PlanetLab testbed, to show that sTile is robust to network delay and efficient enough to significantly outperform existing privacy-preserving approaches. |
Wuttke, Jochen; Brun, Yuriy; Gorla, Alessandra; Ramaswamy, Jonathan Traffic Routing for Evaluating Self-Adaptation Inproceedings In: Proceedings of the 7th International Symposium on Software Engineering for Adaptive and Self-Managing Systems (SEAMS), pp. 27–32, Zurich, Switzerland, 2012. @inproceedings{Wuttke12seams, Toward improving the ability to evaluate self-adaptation mechanisms, we present the automated traffic routing problem. This problem involves large numbers of agents, partial knowledge, and uncertainty, making it well-suited to be solved using many, distinct self-adaptation mechanisms. The well-defined nature of the problem allows for comparison and proper evaluation of the underlying mechanisms and the involved algorithms. We (1) define the problem, (2) outline the sources of uncertainty and partial information that can be addressed by self-adaptation, (3) enumerate the dimensions along which self-adaptive systems should be evaluated to provide a benchmark for comparison of self-adaptation and traditional mechanisms, (4) present Adasim, an open-source traffic routing simulator that allows easy implementation and comparison of systems solving the automated traffic routing problem, and (5) demonstrate Adasim by implementing two traffic routing systems. |
Muşlu, Kivanç; Brun, Yuriy; Holmes, Reid; Ernst, Michael D.; Notkin, David Improving IDE Recommendations by Considering Global Implications of Existing Recommendations Inproceedings In: Proceedings of the New Ideas and Emerging Results Track at the 34th International Conference on Software Engineering (ICSE), pp. 1349–1352, Zurich, Switzerland, 2012. @inproceedings{Muslu12icse-nier, Modern integrated development environments (IDEs) offer recommendations to aid development, such as auto-completions, refactorings, and fixes for compilation errors. Recommendations for each code location are typically computed independently of the other locations. We propose that an IDE should consider the whole codebase, not just the local context, before offering recommendations for a particular location. We demonstrate the potential benefits of our technique by presenting four concrete scenarios in which the Eclipse IDE fails to provide proper Quick Fixes at relevant locations, even though it offers those fixes at other locations. We describe a technique that can augment an existing IDE's recommendations to account for non-local information. For example, when some compilation errors depend on others, our technique helps the developer decide which errors to resolve first. |
Brun, Yuriy; Muşlu, Kivanç; Holmes, Reid; Ernst, Michael D.; Notkin, David Predicting Development Trajectories to Prevent Collaboration Conflicts Inproceedings In: the Future of Collaborative Software Development (FCSD), Seattle, WA, USA, 2012. @inproceedings{Brun12fcsd, The benefits of collaborative development are reduced by the cost of resolving conflicts. We posit that reducing the time between when developers introduce and learn about conflicts reduces this cost. We outline the state-of-the-practice of managing and resolving conflicts and describe how it can be improved by available state-of-the-art tools. Then, we describe our vision for future tools that can predict likely conflicts before they are even created, warning developers and allowing them to avoid potentially costly situations. |
Brun, Yuriy Efficient 3-SAT algorithms in the tile assembly model Journal Article In: Natural Computing, vol. 11, no. 2, pp. 209–229, 2012. @article{Brun12natcomp, Self-assembly is a powerful process found in nature that guides simple objects assembling, on their own, into complex structures. Self-assembly is of interest to computer scientists because self-assembling systems can compute functions, assemble shapes, and guide distributed robotics systems. The tile assembly model is a formal mathematical model of self-assembly that allows the study of time and space complexities of self-assembling systems that lie at the heart of several molecular computer implementations and distributed computational software systems. These implementations and systems require efficient tile systems with small tilesets and fast execution times. The state of the art, however, requires vastly complex tile systems with large tilesets to implement fast algorithms. In this paper, I present a tile system that decides 3-SAT by creating $O^star(1.8393^n)$ nondeterministic assemblies in parallel, improving on the previous best known solution that requires $Theta(2^n)$ such assemblies. This solution directly improves the feasibility of building molecular 3-SAT solvers and efficiency of distributed software. I formally prove the correctness of the system, the number of required parallel assemblies, that the size of the system's tileset is $147 = Theta(1)$, and that the assembly time is nondeterministic linear in the size of the input. |
2011 |
Beschastnikh, Ivan; Brun, Yuriy; Ernst, Michael D.; Krishnamurthy, Arvind; Anderson, Thomas E. Mining temporal invariants from partially ordered logs Journal Article In: ACM SIGOPS Operating Systems Review, vol. 45, no. 3, pp. 39–46, 2011, ISSN: 0163-5980. @article{Beschastnikh11osr, A common assumption made in log analysis research is that the underlying log is totally ordered. For concurrent systems, this assumption constrains the generated log to either exclude concurrency altogether, or to capture a particular interleaving of concurrent events. This paper argues that capturing concurrency as a partial order is useful and often indispensable for answering important questions about concurrent systems. To this end, we motivate a family of event ordering invariants over partially ordered event traces, give three algorithms for mining these invariants from logs, and evaluate their scalability on simulated distributed system logs. |
Edwards, George; Brun, Yuriy; Medvidovic, Nenad Isomorphism in model tools and editors Inproceedings In: Proceedings of the 26th IEEE ACM International Conference on Automated Software Engineering (ASE), pp. 458–461, Lawrence, KS, USA, 2011. @inproceedings{Edwards11ase, Domain-specific languages (DSLs) are modeling languages that are customized for a specific context or project. DSLs allow for fast and precise modeling because the language features and constructs can be precisely tailored based on the needs of the modeling effort. There exist highly customizable model-editing tools that can be easily configured to support DSLs defined by end-users (e.g., system architects, engineers, and analysts). However, to leverage models created using these tools for automated analysis, simulation, and code generation, end-users must build custom analysis tools and code generators. In contrast to model editors, the implementation and maintenance of these analysis and code generation tools can be tedious and hampers the utility of DSLs. In this paper, we posit that analysis and code generation tools for DSLs are, in fact, isomorphic to model editing tools. The implication of this insight is that model editors, analysis tools, and code generators can be treated as analogs conceptually and architecturally, and highly customizable analysis and code generation tools for DSLs can be built using the same approach that has already proven successful for the construction of DSL model editors. |
Beschastnikh, Ivan; Brun, Yuriy; Ernst, Michael D.; Krishnamurthy, Arvind; Anderson, Thomas E. Bandsaw: Log-powered test scenario generation for distributed systems Inproceedings In: The Work In Progress track of the 23rd ACM Symposium on Operating Systems Principles (SOSP), Cascais, Portugal, 2011. @inproceedings{Beschastnikh11sospwip, |
Brun, Yuriy; Holmes, Reid; Ernst, Michael D.; Notkin, David Crystal: Precise and unobtrusive conflict warnings Inproceedings In: Proceedings of the 8th Joint Meeting of the European Software Engineering Conference and ACM SIGSOFT Symposium on the Foundations of Software Engineering Tool Demonstration Track (ESEC/FSE), pp. 444–447, Szeged, Hungary, 2011. @inproceedings{Brun11tool-demo-fse, During collaborative development, individual developers can create conflicts in their copies of the code. Such conflicting edits are frequent in practice, and resolving them can be costly. We present Crystal, a tool that proactively examines developers' code and precisely identifies and reports on textual, compilation, and behavioral conflicts. When conflicts are present, Crystal enables developers to resolve them more quickly, and therefore at a lesser cost. When conflicts are absent, Crystal increases the developers' confidence that it is safe to merge their code. Crystal uses an unobtrusive interface to deliver pertinent information about conflicts. It informs developers about actions that would address the conflicts and about people with whom they should communicate. |
Beschastnikh, Ivan; Abrahamson, Jenny; Brun, Yuriy; Ernst, Michael D. Synoptic: Studying logged behavior with inferred models Inproceedings In: Proceedings of the 8th Joint Meeting of the European Software Engineering Conference and ACM SIGSOFT Symposium on the Foundations of Software Engineering Tool Demonstration Track (ESEC/FSE), pp. 448–451, Szeged, Hungary, 2011. @inproceedings{Beschastnikh11tool-demo-fse, Logging is a powerful method for capturing program activity and state during an execution. However, log inspection remains a tedious activity, with developers often piecing together what went on from multiple log lines and across many files. This paper describes Synoptic, a tool that takes logs as input and outputs a finite state machine that models the process generating the logs. The paper overviews the model inference algorithms. Then, it describes the Synoptic tool, which is designed to support a rich log exploration workflow. |
Brun, Yuriy; Holmes, Reid; Ernst, Michael D.; Notkin, David Proactive detection of collaboration conflicts Inproceedings In: Proceedings of the 8th Joint Meeting of the European Software Engineering Conference and ACM SIGSOFT Symposium on the Foundations of Software Engineering (ESEC/FSE), pp. 168–178, Szeged, Hungary, 2011, (ACM SIGSOFT Distinguished Paper Award.). @inproceedings{Brun11fse, Collaborative development can be hampered when conflicts arise because developers have inconsistent copies of a shared project. We present an approach to help developers identify and resolve conflicts early, before those conflicts become severe and before relevant changes fade away in the developers' memories. This paper presents three results. First, a study of open-source systems establishes that conflicts are frequent, persistent, and appear not only as overlapping textual edits but also as subsequent build and test failures. The study spans nine open-source systems totaling 3.4 million lines of code; our conflict data is derived from 550,000 development versions of the systems. Second, using previously-unexploited information, we precisely diagnose important classes of conflicts using the novel technique of speculative analysis over version control operations. Third, we describe the design of Crystal, a publicly-available tool that uses speculative analysis to make concrete advice unobtrusively available to developers, helping them identify, manage, and prevent conflicts. |
Beschastnikh, Ivan; Brun, Yuriy; Schneider, Sigurd; Sloan, Michael; Ernst, Michael D. Leveraging existing instrumentation to automatically infer invariant-constrained models Inproceedings In: Proceedings of the 8th Joint Meeting of the European Software Engineering Conference and ACM SIGSOFT Symposium on the Foundations of Software Engineering (ESEC/FSE), pp. 267–277, Szeged, Hungary, 2011. @inproceedings{Beschastnikh11fse, Computer systems are often difficult to debug and understand. A common way of gaining insight into system behavior is to inspect execution logs and documentation. Unfortunately, manual inspection of logs is an arduous process and documentation is often incomplete and out of sync with the implementation. This paper presents Synoptic, a tool that helps developers by inferring a concise and accurate system model. Unlike most related work, Synoptic does not require developer-written scenarios, specifications, negative execution examples, or other complex user input. Synoptic processes the logs most systems already produce and requires developers only to specify a set of regular expressions for parsing the logs. Synoptic has two unique features. First, the model it produces satisfies three kinds of temporal invariants mined from the logs, improving accuracy over related approaches. Second, Synoptic uses refinement and coarsening to explore the space of models. This improves model efficiency and precision, compared to using just one approach. In this paper, we formally prove that Synoptic always produces a model that satisfies exactly the temporal invariants mined from the log, and we argue that it does so efficiently. We empirically evaluate Synoptic through two user experience studies, one with a developer of a large, real-world system and another with 45 students in a distributed systems course. Developers used Synoptic-generated models to verify known bugs, diagnose new bugs, and increase their confidence in the correctness of their systems. None of the developers in our evaluation had a background in formal methods but were able to easily use Synoptic and detect implementation bugs in as little as a few minutes. |
Brun, Yuriy; Edwards, George; Bang, Jae; Medvidovic, Nenad Smart redundancy for distributed computation Inproceedings In: Proceedings of the 31st International Conference on Distributed Computing Systems (ICDCS), pp. 665–676, Minneapolis, MN, USA, 2011. @inproceedings{Brun11icdcs, Many distributed software systems allow participation by large numbers of untrusted, potentially faulty components on an open network. As faults are inevitable in this setting, these systems utilize redundancy and replication to achieve fault tolerance. In this paper, we present a novel smart redundancy technique called iterative redundancy, which ensures efficient replication of computation and data given finite processing and storage resources, even when facing Byzantine faults. Iterative redundancy is more efficient and more adaptive than comparable state-of-the-art techniques that operate in environments with unknown system resource reliability. We show how systems that solve computational problems using a network of independent nodes can benefit from iterative redundancy. We present a formal analytical analysis and an empirical analysis, demonstrate iterative redundancy on a real-world volunteer-computing system, and compare it to existing methods. |
Kiddon, Chloé; Brun, Yuriy That's what she said: Double entendre identification Inproceedings In: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics (ACL), pp. 89–94, Portland, OR, USA, 2011. @inproceedings{Kiddon11, Humor identification is a hard natural language understanding problem. We identify a subproblem --- the "that's what she said" problem --- with two distinguishing characteristics: (1) use of nouns that are euphemisms for sexually explicit nouns and (2) structure common in the erotic domain. We address this problem in a classification approach that includes features that model those two characteristics. Experiments on web data demonstrate that our approach improves precision by 12% over baseline techniques that use only word-based features. |
Medvidovic, Nenad; Tajalli, Hossein; Garcia, Joshua; Brun, Yuriy; Krka, Ivo; Edwards, George Engineering heterogeneous robotics systems: A software architecture-based approach Journal Article In: IEEE Computer, vol. 44, no. 5, pp. 61–71, 2011, ISSN: 0018-9162. @article{Medvidovic11, RoboPrism, a framework that supports software-architecture-based development of robotic systems, is accessible to nonexperts in robotics, deals effectively with heterogeneity in distributed and mobile robotics systems, and facilitates adaptation in complex, dynamic environments. |
Brun, Yuriy Improving efficiency of 3-SAT-solving tile systems Journal Article In: Lecture Notes on Computer Science, vol. 6518/2011, pp. 1–12, 2011. @article{Brun11dna-lncs, The tile assembly model has allowed the study of the nature's process of self-assembly and the development of self-assembling systems for solving complex computational problems. Research into this model has led to progress in two distinct classes of computational systems: Internet-sized distributed computation, such as software architectures for computational grids, and molecular computation, such as DNA computing. The design of large complex tile systems that emulate Turing machines has shown that the tile assembly model is Turing universal, while the design of small tile systems that implement simple algorithms has shown that tile assembly can be used to build private, fault-tolerant, and scalable distributed software systems and robust molecular machines. However, in order for these types of systems to compete with traditional computing devices, we must demonstrate that fairly simple tile systems can implement complex and intricate algorithms for important problems. The state of the art, however, requires vastly complex tile systems with large tile sets to implement such algorithms. In this paper, I present a tile system that decides 3-SAT by creating $O^starłeft(1.8393^nright)$ nondeterministic assemblies in parallel, while the previous best known solution requires $Thetałeft(2^nright)$ such assemblies. In some sense, this tile system follows the most complex algorithm implemented using tiles to date. I analyze that the number of required parallel assemblies is $O^starłeft(1.8393^nright)$, that the size of the system's tileset is $147 = Theta(1)$, and that the assembly time is nondeterministic linear in the size of the input. This work directly improves the time and space complexities of tile-inspired computational-grid architectures and bridges theory and today's experimental limitations of DNA computing. |
2010 |
Brun, Yuriy; Holmes, Reid; Ernst, Michael D.; Notkin, David Speculative analysis: Exploring future states of software Inproceedings In: Proceedings of the 2010 Foundations of Software Engineering Working Conference on the Future of Software Engineering Research (FoSER), pp. 59–63, Santa Fe, NM, USA, 2010. @inproceedings{Brun10foser, Most existing software tools and environments help developers analyze the present and past development states of their software systems, but few approaches have investigated the potential consequences of future actions the developers may perform. The commoditization of hardware, multi-core architectures, and cloud computing provide new potential for delivering apparently-instantaneous feedback to developers, informing them of the effects of changes to the software that they may be considering. For example, modern IDEs often provide ``quick fix'' suggestions for resolving compilation errors. Developers must scan this list and select the option they think will resolve the problem. Instead, we propose that the IDE should speculatively perform each of the suggestions and provide information that helps developers select the best option for the given context. We believe the feedback enabled by speculative operations can lead to increased developer productivity and software quality. |
Schneider, Sigurd; Beschastnikh, Ivan; Chernyak, Slava; Ernst, Michael D.; Brun, Yuriy Synoptic: Summarizing system logs with refinement Inproceedings In: Proceedings of the Workshop on Managing Systems via Log Analysis and Machine Learning Techniques (SLAML), Vancouver, Canada, 2010. @inproceedings{Schneider10, Distributed systems are often difficult to debug and understand. A typical way of gaining insight into system behavior is by inspecting its execution logs. Manual inspection of logs is arduous, and few tools can analyze an arbitrary system log out of the box. To support this task we developed Synoptic. Synoptic outputs a concise graph representation of logged events that captures important temporal event invariants mined from the log. We applied Synoptic to synthetic and real distributed system logs, and found that it augmented a distributed system designer's understanding of system behavior with reasonable overhead for an offline analysis tool. Synoptic makes no assumptions about the system, and requires no system modifications. In contrast to prior approaches, Synoptic uses a combination of refinement and coarsening instead of just coarsening to explore the space of representations. Additionally, it infers temporal event invariants to capture distributed system semantics that are often present in system logs. These invariants drive the exploration process and are preserved in the final representation. |
Malek, Sam; Edwards, George; Brun, Yuriy; Tajalli, Hossein; Garcia, Joshua; Krka, Ivo; Medvidovic, Nenad; Mikic-Rakic, Marija; Sukhatme, Gaurav An architecture-driven software mobility framework Journal Article In: Journal of Systems and Software, vol. 83, no. 6, pp. 972–989, 2010. @article{Malek10, Software architecture has been shown to provide an appropriate level of granularity for assessing a software system's quality attributes (e.g., performance and dependability). Similarly, previous research has adopted an architecture-centric approach to reasoning about and managing the run-time adaptation of software systems. For mobile and pervasive software systems, which are known to be innately dynamic and unpredictable, the ability to assess a system's quality attributes and manage its dynamic run-time behavior is especially important. In the past, researchers have argued that a software architecture-based approach can be instrumental in facilitating mobile computing. In this paper, we present an integrated architecture-driven framework for modeling, analysis, implementation, deployment, and run-time migration of software systems executing on distributed, mobile, heterogeneous computing platforms. In particular, we describe the framework's support for dealing with the challenges posed by both logical and physical mobility. We also provide an overview of our experience with applying the framework to a family of distributed mobile robotics systems. This experience has verified our envisioned benefits of the approach, and has helped us to identify several avenues of future work. |
Brun, Yuriy Improving impact of self-adaptation and self-management research through evaluation methodology Inproceedings In: Proceedings of Software Engineering for Adaptive and Self-Managing Systems (SEAMS), pp. 1–9, Cape Town, South Africa, 2010. @inproceedings{Brun10seams, Today, self-adaptation and self-management approaches to software engineering are viewed as specialized techniques and reach a somewhat limited community. In this paper, I overview the current state and expectation of self-adaptation and self-management impact in industry and in premier publication venues and identify what we, as a community, may do to improve such impact. In particular, I find that common evaluation methodologies make it relatively simple for self-adaptation and self-management research to be compared to other such research, but not to more-traditional software engineering research. I argue that extending the evaluation to include comparisons to traditional software engineering techniques may improve a reader's ability to judge the contribution of the research and increase its impact. Finally, I propose a set of evaluation guidelines that may ease the promotion of self-adaptation and self-management as mainstream software engineering techniques. |
Krka, Ivo; Brun, Yuriy; Popescu, Daniel; Garcia, Joshua; Medvidovic, Nenad Using dynamic execution traces and program invariants to enhance behavioral model inference Inproceedings In: Proceedings of the New Ideas and Emerging Results Track at the 32nd International Conference on Software Engineering (ICSE), pp. 179–182, Cape Town, South Africa, 2010. @inproceedings{Krka10icse-nier, Software behavioral models have proven useful for design, validation, verification, and maintenance. However, existing approaches for deriving such models sometimes overgeneralize what behavior is legal. We outline a novel approach that utilizes inferred program invariants and method invocation sequences to obtain an object-level model to describe legal execution sequences. The key insight is using program invariants to identify similar states in the sequences. We exemplify how our approach improves upon certain aspects of the state-of-the-art FSA-inference techniques. |
2009 |
Cheng, Betty H. C.; Lemos, Rogério; Giese, Holger; Inverardi, Paola; Magee, Jeff; Andersson, Jesper; Becker, Basil; Bencomo, Nelly; Brun, Yuriy; Cukic, Bojan; Serugendo, Giovanna Di Marzo; Dustdar, Schahram; Finkelstein, Anthony; Gacek, Cristina; Geihs, Kurt; Grassi, Vincenzo; Karsai, Gabor; Kienle, Holger M.; Kramer, Jeff; Litoiu, Marin; Malek, Sam; Mirandola, Raffaela; Müller, Hausi A.; Park, Sooyong; Shaw, Mary; Tichy, Matthias; Tivoli, Massimo; Weyns, Danny; Whittle, Jon Software engineering for self-adaptive systems: A research roadmap Incollection In: Cheng, Betty H. C.; Lemos, Rogério; Giese, Holger; Inverardi, Paola; Magee, Jeff (Ed.): Software Engineering for Self-Adaptive Systems, vol. 5525, pp. 1–26, Springer-Verlag, 2009, ISBN: 978-3-642-02160-2. @incollection{Cheng09, The goal of this roadmap paper is to summarize the state-of-the-art and to identify critical challenges for the systematic software engineering of self-adaptive systems. The paper is partitioned into four parts, one for each of the identified essential views of self-adaptation: modelling dimensions, requirements, engineering, and assurances. For each view, we present the state-of-the-art and the challenges that our community must address. This roadmap paper is a result of the Dagstuhl Seminar 08031 on ``Software Engineering for Self-Adaptive Systems,'' which took place in January 2008. |
Krka, Ivo; Brun, Yuriy; Edwards, George; Medvidovic, Nenad Synthesizing partial component-level behavior models from system specifications Inproceedings In: Proceedings of the 7th Joint Meeting of the European Software Engineering Conference and ACM SIGSOFT Symposium on the Foundations of Software Engineering (ESEC/FSE), pp. 305–314, Amsterdam, The Netherlands, 2009. @inproceedings{Krka09fse, Initial system specifications, such as use-case scenarios and properties, only partially specify the future system. We posit that synthesizing partial component-level behavior models from these early specifications can improve software development practices. In this paper, we provide a novel algorithm for deriving a Modal Transition System (MTS) for individual system components from system-level scenario and property specifications. These generated MTSs capture the possible component implementations that (1) necessarily provide the behavior required by the scenarios, (2) restrict behavior forbidden by the properties, and (3) leave the behavior that is neither explicitly required nor forbidden as undefined. We also show how our algorithm helps discover potential design flaws. |
Brun, Yuriy; Medvidovic, Nenad Crystal-growth-inspired algorithms for computational grids Inproceedings In: Proceedings of the Workshop on Bio-Inspired Algorithms for Distributed Systems (BADS), pp. 19–26, Barcelona, Spain, 2009, ISBN: 978-1-60558-584-0. @inproceedings{Brun09bads, Biological systems surpass man-made systems in many important ways. Most notably, systems found in nature are typically self-adaptive and self-managing, capable of surviving drastic changes in their environments, such as internal failures and malicious attacks on their components. Large distributed software systems have requirements common to those of some biological systems, particularly in the number and power of individual components and in the qualities of service of the system. However, it is not immediately clear how engineers can extract useful properties from natural systems and inject them into software systems. In this paper, we explore the nature's process of crystal growth and develop mechanisms inspired by that process for designing large distributed computational grid systems. The result is the tile architectural style, a set of design principles for building distributed software systems that solve complex computational problems. Systems developed using the tile style scale well to large computations, tolerate faults and malicious attacks, and preserve the privacy of the data. |
Krka, Ivo; Edwards, George; Brun, Yuriy; Medvidovic, Nenad From system specifications to component behavioral models Inproceedings In: Proceedings of the New Ideas and Emerging Results Track at the 31st International Conference on Software Engineering (ICSE), pp. 315–318, Vancouver, Canada, 2009. @inproceedings{Krka09icse-nier, Early system specifications, such as use-case scenarios and properties, are rarely complete. Partial models of system-level behavior, derived from these specifications, have proven useful in early system analysis. We believe that the scope of possible analyses can be enhanced by component-level partial models. In this paper, we outline an algorithm for deriving a component-level Modal Transition System (MTS) from system-level scenario and property specifications. The generated MTSs capture the possible component implementations that (1) necessarily provide the behavior required by the scenarios, (2) restrict behavior forbidden by the properties, and (3) leave the behavior that is neither explicitly required nor forbidden as undefined. We discuss how these generated models can help discover system design flaws, support requirements elicitation, and help select off-the-shelf components. |
Brun, Yuriy; Reishus, Dustin Path finding in the tile assembly model Journal Article In: Theoretical Computer Science, vol. 410, no. 15, pp. 1461–1472, 2009, ISSN: 0304-3975. @article{Brun09path-finding, Swarm robotics, active self-assembly, and amorphous computing are fields that focus on designing systems of large numbers of small, simple components that can cooperate to complete complex tasks. Many of these systems are inspired by biological systems, and all attempt to use the simplest components and environments possible, while still being capable of achieving their goals. The canonical problems for such biologically-inspired systems are shape assembly and path finding. In this paper, we demonstrate path finding in the well-studied tile assembly model, a model of molecular self-assembly that is strictly simpler than other biologically-inspired models. As in related work, our systems function in the presence of obstacles and can be made fault-tolerant. The path-finding systems use $Theta(1)$ distinct components and find minimal-length paths in time linear in the length of the path. |
Brun, Yuriy; Reishus, Dustin Connecting the dots: Molecular machinery for distributed robotics Journal Article In: Lecture Notes on Computer Science, vol. 5347/2009, pp. 102–111, 2009. @article{Brun09dna-lncs, Nature is considered one promising area to search for inspiration in designing robotic systems. Some work in swarm robotics has tried to build systems that resemble distributed biological systems and inherit biology's fault tolerance, scalability, dependability, and robustness. Such systems, as well as ones in the areas of active self-assembly and amorphous computing, typically use relatively simple components with limited computation, memory, and computational power to accomplish complex tasks such as forming paths in the presence of obstacles. We demonstrate that such tasks can be accomplished in the well-studied tile assembly model, a model of molecular self-assembly that is strictly simpler than other biologically-inspired models. Our systems use a small number of distinct components to find minimal-length paths in time linear in the length of the path while inheriting scalability and fault tolerance of the underlying natural process of self-assembly. |