Show simple item record

Checkpointing Support for Distributed Containerized CPS Co-Simulations

dc.contributor.advisorGokhale, Aniruddha
dc.contributor.advisorBarve, Yogesh Damodar
dc.creatorLi, Ziqi
dc.date.accessioned2021-09-22T14:49:42Z
dc.date.available2021-09-22T14:49:42Z
dc.date.created2021-08
dc.date.issued2021-07-13
dc.date.submittedAugust 2021
dc.identifier.urihttp://hdl.handle.net/1803/16859
dc.description.abstractCo-simulations allow heterogeneous simulators to be composed as a distributed system to enable simulating a large-scale system comprising heterogeneous systems. Cyber-physical Systems (CPS) is one such domain that can benefit from such co-simulations, where these simulations can be utilized as Digital Twins at run-time to control the CPS system. However, any distributed system is prone to failures and any such failure may require these co-simulations to start from the beginning unless appropriate distributed checkpointing and rollback capabilities are provided. With an increasing trend toward designing such co-simulations as containerized distributed systems, realizing a coordinated and distributed checkpointing and rollback capability remains an unresolved problem for a variety of reasons including lack of support for checkpointing in the underlying container technology, effective coordination of checkpoints across the distributed containers and resource management to ensure minimal impact on the simulation response time due to checkpointing support. To address these issues, this research has designed and implemented a novel and systematic approach to checkpoint and restore distributed containerized co-simulations, based on Docker’s experimental checkpointing feature. First, it addresses fundamental technical issues for CRIU such as checkpointing Jgroups multicast and membership auto removal from heartbeat detection. Second, it provides a pause mechanism at the application level to pause all the containers before checkpointing them. Third, it extends EXPPO, which is an existing co-simulation-as-a-service middleware with the newly designed checkpointing and rollback capabilities so that a co-simulation can be recovered from a failure. The research outcomes are demonstrated and validated using a co-simulation constructed of applications from the PARSEC benchmarking suite on a Chameleon cloud computing research platform.
dc.format.mimetypeapplication/pdf
dc.language.isoen
dc.subjectdistributed containerized co-simulation, checkpoint
dc.titleCheckpointing Support for Distributed Containerized CPS Co-Simulations
dc.typeThesis
dc.date.updated2021-09-22T14:49:42Z
dc.type.materialtext
thesis.degree.nameMS
thesis.degree.levelMasters
thesis.degree.disciplineComputer Science
thesis.degree.grantorVanderbilt University Graduate School
dc.creator.orcid0000-0001-5212-5903


Files in this item

Icon

This item appears in the following Collection(s)

Show simple item record