Research Projects

Multi-Core OS and Virtualization for Embedded and Cyber-Physical Systems

This work aims to achieve predictable real-time performance and efficient workload consolidation in embedded and cyber-physical systems using modern multi-core processors. The main goals of this project include:
(1) the predictable use of shared hardware resources in multi-core platforms, e.g., cache, DRAM and I/O devices
(2) the integration of multi-core real-time scheduling and synchronization mechanisms
(3) applications to real-world cyber-physical systems, such as an autonomous vehicle

Systems Infrastructure for an Autonomous Vehicle

This work focuses on the development of systems infrastructure for next-generation automotive systems and autonomous vehicles, including operating systems, virtualization, multi-core memory hierarchy, code placement, and hardware accelerations. Our contributions have been reviewed by and tech-transferred to General Motors.

Coordinated CPU and GPU Parallelization of CPS Workloads

The high computational demands of complex algorithms used in recent cyber-physical systems, e.g., perception and motion planning in an autonomous vehicle, pose substantial challenges in guaranteeing their timeliness. Fortunately, many of these algorithms have parallelizable execution segments that can be accelerated with a multi-core CPU as well as a GPU. Since CPU-only or GPU-only parallelization does not always outperform the other, we focus on the development of coordinated CPU and GPU parallelization schemes.
  • Publications: In preparation

Predictable Cache Management in Virtualization

Existing software-based cache/DRAM management schemes developed for non-virtualized systems do not function properly in a virtualized system, due to an additional address layer introduced by the hypervisor. Even if they work, tasks running on a guest OS that does not support such schemes will suffer from cache or memory interference. In this project, we focus on the predictable management of a shared cache in a multi-core virtualization environment. With our proposed hypervisor-level techniques, even tasks running on closed-source OSs that do not support any cache management scheme can be provided with predictable cache performance.

Other Research Projects

Responsive and Enforced Interrupt Handling in Virtualization

Many I/O devices like sensors use interrupts to interact with the physical environment in a lower latency compared to polling. Therefore, in addition to synchronization, the following two strong requirements are imposed: (i) providing responsive and bounded interrupt handling time, and (ii) enforcing interrupts to protect task executions from interrupt storms. These requirements have been studied extensively in a non-virtualized environment but are not satisfied with prior work in a virtualized environment. In this work, we present a responsive, bounded, and enforced interrupt handling scheme for virtualized systems.

Synchronization for Multi-Core Virtual Machines

The virtualization of embedded and cyber-physical systems has received significant interest for its many benefits, such as the consolidation of individually developed, complex applications into a single hardware platform while maintaining their implementations. However, such consolidation inevitably introduces the sharing of logical and physical resources among tasks, i.e. shared memory for communication, network stacks and I/O devices. The more tasks are consolidated as the number of processing cores increases, the more we need a synchronization mechanism with bounded blocking times for the timing predictability of tasks. To address this issue, we developed vMPCP, a virtualization-aware multi-core real-time synchronization framework.

Bounding and Reducing Memory Interference

Main memory is a major shared resource among processor cores. A task running on one core can be delayed by other tasks running simultaneously on different cores due to interference in the shared main memory system. Such memory interference delay can be large and highly variable, thereby posing a significant challenge for the design of predictable systems. To address this issue, we proposed techniques to bound memory interference in a COTS-based multi-core system. In addition, based on the observations made from my analysis, we developed a memory interference-aware task allocation algorithm that accommodates memory interference delay during the allocation phase.

Coordinated OS-Level Cache Management in a Multi-Core Systems

A large, shared last-level cache (LLC) on modern multi-core processors can effectively reduce memory bandwidth consumption. However, due to resulting cache interference among tasks, the uncontrolled use of the LLC can significantly hamper the predictability and analyzability of a system. As a solution to this issue, we proposed an OS-level cache management scheme for a multi-core platform with an LLC [8]. Our scheme provides predictable cache performance through tight coordination of cache reservation, cache sharing, and cache-aware task allocation. Our scheme does not require special hardware cache partitioning support or modifications to application software. Hence, it is readily applicable to commodity multi-core processors.

Improving Temporal Isolation of Memory Reservation Scheme

Memory reservation provides real-time applications with guaranteed memory access to a specified amount of physical memory. However, previous work on memory reservation primarily focused on private pages, and did not pay attention to shared pages. In this work, we characterized problems with shared pages in real-time applications and proposed a shared-page management scheme to enhance temporal isolation of existing memory reservation scheme. The scheme was implemented and evaluated on the Linux kernel.

Monitoring Timing Constraints for Distributed Real-Time Systems

This work is a development of a run-time tool for monitoring end-to-end timing constraints of event flows in distributed real-time systems. The tool detects timing violations by transparently embedding timing information into event flow instances instead of using an external monitoring thread to record events. Through this approach, the tool is able to detect timing violations without creating an extra monitoring thread, without using inter-process communication for event logging, and without transmitting additional network packets to check inter-node timing constraints.

Adaptive System Software for Sensor Network Applications

This work focuses on the development of system software for sensor networks, which includes the operating system to support diverse sensor devices and to provide a rapid prototyping environment. We developed a novel sensor-network operating system, called RETOS. It runs on microcontroller-based devices, such as TI MSP430 (8MHz, 10Kb RAM, 48Kb ROM). Specifically, RETOS features preemptive multithreading optimization techniques, and a mechanism to protect from erroneous applications in an MMU-less device via static/dynamic code checking.

Power Management System based on Application Monitoring

The purpose of this research was to develop a kernel-based low power management mechanism for battery operated mobile devices. We developed a power-aware LCD management mechanism based on dynamic refresh-rate scaling and frame buffer monitoring. It does not require additional hardware or modifications to applications. We also worked on the development of a process monitor for Dynamic Voltage Scaling (DVS).
  • Publications: [ SPE07 ]