Zhiling Chen

ScanBot: Towards Intelligent Surface Scanning in Embodied Robotic Systems
Arxiv
Zhiling Chen, Yang Zhang, Fardin Jalil Piran, Qianyu Zhou, Jiong Tang, Farhad Imani.
[CODE]
Abstract: We introduce ScanBot, a novel dataset designed for instruction-conditioned, highprecision surface scanning in robotic systems. In contrast to existing robot learning datasets that focus on coarse tasks such as grasping, navigation, or dialogue, ScanBot targets the high-precision demands of industrial laser scanning, where submillimeter path continuity and parameter stability are critical. The dataset covers laser scanning trajectories executed by a robot across 12 diverse objects and 6 task types, including full-surface scans, geometry-focused regions, spatially referenced parts, functionally relevant structures, defect inspection, and comparative analysis. Each scan is guided by natural language instructions and paired with synchronized RGB, depth, and laser profiles, as well as robot pose and joint states. Despite recent progress, existing vision-language action (VLA) models still fail to generate stable scanning trajectories under fine-grained instructions and real-world precision demands. To investigate this limitation, we benchmark a range of multimodal large language models (MLLMs) across the full perception–planning–execution loop, revealing persistent challenges in instruction-following under realistic constraints. ... See More

Multimodal RAG-driven Anomaly Detection and Classification in Laser Powder Bed Fusion using Large Language Models
IDETC 2025
Kiarash Naghavi Khanghah, Zhiling Chen, Lela Romeo, Qian Yang, Rajiv Malhotra, Farhad Imani, Hongyi Xu.

Abstract: Additive manufacturing enables the fabrication of complex designs while minimizing waste, but faces challenges related to defects and process anomalies. This study presents a novel multimodal Retrieval-Augmented Generation-based framework that automates anomaly detection across various Additive Manufacturing processes leveraging retrieved information from literature, including images and descriptive text, rather than training datasets. This framework integrates text and image retrieval from scientific literature and multimodal generation models to perform zero-shot anomaly identification, classification, and explanation generation in a Laser Powder Bed Fusion setting. The proposed framework is evaluated on four L-PBF manufacturing datasets from Oak Ridge National Laboratory, featuring various printer makes, models, and materials. This evaluation demonstrates the framework’s adaptability and generalizability across diverse images without requiring additional training. Comparative analysis using Qwen2-VL-2B and GPT-4o-mini as MLLM within the proposed framework highlights that GPT-4o-mini outperforms Qwen2-VL2B and proportional random baseline in manufacturing anomalies classification. Additionally, the evaluation of the RAG system confirms that incorporating retrieval mechanisms improves average accuracy by 12% by reducing the risk of hallucination and providing additional information. The proposed framework can be continuously updated by integrating emerging research, allowing seamless adaptation to the evolving landscape of AM technologies. This scalable, automated, and zero-shot-capable framework streamlines AM anomaly analysis, enhancing efficiency and accuracy. ... See More

Can Multimodal Large Language Models be Guided to Improve Industrial Anomaly Detection?
Arxiv
Zhiling Chen, Hanning Chen, Mohsen Imani, Farhad Imani.
[CODE]
Abstract: In industrial settings, the accurate detection of anomalies is essential for maintaining product quality and ensuring operational safety. Traditional industrial anomaly detection (IAD) models often struggle with flexibility and adaptability, especially in dynamic production environments where new defect types and operational changes frequently arise. Recent advancements in Multimodal Large Language Models (MLLMs) hold promise for overcoming these limitations by combining visual and textual information processing capabilities. MLLMs excel in general visual understanding due to their training on large, diverse datasets, but they lack domain-specific knowledge, such as industry-specific defect tolerance levels, which limits their effectiveness in IAD tasks. To address these challenges, we propose Echo, a novel multi-expert framework designed to enhance MLLM performance for IAD. Echo integrates four expert modules:Reference Extractor which provides a contextual baseline by retrieving similar normal images, Knowledge Guide which supplies domain-specific insights, Reasoning Expert which enables structured, stepwise reasoning for complex queries, and Decision Maker which synthesizes information from all modules to deliver precise, context-aware responses. Evaluated on the MMAD benchmark, Echo demonstrates significant improvements in adaptability, precision, and robustness, moving closer to meeting the demands of real-world industrial anomaly detection. ... See More

Federated Hyperdimensional Computing for hierarchical and distributed quality monitoring in smart manufacturing
Internet of Things
Zhiling Chen, Danny Hoang, Fardin Jalil Piran, Ruimin Chen, Farhad Imani.

Abstract: In emerging smart manufacturing, the integration of Internet of Things (IoT) and edge devices is essential for in-situ sensing, communication, and adaptive learning. Federated Learning (FL) leverages edge-cloud collaboration to preserve data privacy and minimize communication over-head compared to centralized models. However, conventional FL approaches face significant challenges in manufacturing： (1) non-Independent and Identically Distributed (non-IID) data and diverse feature distributions complicate local model training within hierarchical, complex industrial data structures; (2) directly overwriting local models with a global model during updates causes clients to lose critical task-specific information unique to environments; and (3) transmitting model updates causes massive communication overhead, limiting scalability. We propose Federated Distributed Hyperdimensional Computing (𝖥𝖾𝖽𝖣𝖧𝖣), an FL framework that employs Hyperdimensional Computing (HDC) to optimize communication for hierarchical manufacturing data. Unlike neural networks, HDC offers robust performance with lower computational demands and inherent resilience to noisy, non-IID data, enabling 𝖥𝖾𝖽𝖣𝖧𝖣 to naturally handle data heterogeneity and reduce computational burdens on edge devices. 𝖥𝖾𝖽𝖣𝖧𝖣 integrates a hierarchical graph-based learning model with a node pruning module to alleviate computational load and implements a novel client-cloud update strategy leveraging HDC’s high-dimensional representations to streamline synchronization, thereby minimizing communication costs and im- proving scalability. We validate 𝖥𝖾𝖽𝖣𝖧𝖣 through a case study on hybrid manufacturing using a Sinumerik edge device, focusing on the geometric quality assessment of two counterbore diameters. 𝖥𝖾𝖽𝖣𝖧𝖣 achieved an F1-score of 95.3% and demonstrated performance improvements of up to 12.6% over state-of-the-art neural network-based FL methods, highlighting its superior efficiency and scalability in complex industrial settings. ... See More

Vision Language Model for Interpretable and Fine-grained Detection of Safety Compliance in Diverse Workplaces
Expert Systems with Applications
Zhiling Chen, Hanning Chen, Mohsen Imani, Ruimin Chen, Farhad Imani.
[CODE]
Abstract: Workplace accidents due to personal protective equipment (PPE) non-compliance raise serious safety concerns and lead to legal liabilities, financial penalties, and reputational damage. While object detection models have shown the capability to address this issue by identifying safety items, most existing models, such as YOLO, Faster R-CNN, and SSD, are limited in verifying the fine-grained attributes of PPE across diverse workplace scenarios. Vision language models (VLMs) are gaining traction for detection tasks by leveraging the synergy between visual and textual information, offering a promising solution to traditional object detection limitations in PPE recognition. Nonetheless, VLMs face challenges in consistently verifying PPE attributes due to the complexity and variability of workplace environments, requiring them to interpret context-specific language and visual cues simultaneously. We introduce Clip2Safety, an interpretable detection framework for diverse workplace safety compliance, which comprises four main modules:scene recognition, the visual prompt, safety items detection, and fine-grained verification. The scene recognition identifies the current scenario to determine the necessary safety gear ... See More

Distributed Hyperdimensional Computing for Real-Time Data Aggregation and Interpretable Quality Monitoring in Manufacturing
IMECE 2024 (Porland, OR)
Zhiling Chen, Danny Hoang, Ruimin Chen, Farhad Imani.

Abstract: The integration of diverse sensors in manufacturing processes offers enhanced potential for real-time quality assurance through the collection of complementary data. However, the limited interpretability of current machine learning models often hampers the effective discrimination of each sensor's unique contributions, primarily due to the complexity of decoding the interdependencies among various signals. This paper proposes a novel computational framework, Distributed Hyperdimensional Computing (DHDC), which is designed to leverage efficient cognitive operations, such as binding and bundling, for interpretable learning across multi-level sensor data. DHDC operates by encoding and aggregating data systematically across a distributed architecture, thereby enhancing transparency in computational efficiency. Our real-world experimental results on the 5-axis machining for the fabrication of the counterbore hole feature demonstrate that the framework not only effectively characterizes the impact of individual sensors but also achieves a high degree of predictive accuracy, as evidenced by an F1 score of 90.4%. The proposed framework holds the potential for interpretable and scalable quality control in distributed additive and subtractive manufacturing. ... See More

Privacy-preserving Federated Learning with Differentially Private Hyperdimensional Computing
Computers and Electrical Engineering
Fardin Jalil Piran, Zhiling Chen, Mohsen Imani, Farhad Imani.

Abstract: Federated Learning (FL) is essential for efficient data exchange in Internet of Things (IoT) environments, as it trains Machine Learning (ML) models locally and shares only model updates. However, FL is vulnerable to privacy threats like model inversion and membership inference attacks, which can expose sensitive training data. To address these privacy concerns, Differential Privacy (DP) mechanisms are often applied. Yet, adding DP noise to black-box ML models degrades performance, especially in dynamic IoT systems where continuous, lifelong FL learning accumulates excessive noise over time. To mitigate this issue, we introduce Federated HyperDimensional computing with Privacy-preserving (FedHDPrivacy), an eXplainable Artificial Intelligence (XAI) framework that combines the neuro-symbolic paradigm with DP. FedHDPrivacy carefully manages the balance between privacy and performance by theoretically tracking cumulative noise from previous rounds and adding only the necessary incremental noise to meet privacy requirements. In a real-world case study involving in-process monitoring of manufacturing machining operations, FedHDPrivacy demonstrates robust performance, outperforming standard FL frameworks—including Federated Averaging (FedAvg) ... See More

📍 Updates

📚 Publications

✨ Outreach Activities

🖼️ Gallery