Leveraging data engineering principles to streamline semiconductor research and development pipelines
Synopsis
The Research and Development (R&D) of semiconductor technologies is a multibillion-dollar effort for the worldwide semiconductor industry. To keep track of the rapid advance in semiconductor technology and enhance productivity, there is a need to streamline semiconductor research R&D pipelines without compromising research fidelity and flexibility. Existing Data Engineering principles in handling datasets encompassing scientific data provenance and manuscript rewrite pipeline are employed to streamline semiconductor R&D pipelines. A semantic provenance data model, structured data integration pipelines, and distributed workflows are demonstrated and discussed with respect to three relevant semiconductor R&D tasks: site-specific ion implantation for high-performance embedded non-volatile memory, atomic layer doped gate technology for sub-5 nm node FinFETs, and pitch-scaled extreme ultraviolet lithography implementation for 0.5 nm node logic technologies. The results show the potential of Data Engineering principles for semiconductor data management and research productivity enhancement by addressing the unique needs and preferences of semiconductor R&D.
Research and Development (R&D) is the foundation of advances in semiconductor technology, which drives computer and smartphone performance improvements and low-cost ubiquitous electronics and has wide implications for other fields such as healthcare and information technology. Semiconductor technologies generally advance on the order of 2–3 years, involving multibillion-dollar investments in worldwide efforts by semiconductor and equipment manufacturers, fabrication plants, and research institutions. During this extended time, new semiconductor devices typically comprise a handful of new processing technologies and precise control of many device structures, doping, and chemical composition parameters. R&D is the exercise of empirically exploring, implementing, optimizing, and demonstrating new technologies step by step, often with little prior knowledge of success. The general requirement for semiconductor technologies is that they be manufacturable, which implies extensive investigation of conveying, mass production, contamination, yield, and material transition issues (Begum & Chowdary, 2024; Ketelaars, n.d.; Raghunathan et al., 2024).
The semiconductor industry is faced with an impending crisis at its 5 nm technology node. There is a compelling need to keep dedicated R&D in wafer-scale fabrication without large upfront investments in factories, chemicals, and process equipment, or exposing trade secrets. Currently, R&D typically relies on legacy solutions like spreadsheets and proprietary solutions from equipment vendors, with custom database-derived solutions playing a minor role. Such solutions are hardwired to semiconductors and result in data silos. There is a need for a general-purpose solution that can be implemented within a year and is easily adaptable to many related tasks beyond semiconductors (Xu et al., 2023; Schiller & Larochelle, 2024; Schilling-Wilhelmi et al., 2024).