Data engineering for real-time processing, streaming analytics, and scalable decision intelligence
Synopsis
Data engineering emerges as a pivotal specialization within the broader landscape of computer engineering and computer science. This field's primary focus rests upon the careful design and optimization of data pipelines and architectures that serve the specific needs of a wide array of applications. These services span the realms of artificial intelligence, business intelligence, real time analytics, and other specialized domains. Nevertheless, the diversity of these use cases, with their heterogeneous requirements spanning a broad spectrum in terms of needs, scale, performance, and other supporting engineering metrics, entails that no one technology or solution is common across all these domains. Data applications display widely varying telemetry, at all levels of the technological stack (Gulisano et al., 2012; Akidau et al., 2015; Carbone et al., 2015).
A large fraction of the volumes ingested, stored, and processed typically accrue to applications that are part of real time serving systems. Furthermore, the pipelines and systems used to support the data, storage, processing, and serving needs of these applications must be optimized for throughput, latency, fault tolerance, query expressiveness, and any other domain-specific metric that is relevant, such as data retention and availability. These applications cover an important part of the data ecosystem, with reliable systems providing real time recommendations, personalization, and fraud detection functionality to users in question. Their obsolescence could potentially cause significant disruption to business ecosystems and the shape of the services offered by major economies, affecting companies that utilize recommendations to monetize their user interactions. The unprecedented scale of the customer bases already being served by some of these products provide a powerful incentive for organizations to continually push the latency boundaries of the pipeline architecture (Stonebraker & Çetintemel, 2005; Hesse & Lorenz, 2019).