Get startedGet started for free

Ingesting and processing data for Cymbal Retail

1. Ingesting and processing data for Cymbal Retail

Let’s begin by exploring the role of a Professional Data Engineer in handling data ingestion and data processing for Cymbal Retail, which includes using the various services provided by Google Cloud to support data ingestion and data processing. Cymbal Retail receives data from multiple sources, both internal and external. As the business has grown, the volume of ingested data has also increased exponentially. Processing the data has become increasingly complex and costly. In Cymbal Retail’s current on-premises data centers, Spark and Hadoop jobs are executed on pre-configured, static infrastructure. Part of your role involves determining how to lift-and-shift these jobs to Google Cloud. You have to design the architecture of the data ingestion and processing. Some data can be directly loaded into data warehouses using an extract and load approach, while others might be transformed before being uploaded into the data warehouse. Building, deploying, and operating effective flexible data pipelines for all the stages of data processing is a primary expectation from you as a Professional Data Engineer. You need to identify and deploy the right approach between EL, ETL, or ELT and choose the right Google Cloud tools for the job. Cymbal Retail’s customers want features that require the increased use of real-time data. In this regard, tools like open source Apache Beam and the hosted Dataflow are important skills for a data professional. Your knowledge of ways to apply different types of windowing for various use cases will provide the right approach to analyze streaming data. Your role also requires you to optimize all data ingestions and data processings tasks. Your optimizations should bring considerable savings on effort and cost, while improving availability and responsiveness. As the volume of data and scale of processing increases, Cymbal Retail does not want the latency, effort, or cost to increase linearly, or worse, exponentially. Your early design decisions on automation and orchestration could reduce effort later on.

2. Let's practice!

Create Your Free Account

or

By continuing, you accept our Terms of Use, our Privacy Policy and that your data is stored in the USA.