Big Data Cloud Architecture

Big Data Cloud Architecture – This article shares the best practices of InMobi, Alibaba Cloud’s open source big data service.

The extensive integration of open source and cloud-based technologies has enabled Alibaba Cloud’s large open source database to accumulate rich experience in terms of performance, usability and security. Many companies work with it to focus on the benefits of their core business, shorten development cycles, reduce management and maintenance hassles, and expand business activities.

Big Data Cloud Architecture

Big Data Cloud Architecture

Most of this article shares the best practices of InMobi, an open source big data service for Alibaba Cloud.

Cloud Data Lake House

InMobi is a global platform for mobile advertising and marketing technology. InMobi provides mobile advertising campaigns and marketing technologies for domestic brands and applications and delivery services based on the number of connected applications and users globally. Marketing and monetization services for developers. The foundation was founded in 2007 and entered the Chinese market in 2011. It focuses on technology research and development and plays an important role in the mobile advertising industry. Its professional technology is leading in China and the world. InMobi reaches over one billion monthly active and dedicated users through local service teams in 23 countries and regions around the world. It provides accurate mobile advertising using thousands of audience classifications, thousands of metrics, data from sample libraries that match tens of millions of users, and general services (LBS).

As a leading technology company, InMobi was named one of CNBC’s Disruptive 50 in 2019 and one of the most innovative companies by Fast Companies magazine in 2018.

The previous figure shows the big data structure of InMobi in China, which is divided into data integration layer, storage layer, calculation layer and reporting layer. First, the read data (especially RR and other data) from the previous read section is entered into the data input layer. Then, the data is stored in offline HDFS big data storage, and the data operations are handled by the computer cluster. Finally, processing activities are presented to end users through reports.

If there are not enough computing resources, it is necessary to delegate (or stop) some tasks, to run important tasks first, which is not suitable for generating reports.

Google Cloud Platform Blog: Big Data, The Cloud Way

The power of data reporting is poor and cannot match the needs of business teams to generate reports in minutes.

3. InMobi’s Big Data Optimization Project in China How InMobi approaches big data optimization.

Open up other big data services in the cloud and use the elasticity of big data services to overcome inefficiencies in computing and storage, especially for short-term scenarios with high usage such as 618 Marketing and Double 11 Festival.

Big Data Cloud Architecture

As an open source product, ClickHouse has been widely implemented in the business environment of Internet companies in China.

Cloud Bigdata Platform Reference Architecture Public Cloud Providers Offerings.

A true database system solves the data reporting capacity problem, at least down to the minute level and up to the second level for special reports and requests.

Read RR logs from Kafka and write real reports with ClickHouse. Read useful data from Kafka and store it in MySQL and PostgreSQL according to business requirements.

Store all raw data from Kafka to entire HDFS clusters with Flume and perform data analysis and rule-making. In offline big data collection, all business requirements for offline reports are handled by Spark Jobs. Finally, the activity is recorded in ClickHouse and displayed in an offline data report.

4. More technical and implementation insights in the future: Stream and profile processing in Flink and Hologres will build real-time databases.

Modern Analytics Architecture With Azure Databricks

The Hologres architecture separates storage and computation. The software is fully deployed on Kubernetes and the storage is used for shared storage. You can choose HDFS or OSS on the cloud based on business requirements to achieve elastic expansion of resources and simultaneously solve problems caused by insufficient resources. This applies to InMobi’s advertising business.

Flink performs ETL processing for streaming and profiling data, writes the processed data to Hologres for individual storage and querying, and the business ultimately connects directly to Hologres to provide web services and productivity. Share this page on LinkedIn

Building an on-site big data platform requires significant infrastructure investment to support data collection, processing, classification, storage and analysis. Businesses looking to migrate their applications and big data platforms to the cloud (increasing agility and scale and moving from investment to cost models) should consider building on a cloud-based big data base.

Big Data Cloud Architecture

Businesses can leverage big data as a service solutions on cloud platforms with data centers (Kafka management), streaming analytics, analytics engines (built on open-source Apache Hadoop and Apache Spark), and cloud object storage. Using a cloud solution makes it simple and easy to use without the hassle of setup or high maintenance costs. In addition, data scientists can begin providing immediate value by accessing and analyzing data sets from cloud object storage and data science expertise.

How The Right Operational Architecture Powers The Analytics That Matter

A few months ago, Cloud Garage worked with a mid-sized company to evaluate and transform all of their applications in the cloud. At the end of the initial assessment, we provided a transformation vision and implementation plan based on the Cloud Garage methodology. We also provided a customized cloud architecture, implemented a team model, and created a work plan to break the project into several sub-products and rider models to update their plans and move them to the cloud.

In this article, I’ll share these installation considerations and target architecture components to help others implement big data platforms in the cloud.

Based on this link and others, here are a few key requirements for building a secure cloud platform:

Although many companies today are looking to use big data to generate new business insights, this large company’s use of big data is central to their focus, and baked into many of their business decisions. Big data is driving innovation in customer service by predicting what customers like and how they interact. We learn from these interactions to improve future experiences.

A Data Lake Architecture For Modern Bi

Their current big data platform is adequate but expensive. It is behind computer technology and modern technology due to the number of sales and number of installations. Their platform was in dire need of technical updates and process upgrades. Innovation accelerates innovation, provides data management, lowers maintenance costs, and most importantly creates a single source of trust to resolve data inconsistencies, which will undoubtedly increase the business value of the technology.

Before embarking on cloud migration and digital transformation, organizations must understand their long-term business goals, pain points, application and data infrastructure architecture, and individual infrastructure. , should understand actions and processes. Cloud Garage comprehensively examined customer applications supporting key business functions, categorized them into different categories, and evaluated different cloud delivery models. As mentioned above, this vendor’s big data platform is one of the core components of all their applications. Also, in our evaluation, we found that their platform has been built using various technologies over time, which has many complications to consider.

When we joined the client, they had already started building the target architecture model using open source technologies. However, by doing so, the client plans to implement manual collection techniques on a cloud-based big data platform that can be quickly deployed without using any cloud-native services (like Hadoop and Spark Cluster) or for data flexibility. Relaxation and object storage.

Big Data Cloud Architecture

Our team has realized the goal of adopting open source technologies and proposed a new target architecture that meets their core requirements. Our architecture is divided into three major parts to address data flow:

Architecture And Data Flow ‒ Qlik Catalog

First, all three workflows require data entry. We plan to consolidate all data ingestion through Kafka to accommodate the large volume of data flows through different channels. Although the previous platform used Kafka and other custom production sources, we proposed a variety of integrations with Kafka to serve information on cloud platforms. Kafka guarantees real-time data flow. Since Kafka is an open source technology, the client is not locked in, making it easy for the client to choose a client.

Second, we propose cloud-based analytics to integrate real-time and real-time applications for data processing. Streaming is a proven technology that reaches hundreds of millions of users. Although Stream is not open source, it supports developing applications in Java, Scala, and Python. Stream also supports Apache Beam, a programming model that allows users to build stream-agnostic applications. The controller is the lowest-level stream program,

Cloud data architecture, big data analytics architecture, gcp big data architecture, big data architecture, big data architecture patterns, big data pipeline architecture, big data platform architecture, big data solution architecture, cloud data center architecture, cloud data storage architecture, cloud data warehouse architecture, big data application architecture

Leave a Reply

Your email address will not be published. Required fields are marked *