Big Data Security Solutions

所属分类 : 数据加密解决方案

Big Data Security Solutions







In recent years, "digital economy" is a typical demonstration of the flexible usage of data and becoming the driving force of economic development. In September 2015, the State Council of the People's Republic of China issued the Action Plan on Promoting Big Data Development, which becomes a guiding document for China to promote the application and development of big data and empower the industry.  However, big data faces lots of security challenges caused by data leakage and personal privacy exposure.

As one of the most important technologies on cyberspace security, cryptographic technology can be effectively applied to achieve data authenticity, integrity, confidentiality, and non-repudiation. It plays an important role as most effective, reliable, and economical infrastructure in cyberspace security protection. The Cryptography Law of the People's Republic of China was officially enforced on January 1st, 2020, providing legal basis for the comprehensive promotion of cryptography applications.

Requirement Analysis





The data status inside the big data platform includes data in transmission, usage, and storage. Without data security protection, plain-text data will have security risks during its whole life cycle. Once the data leakage occurs, it will cause serious consequences.

The security risks of the data during the whole life cycle include:

Lack of Security Mechanism in Big Data Platform

In the early design of the Hadoop ecosystem, the security schemes for user authentication, access control, key management, and security auditing are inadequate.

Severe Risks of Private Data Leakage

The big data platform stores lots of of data, usually up to hundreds of terabytes. With such a large amount, sensitive information must be proactively protected.

Insufficient Traditional Security Protection Methods

Traditional cryptographic methods only respond to the encryption requirements for data in    transmission and storage, including TLS (Transport Layer Security) and TDE (Transparent Database Encryption). There are security vulnerabilities such as lack of application-level encryption, uncontrollable permissions and using unsecure cryptographic algorithm.

Lack of independent data security authority system

The data security authority system in the big data platform relies on its own authority control over users and administrators. Lack of independent control over sensitive data can easily lead to the abuse of high authority and breaches of single-layer control.

Scheme Architecture





In response to the aforementioned data security risks, Sansec has built a big data full life cycle security system based on cryptographic technology after many years of research, trials and experiences.

This security system

  • Thoroughly extend the cryptography protection of sensitive data  to the application layer.
  • Overcome the security drawbacks: "transmitting with TLS and storing with TDE"
  • Effectively solve the retrieval and calculation inconvenience of ciphertext data 
  • Build an independent third-party data security authority system

Figure1. Big Data Encryption Scheme


Technical Architecture

This solution combines security platform with cryptographic middleware to provide encryption functions for multiple application components and databases in the big data platform. The architecture is shown in the following figure:

Figure2. Technical Architecture of Big Data Encryption Solution


Security platform: located at the core of the entire cryptographic system, it is responsible for:

  • Provide hardware-level security protection for the entire cryptographic framework (HSM)
  • Key security management based on KMIP (Key Management Interoperability Protocol)
  • Provide identity authentication, access management of ciphertext search engine and cryptographic middleware

Cryptographic middleware: The cryptographic middleware is installed in the form of an application-side software agent. This component is transparently embedded and deployed inside of the application, and it realizes the sensitive data encryption and the key secure management through linkage with the security platform. It has several kinds of cryptographic algorithms, including FPE (Format-preserving encryption) and homomorphic encryption algorithms.

Application layer: Big data platforms mostly use various data processing components based on the Hadoop ecosystem, mainly including three parts:

  • Data cleaning and message distribution (ETL and KAFKA, etc.)
  • Data storage and processing (HDFS, Hive, Hbase, Spark and Flink, etc.)
  • Analysis and presentation (BI).

This solution can support more types of application components, quick customization, and adaptation according to practical requirements.


Product Deployment

The security platform includes HSM, key management system, ciphertext search engine and management terminal. These servers are connected to the big data production cluster through the Ethernet and independently deployment in a secure subnet. At the same time, cryptographic middleware is deployed in various clusters.

Figure3. Big Data Encryption Product Deployment


Main Functions

1.High-performance Data Encryption

Support self-developed cryptographic algorithm engines and optimized algorithm applications to achieve high-performance data encryption.

2.Key Security Management

By using HSM to store the root key securely and the KMIP protocol, we achieve the key centralized management of multiple clients.

3.Independent Access Control for Sensitive Data

Authority access control function under the cryptographic system. Under the permission framework of native Hadoop, realize the permission control for an independent third party's sensitive data.

4.Ciphertext Retrieval and Ciphertext Calculation

  • Realize accurate, fuzzy and high frequency ciphertext query functions.
  • Overcome the bottleneck that ciphertext can only be accurately searched by accurate condition  and restricted fuzzy condition.
  • Achieve full-scene, ciphertext retrieval capabilities that are indistinguishable from plaintext. The cryptographic engine has homomorphic algorithm functions such as Paillier and Elgamal algorithms.

Solution Features





Application Layer Data Encryption

The data at data platform is stored , processed and transported in the form of ciphertext, and only can be decrypted as plaintext when required. This eliminates the problem of insufficient protection caused by low-level encryption protection.

Centralized Management and Distributed Encryption

It occupies abilities of  central management of keys and authorities, protecting the sensitive data through big data cluster.  It protects sensitive data through the high-performance computing ability of the big data cluster.

Support Multi-platform Applications

The solution supports CDH, Apache Hadoop, Huawei FusionInsight, H3C Dataengine and other big data platforms.


The solution has obtained the Commercial Cryptography Certification and meets the requirements of relevant policies and regulations by using Chinese cryptographic algorithms.

Applicable Field





This solution is suitable for finance, government affairs, public security, energy, education, healthcare and enterprises industries.

Applications Cases















Copyright © Sansec all rights reserved.

400-00-90196 / 010-59785977

Room 1406, 14 / F, building 2, yard 16,North Guangshun Street, Chaoyang District, Beijing[100102]

Copyright © Sansec all rights reserved.
  • TOP