A Survey on Homomorphic Encryption in Cloud Security

Kavitha C.R.1, Bharati Harsoor2

1Research Scholar, Dept. of Information Science & Engineering, PDACE, Kalburgi, India

2Professor, Dept. of Information Science & Engineering, PDACE, Kalburgi, India

[email protected], [email protected]

Abstract- An outsourcing of data is increasing the amount of data storage and management in Cloud. This raises many new challenges of privacy concerns for individuals and businesses alike. One of the approaches to handle the privacy concerns is that the users send the encrypted data on to the cloud. Homomorphic encryption scheme is used so that the cloud can perform meaningful computations on the data. Fully Homomorphic Encryption allows arbitrary computation on encrypted data. In the last few years, many solutions for fully homomorphic encryption have been proposed and improved upon, but it is hard to prove the efficiency. In this paper, an analysis has been made to exhibit a number of real-world applications. Also, a number of application-specific optimizations to the encryption schemes have been shown. The system must be efficient without compromising the security services required in the cloud environment.

Keywords: Fully Homomorphic Encryption, privacy, security

1 INTRODUCTION

In cloud computing, Data sharing is an essential aspect for secure, efficient and flexible sharing of data with the other authorized users. The cloud storage development as well as computing platforms supports users to outsource storage and computations on the data. However, concerns over loss of privacy and business value of private data are always a challenge to the adoption of cloud services by consumers and businesses alike. The approaches to privacy protection in data storage phase are based on encryption procedures. Encryption based techniques can be classified into Identity Based Encryption (IBE), Attribute Based Encryption (ABE) and storage path encryption. An excellent way to address these privacy concerns is to store all data in the cloud encrypted, and perform computations on encrypted data. Therefore, we need an encryption scheme that allows meaningful computation on encrypted data, namely a homomorphic encryption scheme. Fully Homomorphic Encryption (FHE) schemes allow arbitrary computation on encrypted data 1.

The simplest encryption scheme for which one can hope to achieve security is the Caesar which is not secure. The conventional public-key encryption schemes with modular exponentiations are secure, but modular exponentiation is not a very simple operation. If we were to forget our current schemes and start from scratch, perhaps something like the following scheme would be a good candidate for a simple symmetric encryption scheme:

KeyGen: The key is an odd integer, chosen from some interval p ? 2n-1, 2n.

Encrypt (p, m): To encrypt a bit m ? {0, 1}, set the ciphertext as an integer whose residue mod p has the same parity as the plaintext. Namely, set c = pq + 2r + m, where the integers q, r

are random in some other prescribed intervals, such that 2r < p/2 in absolute value.

Decrypt (p, c): Output (c mod p) mod 2.

We stress that our main motivation is conceptual simplicity – namely, to demonstrate that even something as complex as fully homomorphic encryption can be achieved using “elementary” techniques.

2 related work

There are many approaches to privacy preservation storage on cloud. When data are stored on cloud, data security consists of three dimensions, confidentiality, integrity and availability 2. If data confidentiality or integrity is breached it will have a direct effect on users privacy. Availability of information ensures that authorized parties are able to access the information when required. A fundamental requirement for big data storage system is to protect the privacy of an individual. There are some existing mechanisms to fulfil that requirement. For example, a sender can encrypt the data using pubic key encryption (PKE) in a manner that only the valid recipient can decrypt the data. The approaches to safeguard the privacy of the user when data are stored on the cloud are as follows 3:

•?Attribute based encryption (ABE) Access control is based on the identity of a user complete access over all resources.

•?Homomorphic encryption can be deployed in IBE or ABE scheme settings updating cipher text receiver is possible.

•?Storage path encryption secures storage of big data on clouds.

•?Usage of Hybrid clouds Hybrid cloud is a cloud computing environment which utilizes a blend of private cloud and third-party, public cloud services with organization between the two platforms.

3 Motivation

The major deficiencies of several modern techniques have given us the necessary challenges to carry out the current investigation on the Cloud Security.

An appropriate cryptography technique is suggested to attain safe data storage and transaction in the cloud computing.

The data privacy problem faced in the course of the third party auditing cannot be steered clear of completely with the encryption technique, though it may be just converted into the composite key management domain.

The data safety guaranteed by means of the Privacy-Preserving Public Auditing is not effective.

It may not be easily possible for any user to access the entire data of significance from the cloud data centre. Because several cloud service providers accumulate the needed data. Therefore, a gloom of doubt pervades the users, when the data is accessed through the medium of the cloud service providers.

PRELIMINARIES

Homomorphic Public Key Encryption scheme E has four algorithms: the usual KeyGen, Encrypt, Decrypt and an additional algorithm Evaluate. The algorithm Evaluate takes as input a public key pk, a circuit C, a tuple of ciphertexts c1 = <c1, . . . , ct > (one for every input bit of C), and outputs another ciphertext c.

Fully Homomorphic Encryption scheme E = (KeyGen, Encrypt, Decrypt, Evaluate) is homomorphic for a class C1 of circuits if it is correct for all circuits C ? C1. E is fully homomorphic if it is correct for all boolean circuits.

Lauter and colleagues implemented only the most efficient parts of a fully homomorphic encryption system. As a result, they’ve produced a system dubbed “somewhat” homomorphic that can only perform some calculations, but is speedy enough to be used in real software. Only additions and a few multiplications can be done on a piece of encrypted data sent to the system, but that’s enough for many services, says Lauter. Still can do a lot of statistical functions and perform analysis like logistical regression, which is used to do things like predict how likely a person is to have a heart attack. As techniques for fully homomorphic encryption evolve, it might be possible to gradually increase the complexity of calculations that can be performed practically.

Group Homomorphic Encryption (GHE) schemes are

public key encryptions that allow to compute an operation on ciphertexts being equivalent to some binary operation on the corresponding plaintexts 7.

Somewhat Homomorphic Encryption (SHE) schemes

allow a specific class of functions to be evaluated on

ciphertexts. Usually this scheme supports an arbitrary number of a type of operation but only a limited number of second categories of operations. FHE scheme is an extended form of group homomorphic encryption.

Parallel homomorphic encryption (PHE) schemes are encryption schemes that support computation over encrypted data through the use of an evaluation algorithm that can be efficiently executed in parallel. Using a PHE scheme, a client can outsource the evaluation of a function f on some private input x to a cluster of w machines as follows. The client encrypts x and sends the ciphertext and f to the controller. Using the ciphertext, the controller generates n jobs that it distributes to the workers and, as above, the workers execute their jobs in parallel. When the entire computation is finished, the client receives a ciphertext which it decrypts to recover f(x). To handle cases where even the function f must be hidden, we introduce a second variant of PHE which we refer to as delegated PHE. This variant includes an additional token generation algorithm that takes as input f and outputs a token ? that reveals no information about f but that, nonetheless, can be used by the evaluation algorithm (instead of f) to return an encryption of f(x).

MapReduce-Parallel Homomorphic Encryption MapReduce algorithm is divided into five (possibly probabilistic) algorithms: parse, map, partition, reduce and merge. The parse algorithm takes an n-bit input x and outputs a set of label/value pairs {(li, vi)}i. The map algorithm takes a single input pair (li, vi) and outputs a set of intermediate pairs {(?j , ?j)}j , where ?j is from some label space ?. Note that the intermediate pairs do not have to be from the same space as the input pairs. The partition algorithm takes as input a set of intermediate pairs and partitions it into the sets {Pl}l. The reduce algorithm takes as input a label ? and a partition Pl and returns a string z. Finally, the merge algorithm takes a set of strings {zr}r and returns an output y. For the algorithm to work properly, it is crucial that the map and reduce algorithm be stateless.

Paillier Encryption- Paillier Encryption is an additively homomorphic encryption proposed by Paillier 5. In this paper, we mainly focus on its additive homomorphism, which means it can compute a new ciphertext of plaintext (m1 + m2) based on the two ciphertexts of plaintext m1 and m2 without revealing the plaintexts. Specifically,

• Given m1 and m2, we can compute m1 +m2 = m1.

m2.

• Given m1 and a plaintext ?, we can compute ? · m1 = m1 ?.

where m describes the Paillier ciphertext of a plaintext m. More details of construction of Paillier encryption can be found in 5.

BGN Encryption- Boneh-Goh-Nissim (BGN) encryption is

a somewhat homomorphic encryption, which was proposed by Boneh 6. With the homomorphic property, it can compute an arbitrary number of additions and also one multiplication on encrypted data. Specifically, we have

• Given ?m1? and ?m2?, compute ?m1 + m2? = ?m1? ? ?m2?

• Given ?m1? and ?m2?, compute ?m1 × m2? = ?m1?? ?m2?where ?m? represents the BGN ciphertext of a plaintext m, ? and ? denote the corresponding operations in the ciphertext domain.

More construction details of BGN can be found in 6. We will use Paillier and BGN respectively for evaluating inner

products with different privacy requirements, which essentially leads to secure association rule mining. Understanding the homomorphic properties of these two primitives will be sufficient to follow the design of our protocols. The details and security proofs of Paillier and BGN can be found in 5 and 6 respectively.

Challenges of Executing Encrypted Programs

Conventional program which run on unencrypted data has

a long history. They configure the general purpose computer to perform operations as required to solve a problem. The instructions are the sequence of operations stored in the memory. However, algorithms on encrypted data bring forward many new challenges. We enumerate some of the important challenges:

Handing of arithmetic operations: The underlying

ALU operations supported by general purpose computers are traditional computations: like addition, multiplication, equality, comparisons, bitwise operators etc. On the other hand, encrypted programs demand the requirement of FHE operations, which are complex operations. Hence, translation of the code from an unencrypted to an encrypted program should be tackled efficiently.

Handling of conditional branches and detecting Termination conditions: The conditional operations are encrypted and hence branching and termination need to be handled depending on encrypted conditions.

Supporting recursive programs: Handling recursion with encrypted operations is another major challenge

since existing recursion stacks are unencrypted.

The main objective is to study the implementation of algorithms on encrypted data. For this reason, implementations of basic operations need to be defined. In

the next section, we shall discuss about the basic operations

required to implement standard algorithms and then those operations will be realized on encrypted data.

BASIC OPERATIONS FOR ALGORITHM HANDLING

AND THEIR ENCRYPTED VARIANTS

Any algorithm is consists of sequences of specific mathematical and logical manipulations. An algorithm consists of different variables and instructions to be executed on them (Execution of same instruction may be once or repeated in a loop). First step of translating any algorithm is realizing the variables in encrypted domain. The integer input variables will be translated to type encrypted variables.

Instructions on the variables are basically a single operation or function to be executed. To handle those operations on encrypted data, here we first identify the different class of unencrypted operators and then design their respective encrypted counterparts. According to the standard programming languages (like C), those manipulations or operations can be classified into different types:

Arithmetic operators perform addition, subtraction,

multiplication and division of two operands.

Relational operators checks if the values of two operands

are equal. These operators also decide the greater—lesser relations between two operands.

Logical operators include logical AND, OR and NOT

operations among the operands.

Bitwise operators perform bitwise operations among

the operands (like bitwise AND, OR operations).

Assignment operators assign values from right side

operands to left side operand.

PROPOSED WORK

The sequential homomorphic encryption involves the main step towards private outsourced computation; an increasing fraction of the computations performed in the cloud is on massive datasets and therefore requires the computation to be performed on clusters of machines. To address this, we make the following contributions:

PHE and delegated PHE (which also hides the function being evaluated). The MapReduce model of parallel computation is considered and formalizes MapReduce parallel PHE schemes. Given the practical importance of the MapReduce model and the emergence of cloud-based MapReduce clusters, we believe the study of MapReduce-parallel HE to be important and well motivated.

Consider new Randomized Reductions (RR) for univariate and multivariate polynomials with a small number of variables. The reduction for univariate polynomials is information theoretically secure while the reduction for multivariate polynomials is secure based on the multi-dimensional noisy curve reconstruction assumption 9.

A general transformation from any RR to a MR-parallel HE scheme given any public-key HE scheme that can evaluate the reductions’ recovery algorithm. If the RR works for any function within a class C, then the resulting MR-parallel scheme is C-homomorphic.

A delegated MR-parallel HE scheme for any function whose output values can be computed by evaluating a (single) univariate polynomial over the input values can be shown. This is based on the RR for univariate polynomials and makes black-box use of a 2DNF-HE scheme (i.e., a HE scheme that can handle a single multiplication and any number of additions) such as 10.

Construct a MR-parallel HE scheme for any function which consists of all functions whose outputs depend on at most m input elements. We note that the construction is not delegated. The scheme makes use of our RR for multivariate polynomials and makes black-box use of additively HE.

Finally, MR-parallel HE schemes can be used to evaluate various database queries on encrypted datasets including keyword search and set membership testing.

Methodology/Modules

Delegated Map Reduce Parallel Homomorphic Encryption Scheme is

Generate: probabilistic algorithm that takes security parameter s as input and returns a key K.

Encrypt: probabilistic algorithm that takes a key K and an input x from some message space X and returns a ciphertext c.

Token: probabilistic algorithm that takes a key K and a function f as input and returns a token T.

Parse: deterministic algorithm that takes a token T and a ciphertext c as input and returns a sequence of input pairs (li,vi).

Map: probabilistic algorithm that takes a pair (l,v) as input and returns a sequence of intermediate pairs (?,?).

Partition: probabilistic algorithm that takes a pair (?, ?) and returns a value h in some space H.

Reduce: probabilistic algorithm that takes a label ? and a partition P of intermediate values and returns an output pair (?, z).

Merge: deterministic algorithm that takes a set of output pairs and returns a ciphertext c1.

Decrypt: deterministic algorithm that takes a key K and a ciphertext c1 as input and that returns an output y.

Correctness: y = f(x).

APPLICATIONS

Simple database queries: The simplest query we can support

is for set membership. More precisely, if S ? F, the query Set(x, S) returns a n-bit string such that the ith bit indicates whether xi ? S.

Keyword search: A more complex query that can also be

performed with m = 2 is keyword search. Here, let x = (w1, v1), … (wn,vn) be a dataset where wi and vi are in F. The query KS(x, w) outputs a sequence (z1, .. zn) such that zi = vi if wi = w and such that zi = 0 if wi ? w. The KS query can be handled using the approach of 11 which for every pair (wi, vi) computes the polynomial Qw, r (wi, vi) = p1(vi) + r . p2(wi), where r ?$ F, p1(v) returns v|| 0k and p2(w) = 0.

Medical Applications- Private data and Public functions:

A private cloud medical records storage system (Patient Controlled Encryption) in which all data for a patient’s medical record is encrypted by the healthcare providers before being uploaded to the patient’s record in the cloud storage system. The patient controls sharing and access to the record by sharing secret keys with specific providers (features include a hierarchical structure of the record, ability to search the encrypted data, and various choices for how to handle key distribution). However this system does not provide for the cloud to do any computation other than search (exact keyword match, or possibly conjunctive searches). With the FHE implementation, the ability for the cloud to do computation on the encrypted data on behalf of the patient is added. Imagine a future where monitors or other devices may be constantly streaming data on behalf of the patient to the cloud. With FHE, the cloud can compute functions on the encrypted data and send the patient updates, alerts, or recommendations based on the received data. The functions to be computed in this scenario may include averages, standard deviations or other statistical functions such as logistical regression which can help predict the likelihood of certain dangerous health episodes. Encrypted input to the functions could include blood pressure or heart monitor or blood sugar readings, for example, along with information about the patient such as age, weight, gender, and other risk factors. The functions computed may not need to be private in this case since they may be a matter of public health and thus public.

Financial Applications- Private data and Private functions

In the financial industry there is a potential application scenario in which both the data and the function to be computed on the data is private and proprietary.

As an example, data about corporations, their stock price or their performance or inventory is often relevant to making investment decisions. Data may even be streamed on a continuous basis reflecting the most up-to-date information necessary for making decisions for trading purposes.

Functions which do computations on this data may be proprietary, based on new predictive models for stock price performance and these models may be the product of costly research done by financial analysts, so a company may want to keep these models private to preserve their advantage and their investment.

With FHE, some functions can be evaluated privately as follows. The customer uploads an encrypted version of the function to the cloud, for example a program where some of the evaluations involve encrypted inputs which are specified. The streaming data is encrypted to the customer’s public key and uploaded to the cloud. The cloud service evaluates the private function by applying the encrypted description of the program to the encrypted inputs it receives. After processing, the cloud returns the encrypted output to the customer.

Advertising and Pricing

Imagine an advertiser, for example a cosmetics company, who wants to use contextual information to target advertising to potential customers. The consumer uses a mobile phone as a computing device, and the device constantly uploads contextual information about the consumer, including location, the time of day, information from email or browsing activity such as keywords from email or browser searches. In the future, imagine that information is uploaded potentially constantly from video devices: either pictures of objects of interest such as brands or faces which are automatically identified, or from a video stream from a camera on the body which is identifying context in the room (objects, people, workplace vs. home vs. store). When contextual information is uploaded to the cloud server and made accessible to the cosmetics company, the company computes some function of the contextual data and determine which targeted advertisement to send back to the consumer’s phone.

Encrypted version: The problem with these scenarios is the invasion of privacy resulting from giving that much detailed information about the consumer to the server or to the advertising company. Now, imagine an encrypted version of this entire picture. All the contextual data is encrypted and then uploaded to the server; the advertiser uploads encrypted ads to the server; the server computes a function on the encrypted inputs which determines which encrypted ad to send

to the consumer; this function could be either private/proprietary or not. All contextual data and all ads are encrypted to the consumer’s public key. Then the cloud can operate and compute on this data, and the consumer can decrypt the received ad. As long as the cloud service provider does not collude with the advertisers, and semantically secure FHE encryption is employed, the cloud and the advertisers don’t learn anything about the consumer’s data.

Conclusion

We have described various homomorphic encryption schemes that use only simple integer arithmetic operations. The primary open problem is to improve the efficiency of the scheme, to the extent that it is possible.

FUTURE SCOPE

Optimizing communication with the cloud: A solution to help mitigate the problem of the large ciphertext size based FHE solution. In any of the above applications, a client communicates with the cloud service and uploads its data encrypted under a FHE scheme, and the cloud operates on these data and returns encrypted outputs to the client. Each ciphertext has size n log(q), and for functions requiring a large number of multiplications, q and n could be very large. The solution to this is two-fold. First, all encryptions that the client sends to the server can be encrypted using (a semantically secure encryption scheme based on) AES (which, by itself, is not homomorphic at all). The main observation is that the steps of AES decryption can all be carried out on FHE-encrypted entries. A onetime set-up cost is that the client uploads the FHE-encryption of its AES secret key K:

“Client sends FHE(K) to Cloud”

Then for each piece of content m to be uploaded to the cloud, the client uploads only the AES-encryption of m to the cloud, encrypted under its own secret key K.

“Client sends AESK(m) to Cloud”

Now the cloud is expected to operate on the inputs it receives for the client and compute and return FHE-encryptions of functions of those inputs. In order to do that, the cloud must first compute the FHE-encryption of m: in other words the cloud computes the public key FHE-encryption of the AES encryption of the content, FHE (AESK(m)), and must now unravel the AES encryption inside the FHE encryption to obtain FHE(m). Once this is done, the cloud computes the FHE encryption of f(m), for the appropriate function f.

There is still a snag in this solution, namely that the resulting ciphertext that the server returns to the client is still a large FHE ciphertext. The solution to this is the dimension education technique introduced by 12. In particular, the dimension reduction technique converts a ciphertext in Zqx/ (xn + 1) (where both n and q are large in order to support expressive homomorphisms) to a ciphertext in Zpx/ (xk + 1),

where both k and p are small. The resulting ciphertext encrypts the same message, although it does not support any further homomorphisms. The server then applies this transformation and sends the resulting short ciphertext to the client. In short, all the communication over the network consists of short, non-homomorphic cipher-texts. At the server’s end, the ciphertexts are first “upgraded” to homomorphic ciphertexts which are then computed on, and finally “downgraded” to short non-homomorphic ciphertexts which are then sent to the client.

11. References

Marten van Dijk, Craig Gentry, Shai Halevi and Vinod Vaikuntanathan, Fully Homomorphic Encryption over the Integers, MIT and IBM Research, June 2010.

Xiao Z, Xiao Y. Security and privacy in cloud computing. In: IEEE Trans on communications surveys and tutorials, vol 15, no. 2, 2013. pp. 843–59.

Mehmood A, Natgunanathan I, Xiang Y, Hua G, Guo S. Protection of big data privacy. In: IEEE translations and content mining are permitted for academic research 2016.

Priyank Jain, Manasi Gyanchandani and Nilay Khare, Big data privacy: a technological perspective and review, Journal of Big data, 2016.

P. Paillier, “Public-key cryptosystems based on composite degree residuosity classes.” Springer, 1999, pp. 223–238.

D. Boneh, E.-J. Goh, and K. Nissim, “Evaluating 2-dnf formulas on ciphertexts,” in Theory of cryptography. Springer, 2005, pp. 325–341.

S. Rass and D. Slamanig, Cryptography for Security and Privacy in Cloud Computing. Norwood, MA, USA: Artech House, Inc., 2013.

Chatterjee and Sengupta: Translating Algorithms to Handle Fully Homomorphic Encrypted Data on the Cloud, IEEE Transactions on Cloud Computing, Vol. 6, No. 1, 2018.

Y. Ishai, E. Kushilevitz, R. Ostrovsky, and A. Sahai. Cryptography from anonymity. In IEEE Symposium on Foundations of Computer Science (FOCS ’06). IEEE Computer Society, 2006.

C. Gentry, S. Halevi, and V. Vaikuntanathan. A simple bgn-type cryptosystem from lwe. In Advances in Cryptology – EUROCRYPT ’10, pages 506{522. Springer, 2010.

M. Freedman, Y. Ishai, B. Pinkas, and O. Reingold. Keyword search and obliviosu pseudorandom functions. In Theory of Cryptography Conference (TCC ’05). Springer, 2005.

Zvika Brakerski and Vinod Vaikuntanathan. Efficient fully homomorphic

encryption from (standard) LWE. FOCS, 2011.