Fuzzy data encryption makes effective data utilization

Fuzzy Keyword Search overEncrypted Data in Cloud Computing Pranjuli Yavatkar, NikitaPatil, Sneha KaleDepartment of ComputerEngineering, Bharati Vidyapeeth College of Engineering, Navi Mumbai,Maharashtra  Abstract -As Cloud Computing becomes prevalent, more and more sensitive information arebeing centralized into the cloud. For the protection of data privacy, sensitivedata usually have to be encrypted before outsourcing, which makes effectivedata utilization a very challenging task. Although traditional searchableencryption schemes allow a user to securely search over encrypted data throughkeywords and selectively retrieve files of interest, these techniques supportonly exact keyword search. That is, there is no tolerance of minor typos andformat inconsistencies which, on the other hand, are typical user searchingbehavior and happen very frequently.

This significant drawback makes existingtechniques unsuitable in Cloud Computing as it greatly affects system usability,rendering user searching experiences very frustrating and system efficacy verylow. In this paper, for the first time we formalize and solve the problem ofeffective fuzzy keyword search over encrypted cloud data while maintainingkeyword privacy. Fuzzy keyword search greatly enhances system usability byreturning the matching files when users’ searching inputs exactly match thepredefined keywords or the closest possible matching files based on keywordsimilarity semantics, when exact match fails.  KEYWORDS: Encryption, Fuzzy Keyword, Cloud Computing I.

We Will Write a Custom Essay Specifically
For You For Only $13.90/page!

order now

                   INTRODUCTION As Cloud Computing becomes prevalent, more and moresensitive information are being centralized into the cloud, such as emails,personal health records, government documents, etc. By storing their data intothe cloud, the data owners can be relieved from the burden of data storage andmaintenance so as to enjoy the on-demand high quality data storage service.However, the fact that data owners and cloud server are not in the same trusteddomain may put the outsourced data at risk, as the cloud server may no longerbe fully trusted. It follows that sensitive data usually should be encryptedprior to outsourcing for data privacy and combating unsolicited accesses.However, data encryption makes effective data utilization a very challengingtask given that there could be a large amount of outsourced data files.Moreover, in Cloud Computing, data owners may share their outsourced data witha large number of users. The individual users might want to only retrievecertain specific data files they are interested in during a given session.

Oneof the most popular ways is to selectively retrieve files through keyword-basedsearch instead of retrieving all the encrypted files back which is completelyimpractical in cloud computing scenarios. Such keyword-based search techniqueallows users to selectively retrieve files of interest and has been widelyapplied in plaintext search scenarios, such as Google search. Unfortunately,data encryption restricts user’s ability to perform keyword search and thusmakes the traditional plaintext search methods unsuitable for Cloud Computing.Besides this, data encryption also demands the protection of keyword privacysince keywords usually contain important information related to the data files.

Although encryption of keywords can protect keyword privacy, it further rendersthe traditional plaintext search techniques useless in this scenario. Tosecurely search over encrypted data, searchable encryption techniques have beendeveloped in recent years. Searchable encryption schemes usually build up anindex for each keyword of interest and associate the index with the files thatcontain the keyword. By integrating the trapdoors of keywords within the indexinformation, effective keyword search can be realized while both file contentand keyword privacy are well-preserved. Although allowing for performingsearches securely and effectively, the existing searchable encryptiontechniques do not suit for cloud computing scenario since they support onlyexact keyword search. That is, there is no tolerance of minor typos and formatinconsistencies.It is quite common that users’ searching input mightnot exactly match those pre-set keywords due to the possible typos,representation inconsistencies, and/or her lack of exact knowledge about thedata.

The naive way to support fuzzy keyword search is through simple spellcheck mechanisms. However, this approach does not completely solve the problemand sometimes can be ineffective due to the following reasons: on the one hand,it requires additional interaction of user to determine the correct word fromthe candidates generated by the spell check algorithm, which unnecessarilycosts user’s extra computation effort; on the other hand, in case that useraccidentally types some other valid keywords by mistake (for example, searchfor “hat” by carelessly typing “cat”), the spell check algorithm would not evenwork at all, as it can never differentiate between two actual valid words.Thus, the drawbacks of existing schemes signifies the important need for new techniquesthat support searching flexibility, tolerating both minor typos and formatinconsistencies.In this paper, we focus on enabling effective yetprivacy-preserving fuzzy keyword search in Cloud Computing.

To the best of ourknowledge, we formalize for the first time the problem of effective fuzzykeyword search over encrypted cloud data while maintaining keyword privacy.Fuzzy keyword search greatly enhances system usability by returning thematching files when users’ searching inputs exactly match the predefinedkeywords or the closest possible matching files based on keyword similaritysemantics, when exact match fails. More specifically, we use edit distance toquantify keywords similarity and develop a novel technique, i.e.

, awildcard-based technique, for the construction of fuzzy keyword sets. Thistechnique eliminates the need for enumerating all the fuzzy keywords and theresulted size of the fuzzy keyword sets is significantly reduced. Based on theconstructed fuzzy keyword sets, we propose an efficient fuzzy keyword searchscheme.

 II.                RELATED WORK Plaintext fuzzykeyword search: Recently, the importance of fuzzysearch has received attention in the context of plaintext searching ininformation retrieval community 11–13. They addressed this problem in thetraditional information access paradigm by allowing user to search withoutusing try-and-see approach for finding relevant information based onapproximate string matching. At the first glance, it seems possible for one todirectly apply these string matching algorithms to the context of searchableencryption by computing the trapdoors on a character base within an alphabet.However, this trivial construction suffers from the dictionary and statisticsattacks and fails to achieve the search privacy. Searchableencryption: Traditional searchable encryption hasbeen widely studied in the context of cryptography.

Among those works, most arefocused on efficiency improvements and security definition formalizations. Thefirst construction of searchable encryption was proposed by Song et al., inwhich each word in the document is encrypted independently under a specialtwo-layered encryption construction. Goh proposed to use Bloom filters toconstruct the indexes for the data files. To achieve more efficient search,Chang et al. and Curtmola et al. both proposed similar “index” approaches,where a single encrypted hash table index is built for the entire filecollection.

In the index table, each entry consists of the trapdoor of akeyword and an encrypted set of file identifiers whose corresponding data filescontain the keyword. As a complementary approach, Boneh et al. presented apublic-key based searchable encryption scheme.

Note that all these existingschemes support only exact keyword search, and thus are not suitable for CloudComputing. III.             PROBLEM FORMULATION A. System Model:    In this paper, weconsider a cloud data system consisting of data owner, data user and cloudserver. Given a collection of n encrypted data files C = (F1, F2,.

.., FN)stored in the cloud server, a predefined set of distinct keywords W = {w1, w2,..

., wp}, the cloud server provides the search service for the authorized usersover the encrypted data C. We assume the authorization between the data ownerand users is appropriately done. An authorized user types in a request toselectively retrieve data files of his/her interest. The cloud server isresponsible for mapping the searching request to a set of data files, whereeach file is indexed by a file ID and linked to a set of keywords. The fuzzykeyword search scheme returns the search results according to the followingrules: 1) if the user’s searching input exactly matches the pre-set keyword,the server is expected to return the files containing the keyword1; 2) if thereexist typos and/or format inconsistencies in the searching input, the serverwill return the closest possible results based on pre-specified similaritysemantics.

An architecture of fuzzy keyword search is shown in the Fig. 1. B.

Threat Model  We consider asemi-trusted server. Even though data files are encrypted, the cloud server maytry to derive other sensitive information from users’ search requests whileperforming keyword-based search over C. Thus, the search should be conducted ina secure manner that allows data files to be securely retrieved while revealingas little information as possible to the cloud server. More specifically, it isrequired that nothing should be leaked from the remotely stored files and indexbeyond the outcome and the pattern of search queries.

 C.Design Goals  In this paper, weaddress the problem of supporting efficient yet privacy-preserving fuzzykeyword search services over encrypted cloud data. Specifically, we have thefollowing goals: i) to explore new mechanism for constructing storage efficientfuzzy keyword sets; ii) to design efficient and effective fuzzy search schemebased on the constructed fuzzy keyword sets; iii) to validate the security ofthe proposed scheme.

 IV. CONSTRUCTIONS OF EFFECTIVEFUZZY KEYWORD SEARCH IN CLOUD The key idea behindour secure fuzzy keyword search is two-fold: 1) building up fuzzy keyword setsthat incorporate not only the exact keywords but also the ones differingslightly due to minor typos, format inconsistencies, etc.; 2) designing anefficient and secure searching approach for file retrieval based on the resultedfuzzy keyword sets. A.Advanced Technique for Constructing Fuzzy Keyword Sets  To provide morepractical and effective fuzzy keyword search constructions with regard to bothstorage and search efficiency, we now propose an advanced technique to improvethe straightforward approach for constructing the fuzzy keyword set. Withoutloss of generality, we will focus on the case of edit distance d = 1 toelaborate the proposed advanced technique.

For larger values of d, thereasoning is similar. Note that the technique is carefully designed in such away that while suppressing the fuzzy keyword set, it will not affect the searchcorrectness. Wildcard-basedFuzzy Set Construction  In the abovestraightforward approach, all the variants of the keywords have to be listedeven if an operation is performed at the same position. Based on the aboveobservation, we proposed to use a wildcard to denote edit operations at thesame position. The wildcard-based fuzzy set of wi with edit distanced is denoted as S wi,d ={S’wi,0, S’wi,2, ··· ,S’wi,d }, where S’ wi ,?  denotes the set of words wi with ? wildcards.

Note each wildcard represents an edit operation on wi. For example,for the keyword CASTLE with the pre-set edit distance 1, its wildcard-basedfuzzy keyword set can be constructed as SCASTLE,1 = {CASTLE, *CASTLE, *ASTLE,C*ASTLE, C*STLE, ··· , CASTL*E, CASTL*, CASTLE*}. The total number of variantson CASTLE constructed in this way is only 13 + 1, instead of 13 × 26 + 1 as inthe above exhaustive enumeration approach when the edit distance is set to be1.

Generally, for a given keyword wi with length l,the size of S wi,1 will be only 2 l +1+1,as compared to (2 l + 1) × 26 + 1 obtained in thestraightforward approach. The larger the pre-set edit distance, the morestorage overhead can be reduced: with the same setting of the example in the straightforwardapproach, the proposed technique can help reduce the storage of the index from30GB to approximately 40MB. In case the edit distance is set to be 2 and 3, thesize of S wi,2 and S wi,3 will be C1 l +1+C1 l ·C1 l+2C2 l +2 andC1 l+ C3 l + 2C2 l +2C2 l· C1 l .

In other words, the number is only O( l d)for the keyword with length land edit distance d.  B.The Efficient Fuzzy Keyword Search Scheme  Based on thestorage-efficient fuzzy keyword sets, we show how to construct an efficient andeffective fuzzy keyword search scheme.

The scheme of the fuzzy keyword searchgoes as follows: 1) To build an index for wi with edit distance d, the dataowner first constructs a fuzzy keyword set Swi,d using the wildcardbased technique. Then he computes trapdoor set {Tw’i} for each w’i? Swi,d with a secret key sk shared between data owner and authorized users.The data owner encrypts FIDwi as Enc(sk, FIDwi  || wi). The index table {({ Tw’i}w’i? Swi,d , Enc(sk, FIDwi || wi ))} wi ?W and encrypted data files areoutsourced to the cloud server for storage;2) To search with(w, k), the authorized user computes the trapdoor set {Tw’}w’? Sw,k , where Sw,k isalso derived from the wildcard-based fuzzy set construction. He then sends {Tw’}w’? Sw,k to the server; 3) Upon receivingthe search request {Tw’}w’ ? Sw,k, the servercompares them with the index table and returns all the possible encrypted fileidentifiers {Enc(sk, FIDwi || wi)} according to the fuzzykeyword definition in section III-D. The user decrypts the returned results andretrieves relevant files of interest. In thisconstruction, the technique of constructing search request for w is the same asthe construction of index for a keyword.

As a result, the search request is atrapdoor set based on Sw,k , instead of a single trapdoor as in thestraightforward approach. In this way, the searching result correctness can beensured. V.

CONCLUSION In this paper, we formalize and solve the problem ofsupporting efficient yet privacy-preserving fuzzy search for achievingeffective utilization of remotely stored encrypted data in Cloud Computing. Wedesign an advanced technique (i.e., wildcard-based technique) to construct thestorage-efficient fuzzy keyword sets by exploiting a significant observation onthe similarity metric of edit distance. Based on the constructed fuzzy keywordsets, we further propose an efficient fuzzy keyword search scheme. Throughrigorous security analysis, we show that our proposed solution is secure andprivacy-preserving, while correctly realizing the goal of fuzzy keyword search.

 REFERENCES1 D. Song, D. Wagner, and A. Perrig, “Practicaltechniques for searches on encrypted data,” in Proc.

of IEEE Symposium onSecurity and Privacy’00, 2000. 2 E.-J. Goh, “Secure indexes,” Cryptology ePrintArchive, Report 2003/216, 2003, http://eprint.iacr.org/.

3 Fuzzy keyword search over encrypted data in cloudcomputing”, Illinois Institute of Technology, ISSN: 2321- 8134.4 Y.-C. Chang and M. Mitzenmacher, “Privacypreserving keyword searches on remote encrypted data,” in Proc. of ACNS’05,2005. 5 D.

Boneh, G. D. Crescenzo, R. Ostrovsky, and G.Persiano, “Public key encryption with keyword search,” in Proc.

of EUROCRYP’04,2004. 6 R. Curtmola, J. A. Garay, S. Kamara, and R.Ostrovsky, “Searchable symmetric encryption: improved definitions and efficientconstructions,” in Proc.

of ACM CCS’06, 2006. 7 D. Boneh and B. Waters, “Conjunctive, subset, andrange queries on encrypted data,” in Proc. of TCC’07, 2007, pp.

535–554. 8 REVIEW PAPER ON FUZZY SEARCH OVER ENCRYPTED DATAIN CLOUD COMPUTING by Neel Gala ISSN:2393-98429 F. Bao, R. Deng, X. Ding, and Y. Yang, “Privatequery on encrypted data in multi-user settings,” in Proc.

of ISPEC’08, 2008. 10 C. Li, J. Lu, and Y.

Lu, “Efficient merging andfiltering algorithms for approximate string searches,” in Proc. of ICDE’08,2008.   


I'm Mary!

Would you like to get a custom essay? How about receiving a customized one?

Check it out