Friday, September 20, 2019
Obfuscated Data Storage For Cloud Environment
Obfuscated Data Storage For Cloud Environment Ravi Pandey, and Kamlesh Chandra Purohit Abstract. Data storage service is one of the most attractive services provided by the cloud service provider. Despite the benefits of cloud computing threat to data confidentiality, integrity and availability may stop the data owner from switching to a cloud environment. Handing over the data to a third party to store and manage may generate data security issues as owner of the data cannot deploy its own security policies in storage service providerââ¬â¢s premises. A storage service provider may misuse the clientââ¬â¢s data. In a cloud environment data is stored in the service providerââ¬â¢s premises so there should be a mechanism which hides meaning of the data from the service provider or any other unauthorized entity. In this paper, we propose a mechanism which utilizes some existing schemes like erasure correcting code, AES, SHA256 and some new techniques to achieve data security guarantees against any unauthorized entity. Keywords: Data storage service, cloud computing, erasure correcting code, AES, SHA256. 1. Introduction Cloud computing technology, which is popular for its pay as you go model attracts enterprises and individuals to host their data in a cloud environment. Based on results from IDCs 2012 North American CloudTrack Survey ââ¬Å"more than 30% of organizations expect that within five years, the majority of their IT capability will be delivered through public cloud services and that within three years, they will access 45.5% of IT resources through some form of cloud ââ¬â public, private, or hybridâ⬠. As Cloud Computing has advantages for both providers and users, it is developing in an amazing pace and predicted to grow and be adopted by a large amount of users in the near future [1]. Cloud storage is an important service of cloud computing, which allows data owners (owners) to move data from their local computing systems to the cloud [2]. The storage space requirement is increasing every day as rate of data generation is very high. According to IDC Digital Universe Study, June 2011, ââ¬Å"In 2011, the amount of information created and replicated will surpass 1.8 zettabytes (1.8 trillion gigabytes), growing by a factor of nine in just five years. Thatââ¬â¢s nearly as many bits of information in the digital universe as stars in the physical universeâ⬠. Various surveys show that the issue of data security is highest among clients who wish to switch their data in the cloud. Existing encryption schemes assure security guarantee for data while traversing in the network, where data resides for very short time. In case of data stored in a cloud environment, data resides for a long time in the service providerââ¬â¢s storage premises, so the service provider or its any employee can attempt brute force attack to get information from the clientââ¬â¢s stored data. In a cloud environment there can be some internal attackers like employee of service provider who can behave dishonestly. Some applications stored by other client on the same server may be malicious which try to access data of other clients. Although it becomes difficult for an external attacker to attack intended data as in cloud where data is stored is not known to attacker but they can try to attack any random data. Therefore, client expects a secure network in which client can upload data, an honest service provider and a third party auditor who can take responsibility to check integrity of data stored in cloud server. This paper is extension of our previous paper []. Here we will explain proposed scheme in more detail and we will analyze the complexity of our algorithm. 2. System Model Cloud storage auditing system consists of three entities client, cloud server and third Party auditor. Client is the owner of data to be stored in cloud. Client generates the data to be hosted in cloud and can access, modify or delete the data to be hosted. Cloud storage server stores the data and provides mechanisms to access, modifying or deleting the data. Fig. 1. Cloud data storage architecture. Storage servers are geographically distant located, data is redundantly stored in multiple servers for security reasons. Third party auditor is an authorised system to check integrity of data storage. Data flow in between any pair of entity happens in encrypted form. We know that system is prone to internal and external attacks; other issues like hardware failure, software bugs, networking may also impact the system. We believe that, the third party auditing scheme, proposed by many researcher with some modification can make the whole cloud storage environment more reliable and secure. 3. Design Goal In Existing scheme key problem is that data stored in cloud data storage server is in meaningful manner. Our design goal is to obfuscate data before uploading it to the cloud server. After that we utilize existing data encryption techniques and hashing algorithm for providing user authentication and ensuring data integrity. 4. Proposed Work 4.1. Data Obfuscation Algorithm Let F be the private file, which is to be uploaded in the cloud environment. (we can see file F as an array of bytes from 0 to Flength.) Select a key K, an array of 10 digits from 0 to 9 without any repetition. Initialize 10 files f0,f1,f2,f3,â⬠¦f9., we call them file components. For each byte F[i] of file F, calculate j= i %10; For each j, look for K[j] and append F[i] byte on fK[j] component. Figure 1. Demonstration of the file F, which is to be uploaded in the cloud environment. Figure 2. Demonstration of the key K. Figure 3. Demonstration of the file component fj. 4.2. Program Code for data obfuscation algorithm Program code in python for splitting file in components. count=0 n=0 path=I:\abcd; khol = open(path,rb) clone1=open(I:\115,wb) clone2=open(I:\116,wb) clone3=open(I:\117,wb) clone4=open(I:\120,wb) clone5=open(I:\121,wb) clone6=open(I:\123,wb) clone7=open(I:\124,wb) clone8=open(I:\125,wb) clone9=open(I:\126,wb) myL=[clone3,clone5,clone2,clone7,clone1,clone9,clone4,clone8,clone6] byte=start while byte!=: byte = khol.read(1) n=count%9 count=count+1 if byte: if n==0: clone=myL[0] clone.write(byte) elif n==1: clone=myL[1] clone.write(byte) elif n==2: clone=myL[2] clone.write(byte) elif n==3: clone=myL[3] clone.write(byte) elif n==4: clone=myL[4] clone.write(byte) elif n==5: clone=myL[5] clone.write(byte) elif n==6: clone=myL[6] clone.write(byte) elif n==7: clone=myL[7] clone.write(byte) elif n==8: clone=myL[8] clone.write(byte) else: break khol.close() clone1.close() clone2.close() clone3.close() clone4.close() clone5.close() clone6.close() clone7.close() clone8.close() clone9.close() Program code in python for regenerating main file from components. clone1=open(I:\115,rb) clone2=open(I:\116,rb) clone3=open(I:\117,rb) clone4=open(I:\120,rb) clone5=open(I:\121,rb) clone6=open(I:\123,rb) clone7=open(I:\124,rb) clone8=open(I:\125,rb) clone9=open(I:\126,rb) recover=open(I:\abcd,wb) myL=[clone3,clone5,clone2,clone7,clone1,clone9,clone4,clone8,clone6] byte=start while byte!=: if byte: for x in myL: byte=x.read(1) recover.write(byte) else: break clone1.close() clone2.close() clone3.close() clone4.close() clone5.close() clone6.close() clone7.close() clone8.close() clone9.close() recover.close() 4.3. Erasure Correcting Code After division of File F in 10 components, we use erasure correcting code [] to achieve data availability against byzantine failure. Erasure coding (EC) is a method of data protection in which data is broken into fragments, expanded and encoded withredundantdata pieces and stored across a set of different locations or storage media. Erasure coding creates a mathematicalfunctionto describe asetof numbers so they can be checked for accuracy and recovered if one is lost. Referred to aspolynomial interpolationor oversampling, this is the key concept behind erasure codes. In mathematical terms, the protection offered by erasure coding can be represented in simple form by the following equation: n = k + m. The variable ââ¬Å"kâ⬠is the original amount of data or symbols. The variable ââ¬Å"mâ⬠stands for the extra or redundant symbols that are added to provide protection from failures. The variable ââ¬Å"nâ⬠is the total number of symbols created after the erasure coding process. For instance, in a [10:16] method six extra symbols (m) would be added to the 10 base symbols (k). The 16 data fragments (n) would be spread across 16 drives, nodes or geographic locations. The original file could be reconstructed from 10 verified fragments. 4.4. Component Encryption (AES) Now we can encrypt each file component with AES using 128 bit, 192 bit, or 256 bit key for encryption. 4.5. Token Generation(SHA256) We calculate hash function SHA256 for every data component H(fi)=xi. TheSHA algorithm is a cryptographic hash functionwhich produces a 64 digithash value. We use this token as a challenge token for auditing the storage server. Third party auditor keeps token value of each encrypted file component. File components are identified by a random_number generated by the client. Size of random_number depends on the requirement of the storage space by the client. 4.6. Database description After uploading files in the data storage server the client can delete the files in its local machine. Client keeps database of each file which contains file name, AES encryption password , file distribution password. Besides this for each file client keeps database for file component of that file. It contains random_number associated with each file component. Third party auditor keeps clients id, random_number associated with file components and challenge token corresponding to each component. Storage server stores client id and file component named with random_number generated by the client. 5. Third Party auditing TPA sends random_number to the cloud storage server. On receiving this random_number, cloud storage server calculates hash function of corresponding file component. Storage server encrypts this hash value with a shared key among TPA and storage server and sends encrypted hash value to TPA. TPA matches this received value with its database. If the stored hash value of a file and received hash value are same then file component is stored correctly, otherwise TPA sends alert message to the corresponding client of that file. 6. Algorithm Analysis In this section, we evaluate the complexity of the proposed scheme under the section 4.1. We took files of size 10 kb, 100 kb, 1000 kb and 10,000 kb for analyzing time complexity of the algorithm. Using time function of python we majored repeatedly the time of execution of proposed algorithm. We found that time of execution linearly (O(n)) depends on the size of file. Similarly, we majored time of execution at the time of regeneration of file from its components, and we found that time of regeneration of file is also linearly (O(n)) depends on file size. After execution of algorithm we found that total size of file components generated from the file to be uploaded is equal to the size of original file. In this way proposed scheme do not impose any extra storage burden. 7. Conclusion To ensure cloud data storage security, it is essential to hide meaning of data from all the third party entities like storage service provider and third party auditor. This is only possible when owner of data obfuscate the file to be uploaded in its own machine before uploading. The scheme which we have proposed ensures that information stored in the file cannot be interpreted by the third party auditor and storage service provider. Hence, clientââ¬â¢s file is safe from both internal and external attackers. Utilization of existing scheme erasure correcting code ensures security against byzantine failure and use of random_number associated with file components together with secure hash algorithm allows third party auditor to audit file components without sharing any information which can help storage service provider to interpret the meaning of stored file. References Agrawal, R., Srikant, R.: Fast Algorithms for Mining Association Rules. In Proceedings of the 20th International Conference on Very Large Databases. Morgan Kaufmann, Santiago, Chile, 487-499. (1994) Garcia-Molina, H., Ullman, D. J., Widom, J.: Database Systems: The Complete Book. Prentice Hall, New Jersey, USA. (2002) Wang, X., Bettini, C., Brodsky, A., Jajoida, S.: Logical Design for Temporal Databases with Multiple Granularities. ACM Transactions on Database Systems, Vol. 22, No. 2, 115-170. (1997) Bruce, K. B., Cardelli, L., Pierce, B. C.: Comparing Object Encodings. In: Abadi, M., Ito, T. (eds.): Theoretical Aspects of Computer Software. Lecture Notes in Computer Science, Vol. 1281. Springer-Verlag, Berlin Heidelberg New York, 415ââ¬â438. (1997) van Leeuwen, J. (ed.): Computer Science Today. Recent Trends and Developments. Lecture Notes in Computer Science, Vol. 1000. Springer-Verlag, Berlin Heidelberg New York (1995) Ribià ¨re, M., Charlton, P.: Ontology Overview. Motorola Labs, Paris (2002). [Online]. Available: http://www.fipa.org/docs/input/f-in-00045/f-in-00045.pdf (current October 2003)
Subscribe to:
Post Comments (Atom)
No comments:
Post a Comment
Note: Only a member of this blog may post a comment.