Data on-chain process

Overall process flow

After the data is transmitted to CESS through DeOSS, the process flow of data in the network is depicted as following:

CESS storage method is on the objects level. Users get CESS object storage service through the service client. New users need to purchase space and create buckets before uploading data objects. The client uploads a data object with the following process:

  • Create a "storage deal" on the chain, and pre-load the meta information of the data object (including MHT-ROOT and leaf node meta information). The "storage deal" status is OPEN;

  • The blockchain network randomly selects a scheduling node as the "executor" of the deal. The "storage deal" status is ASSIGNED;

  • The client sends the data object to the target scheduler.

2. After receiving the data object, the scheduler first checks whether the data deal is already assigned to oneself. If yes, then pre-process the data by slicing, backup (copy), marking (sending to SGX Enclave for processing), and outputting several data segments; finally, each data segment is randomly sent to each storage node.

3. The storage miner first checks whether the received data segment information matches the deal. If yes, use the "authentication key pair" to hash the data segment, and return the signature and confirmation message to the scheduling node. The storage node stores the data segment in the temporary area at the same time. After the deal is successful, the storage node selects an idle data segment in the storage area to replace it. If the deal fails, the corresponding data segment of the temporary area can be removed directly.

4. After receiving the confirmation message from all storage nodes, the scheduler node packages all storage node information and fills it in the deal on the chain.

5. The blockchain network checks whether the signature of each data segment in the deal matches each storage node. If all are confirmed to be correct, the deal is officially complete and enters SUCCESS status. Otherwise, the order is still in the ASSIGNED state.

Data pre-processing

Before each file is uploaded, the data will be preprocessed, including meta information extraction, data slicing, multi-copy replication, etc.

Meta information includes data ownership, data summary description, and data keyword description.

Data slicing is used to standardize data storage and to form standard data segments for unified processing and scheduling.

Multiple replicas are mainly used to improve the reliability of data and ensure the data will not be lost due to a single point of failure.

On-chain file declaration

The file declaration on the CESS blockchain, including the data owner information, is one of the basic elements for file uploading. The file declaration is shown in Figure 3.

The Scheduler

File data fragments and replica fragments are randomly sent to the nearest scheduler node using TCP data streaming. If the data streaming fails, other scheduling nodes will be selected to continue sending until the transmission is completed.

The scheduler checks the validity of the file fragments according to the file declaration on the chain and then sends them randomly to storage miners. If a miner does not receive the file fragment, the scheduler will send the file fragment to another miner until the scheduling is complete.

Declaration on chain

The scheduler collects data such as each fragment and receives miner information to form file metadata, and then sends it to the blockchain. After that, the application layer can obtain the storage address of the file fragment by querying the corresponding information on the chain. When downloading the file, the complete file can be restored by obtaining enough fragments from the nearest storage miners.

Random challenge

Because DeOSS is an open file storage system, there may exist malicious miners. Therefore, DeOSS holds a zero-trust attitude toward miners. DeOSS uses the Proof of Data Reduplication and Recovery (PoDR²) algorithm to ensure the reliable storage of files.

According to the PoDR² algorithm, the blockchain network will randomly challenge the storage miners. For each stored data fragment, the miner needs to submit the storage certificate to the network. The blockchain network is responsible for verifying the submitted certificate to prove whether the miner has completed the storage task.

Summary

Unlike other distributed storage systems, DeOSS does not impose any restrictions on the storage miner's file management system or the storage method but only focuses on the correctness and reliability of file input and output. Therefore, a storage miner can not only store files distributed by DeOSS but also freely utilize idle space to store its files, thus improving resource utilization efficiency and increasing the flexibility of system expansion.

Last updated