Fujitsu Laboratories Develops Data Transfer Technology That Increases Speed Of Remote File Access


Fujitsu Laboratories Develops Data Transfer Technology that Increases Speed of Remote File AccessFujitsu Laboratories has developed a software-based technology to increase data-transfer speeds for accessing files on remote enterprise file-sharing servers. When accessing remote file-sharing servers in the cloud, slow upload and download speeds for typical file-sharing systems due to network latency has been an issue.
By using a newly developed software that relays communications between the client and server, the number of communications made has been significantly reduced, lowering the effects of network latency. This communication frequency occurs when obtaining information on multiple file names and file sizes on a remote network. In an internal experiment, file transfers were confirmed to be up to ten times faster when dealing with multiple small files. Transfers of large files can be up to twenty times faster when combined with the deduplication technology(1) Fujitsu Laboratories announced last year. By simply installing this software on a client and server, increased speeds for file access for existing file-sharing systems can be achieved.
Background

In file sharing, files are stored on server connected to a network and multiple clients can share the same files. This is used by enterprises to share information and manage documents. Previously, Individual locations have maintained their own file-sharing servers on-site, but in order to improve security and reduce operating costs through combined management, server consolidation has become more common as have opportunities to remotely access file-sharing servers. With two network file-sharing protocols that are widely used in file-sharing systems, CIFS and SMB(2), the effects of network latency can impose significant wait times for accessing files, creating a demand for improving speed.
Technologies Issues

Fujitsu Laboratories has already developed a deduplication technology for use with remote data transfers, which accelerates the process by avoiding retransmissions of previously sent data. This technology can be applied to a variety of situations, but it has had limited effectiveness with the CIFS and SMB file-sharing protocols because of their unique processes. Improving networks and installing specialized hardware are other ways of increasing speeds, but these are expensive, and installation of specialized hardware has limited effectiveness when handling large numbers of small files only a few kilobytes in size. The CIFS and SMB file-sharing protocols have the following unique processes and challenges.
When copying a folder containing a large number of files, all of the file-attribute information is requested for each file, and the accumulation of these requests in a remote network causes significant latency.
When sending relatively large files, their data is split into pieces tens of kilobytes in size, and header information is attached to each data. Because this header information is updated each time, the transmitted data becomes different even if it sends the same file, which makes deduplication ineffective.

About the Technology

Fujitsu Laboratories has developed a technology that accelerates data transfers for file-sharing servers using only software. Key features of the technology are as follows.
1. Collectively proxy read-ahead for multiple files and proxy response

With this technology, a module is installed on both the client and server that accelerates data transfers. The server-side module: 1) identifies when a folder containing multiple files starts to download; 2) read-ahead on the client proxy the batch of all the files downloaded; 3) these read-ahead files are bundled together and transmitted to the client-side module; and 4) the client-side module then replies to a request to get data with its server proxy. In this way, the amount of communications generated by obtaining file attributes, such as multiple file names and file sizes, is greatly reduced, as are the delays influenced by network latency.
2. Effective deduplication due to header separation

Fujitsu Laboratories developed a technology that works on the server-side module to separate the transmitted data into the headers and the contents of file. This makes deduplication of retransmitted data more precise, leading to more effective network traffic reduction.

Results

In Fujitsu Laboratories' internal experiment, software that implements this technology was found to have the following effects.
Increase in speed of multiple small file transfers: In a test environment that simulated the network latency for accessing a file-sharing server in Kawasaki from a location in Kyushu, batch downloads of folders containing one hundred 1-KB files was found to be ten times faster.
Increase in speed of large file transfers: In the same test environment, a download of a single 10 MB file was found to be as much as twenty times faster (compared with having no acceleration technologies such as deduplication).
This technology is implemented as software and can be installed on existing file-sharing systems. It can also be applied to cloud and server-virtualization environments, mobile devices, etc., and can be extended to a variety of network services. This technology enables more efficient file sharing and joint development between remote locations.
Future Plans

Fujitsu Laboratories plans to implement this technology into a product as a function for a WAN optimization solution during fiscal 2015, after internal testing at Fujitsu.
[1] Deduplication technology Deduplication and compression reduce the volume of data being transmitted, effectively increasing data-transmission speeds as much as ten times. See April 8, 2014, press release: http://www.fujitsu.com/global/about/resources/news/press-releases/2014/0408-02.html
[2] CIFS and SMB CIFS stands for Common Internet File System; SMB stands for Server Message Block. Both are widely used communications protocols for file sharing. Strictly speaking, CIFS represents SMB 1.0, but is widely referred to as SMB.