Objective:
In this paper, we are
interested in studying the load rebalancing problem in distributed file systems
specialized for large-scale, dynamic and data-intensive clouds.
Abstract:
Distributed file systems are
key building blocks for cloud computing applications based on the Map Reduce
programming paradigm. In such file systems, nodes simultaneously serve
computing and storage functions; a file is partitioned into a number of chunks
allocated in distinct nodes so that Map Reduce tasks can be performed in
parallel over the nodes. However, in a cloud computing environment, failure is
the norm, and nodes may be upgraded, replaced, and added in the system.
Files can
also be dynamically created, deleted, and appended. This results in load
imbalance in a distributed file system; that is, the file chunks are not
distributed as uniformly as possible among the nodes. Emerging distributed file
systems in production systems strongly depend on a central node for chunk
reallocation. This dependence is clearly inadequate in a large-scale,
failure-prone environment because the central load balancer is put under
considerable workload that is linearly scaled with the system size, and may
thus become the performance bottleneck and the single point of failure. In this
paper, a fully distributed load rebalancing algorithm is presented to cope with
the load imbalance problem.
Our
algorithm is compared against a centralized approach in a production system and
a competing distributed solution presented in the literature. The simulation
results indicate that our proposal is comparable with the existing centralized
approach and considerably outperforms the prior distributed algorithm in terms
of load imbalance factor, movement cost, and algorithmic overhead.
Existing
System:
,
•
Some use the
concept of virtual server
•
However:
•
Either ignores
the heterogeneity of node capabilities.
•
Or transfer loads
without considering proximity relationships between nodes.
•
Or both.
Disadvantage:
Emerging distributed file systems in production
systems strongly depend on a central node for chunk reallocation. This
dependence is clearly inadequate in a large-scale, failure-prone environment
because the central load balancer is put under considerable workload that is
linearly scaled with the system size, and may thus become the performance
bottleneck and the single point of failure.
Proposed system:
·
The load of each virtual server is
stable over the timescale when load balancing is performed.
·
Load balancing is performed in proximity-aware
manner, to minimize the overhead of load movement (bandwidth usage) and
allow more efficient and fast load balancing.
Advantage:
–
Nodes take more
loads.
–
Maintain the
consistency and speed.
Block
Diagram:
•
L – Load, T – Target Load.
Node A
|
10 L=40
T=50
Node B
|
HEAVY
|
15
L=40
T=30
Node C
|
L=35
20
T=40
Software Requirements:
Operating System : Windows XP
Language Used : JAVA 2
Tools :
NETBEANS IDE, MYSQL SERVER
Hardware Requirements:
Processor : >2 GHz
Main Memory : 512 MB RAM
Hard Disk : 80 GB
No comments:
Post a Comment