The growing popularity of peer-to-peer (P2P) systems has necessitated the need for
managing huge volumes of data efficiently to ensure acceptable user response
times. Dynamically changing popularities of data items and skewed user query patterns
in P2P systems may cause some of the peers to become bottlenecks, thereby
resulting in severe load imbalance and consequently increased user response times.
An effective load-balancing mechanism becomes a necessity in such cases.
Such load-balancing can be achieved by efficient online data migration/replication.
While much work has been done to harness the huge computing resources of P2P
systems for high-performance computing and scientific applications, issues concerning
load-balancing with a view towards faster access to data for normal users
have not received adequate attention. Notably, the sheer size of P2P networks
and the inherent dynamism of the environment pose significant challenges to
load-balancing. The main contributions of our proposal are three-fold. First,
we view a P2P system as comprising clusters of peers and present techniques for
both intra-cluster and inter-cluster load-balancing. Second, we analyze the trade-offs
between the options of migration and replication and formulate a strategy
based on which the system decides at run-time which option to use. Third, we
propose an effective strategy aimed towards automatic self-evolving clusters
of peers. Our performance evaluation demonstrates that our proposed technique for
inter-cluster load-balancing is indeed effective in improving the system performance
significantly. To our knowledge, this work is one of the earliest attempts
at addressing load-balancing via both online data migration and replication
in P2P environments. |
In mobile ad-hoc peer-to-peer (M-P2P) networks, frequent network partitioning leads
to typically low data availability, thereby making data replication a necessity.
This work proposes EcoRep, a novel economic model for dynamic replica allocation
in M-P2P networks. EcoRep performs replica allocation based on a data item’s
relative importance, which is quantified by the data item’s price in terms
of a virtual currency. The price of a data item depends on its access frequency,
the number of users who accessed it, the number of its existing replicas, its
(replica) consistency and the average response time required for accessing it.
EcoRep ensures fair replica allocation by considering the origin of queries for
data items. EcoRep requires a query issuing user to pay the price of his requested
data item to the user serving his request. This discourages free-riding
and encourages user participation by providing an incentive for users to become
service-providers. EcoRep also considers other issues such as load, energy and
network topology as replication criteria. Our performance study indicates that
EcoRep is indeed effective in improving query response times and data availability
in M-P2P networks. |
OVERVIEW OF RESEARCH PROJECTS |
P2P Projects: Brief Overview |
GRID Projects: Brief Overview |
The unprecedented growth as well as the growing importance of available spatial data
at geographically distributed locations has made efficient networking of such
data a necessity for availability reasons. The emergence of grid computing coupled
with large and powerful computer networks, which have the capability to
connect thousands of geographically distributed computers worldwide, has opened
a world of opportunities for such networking. This provides a strong motivation
for designing a spatial grid which supports fast data retrieval and allows its
users to transparently access data of any location from anywhere. However, several
challenging issues need to be addressed for the spatial grid to work efficiently
in practice. In particular, mechanisms for efficient search and effective
load-balancing need to be in place. This paper focusses on dynamic load-balancing
in spatial grids via data migration/replication to prevent degradation in
system performance owing to severe load imbalance among the nodes. Notably, issues
concerning load-balancing are more complex in case of grids than for traditional
domains primarily because a grid usually spans across multiple administrative
domains. The main contributions of our proposal are as follows. First, we
view a spatial grid as comprising several clusters where each cluster is a local
area network (LAN) and propose a novel inter-cluster load-balancing algorithm
which uses migration/replication of data. Second, we present a novel scalable
technique for dynamic data placement that not only improves data availability
but also minimizes disruptions and downtime to the system. Our performance study
demonstrates the effectiveness of our proposed approach in correcting workload
skews, thereby facilitating improvement in system performance. To our knowledge,
this work is one of the earliest attempts at addressing load-balancing via
both online data migration and replication in GRID environments. |
ONGOING WORK ON P2P SYSTEMS AND GRIDs |
Currently, in the context of both P2P systems and GRIDs, we are working on the following
topics: 1) Deciding at run-time which option is preferable -- data migration or data replication. 2) Optimizing data migration and replication in P2P systems. 3) Determining the optimal granularity of data migration. 4) Ascertaining the source peers and the destination peers for data movement. 5) Investigating the overhead of load-balancing across huge P2P/GRID systems. 6) Expediting search in P2P/GRID systems. 7) Examining possibilities for integrating load-balancing into existing P2P/GRID systems. 8) Studying issues concerning peer communities. |
DataBanking Project: Brief Overview |
Banks have been traditionally viewed as institutions that deal in money and its substitutes
and provide other financial services. However, the unprecedented increase
in the complexity of information related to the customers coupled with the
ever-increasing demands of globalization as well as the emergence of large and
powerful computer networks motivates a strong need for institutions which do
not restrict themselves only to managing their customers' financial resources,
but also deal with more complex information pertaining to their customers. In this
regard, the main contribution of our work is the proposal for the creation
of a new kind of institution, which we designate as DataBank, which goes well beyond
the traditional financial resource handling that is characteristic of traditional
banks to storing and managing all the important information associated
with a customer. In other words, a DataBank acts as a personal information repository
for each of its customers and allows its customers to access their respective
data from ANYWHERE at ANYTIME without requiring its customers to bother
about the complex processes which mediate their access to their respective data.
Potential possibilities of DataBanking are immense, but several issues such as
security, privacy, dependability, access control, concurrency control, data dynamism
need to be addressed for a DataBank to work effectively in practice. We
are currently working on these issues. |
Economic incentive models for Mobile Peer-to-Peer Networks: Brief Overview |
Further details concerning this project are available in the following publications: |
Anirban Mondal, Sanjay Kumar Madria and Masaru Kitsuregawa "CADRE: A Collaborative
replica allocation and deallocation approach for Mobile-P2P Networks."
Proceedings of the International Database Engineering & Applications Symposium
-- IDEAS 2006 PDF Anirban Mondal, Sanjay Kumar Madria and Masaru Kitsuregawa "EcoRep: An Economic Model for Efficient Dynamic Replication in Mobile-P2P networks." Proceedings of the International Conference on Management of Data, COMAD 2006 PDF Anirban Mondal, Sanjay Kumar Madria and Masaru Kitsuregawa "CLEAR: An Efficient Context and Location-based Dynamic Replication Scheme for Mobile-P2P Networks." Proceedings of the International Conference on Database and Expert Systems Applications - DEXA 2006 PDF, MS-Powerpoint slides Anirban Mondal and Masaru Kitsuregawa "Privacy, Security and Trust in P2P environments: A Perspective (Invited talk)." Proceedings of the International Workshop on P2P Data Management, Security and Trust - PDMST 2006 PDF, MS-Powerpoint slides |
Further details concerning my P2P and GRID projects are available in the following
publications: |
Anirban Mondal and Masaru Kitsuregawa "Effective Dynamic Replication in Wide-Area
Network Environments: A Perspective (Invited talk) ." Proceedings of the International Workshop on High Availability of Distributed
Systems -- HADIS 2005 PDF Anirban Mondal, Yi Lifu and Masaru Kitsuregawa "On Improving the Performance Dependability of Unstructured P2P Systems via Replication." Proceedings of the International Conference on Database and Expert Systems Applications - DEXA 2004 PDF Anirban Mondal and Masaru Kitsuregawa "Load-Balancing Remote Spatial Join Queries in a Spatial GRID." Proceedings of ER Conference on Conceptual Modelling -- ER 2004 PDF Anirban Mondal, Kazuo Goda and Masaru Kitsuregawa "Effective Load-Balancing via Migration and Replication in Spatial GRIDs." Proceedings of the International Conference on Database and Expert Systems Applications - DEXA 2003 PDF Anirban Mondal, Yi Lifu and Masaru Kitsuregawa "P2PR-Tree: An R-Tree-Based Spatial Index for Peer-to-Peer Environments." Proceedings of the International EDBT Workshop on P2P and Databases -- P2PDB 2004 PDF |
Anirban Mondal, Pankaj Garg and Masaru Kitsuregawa "DataBank: A Blueprint for efficient
privacy-preserving personalized user data management worldwide (Invited talk)." Proceedings of the International Conference on Management of Data (COMADb),
Industrial Track, 2005 PDF |
This page was last updated on November 30, 2006 |
Further details of this industrial track work can be found in the following publication: |