In Search of an Understandable Consensus Algorithm (Raft)

Paxos Made Simple

ZooKeeper: Wait-free coordination for Internet-scale systems

Using Paxos to Build a Scalable, Consistent, and Highly Available Datastore

Impossibility of Distributed Consensus With One Faulty Process

Consensus in the presence of partial synchrony

Viewstamped Replication Revisited

Replication

Don’t be lazy, be consistent: Postgres-R, a new way to implement Database Replication

PacificA: Replication in Log-Based Distributed Storage Systems

Chain Replication for Supporting High Throughput and Availability

Byzantine Chain Replication

A Comprehensive Study of Convergent and Commutative Replicated Data Types

Optimistic Replication

Causality/Transactions

Stronger Semantics for Low-Latency Geo-Replicated Storage (Eiger)

Calvin: Fast Distributed Transactions for Partitioned Database Systems

Sinfonia: a new paradigm for building scalable distributed systems

Understanding the Limitations of Causally and Totally Ordered Communication

A Response to Cheriton and Skeen’s Criticism of Causal and Totally Ordered Communication

MDCC: Multi-Datacenter Consistency

Spanner: Google’s globally distributed database

Concurrency

Transactional Memory: Architectural Support for Lock-Free Data Structures

Software Transactional Memory

Sharing Memory Robustly in Message-Passing Systems

Wait-free Synchronization

ZooKeeper’s atomic broadcast protocol: Theory and practice

Kafka (LinkedIn)

Omega: flexible, scalable schedulers for large compute clusters

Thialfi: A Client Notification Service for Internet-Scale Applications

Large-scale Incremental Processing Using Distributed Transactions and Notifications

Note: We haven’t included anything already covered in 6.824 , but you should read those papers too.

Paxos Made Live: An Engineering Perspective

Viewstamped Replication: A new primary copy method to support highly-available distributed systems

Time, Clocks, and the Ordering of Events in a Distributed System

The Part-Time Parliament

Paxos Made Practical

The papers from SOSP 2013

IEEE Account

  • Change Username/Password
  • Update Address

Purchase Details

  • Payment Options
  • Order History
  • View Purchased Documents

Profile Information

  • Communications Preferences
  • Profession and Education
  • Technical Interests
  • US & Canada: +1 800 678 4333
  • Worldwide: +1 732 981 0060
  • Contact & Support
  • About IEEE Xplore
  • Accessibility
  • Terms of Use
  • Nondiscrimination Policy
  • Privacy & Opting Out of Cookies

A not-for-profit organization, IEEE is the world's largest technical professional organization dedicated to advancing technology for the benefit of humanity. © Copyright 2024 IEEE - All rights reserved. Use of this web site signifies your agreement to the terms and conditions.

Distributed operating systems

research paper on distributed system

New Citation Alert added!

This alert has been successfully added and will be sent to:

You will be notified whenever a record that you have chosen has been cited.

To manage your alert preferences, click on the button below.

New Citation Alert!

Please log in to your account

Information & Contributors

Bibliometrics & citations, view options.

  • Klimiankou Y Serafini M Xu H (2022) Towards practical multikernel OSes with MySyS Proceedings of the 13th ACM SIGOPS Asia-Pacific Workshop on Systems 10.1145/3546591.3547525 (29-37) Online publication date: 23-Aug-2022 https://dl.acm.org/doi/10.1145/3546591.3547525
  • Iorio M Risso F Palesandro A Camiciotti L Manzalini A (2022) Computing Without Borders: The Way Towards Liquid Computing IEEE Transactions on Cloud Computing 10.1109/TCC.2022.3229163 (1-18) Online publication date: 2022 https://doi.org/10.1109/TCC.2022.3229163
  • Kang Y Yang J Peng J Zhao J Cheng S Jia G (2022) Analysis of Resources Scheduling for DL Workloads on Multi-tenant GPU Clusters 2022 IEEE 5th Advanced Information Management, Communicates, Electronic and Automation Control Conference (IMCEC) 10.1109/IMCEC55388.2022.10019820 (1776-1781) Online publication date: 16-Dec-2022 https://doi.org/10.1109/IMCEC55388.2022.10019820
  • Show More Cited By

Recommendations

Advanced non-distributed operating systems course.

The use of Non-Distributed Operating Systems is very common and old. Many researchers feel that this field of research is outmoded, and therefore put their efforts into Distributed Operating Systems. Advanced Operating Systems courses generally include ...

Functional specialization in distributed operating systems

A distributed operating system provides the same functionality and interface as a monolithic operating system. That is, for both systems the goal is to make the computing and storage facilities as provided by the hardware available to the users of the ...

Distributed Operating Systems

Reviewer: John George Fletcher

This paper is a review of the current state of the art in distributed operating systems. It contrasts such a system, in which the “users . . . should not know (or care) on which machine . . .:9- Ttheir programs are running,” with the much less transparent notion of a network operating system. The authors then discuss various principles and techniques relating to the design of such systems, particularly in regard to communication primitives, naming and protection, resource management, fault tolerance, and services provided. It concludes with sketches of four actual distributed systems: the Cambridge distributed computing system, Amoeba, the V Kernel, and the Eden project. On the whole, the paper is accurate, complete, and clear. This reviewer recommends it highly, with the following reservations: (1)The various aspects of a distributed system are perhaps discussed a bit too independently of one another. It is not mentioned how certain concepts fit together and how some concepts solve (or cause) several problems at once. (2)Criticisms of several popular ideas are often too gentle, frequently only being implied. The inefficiency of the OSI design is noted, without mentioning how unnecessary its complexity is. Highly theoretical analyses, particularly in regard to scheduling, are discussed at length, with only a passing phrase or two about their impracticality. Much of the discussion is phrased so as to sound familiar and comforting to UNIX aficionados, until late in the paper a reference is made to “poor performance and a fair amount of effort spent trying to convince UNIX to do things against its will.”

Computing Reviews logo

Access critical reviews of Computing literature here

Become a reviewer for Computing Reviews.

Information

Published in.

cover image ACM Computing Surveys

Association for Computing Machinery

New York, NY, United States

Publication History

Permissions, check for updates, contributors, other metrics, bibliometrics, article metrics.

  • 221 Total Citations View Citations
  • 15,363 Total Downloads
  • Downloads (Last 12 months) 1,348
  • Downloads (Last 6 weeks) 100
  • Tanenbaum A Renesse R Staveren H Sharp G Mullender S Jansen J Rossum G (2022) Experiences with the Amoeba Distributed Operating System Classic Operating Systems 10.1007/978-1-4757-3510-9_25 (550-586) Online publication date: 3-Aug-2022 https://doi.org/10.1007/978-1-4757-3510-9_25
  • Hale K Gunawi H Ma X (2021) Coalescent computing Proceedings of the 12th ACM SIGOPS Asia-Pacific Workshop on Systems 10.1145/3476886.3477503 (79-88) Online publication date: 24-Aug-2021 https://dl.acm.org/doi/10.1145/3476886.3477503
  • Brightwell R Ferreira K Maccabe A Pedretti K Riesen R (2019) Sandia Line of LWKs Operating Systems for Supercomputers and High Performance Computing 10.1007/978-981-13-6624-6_3 (23-46) Online publication date: 16-Oct-2019 https://doi.org/10.1007/978-981-13-6624-6_3
  • Chen Y Sun M (2018) TSSA: A two step scheduling algorithm for the event-driven clusters 2018 20th International Conference on Advanced Communication Technology (ICACT) 10.23919/ICACT.2018.8323690 (184-189) Online publication date: Feb-2018 https://doi.org/10.23919/ICACT.2018.8323690
  • van Haren P Oomens N (2017) Data Acquisition Systems for Fusion Devices Fusion Technology 10.13182/FST93-A30189 24 :4 (391-402) Online publication date: 9-May-2017 https://doi.org/10.13182/FST93-A30189
  • Alappatt A (2017) Network Applications Are Interactive Queue 10.1145/3134434.3145628 15 :4 (89-113) Online publication date: 1-Aug-2017 https://dl.acm.org/doi/10.1145/3134434.3145628
  • Alappatt A (2017) Network applications are interactive Communications of the ACM 10.1145/3133319 61 :1 (46-53) Online publication date: 27-Dec-2017 https://dl.acm.org/doi/10.1145/3133319

View options

View or Download as a PDF file.

View online with eReader .

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Share this publication link.

Copying failed.

Share on social media

Affiliations, export citations.

  • Please download or close your previous search result export first before starting a new bulk export. Preview is not available. By clicking download, a status dialog will open to start the export process. The process may take a few minutes but once it finishes a file will be downloadable from your browser. You may continue to browse the DL while the export process is in progress. Download
  • Download citation
  • Copy citation

We are preparing your search results for download ...

We will inform you here when the file is ready.

Your file of search results citations is now ready.

Your search export query has expired. Please try again.

Navigation Menu

Search code, repositories, users, issues, pull requests..., provide feedback.

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly.

To see all available qualifiers, see our documentation .

  • Notifications You must be signed in to change notification settings

A curated list to learn about distributed systems

theanalyst/awesome-distributed-systems

Folders and files.

NameName
128 Commits

Repository files navigation

Awesome-distributed-systems.

A (hopefully) curated list on awesome material on distributed systems, inspired by other awesome frameworks like awesome-python . Most links will tend to be readings on architecture itself rather than code itself.

Read things here before you start.

  • CAP Theorem , Also plain english explanation
  • Fallacies of Distributed Computing , expect things to break, everything
  • Distributed systems theory for the distributed engineer , most of the papers/books in the blog might reappear in this list again. Still a good BFS approach to distributed systems.
  • FLP Impossibility Result (paper) , an easier blog post to follow along
  • An Introduction to Distributed Systems @aphyr's excellent introduction to distributed systems
  • Distributed Systems for fun and profit [Free]
  • Distributed Systems Principles and Paradigms, Andrew Tanenbaum [Free with registration]
  • Scalable Web Architecture and Distributed Systems [Free]
  • Principles of Distributed Systems [Free] [ETH Zurich University]
  • Making reliable distributed systems in the presence of software errors , [Free] Joe Amstrong's (Author of Erlang) PhD thesis
  • Designing Data Intensive Applications [Amazon Link]
  • Distributed Machine Learning Patterns, Yuan Tang , Practical patterns for scaling machine learning from your laptop to a distributed cluster
  • Distributed Computing, Hagit Attiya and Jennifer Welch
  • Distributed Algorithms, Nancy Lynch [Amazon Link]
  • Impossibility Results for Distributed Computing [paywall]
  • Designing Distributed Systems, Brendan Burns [Free with registration]
  • Distributed Systems: Concepts and Design, George Coulouris [Amazon Link]
  • Akka in Action, Second Edition
  • Systemantics: how systems work and especially how they fail
  • Think Distributed Systems [Free with subscription]

Must read papers on distributed systems. While nearly all of Lamport's work should feature here, just adding a few that must be read.

  • Times, Clocks and Ordering of Events in Distributed Systems Lamport's paper, the Quintessential distributed systems primer
  • Session Guarantees for Weakly Consistent Replicated Data a '94 paper that talks about various recommendations for session guarantees for eventually consistent systems, many of this would be standard vocabulary in reading other dist. sys papers, like monotonic reads, read your writes etc.

Storage & Databases

  • Dynamo: Amazon's Highly Available Key Value Store Paraphrasing @fogus from their blog , it is very rare for a paper describing an active production system to influence the state of active research in any industry; this is one of those seminal distributed systems paper that solves the problem of a highly available and fault tolerant database in an elegant way, later paving the way for systems like Cassandra, and many other AP systems using a consistent hashing.
  • Bigtable: A Distributed Storage System for Structured Data
  • The Google File System
  • Cassandra: A Decentralized Structured Storage System Inspired heavily by Dynamo, an now an open source
  • CRUSH: Controlled, Scalable, Decentralized Placement of Replicated Data , the algorithm for the basis of Ceph distributed storage system, for the architecture itself read RADOS

Messaging systems

  • The Log: What every software engineer should know about real-time data's unifying abstraction , a somewhat long read, but covers brilliantly on logs, which are at the heart of most distributed systems
  • Kafka: a Distributed Messaging System for Log Processing

Distributed Consensus and Fault-Tolerance

  • Practical Byzantine Fault Tolerance
  • The Byzantine Generals Problem
  • Impossibility of Distributed Consensus with One Faulty Process
  • The Part Time Parliament Paxos, Lamport's original Paxos paper, a bit difficult to understand, may require multiple passes
  • Paxos Made Simple , a more terse readable Paxos paper by Lamport himself. Shorter and more easier compared to the original.
  • The Chubby Lock Service for loosely coupled distributed systems Google's lock service used for loosely coupled distributed systems. Sort of Paxos as a Service for building other distributed systems. Primary inspiration behind other Service Discovery & Coordination tools like Zookeeper, etcd, Consul etc.
  • Paxos made live - An engineering perspective Google's learning while implementing systems atop of Paxos. Demonstrates various practical issues encountered while implementing a theoretical concept.
  • Raft Consensus Algorithm An alternative to Paxos for distributed consensus, that is much simpler to understand. Do checkout an interesting visualization of raft
  • Conflict-free Replicated Data Types presents an approach for Strong Eventual Consistency which as been applied in projects such as Riak , Redis and Akka . A great talk on the subject by Martin Kleppmann can be found here
  • Speculative algorithms for global state synchronizations Azos.Sky.Server.Locking uses probability based QOS (Quality of Service)/Trust measure to ensure probability-based consensus. The approach avoids distributed state machine/phase synchronization and is very simple to understand and implement

Testing, monitoring and tracing

While designing distributed systems are hard enough, testing them is even harder.

  • Dapper , Google's large scale distributed-systems tracing infrastructure, this was also the basis for the design of open source projects such as Zipkin , Apache SkyWalking , Pinpoint and HTrace .

Programming Models

  • Distributed Programming Model
  • PSync: a partially synchronous language for fault-tolerant distributed algorithms Video: Conference Video
  • Programming Models for Distributed Computing
  • Logic and Lattices for Distributed Programming

Verification of Distributed Systems

  • Curated list of resources on testing distributed systems includes links to materials on testing by various companies (Google, Amazon, Netflix, Microsoft, Dropbox, etc) and research papers.
  • Jepsen A framework for distributed systems verification, with fault injection @aphyr has featured enough times in this list already, but Jepsen and the blog posts that go with are a quintessntial addition to any distributed systems reading list.
  • Verdi A Framework for Implementing and Formally Verifying Distributed Systems Paper
  • Distributed Deep Dive interview series by Ably Relatime .
  • Distributed Systems in One Lesson Distributed Systems in One Lesson by Tim Berglund
  • Reliable Distributed Algorithms, Part 1 , KTH Sweden
  • Reliable Distributed Algorithms, Part 2 , KTH Sweden
  • Cloud Computing Concepts , University of Illinois
  • CMU: Distributed Systems in Go Programming Language
  • Software Defined Networking , Georgia Tech.
  • ETH Zurich: Distributed Systems
  • ETH Zurich: Distributed Systems Part 2 , covers Distributed control algorithms, communication models, fault-tolerance among other things. In particular fault tolerance issues (models, consensus, agreement) and replication issues (2PC,3PC, Paxos), which are critical in understanding distributed systems are explained in great detail.
  • Distributed Systems Course , A beginner course on distributed system by Chris Colohan, A google employee who contributed to SUIF, MapReduce, TCMalloc, Percolator, Caffeine, Borg, Omega, and Piper.
  • MIT 6.824 , Youtube-playlist MIT distributed system lectures, in each video they discuss papers like GFS, Zookeeper, RAFT, Spanner...
  • Distributed Systems , Lectures 9 to 16 of the Cambridge University lecture "Concurrent and Distributed Systems", given by Dr. Martin Kleppmann. Youtube-playlist . A computer science entrance course, covered basic models and algorithms in distributed systems, also discussed CRDT, collaboration software and google's spanner.

Blogs and other reading links

  • Amazon Builder's Library , a collection of Amazon's learnings on distributed systems
  • How we implemented consistent hashing efficiently
  • Notes on Distributed Systems for Young Bloods
  • High Scalability Several architectures of huge internet services, for eg twitter , whatsapp
  • There is No Now , Problems with simultaneity in distributed systems
  • Turing Lecture: The Computer Science of Concurrency: The Early Years , An article by Leslie Lamport on concurrency
  • The Paper Trail blog, a very readable blog covering various aspects of distributed systems
  • aphyr , Posts on jepsen series are pretty awesome
  • All Things Distributed - Wernel Vogel's (Amazon CTO) blog on distributed systems
  • Distributed Systems: Take Responsibility for Failover
  • The C10K problem
  • On Designing and Deploying Internet-Scale Services
  • Files are hard A blog post on filesystem consistency, pretty important to read if you are into distributed storage or databases.
  • Distributed Systems Testing: The Lost World Testing distributed systems are hard enough, a well researched blog post which again covers a lot of links to various approaches and other papers
  • SWIM Protocol explained A blog post on popular SWIM failure detector
  • ACM Symposium on Principles of Distributed Computing (PODC) and International Symposium on Distributed Computing (DISC) , a list of resources from PODC–DISC community including conference series, mailing lists, youtube, twitter, etc.
  • IEEE International Parallel & Distributed Processing Symposium (IPDPS) , an international forum for engineers and scientists to present their latest research findings.
  • Springer Distributed Computing Journal , a journal about theory, design, specification, and implementation of distributed systems.

Other lists like this one

  • Readings in distributed systems
  • Distributed Systems meta list
  • List of required readings for Distributed Systems Part of CMU's Engineering Distributed Systems course
  • The Distributed Reader
  • A Distributed Systems Reading List , A collection of material, mostly papers on Distributed Systems Theory as well as seminal industry papers
  • Distributed Systems Readings , A comprehensive list of online courses related to distributed systems
  • Awesome Distributed Consensus , Another list of materials on distributed consensus protocols
  • Beginner's Guide to Distributed Systems A blog post with some useful getting started links for distributed systems

Contributors 31

@theanalyst

research paper on distributed system

Academia.edu no longer supports Internet Explorer.

To browse Academia.edu and the wider internet faster and more securely, please take a few seconds to  upgrade your browser .

  •  We're Hiring!
  •  Help Center

Distributed System

  • Most Cited Papers
  • Most Downloaded Papers
  • Newest Papers
  • Last »
  • Distributed Computing Follow Following
  • Distributed Algorithms Follow Following
  • Cloud Computing Follow Following
  • Peer-to-Peer Follow Following
  • Computer Science Follow Following
  • Middleware Follow Following
  • P2p(peer-2-Peer) Networks Follow Following
  • Operating Systems Follow Following
  • High Performance Distributed System Follow Following
  • Mobile Distributed Systems Follow Following

Enter the email address you signed up with and we'll email you a reset link.

  • Academia.edu Journals
  •   We're Hiring!
  •   Help Center
  • Find new research papers in:
  • Health Sciences
  • Earth Sciences
  • Cognitive Science
  • Mathematics
  • Computer Science
  • Academia ©2024

A Distributed Systems Reading List

Introduction.

I often argue that the toughest thing about distributed systems is changing the way you think. The below is a collection of material I've found useful for motivating these changes.

Thought Provokers

Ramblings that make you think about the way you design. Not everything can be solved with big servers, databases and transactions.

  • Harvest, Yield and Scalable Tolerant Systems - Real world applications of CAP from Brewer et al
  • On Designing and Deploying Internet Scale Services - James Hamilton
  • The Perils of Good Abstractions - Building the perfect API/interface is difficult
  • Chaotic Perspectives - Large scale systems are everything developers dislike - unpredictable, unordered and parallel
  • Data on the Outside versus Data on the Inside - Pat Helland
  • Memories, Guesses and Apologies - Pat Helland
  • SOA and Newton's Universe - Pat Helland
  • Building on Quicksand - Pat Helland
  • Why Distributed Computing? - Jim Waldo
  • A Note on Distributed Computing - Waldo, Wollrath et al
  • Stevey's Google Platforms Rant - Yegge's SOA platform experience
  • Latency Exists, Cope! - Commentary on coping with latency and it's architectural impacts
  • Latency - the new web performance bottleneck - not at all new (see Patterson ), but noteworthy
  • The Tail At Scale - the latencychallenges inherent of dealing with latency in large scale systems

Somewhat about the technology but more interesting is the culture and organization they've created to work with it.

  • A Conversation with Werner Vogels - Coverage of Amazon's transition to a service-based architecture
  • Discipline and Focus - Additional coverage of Amazon's transition to a service-based architecture
  • Vogels on Scalability
  • SOA creates order out of chaos @ Amazon

Current "rocket science" in distributed systems.

  • Chubby Lock Manager
  • Google File System
  • Data Management for Internet-Scale Single-Sign-On
  • Dremel: Interactive Analysis of Web-Scale Datasets
  • Large-scale Incremental Processing Using Distributed Transactions and Notifications
  • Megastore: Providing Scalable, Highly Available Storage for Interactive Services - Smart design for low latency Paxos implementation across datacentres.
  • Spanner - Google's scalable, multi-version, globally-distributed, and synchronously-replicated database.
  • Photon - Fault-tolerant and Scalable Joining of Continuous Data Streams. Joins are tough especially with time-skew, high availability and distribution.
  • Mesa: Geo-Replicated, Near Real-Time, Scalable Data Warehousing - Data warehousing system that stores critical measurement data related to Google's Internet advertising business.

Consistency Models

Key to building systems that suit their environments is finding the right tradeoff between consistency and availability.

  • CAP Conjecture - Consistency, Availability, Parition Tolerance cannot all be satisfied at once
  • Consistency, Availability, and Convergence - Proves the upper bound for consistency possible in a typical system
  • CAP Twelve Years Later: How the "Rules" Have Changed - Eric Brewer expands on the original tradeoff description
  • Consistency and Availability - Vogels
  • Eventual Consistency - Vogels
  • Avoiding Two-Phase Commit - Two phase commit avoidance approaches
  • 2PC or not 2PC, Wherefore Art Thou XA? - Two phase commit isn't a silver bullet
  • Life Beyond Distributed Transactions - Helland
  • If you have too much data, then 'good enough' is good enough - NoSQL, Future of data theory - Pat Helland
  • Starbucks doesn't do two phase commit - Asynchronous mechanisms at work
  • You Can't Sacrifice Partition Tolerance - Additional CAP commentary
  • Optimistic Replication - Relaxed consistency approaches for data replication

Papers that describe various important elements of distributed systems design.

  • Distributed Computing Economics - Jim Gray
  • Rules of Thumb in Data Engineering - Jim Gray and Prashant Shenoy
  • Fallacies of Distributed Computing - Peter Deutsch
  • Impossibility of distributed consensus with one faulty process - also known as FLP [access requires account and/or payment, a free version can be found here ]
  • Unreliable Failure Detectors for Reliable Distributed Systems. A method for handling the challenges of FLP
  • Lamport Clocks - How do you establish a global view of time when each computer's clock is independent
  • The Byzantine Generals Problem
  • Lazy Replication: Exploiting the Semantics of Distributed Services
  • Scalable Agreement - Towards Ordering as a Service
  • Scalable Eventually Consistent Counters over Unreliable Networks - Scalable counting is tough in an unreliable world

Languages and Tools

Issues of distributed systems construction with specific technologies.

  • Programming Distributed Erlang Applications: Pitfalls and Recipes - Building reliable distributed applications isn't as simple as merely choosing Erlang and OTP.

Infrastructure

  • Principles of Robust Timing over the Internet - Managing clocks is essential for even basics such as debugging
  • Consistent Hashing and Random Trees
  • Amazon's Dynamo Storage Service

Paxos Consensus

Understanding this algorithm is the challenge. I would suggest reading "Paxos Made Simple" before the other papers and again afterward.

  • The Part-Time Parliament - Leslie Lamport
  • Paxos Made Simple - Leslie Lamport
  • Paxos Made Live - An Engineering Perspective - Chandra et al
  • Revisiting the Paxos Algorithm - Lynch et al
  • How to build a highly available system with consensus - Butler Lampson
  • Reconfiguring a State Machine - Lamport et al - changing cluster membership
  • Implementing Fault-Tolerant Services Using the State Machine Approach: a Tutorial - Fred Schneider

Other Consensus Papers

  • Mencius: Building Efficient Replicated State Machines for WANs - consensus algorithm for wide-area network
  • In Search of an Understandable Consensus Algorithm - The extended version of the RAFT paper, an alternative to PAXOS.

Gossip Protocols (Epidemic Behaviours)

  • How robust are gossip-based communication protocols?
  • Astrolabe: A Robust and Scalable Technology For Distributed Systems Monitoring, Management, and Data Mining
  • Epidemic Computing at Cornell
  • Fighting Fire With Fire: Using Randomized Gossip To Combat Stochastic Scalability Limits
  • Bi-Modal Multicast
  • ACM SIGOPS Operating Systems Review - Gossip-based computer networking
  • SWIM: Scalable Weakly-consistent Infection-style Process Group Membership Protocol
  • Chord : A Scalable Peer-to-peer Lookup Protocol for Internet Applications
  • Kademlia : A Peer-to-peer Information System Based on the XOR Metric
  • Pastry : Scalable, decentralized object location and routing for large-scale peer-to-peer systems
  • PAST : A large-scale, persistent peer-to-peer storage utility - storage system atop Pastry
  • SCRIBE : A large-scale and decentralised application-level multicast infrastructure - wide area messaging atop Pastry

Information

  • Author Services

Initiatives

You are accessing a machine-readable page. In order to be human-readable, please install an RSS reader.

All articles published by MDPI are made immediately available worldwide under an open access license. No special permission is required to reuse all or part of the article published by MDPI, including figures and tables. For articles published under an open access Creative Common CC BY license, any part of the article may be reused without permission provided that the original article is clearly cited. For more information, please refer to https://www.mdpi.com/openaccess .

Feature papers represent the most advanced research with significant potential for high impact in the field. A Feature Paper should be a substantial original Article that involves several techniques or approaches, provides an outlook for future research directions and describes possible research applications.

Feature papers are submitted upon individual invitation or recommendation by the scientific editors and must receive positive feedback from the reviewers.

Editor’s Choice articles are based on recommendations by the scientific editors of MDPI journals from around the world. Editors select a small number of articles recently published in the journal that they believe will be particularly interesting to readers, or important in the respective research area. The aim is to provide a snapshot of some of the most exciting work published in the various research areas of the journal.

Original Submission Date Received: .

  • Active Journals
  • Find a Journal
  • Proceedings Series
  • For Authors
  • For Reviewers
  • For Editors
  • For Librarians
  • For Publishers
  • For Societies
  • For Conference Organizers
  • Open Access Policy
  • Institutional Open Access Program
  • Special Issues Guidelines
  • Editorial Process
  • Research and Publication Ethics
  • Article Processing Charges
  • Testimonials
  • Preprints.org
  • SciProfiles
  • Encyclopedia

electronics-logo

Article Menu

research paper on distributed system

  • Subscribe SciFeed
  • Recommended Articles
  • Google Scholar
  • on Google Scholar
  • Table of Contents

Find support for a specific problem in the support section of our website.

Please let us know what you think of our products and services.

Visit our dedicated information section to learn more about MDPI.

JSmol Viewer

Research on sdp-bf method with low false positive face to passive detection system.

research paper on distributed system

1. Introduction

  • SDP-DCBF-SFF is proposed, in which the interested endpoints are coarsely filtered by DCBF and the uninterested endpoints are accurately filtered by SFF. It eliminates all the false positive endpoint connection requests caused by the Hash collision of Bloom filter without generating extra flow overhead.
  • We propose an exit mechanism for elements in SFF based on the vitality factor, allowing “false positive” endpoints to exit SFF.
  • A network performance evaluation index based on the correction of false positive rate is proposed. The evaluation accuracy of network resource consumption in the discovery process is enhanced by introducing the false positive rate into the total number of messages in the network and other indexes.

2. System Model and Basic Definitions

2.1. distributed non-cooperative perception system model, 2.2. data distribution protocol, 2.3. traditional sdp and sdp-bf, 3. proposed discovery protocol sdp-dcbf-sff, 3.1. the proposed discovery protocol model.

  • Create participants and endpoints. Create local participants and remote participants in the communication network as well as corresponding participant endpoint information. To better explain the algorithm, we set Local Participant A and Remote Participant B, Endpoints A1 and A2 of Participant A, and Endpoints B1 and B2 of Participant B.
  • Detect participants and endpoints in the network. Detect all participants in the network and the endpoint information of participants.
  • Configure initial parameters. Configure the SFF parameters of A to B in Participant A as an empty array and the DCBF parameters as the KEY of Endpoints A1 and A2 after Hash mapping; configure the SFF parameters of B to A in Participant B as an empty array and the DCBF parameters as the KEY of Endpoints B1 and B2 after Hash mapping.
  • Packet MPDP messages. Participant A generates a PDP message according to the DDS protocol, which constitutes an MPDP transmission message packet together with the KEY value stored in DCBF and the parameter information in SFF.
  • Network transmission. Transmit the packetized messages in Step 4 to the remote participants through the network.
  • Judge MPDP message matching. Participant B judges the DDS interoperability standard of the MPDP message packets received from Participant A. If protocol version, supplier identification, supported discovery protocol, etc. are matched successfully between both of them, go to Step 7. Otherwise, finish the current discovery process, return to Step 2, and redetect the participants.
  • Judge DCBF endpoints matching in MEDP. Carry out independent Hash mapping on the endpoints of Participant B and match the mapping result with the DCBF KEY in the MPDP message packet from Participant A. If the endpoint is matched successfully, go to Step 8. Otherwise, repeat Step 7 for the next endpoint. If all the endpoints are mismatched, finish the current discovery process, return to Step 2, and redetect the participants.
  • Judge SFF endpoint matching in MEDP. Compare the endpoints screened in Step 7 with the information stored in the SFF of Participant B. If the current endpoint is not from SFF, store the information in the SFF and maintain the SFF (see Section 3.2.2 ). Otherwise, repeat Step 8 for the next endpoint. If all the endpoints are from SFF, finish the current discovery process, return to Step 2, and redetect the participants.
MPDP
all Endpoints in Participant all N_ParticipantDATA Discovery finish
MEDP
all Endpoints passed DCBF Endpoint ∈ SFF all Endpoints ∈ SFF Life_Time < 10LeaseDuration

3.2. Composition of Key Modules

3.2.1. dcbf module.

  • Set initial parameters. The initial sub-BF ID is set as 0, the BF value is set as a 0 array, and the threshold of false positive rate is set as 0.5. k independent Hash mapping functions are initialized.
  • Detect the number of endpoints. The endpoints of participants are the elements to be mapped in DCBF. The DCBF module detects the number of elements of Participant B. If it detects an increase in elements, go to Step 3. If it detects a decrease in elements, go to Step 6. Otherwise, keep the detection state.
  • Judge the threshold of false positive rate. When a newly increased element is detected, the current FP is calculated according to Equation ( 1 ). If FP is higher than the threshold, go to Step 4; otherwise, go to Step 5.
  • Create a new BF. Create a new sub-BF with the same length and Hash function, add 1 to its ID flag bit, set the sub-BF to as Active state, and then go to Step 5.
  • Hash mapping of new endpoints. Carry out independent Hash mapping on the newly increased endpoints in the DDS system 3 times, and add 1 to the count value at the corresponding position each time. The mapped sequence is stored in the sub-BF in the Active state, and the mapped result is the DCBF KEY. Then, go to Step 9.
  • Judge the element to be deleted. When the element to be deleted is detected, judge their positions. If it is the last element in the DCBF, go to Step 7; otherwise, go to Step 8.
  • Delete the sub-BF. Delete the current sub-BF in the whole DCBF, decrease the quantity counters of sub-BF by 1, and then go to Step 8.
  • Delete the Hash mapping of endpoints. Carry out independent Hash mapping on the deleted endpoints in the DDS system 3 times, and subtract 1 from the count value at the corresponding position each time. The mapped sequence is stored in the sub-BF in the Active state, and the mapped result is DCBF KEY. Then, go to Step 9.
DCBF
DDS first initialization all Endpoints in Participant index(E)<Length_req Endpoint is update Endpoint E delet count<Length_req Endpoint E added count>Length_req

3.2.2. SFF Module

  • Set initial parameters. Initialize SFF as an empty list.
  • Update the SFF value. Participant A monitors the MPDP messages from Participant B and updates the B to A SFF of Participant B to the SFF of Participant A. This process is divided into the following two processes. In Process 1, set “life factor” for the new endpoint entering SFF and then go to Step 3. In Process 2, parse the MPDP messages. If the participant messages are matched, go to Step 4; otherwise, go to Step 2 again. “Life factor” is used to characterize the endpoint’s residence time in SFF, and it is a duration parameter value in LifespanQoS. The definition of IDL is as follows: S t r u c t L i f e s p a n Q o s P o l i c y { D u r a t i o n t d u r a t i o n ; }
  • Judge life factor threshold. Monitor the “life factor” value of each endpoint. When “life factor” is greater than 0, go to Step 3 again. When it is equal to 0, remove the endpoint information from SFF and then go to Step 6.
SFF
1 receive Keys all Endpoints Matched Endpoints ∈ SFF Publish/Subscribe

3.3. Evaluation Index of Discovery Protocol Based on Correction of False Positive Rate

4. simulation results, 4.1. number of messages, 4.2. false positive rate, 4.3. transmission delay, 5. conclusions, author contributions, data availability statement, conflicts of interest.

  • Zou, Y.; Wang, X.; Shen, W. Physical-layer security with multiuser scheduling in cognitive radio networks. IEEE Trans. Commun. 2013 , 61 , 5103–5113. [ Google Scholar ] [ CrossRef ]
  • Kurte, R.; Salcic, Z.; Kevin, I.; Wang, K. A distributed service framework for the internet of things. IEEE Trans. Ind. Inform. 2019 , 16 , 4166–4176. [ Google Scholar ] [ CrossRef ]
  • Chi, Y.; Liu, L.; Song, G.; Li, Y.; Guan, Y.L.; Yuen, C. Constrained Capacity Optimal Generalized Multi-User MIMO: A Theoretical and Practical Framework. IEEE Trans. Commun. 2022 , 70 , 8086–8104. [ Google Scholar ] [ CrossRef ]
  • Wang, H.; Roman, H.E.; Yuan, L.; Huang, Y.; Wang, R. Connectivity, coverage and power consumption in large-scale wireless sensor networks. Comput. Netw. 2014 , 75 , 212–225. [ Google Scholar ] [ CrossRef ]
  • Habib, H.F.; Fawzy, N.; Esfahani, M.M.; Mohammed, O.A.; Brahma, S. An Enhancement of Protection Strategy for Distribution Network Using the Communication Protocols. IEEE Trans. Ind. Appl. 2020 , 56 , 1240–1249. [ Google Scholar ] [ CrossRef ]
  • DDS Version 1.4. Available online: https://www.omg.org/spec/DDS/ (accessed on 1 March 2015).
  • Köksal, Ö.; Tekinerdogan, B. Obstacles in data distribution service middleware: A systematic review. Future Gener. Comput. Syst. 2017 , 68 , 191–210. [ Google Scholar ] [ CrossRef ]
  • Yoon, G.; Choi, J.; Park, H.; Choi, H. Topic naming service for DDS. In Proceedings of the 2016 International Conference on Information Networking (ICOIN), Kota Kinabalu, Malaysia, 13–15 January 2016; pp. 378–381. [ Google Scholar ] [ CrossRef ]
  • Scordino, C.; Mariño, A.G.; Fons, F. Hardware Acceleration of Data Distribution Service (DDS) for Automotive Communication and Computing. IEEE Access 2022 , 10 , 109626–109651. [ Google Scholar ] [ CrossRef ]
  • Abdellatif, S.; Berthou, P.; Villemur, T.; Simo, F. Management of industrial communications slices: Towards the Application Driven Networking concept. Comput. Commun. 2020 , 155 , 104–116. [ Google Scholar ] [ CrossRef ]
  • Patgiri, R.; Biswas, A.; Nayak, S. Malicious URL detection using learned Bloom Filter and evolutionary deep learning. Comput. Commun. 2023 , 200 , 30–41. [ Google Scholar ] [ CrossRef ]
  • Liu, L.F.; Miao, S.X.; Hu, H.P. Pseudorandom bit generator based on non-stationary logistic maps. IET Inf. Secur. 2016 , 10 , 87–94. [ Google Scholar ] [ CrossRef ]
  • Al-Madani, B.; Khan, A.U.H.; Baig, Z.A. A novel mobility-aware data transfer service (MADTS) based on DDS standards. Arab. J. Sci. Eng. 2014 , 39 , 2843–2856. [ Google Scholar ] [ CrossRef ]
  • Zhai, H.; Zhuang, Y.; Huo, Y. Publish/subscribe automatic discovery algorithm based on service ability vector. Comput. Eng. 2014 , 40 , 52–54. (In Chinese) [ Google Scholar ]
  • An, K.; Gokhale, A.; Schmidt, D.; Tambe, S.; Pazandak, P.; Pardo-Castellote, G. Content-based filtering discovery protocol (CFDP): Scalable and efficient OMG DDS discovery protocol. In Proceedings of the ACM International Conference on Distributed Event-Based Systems, Mumbai, India, 26–29 May 2014; pp. 130–141. [ Google Scholar ]
  • Jia, Y.; Xu, L.; Yang, Y.; Zhang, X. Lightweight automatic discovery protocol for OpenFlow-based software-defined networking. IEEE Commun. Lett. 2020 , 24 , 312–315. [ Google Scholar ] [ CrossRef ]
  • Sanchez-Monedero, J.; Povedano-Molina, J.; Lopez-Vega, J.M.; Lopez-Soler, J.M. DDS-enabled Cloud management support for fast task offloading. In Proceedings of the IEEE Symposium on Computers and Communications, Cappadocia, Turkey, 1–4 July 2012; pp. 000067–000074. [ Google Scholar ]
  • Putra, H.A.; Kim, D.S. Node discovery scheme of DDS combat management system. Comput. Stand. Interfaces 2014 , 37 , 20–28. [ Google Scholar ] [ CrossRef ]
  • Khaefi, M.R.; Im, J.-Y.; Kim, D.-S. An efficient DDS node discovery scheme for naval combat system. In Proceedings of the IEEE 20th Conference on Emerging Technologies & Factory Automation (ETFA), Luxembourg, 8–11 September 2015; pp. 1–8. [ Google Scholar ]
  • Nwadiugwu, W.P.; Cha, J.-H.; Kim, D.-S. Enhanced SDP-dynamic bloom filters for a DDS node discovery in real-time distributed systems. In Proceedings of the 2017 22nd IEEE International Conference on Emerging Technologies and Factory Automation (ETFA), Limassol, Cyprus, 12–15 September 2017; pp. 1–4. [ Google Scholar ] [ CrossRef ]
  • Geng, H.; Li, Y. Publish subscribe automatic discovery algorithm based on hierarchical bloom filter. J. Comput. Eng. Des. 2019 , 40 , 3494–3499. [ Google Scholar ]
  • Fan, Z.; Zhang, T.; Liu, Z. DDS automatic discovery algorithm based on single hash count bloom. J. Comput. Eng. Des. 2022 , 43 , 1964–1971. [ Google Scholar ]
  • Liu, Z.; Liu, S.; Fan, Z.; Zhao, Z. Low consumption automatic discovery protocol for DDS-based large-scale distributed parallel computing. Parallel Comput. 2023 , 118 , 103052. [ Google Scholar ] [ CrossRef ]
  • Williams-Paul Nwadiugwu, D.; Kim, D.-S.; Ejaz, W.; Anpalagan, A. MAD-DDS: Memory-efficient automatic discovery data distribution service for large-scale distributed control network. IET Commun. 2023 , 17 , 1432–1446. [ Google Scholar ] [ CrossRef ]
  • Li, J.; Zhang, Q.; Zhang, Y.; Wu, X.; Wang, X.; Su, Y. Hidden Phase Space Re-construction: A Novel Chaotic Time Series Prediction Method for Speech Signals. Chin. J. Electron. 2018 , 27 , 1221–1228. [ Google Scholar ] [ CrossRef ]
  • Griffiths, H.; Willis, N. Klein Heidelberg—the first modern bistatic radar system. IEEE Trans. Aerosp. Electron. Syst. 2010 , 46 , 1571–1588. [ Google Scholar ] [ CrossRef ]
  • Lee, J.; Byun, H.; Lim, H. Dual-load Bloom filter: Application for name lookup. Comput. Commun. 2020 , 151 , 1–9. [ Google Scholar ] [ CrossRef ]
  • Blco-Justicia, A.; Domingo-Ferrer, J. Efficient privacy-preserving implicit authentication. Comput. Commun. 2018 , 125 , 13–23. [ Google Scholar ] [ CrossRef ]

Click here to enlarge figure

Method (Author)Memory ConsumptionPositive MisjudgmentExtra Flow OverheadTransmission Delay
Sanchez-Monedero [ ]×××
Putra [ ]××
Khaefi [ ]××
Nwadiugwu [ ]××
Geng [ ]××
Fan [ ]××
Liu [ ]×
Nwadiugwu [ ]××
Proposed discovery protocol
System ScalePEPub/Sub
Small15540210/280
Medium301300540/790
Large451900790/1110
Super large603150930/1450
The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

Jiang, C.; Li, J.; Yang, Y. Research on SDP-BF Method with Low False Positive Face to Passive Detection System. Electronics 2024 , 13 , 3240. https://doi.org/10.3390/electronics13163240

Jiang C, Li J, Yang Y. Research on SDP-BF Method with Low False Positive Face to Passive Detection System. Electronics . 2024; 13(16):3240. https://doi.org/10.3390/electronics13163240

Jiang, Chenzhuo, Junjie Li, and Yuxiao Yang. 2024. "Research on SDP-BF Method with Low False Positive Face to Passive Detection System" Electronics 13, no. 16: 3240. https://doi.org/10.3390/electronics13163240

Article Metrics

Article access statistics, further information, mdpi initiatives, follow mdpi.

MDPI

Subscribe to receive issue release notifications and newsletters from MDPI journals

Pardon Our Interruption

As you were browsing something about your browser made us think you were a bot. There are a few reasons this might happen:

  • You've disabled JavaScript in your web browser.
  • You're a power user moving through this website with super-human speed.
  • You've disabled cookies in your web browser.
  • A third-party browser plugin, such as Ghostery or NoScript, is preventing JavaScript from running. Additional information is available in this support article .

To regain access, please make sure that cookies and JavaScript are enabled before reloading the page.

Development of the Online Data Processing System for the BM@N Experiment at NICA

  • COMPUTER TECHNOLOGIES IN PHYSICS
  • Published: 14 August 2024
  • Volume 21 , pages 789–792, ( 2024 )

Cite this article

research paper on distributed system

  • E. Alexandrov 1 ,
  • I. Alexandrov 1 ,
  • A. Chebotov 1 ,
  • I. Filozova 1 ,
  • K. Gertsenberger 1 ,
  • I. Romanov 1 &
  • G. Shestakova 1  

A huge amount of experimental data should be collected, stored and processed in large modern high-energy physics experiments, including the experiments of the NICA (Nuclotron-based Ion Collider fAcility) project at the Joint Institute for Nuclear Research. In this regard, corresponding performance requirements are put forward for existing online systems. The online data processing system developed for the BM@N experiment within the NICA project is based on a distributed architecture, enabling it to meet high performance requirements through scalability and parallel computing. The purpose of the online system is selective data processing (conversion to event digits in the CERN ROOT format and fast event reconstruction) and data monitoring of the ongoing experiment. To achieve this goal, the FairMQ package implemented by the FAIR collaboration (GSI Institute, Germany) has been chosen to communicate distributed processes executed on the nodes of the computing infrastructure with each other through the exchange of their messages. One of the issues in developing and using such systems is the problem of the distributed run and control of the processes. The task has been solved by using the FAIR DDS (Dynamic Deployment System) toolkit. The BM@N online system starts the predefined software tasks in a required sequence and allows managing them during sessions, including the transmission of messages between the tasks and the update of some properties. The paper presents the purposes and architecture of the Online Data Processing System for the BM@N experiment and features of the current implementation.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save.

  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime

Price includes VAT (Russian Federation)

Instant access to the full article PDF.

Rent this article via DeepDyve

Institutional subscriptions

research paper on distributed system

Similar content being viewed by others

research paper on distributed system

BM@N Run 8 Data Processing on a Distributed Infrastructure with DIRAC

research paper on distributed system

BES-III distributed computing status

research paper on distributed system

Common Deployment Complex for the Information Systems of the BM@N Experiment

Explore related subjects.

  • Artificial Intelligence

ROOT macros contain pure C++ code, which is interpreted at runtime.

The Condition Database stores various parameters that are used in the data processing algorithms.

The FairRunAna manager is part of the FairRoot infrastructure and is used to perform event reconstruction and physics data analysis tasks.

M. Kapishin, “Studies of baryonic matter at the BM@N experiment (JINR),” Nucl. Phys. A 982 , 967–970 (2019).

Article   ADS   Google Scholar  

NICA Collaboration. “NICA White paper. Searching for a QCD mixed phase at the Nuclotron-based ion collider facility,” (2014).

P. Batyuk, K. Gertsenberger, S. Merts, and and O. Rogachevsky, “The BmnRoot framework for experimental data processing in the BM@N experiment at NICA,” EPJ Web Conf. 214 , 05027 (2019).

Article   Google Scholar  

R. Brun and F. Rademakers, “ROOT–an object oriented data analysis framework,” Nucl. Instrum. Methods Phys. Res., Sect. A 389 , 81–86 (1997).

Google Scholar  

M. Al-Turany, D. Bertini, R. Karabowicz, D. Kresan, P. Malzacher, T. Stockmanns, and F. Uhlig, “The FairRoot framework,” J. Phys.: Conf. Ser. 396 , 022001 (2012).

A. Chebotov, K. Gertsenberger, P. Klimai, and A. Moshkin, “Information system based on the condition database for the NICA experiments, user WEB application, and related services,” Phys. Part. Nucl. Lett. 19 , 558–561 (2022).

K. Gertsenberger, “Event display for the fixed target experiment BM@N,” EPJ Web Conf. 108 , 02022 (2016).

FairMQ–C++ message queuing library and framework. Official repository. https://github.com/FairRootGroup/FairMQ. (Accessed November 10, 2023).

A. Lebedev and A. Manafov, “DDS: the dynamic deployment system,” EPJ Web Conf. 214 , 01011 (2019).

Slurm manager. Workload. URL: https://slurm.schedmd.com/. (Accessed November 10, 2023).

E. Alexandrov, I. Alexandrov, A. Chebotov, K. Gertsenberger, I. Filozova, D. Priakhina, and G. Shestakova, “Configuration information system for online processing and data monitoring in the NICA experiments,” J. Phys.: Conf. Ser. 2438 , 012019 (2023).

Download references

This work was supported by ongoing institutional funding. No additional grants to carry out or direct this particular research were obtained.

Author information

Authors and affiliations.

Joint Institute for Nuclear Research, 141980, Dubna, Moscow oblast, Russia

E. Alexandrov, I. Alexandrov, A. Chebotov, I. Filozova, K. Gertsenberger, I. Romanov & G. Shestakova

You can also search for this author in PubMed   Google Scholar

Corresponding author

Correspondence to I. Romanov .

Ethics declarations

The authors of this work declare that they have no conflicts of interest.

Additional information

Publisher’s note..

Pleiades Publishing remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Alexandrov, E., Alexandrov, I., Chebotov, A. et al. Development of the Online Data Processing System for the BM@N Experiment at NICA. Phys. Part. Nuclei Lett. 21 , 789–792 (2024). https://doi.org/10.1134/S154747712470136X

Download citation

Received : 01 February 2024

Revised : 12 February 2024

Accepted : 20 February 2024

Published : 14 August 2024

Issue Date : August 2024

DOI : https://doi.org/10.1134/S154747712470136X

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Find a journal
  • Publish with us
  • Track your research

COMMENTS

  1. IEEE Transactions on Parallel and Distributed Systems

    Profile Information. Communications Preferences. Profession and Education. Technical interests. Need Help? US & Canada: +1 800 678 4333. Worldwide: +1 732 981 0060. Contact & Support. Follow.

  2. Distributed Systems

    1. Distributed computer control The main aim of this paper is to present a review of the literature concerning distributed systems and large scale systems modelling and conjunction of these subjects. In the experimental part, an example of a large scale system model design is presented.

  3. A brief introduction to distributed systems

    Distributed systems are by now commonplace, yet remain an often difficult area of research. This is partly explained by the many facets of such systems and the inherent difficulty to isolate these facets from each other. In this paper we provide a brief overview of distributed systems: what they are, their general design goals, and some of the most common types.

  4. Journal of Parallel and Distributed Computing

    The publishes original research papers and timely review articles on the theory, design, evaluation, and use of parallel and/or distributed computing systems. The journal also features special issues on these topics; again covering the full range from the design to the use of our targeted systems. Research Areas Include:

  5. The evolution of distributed computing systems: from ...

    Distributed systems have been an active field of research for over 60 years, and has played a crucial role in computer science, enabling the invention of the Internet that underpins all facets of modern life. Through technological advancements and their changing role in society, distributed systems have undergone a perpetual evolution, with each change resulting in the formation of a new ...

  6. 320007 PDFs

    A distributed system consists of multiple autonomous... | Explore the latest full-text research PDFs, articles, conference papers, preprints and more on DISTRIBUTED SYSTEMS. Find methods ...

  7. Distributed Systems and Parallel Computing

    From our company's beginning, Google has had to deal with both issues in our pursuit of organizing the world's information and making it universally accessible and useful. We continue to face many exciting distributed systems and parallel computing challenges in areas such as concurrency control, fault tolerance, algorithmic efficiency, and ...

  8. PDF Future Directions for Parallel and Distributed Computing

    achieve a set of goals that go beyond traditional systems aims to meet society's needs for more scalable, energy-efficient, reliable, verifiable, and secure computing systems. 2 Recommendations This section briefly summarizes top-level recommendations for research in parallel and distributed computing.

  9. A Comparative Study of Consensus Algorithms for Distributed Systems

    Consensus algorithms are sometimes very difficult to understand and therefore implement correctly. In this paper we chose to complete a comparative study between three different consensus algorithms Raft, Paxos, and pBFT. We provided our implementation for the three algorithms with details of the assumptions taken.

  10. Dapper, a Large-Scale Distributed Systems Tracing Infrastructure

    Here we introduce the design of Dapper, Google's production distributed systems tracing infrastructure, and describe how our design goals of low overhead, application-level transparency, and ubiquitous deployment on a very large scale system were met. Dapper shares conceptual similarities with other tracing systems, particularly Magpie [3 ...

  11. Papers

    The papers from SOSP 2013. DSRG is a Distributed Systems Reading Group at MIT. We meet once a week on the 9th floor of Stata to discuss distributed systems research papers, and cover papers from conferences like SOSP, OSDI, PODC, VLDB, and SIGMOD. We try to have a healthy mix of current systems papers and older seminal papers.

  12. A brief introduction to distributed systems

    A distributed system is a collection of autonomous computing elements that. appears to its users as a single coherent system. This definition refers to two characteristic features of distributed ...

  13. A Perspective on Distributed Computer Systems

    Distributed computer systems have been the subject of a vast amount of research. Many prototype distributed computer systems have been built at university, industrial, commercial, and government research laboratories, and production systems of all sizes and types have proliferated. It is impossible to survey all distributed computing system research. Instead, this paper identifies six ...

  14. Distributed operating systems

    This paper is intended as an introduction to distributed operating systems, and especially to current university research about them. After a discussion of what constitutes a distributed operating system and how it is distinguished from a computer network, various key design issues are discussed.

  15. Distributed Systems and Recent Innovations: Challenges and Benefits

    It is worth noting that a vast majority of distributed systems (e.g., [8]) can be represented by this system model. A similar system mode can be found in the literature [11]. ...

  16. Distributed Systems Research Papers

    Development of secured and trusted distributed systems is a critical research issue. This paper is a contribution towards the summarization of work carried out in this field as well as identifies new research lines. In this paper several approaches about security aspects in distributed systems are covered.

  17. PDF Time, Clocks, and the Ordering of Events in a Distributed System

    realize that suppose it is sent attime tand received at time t'. We the order inwhich events occur isonly apartial ordering. pretend that m has aclock Cm which runs ata constant We believe that this dea isuseful in. understanding y rate such. that C,~(t) = tm and Cm(t') = tm +/~m. Then multiprocess system. It shou.

  18. theanalyst/awesome-distributed-systems

    Must read papers on distributed systems. While nearly all of Lamport's work should feature here, just adding a few that must be read.. Times, Clocks and Ordering of Events in Distributed Systems Lamport's paper, the Quintessential distributed systems primer; Session Guarantees for Weakly Consistent Replicated Data a '94 paper that talks about various recommendations for session guarantees for ...

  19. Distributed System Research Papers

    In this paper, we describe how the distributed systems paradigm can be extended to provide a unified abstraction for both hardware and software components. Moreover, based on that abstraction, we define a low-overhead system-wide communication architecture that offers communication transparency between all kinds of components.

  20. Distributed Systems Reading List

    Current "rocket science" in distributed systems. MapReduce. Chubby Lock Manager. Google File System. BigTable. Data Management for Internet-Scale Single-Sign-On. Dremel: Interactive Analysis of Web-Scale Datasets. Large-scale Incremental Processing Using Distributed Transactions and Notifications.

  21. (PDF) Distributed Computing: An Overview

    research papers in International Journals/Conferences in . ... In distributed system, the most common important factor is the information collection about loads on different nodes.

  22. PDF Bigtable: A Distributed Storage System for Structured Data

    Bigtable is a distributed storage system for managing structured data that is designed to scale to a very large size: petabytes of data across thousands of commodity servers. Many projects at Google store data in Bigtable, including web indexing, Google Earth, and Google Fi-nance. These applications place very different demands

  23. Electronics

    With the rapid development of 5G, UAV, and military communications, the data volume obtained by the non-cooperative perception system has increased exponentially, and the distributed system has become the development trend of the non-cooperative perception system. The data distribution service (DDS) produces a significant effect on the performance of distributed non-cooperative perception ...

  24. High‐power radio frequency wireless energy transfer system

    1 INTRODUCTION 1.1 Motivation and problem description. Removing the physical contact between power source and electrical components by using approaches, such as autonomous feeding and wireless power transfer was always a problem [1-4].For this purpose, the solar photovoltaic (PV) solution beside some other renewable energy harvesting methods grew, but wireless energy transfer cause remained ...

  25. Distributed consensus time‐varying optimization algorithm for multi

    Firstly, by utilizing exponential function, a novel criterion with respect to the stability in predefined time is developed for the nonlinear systems. Secondly, a new distributed switching control protocol, which can inhibit the external unbounded disturbances, is piecewise designed to drive the states of agents to the sliding mode surface via ...

  26. (PDF) Distributed computing systems

    Distributed computing systems refer to a network of computers that work together to. achieve a common goal. In a distributed computing system, individual computers are. connected to each other ...

  27. FIT5046-Assessment 2- Research Paper Presentation-2024

    Information-systems document from Monash University, 3 pages, FIT5046 (Mobile and Distributed Computing Systems) Assessment 2: Research Paper Analysis Presentation (15%) Group Assignment (Groups of 2) The 10-minute presentation in Week 7 Attendance Compulsory Submission of the PowerPoint file to Moodle by Monday 15

  28. Development of the Online Data Processing System for the BM@N

    Abstract A huge amount of experimental data should be collected, stored and processed in large modern high-energy physics experiments, including the experiments of the NICA (Nuclotron-based Ion Collider fAcility) project at the Joint Institute for Nuclear Research. In this regard, corresponding performance requirements are put forward for existing online systems. The online data processing ...