Data Synchronization

Sync

Definition
An automated process that ensures that the data in store A is made equal to that in store B.
Data Synchronization

How does it work?

Data is stored in multiple copies, either in a multiple databases or in multiple file systems. There are a number of variations to the process.

  • All data of store A may need to be synchronized with store B, or just part of the data.
  • The data may need to be transformed from A's format to B's format.
  • The syncronization may be one-way (only A to B) or two way (B to A as well).
  • The synchronizer copies all data to be synced every time, or it calculates the difference between the stores and syncs only this difference.
  • The synchronization may happen periodically or based on a trigger caused by a change in the data.
  • There may be a single source of truth that is the main authority, or there are equal peers
  • If multiple sources of events need to be synchronized the use of clocks may be necessary to ensure that the events are integrated in the correct order.

If the data consists of files that are copied as single units, and the sync is two-way (or has multiple sources) it may be necessary to calculate the differences between files and to merge these differences.

If data in store A is different from that in B the stores are inconsistent. It is the job of data synchronization to ensure consistency. A special type of consistency is eventual consistency.

Examples

  • Data base replication
  • Version control systems

When should you use it?

  • For performance reasons
  • When working remotely or off-line

Problems

  • If multiple users change the same piece of data, these changes collide and this collision may need to be resolved, even manually.

Links