From Cyrus

Jump to: navigation,

Merge Index

[Page Tools]

Contents

Executive Summary

  • Super-fast expunge (just a flags update!)
  • Index file never rewritten underfoot (removes a pile of race conditions)
  • QRESYNC/Condstore have access to information about deleted messages in sorted order with full metadata still available
  • Deletions get a modseq.

Preconditions

Racefree Locking

Index Format Changes

cyrus.index header

  • NUM_RECORDS - like EXISTS but includes expunged messages
  • need_cleanup mailbox flag - tells cyr_expire there are things that need doing
  • LAST_EXPIRE - timestamp of the last expire run, handy for replication and "opportunistic expire", where mailboxes get auto_expired on close if they haven't been expired for a while.

cyrus.index records

  • is_expunged flag - to know that the message should be skipped when looking at existing messages
  • file_unlinked flag - to show that the message file no longer exists

Implementation

Exists would still work the same way, but rather than this pattern:

for (msgno = 1; msgno <= mailbox->exists; msgno++) {
  mailbox_read_index_record(mailbox, msgno, &record);
  // do stuff
}

We would have:

for (recno = 1; recno <= mailbox->num_records; recno++) {
  mailbox_read_index_record(mailbox, recno, &record);
  if (record.message_flags & MAILBOX_FLAG_EXPUNGED)
       continue;
  // do stuff
}

To get msgno to UID mapping in imapd, we would need to keep an in memory mapping of which messages were valid. We already have this with seenflag, and it would just be an array of record numbers, i.e:

  • 1
  • 3
  • 4
  • 5
  • 9
  • 10

which could be a file with 5 expunged records and 5 regular records.

Then index_checkseen would reconcile that against the current state of the cyrus.index file and emit messages as appropriate.

EXPUNGE_DELAYED

  • just update the is_expunged and modseq on the record, reduce exists and the answered, deleted, flagged counts as appropriate in the header.
  • No index rewrite cost.
  • cyr_expire: rewrite the index record, removing all is_expunged records (optionally: keep those less than X days old). Unlink the files.

EXPUNGE_FORCE

  • just update the is_expunged and modseq on the record, reduce exists and the answered, deleted, flagged counts as appropriate in the header.
  • also unlink the message file, and set file_removed flag in the record.
  • optionally: set need_cleanup in the header.

LOCKED MAILBOXES

If a mailbox is open anywhere, it will have a SHARED lock on the cyrus.lock file, which it can never be rewritten by any task. Records never move once created. Instead, cyr_expire will set the need_cleanup flag in the header and unlink the files that it would have unlinked, setting file_removed on each index record, but leaving them in place.

Before ANY process unlocks an index file, it looks at the need_cleanup flag. If it exists, it performs a "checkpoint", which just creates a new cyrus.index file and cyrus.cache file, copying every index record that doesn't have file_removed set. (cyrus.cache rewrite could be optional based on leaked_cache header field...)

This is done by unlocking, and then TRYING to lock in exclusive mode. If the lock succeeds, then perform the checkpoint, otherwise just finish. Another process will get to the same point soon, and it will be the last one so will get the exclusive lock.