Everyone that has ever used server or desktop virtualization probably has used clones. Even though “clone” is not a well defined storage term, in most cases it is used to describe a data (image) copy. Technically this “copy” can be achieved using several technologies: copy clones, snapshots based, or mirror based (BCV), etc. VMWare is using the term “Full clone” to describe a “copy clone”, while clones that use delta/copy on write (snapshot) mechanism are named linked clones. Some people treat clones only as linked clones and/or writable snapshots. Netapp has a feature called Flex-clone that is just a writable snapshot.
My view on this is that the term “clone” (as it is used in virtualization systems) should describe the use case and not the technology. Even though snapshots and clones may use the same underlying technology, their use cases and use patterns are not the same. For example, under many systems the snapshots’ source volume is more important to the user than its snapshots and has a preferred status over them (backup scenario). Technically, the source volume is often full provisioned and has strict space accounting and manual removal policy while the snapshots are likely to be thin-provisioned (“space efficient”), may have automatic removal (expiration/exhaustion) policy, and soft/heuristic space management (see XIV for example).
This preferred source scheme will not work for clones; in many cases the source of the clones is just a template that is never used by itself, so you can store it on much less powerful storage (tier), and once you finished generating the clones, you can delete it if you want. The outcome of the clone is much more important from the template, so if space runs out you may delete few old templates, but you wont remove the clones if they are in use – each of them is a standalone VM image.
This can be demonstrated by VMware’s linked clones that are implemented as writable snapshots on top of a readonly base. When you generate linked clone pool using VMWare View Manager, the manager creates a readonly full clone (the “replica”) and takes snapshots from it. This (clever) way hides the snapshot source and in most cases you don’t directly manage or use replicas. The base template has no role after the cloning ends and can be deleted.
Another major difference is the different creation patterns: snapshot creating events tend to be periodic (backup/data set separation scenarios), and cloning creation (at least for VDI use cases) tend to be bursty, meaning each time clones are created from a base (template/replica) many of them are created at once.
This means that if you build a source-snapshot(or clone) creation over time graph, a typical snapshot graph will be dense and long tree (see below) while the equivalent typical clone tree will be very shallow but with big span out factor (maybe it should be called clone bush 😉 ). The following diagrams depict such graphs:
Due to these differences, even though under the hood snapshots and (linked) clones may be implemented using the same technologies, it is bad and in fact they shouldn’t be implemented in the same way as many (if not most) implementation assumptions for snapshots are not valid for clones and vice versa!
A very good example for such assumption is the span out level – many snapshots are implemented as follows: the source has it own guaranteed space, and each snapshot has its own delta space. When a block in the source is modified, the old block is copied to the snapshot delta spaces (copy on write). This common technique is very efficient for (primary) source and (secondary) snapshot scheme but on the other side it also assumes that the span out level is low – because the modified block has to be copied to each snapshot delta space. Image what will happen if you have 1000 (snapshot based) clones created from the same source!
If we go back to the VMWare’s linked clone case, the read only replica is enabling VMWare to generate many writable snapshots on top of a single source. The original snapshot mechanism cannot do that!
To sum this post up I want to claim that:
- Clones (even linked) are not snapshots
- Most (even all) storage systems are not implementing the clone use case, but rather just the snapshot use case
- It is time that storage systems will implement clones