Collections¶
A collection holds items that all have a single common numerical component, whose value differs between each item.
Each collection comprises three main attributes:
Given items such as:
- file.0001.jpg
- file.0002.jpg
The head would be file.
, the tail .jpg
and the
padding 4
.
Note
If the numerical component is unpadded then the padding would be 0 and a variable index width supported.
A collection can be manually created using the Collection
class:
>>> import clique
>>> collection = clique.Collection(head='file.', tail='.jpg', padding=4)
Adding & Removing Items¶
Items can then be added
to the
collection:
>>> collection.add('file.0001.jpg')
If an item does not match the collection’s expression a
CollectionError
is raised:
>>> collection.add('file.0001.dpx')
CollectionError: Item does not match collection expression.
Whether an item matches the collection expression can be tested ahead of time
if desired using match()
:
>>> print collection.match('file.0002.jpg')
<_sre.SRE_Match object at 0x0000000003710D78>
>>> print collection.match('file.0002.dpx')
None
To remove an item:
>>> collection.remove('file.0001.jpg')
If the item is not present, a CollectionError
is
raised:
>>> collection.remove('file.0001.jpg')
CollectionError: Item not present in collection.
Accessing Items¶
To access items in the collection, iterate over it:
>>> collection.add('file.0001.jpg')
>>> collection.add('file.0002.jpg')
>>> for item in collection:
... print item
file.0001.jpg
file.0002.jpg
Note
A collection may be sparse and so is not directly indexable. If you need to access an item by index, convert the collection to a list:
>>> print list(collection)[-1]
file.0002.jpg
Manipulating Indexes¶
Internally, Clique does not store the items directly, but rather just the properties to recreate the items (head, tail, padding). In addition it holds a sorted set of indexes present in the collection.
This set of indexes can be manipulated directly to perform the equivalent of adding and removing items (perhaps in bulk).
>>> print collection.indexes
[1, 2]
>>> collection.indexes.update([2, 3, 4])
>>> for item in collection:
... print item
file.0001.jpg
file.0002.jpg
file.0003.jpg
file.0004.jpg
Note
It is not possible to assign a new index set directly:
>>> collection.indexes = set([1, 2, 3])
AttributeError: Cannot set attribute defined as unsettable.
Instead, first clear and update the set as required:
>>> collection.indexes.clear()
>>> collection.indexes.update(set([1, 2, 3])
Formatting¶
It is useful to express a collection as a string that represents the collection
expression and ranges in a standard way. Clique supports basic formatting of a
collection through its format()
method:
>>> collection = clique.Collection('file.', '.jpg', 4, indexes=set([1, 2]))
>>> print collection.format()
file.%04d.jpg [1-2]
The format()
method can be passed an
alternative pattern if required:
>>> print collection.format('{head}[index]{tail}')
file.[index].jpg
The passed pattern should match the formatting rules of Python’s standard string formatter and will have the following keyword variables available to it:
- :term:`head` - Common leading part of the collection.
- :term:`tail` - Common trailing part of the collection.
- :term:`padding` - Padding value in
%0d
format.- range - Total range in the form
start-end
- ranges - Comma separated ranges of indexes.
- holes - Comma separated ranges of missing indexes.
Structure¶
Clique makes it easy to get further information about the structure of a collection and act on that structure.
To check if a collection contains items that make up a contiguous
sequence use is_contiguous()
:
>>> collection = clique.Collection('file.', '.jpg', 4)
>>> collection.indexes.update([1, 2, 3, 4, 5])
>>> print collection
file.%04d.jpg [1-5]
>>> print collection.is_contiguous()
True
>>> collection.indexes.discard(3)
>>> print collection
file.%04d.jpg [1-2, 4-5]
>>> print collection.is_contiguous()
False
To access the missing indexes in a non-contiguous collection use the
holes()
method (which returns a new
Collection
):
>>> missing = collection.holes()
>>> print missing.indexes
[3]
To separate a non-contiguous collection into a number of
contiguous collections use the separate()
method:
>>> subcollections = collection.separate()
>>> for subcollection in subcollections:
... print subcollection
file.%04d.jpg [1-2]
file.%04d.jpg [4-5]
And to merge compatible collections into another use the
merge()
method:
>>> collection_a = clique.Collection('file.', '.jpg', 4, set([1, 2]))
>>> collection_b = clique.Collection('file.', '.jpg', 4, set([4, 5]))
>>> print collection_a.indexes
[1, 2]
>>> collection_a.merge(collection_b)
>>> print collection_a.indexes
[1, 2, 4, 5]
Note
The collection being merged into is modified in-place, whilst the collection being merged is left unaltered.
A collection can be tested for compatibility using the
is_compatible()
method:
>>> collection_a = clique.Collection('file.', '.jpg', 4, set([1, 2]))
>>> collection_b = clique.Collection('file.', '.jpg', 4, set([4, 5]))
>>> collection_c = clique.Collection('file.', '.dpx', 4, set([4, 5]))
>>> print collection_a.is_compatible(collection_b)
True
>>> print collection_a.is_compatible(collection_c)
False