Iterators are one of the basic pillars on which the ndimage module stands. As the name suggests it is used to iterate over the arbitrary dimensional input arrays. Even though Numpy library contains a complete iterator PyArrayIterObject
container with all the required members, ndimage module has its own optimized iterator structure NI_Iterator
.
But before jumping directly to Iterators
it would be better if we first understand PyArrayObject
. It will help us in deciphering Iterators
better. In C every ndarray is a pointer to PyArrayObject
struture. It contains all the information required to deal with ndarray in C. All instances of ndarray will have this structure. It is defined as:
typedef struct PyArrayObject { PyObject_HEAD char *data; int nd; npy_intp *dimensions; npy_intp *strides; PyObject *base; PyArray_Descr *descr; int flags; PyObject *weakreflist; } PyArrayObject;
- data is the pointer to the first element of the array.
- nd refers to number of dimensions in the array.
- dimensions is an array of integers which tells the shape of each dimension.
- strides is an array of integers providing for each dimension the number of bytes that must be skipped to get to the next element in that dimensions.
Rest of the members of the container are not much of our use as of now. So for now we can safely ignore them.
Now let’s come back to PyArrayIterObject
. It is another container defined in Numpy containing information required to iterate through the array.
The NI_Iterator container defined in ni_support.h looks something like:-
typedef struct { int rank_m1; npy_intp dimensions[MAXDIM]; npy_intp coordinates[MAXDIM]; npy_intp strides[MAXDIM]; npy_intp backstrides[MAXDIM]; } NI_Iterator;
- rank_m1 is basically the rank of the Array which is to be iterated. Its value is equal to N -1, where N is the number of dimensions of the underlying array.
- dimension is an array containing the size of each of the dimension the iterator iterate over. Its value is one less than the dimension of
PyArrayObject
of the Array which is to be iterated. - coordinates refers to a N-Dimensional index of the array i.e. it tells us the last position visited by the iterator.
- strides tells us stride along each of the dimension in the array. It is same as that of strides of
PyArrayObject
of the Array which is to be iterated. - For each dimension backstrides tells us the number of bytes needed to jump from the end of a dimension back to its beginning. Note that:- backstride[k] = dimension[k] * stride[k]. It is saved only for optimization purpose.
The PyArrayIterObject
is a higher version of NI_Iterator
. Along with the members of NI_Iterator
it contains some extra members which provide some extra functionality. Following is the exact struct of PyArrayIterObject
annotated with the function of each of its members:
typedef struct { PyObject_HEAD /* Same as rank_m1 in NI_Iterator */ int <strong>nd_m1</strong>; /* The current 1-D index into the arrray.*/ npy_intp index; /* The total size of the Array to be iterated. */ npy_intp size; /* Same as coordinates in NI_Iterator */ npy_intp coordinates[NPY_MAXDIMS]; /* Same as dimensions in NI_Iterator */ npy_intp dims_m1[NPY_MAXDIMS]; /* Same as strides in NI_Iterator*/ npy_intp strides[NPY_MAXDIMS]; /* Same as backstrides in NI_Iterator */ npy_intp backstrides[NPY_MAXDIMS]; /* This array is used to convert 1-D array to N-D array*/ npy_intp factors[NPY_MAXDIMS]; /* The pointer to underlying Arrray*/ PyArrayObject *ao; /* Pointer to element in the ndarray indicated by the index*/ char *dataptr; /* This flag is true if Underlying array is C- contiguous.*/ /* It is used to simplify calculations. */ Bool contiguous; } PyArrayIterObject;
The members in bold were also part of NI_Iterator
. As we can see all the extra members from NI_Iterator
are mostly convenience values and can be derived at the time of requirement.
For iterating through any ndarray, the members of the NI_Iterator
requires to be set according to the ndarray to iterated. This is done by NI_InitPointIterator
function. It takes values from input array and initializes all the members of NI_Iterator
. The function is quite simple to understand and defined as:-
int NI_InitPointIterator(PyArrayObject *array, NI_Iterator *iterator) { int ii; iterator->rank_m1 = array->nd - 1; for(ii = 0; ii < array->nd; ii++) { iterator->dimensions[ii] = array->dimensions[ii] - 1; iterator->coordinates[ii] = 0; iterator->strides[ii] = array->strides[ii]; iterator->backstrides[ii] = array->strides[ii] * iterator->dimensions[ii]; } return 1; }
This sets NI_Iterator
to iterate over all the dimensions of the input array. But there may be cases when we don’t need to iterate over whole ndarray, but only some axes. For this two other variant function of NI_InitPointIterator
are available. These are:
- NI_SubspaceIterator:- This functions sets the iterator to iterate through a given set of axes. It takes two arguments, first an iterator and other the set of axes to be iterated and initializes the iterator. Its code can be seen here
- NI_LineIterator:- This function sets the iterator to iterate over .only one given axis. It basically calls NI_SubspaceIterator giving only one axis in the `axes` argument. Its code can be seen here.
I would like to thank my mentor Jaime who helped me in understanding this.
In the next blog I will give the detailed explanation of NI_LineBuffer and NI_FIlterIterators.