Maybe this explanation can help understand the concept:
datais an array containing all the non zero elements of the sparse matrix.
indicesis an array mapping each element in
datato its column in the sparse matrix.
indptrthen maps the elements of
indicesto the rows of the sparse matrix. This is done with the following reasoning:
- If the sparse matrix has M rows,
indptris an array containing M+1 elements
- for row i,
[indptr[i]:indptr[i+1]]returns the indices of elements to take from
indicescorresponding to row i. So suppose
indptr[i+1]=l, the data corresponding to row i would be
indices[k:l]. This is the tricky part, and I hope the following example helps understanding it.
EDIT : I replaced the numbers in
data by letters to avoid confusion in the following example.
Note: the values in
indptr are necessarily increasing, because the next cell in
indptr (the next row) is referring to the next values in
indices corresponding to that row.
Represent the "data" in a 4 X 4 Matrix:
data = np.array([10,0,5,99,25,9,3,90,12,87,20,38,1,8]) indices = np.array([0,1,2,3,0,2,3,0,1,2,3,1,2,3]) indptr = np.array([0,4,7,11,14])
- 'indptr'- Index pointers is linked list of pointers to 'indices' (Column index Pointers)...
- indptr[i:i+1] represents i to i+1 index of pointer
- 14 reprents len of Data len(data)... indptr = np.array([0,4,7,11,len(data)]) other way of represenint 'indptr'
- 0,4 → 0:4 represents pointers to indices 0,1,2,3
- 4,7 → 4:7 represents the pointers of indices 0,2,3
- 7,11 → 7:11 represents the pointers of 0,1,2,3
- 11,14 → 11:14 represents pointers 1,2,3
# Representing the data in a 4,4 matrix a = csr_matrix((data,indices,indptr),shape=(4,4),dtype=np.int) a.todense() matrix([[10, 0, 5, 99], [25, 0, 9, 3], [90, 12, 87, 20], [ 0, 38, 1, 8]])