NumPy based one with np.tile

np.tile(np.arange(m),(n+m1)//m)[:n]
Sample run 
In [58]: m,n = 7,12
In [59]: np.tile(np.arange(m),(n+m1)//m)[:n]
Out[59]: array([0, 1, 2, 3, 4, 5, 6, 0, 1, 2, 3, 4])
Benchmarking
If you are looking for efficiency, especially on decent to large sized data, NumPy would do well. In this section we are timing NumPy solutions with variations across m
and n
.
Setup :
import numpy as np
def resize(m,n):
return np.resize(np.arange(m), n)
def mod(m,n):
return np.mod(np.arange(n), m)
def tile(m,n):
return np.tile(np.arange(m),(n+m1)//m)[:n]
Run timings code on IPython console :
# Setup inputs and timeit those on posted NumPy approaches
m_ar = [10,100,1000]
s_ar = [10,20,50,100,200,500,1000] # scaling array
resize_timings = []
mod_timings = []
tile_timings = []
sizes_str = []
for m in m_ar:
for s in s_ar:
n = m*s+m//2
size_str = str(m) + 'x' + str(n)
sizes_str.append(size_str)
p = %timeit o q resize(m,n)
resize_timings.append(p.best)
p = %timeit o q mod(m,n)
mod_timings.append(p.best)
p = %timeit o q tile(m,n)
tile_timings.append(p.best)
Get results on plot :
# Use pandas to study results
import pandas as pd
df_data = {'Resize':resize_timings,'Mod':mod_timings,'Tile':tile_timings}
df = pd.DataFrame(df_data,index=sizes_str)
FGSZ = (20,6)
T = 'Timings(s)'
FTSZ = 16
df.plot(figsize=FGSZ,title=T,fontsize=FTSZ).get_figure().savefig("timings.png")
Results
Comparing all three
resize
and tile
based ones seem to be doing well.
Comparing resize
and tile
Lets's plot those two only :
tile
seems to be doing better between these two.
Studying in chunks
Now, let's study the timings in chunks corresponding to three different m's
:
mod
based one wins only on small m
and small n
and the timings on those m's and n's are of the order of 56 usecs, but loses out on most of the other scenarios, and it's the compute involving it that's killing it on those.