Home > Software design >  python random.choice TypeError: '(array([ 109, 1280, 427, 531, 1563, 102, 1774, 802, 560, 0]),
python random.choice TypeError: '(array([ 109, 1280, 427, 531, 1563, 102, 1774, 802, 560, 0]),

Time:01-11

So I'm trying to read in an excel document and then to select x amount of rows that are random and not replaced. I'm getting the Error when I try to run and would love for some guidance. I'm writing a Jupyter Notebook using VS Code.

#import libraries.
import os
import subprocess
import sys
import pandas as pd
import numpy as np
import tkinter as tk

#allow user to browse for specific excel file
from tkinter import filedialog
root = tk.Tk()
root.withdraw()
file_path = filedialog.askopenfilename()
sizeOfSample = 10

#read in excel as dataframe after user selects file in explorer
df = pd.read_excel (file_path)

#select random rows from df to display.
number_of_rows = df.shape[0]
random_indices = np.random.choice(number_of_rows, size=sizeOfSample, replace=False)
random_rows = df[random_indices, :]

print (random_rows)

This is the output I'm getting.

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
~\AppData\Local\Temp/ipykernel_1716/1509119795.py in <module>
     21 number_of_rows = initArr.shape[0]
     22 random_indices = np.random.choice(number_of_rows, size=sizeOfSample, replace=False)
---> 23 random_rows = initArr[random_indices, :]
     24 
     25 print (random_rows)

C:\Python39\lib\site-packages\pandas\core\frame.py in __getitem__(self, key)
   3456             if self.columns.nlevels > 1:
   3457                 return self._getitem_multilevel(key)
-> 3458             indexer = self.columns.get_loc(key)
   3459             if is_integer(indexer):
   3460                 indexer = [indexer]

C:\Python39\lib\site-packages\pandas\core\indexes\base.py in get_loc(self, key, method, tolerance)
   3359             casted_key = self._maybe_cast_indexer(key)
   3360             try:
-> 3361                 return self._engine.get_loc(casted_key)
   3362             except KeyError as err:
   3363                 raise KeyError(key) from err

C:\Python39\lib\site-packages\pandas\_libs\index.pyx in pandas._libs.index.IndexEngine.get_loc()

C:\Python39\lib\site-packages\pandas\_libs\index.pyx in pandas._libs.index.IndexEngine.get_loc()

TypeError: '(array([ 109, 1280,  427,  531, 1563,  102, 1774,  802,  560,    0]), slice(None, None, None))' is an invalid key

CodePudding user response:

Replace:

random_rows = df[random_indices, :]

By:

random_rows = df.loc[random_indices, :]

But you can use:

random_rows = df.sample(n=sizeOfSample, replace=True)
  •  Tags:  
  • Related