Iterating through and analyzing a list of lists-CodePudding

I am trying to iterate and analyze through several nested-lists. Usually, the list I start with, contains over 200 sublists:

[
  [
    1499040000000,      // Open time
    "0.01634790",       // Open
    "0.80000000",       // High
    "0.01575800",       // Low
    "0.01577100",       // Close
    "148976.11427815",  // Volume
    1499644799999,      // Close time
    "2434.19055334",    // Quote asset volume
    308,                // Number of trades
    "1756.87402397",    // Taker buy base asset volume
    "28.46694368",      // Taker buy quote asset volume
    "17928899.62484339" // Ignore.
  ]
]

I want to iterate through several different subsections of that nested-list. E.g. I want to iterate and analyze only through the last quarter of the list or through the second half.

And from those subsections, I want to determine the max value from value "High", i.e index 2.

This is what I've tried:

import itertools

twentyfour_hour_klines = initial list of sublists

#last 6 hours:
lookback_period = int('6')
six_hour_highest_high = get_highest_high(klines=twentyfour_hour_klines, lookback_period=lookback_period)
print(six_hour_highest_high, flush=True)

def get_highest_high(klines, lookback_period):
    start = int(len(klines) / 24 * (24 - lookback_period)   1)
    stop = int(len(klines)   1)

    highest_high = None 
    for line in itertools.islice(klines , start, stop):
        if highest_high == None:
            highest_high = float(line[2])
        elif float(line[2]) > highest_high:
            highest_high = float(line[2])
    return highest_high

It works, but it seems like quite the clunky solution. Is there anything more lean than this? Please also keep in mind, I need to perform calculation multiple times and speed is a concern.

CodePudding user response：

What I would do, anytime I'm trying to do the same thing to a list, is to do a map. A map applies the same function to every item in the list separately.

The only thing to work out is what the function looks like. We need to create a lambda function which takes a list and return the nth item.

x = [1499040000000,
    "0.01634790",
    "0.80000000",
    "0.01575800",       
    "0.01577100",
    "148976.11427815",
    1499644799999,
    "2434.19055334",
    308,                
    "1756.87402397",
    "28.46694368",
    "17928899.62484339"
  ]

  x[2]    # returns 0.8, index 0-up

Now let's try creating a longer list, and performing a map.

 y = [
      [1499040000000,
       "0.01634790",
       "0.80000000",
       "0.01575800",       
       "0.01577100",
       "148976.11427815",
       1499644799999,
       "2434.19055334",
       308,                
       "1756.87402397",
       "28.46694368",
       "17928899.62484339"
       ],
      [1499040000000,
       "0.01634790",
       "0.80000000",
       "0.01575800",       
       "0.01577100",
       "148976.11427815",
       1499644799999,
       "2434.19055334",
       308,                
       "1756.87402397",
       "28.46694368",
       "17928899.62484339"
     ]    
 ]

 res=map(lambda lst: lst[2],y)
 for a in res:
   print(a)    # 0.8, 0.8

Finally, creating a function:

 def extract(lst, n):
     return map(lambda x: x[n],lst)

Map returns an iterable, so you can do for x in on it, or it can be converted to a list using list.

CodePudding user response：

Based on suggestion from @Kraigolas, I managed to get the following solution:

    price_data = get_minute_data(symbol="BTCUSDT", interval=Client.KLINE_INTERVAL_5MINUTE, start_str='1 day ago UTC')
    
    def get_minute_data(symbol, interval, start_str):
        price_data = client.futures_historical_klines(symbol=symbol, interval=interval, start_str=start_str)
    
        df = pd.DataFrame(price_data)
        df = df.iloc[:,:7]
        df.columns = ["Open time",
                        "Open",
                        "High",
                        "Low", 
                        "Close", 
                        "Volume", 
                        "Close time"]
        df[["Open",
            "High",
            "Low", 
            "Close", 
            "Volume"]] = df[["Open",
                                "High",
                                "Low", 
                                "Close", 
                                "Volume"]].astype(float)
        df["Open time"] = pd.to_datetime(df["Open time"], unit='ms')
        df["Close time"] = pd.to_datetime(df["Close time"], unit='ms')

        return df