I tried solving a problem on HackerRank, called 'Apple and orange', here's the code:
def countApplesAndOranges(s, t, a, b, apples, oranges):
count_apples = 0
count_oranges = 0
x = [x for x in range(s, t 1)]
pos_apple = [apple a for apple in apples]
pos_orange = [orange b for orange in oranges]
for i in x:
for j in pos_apple:
if j == i:
count_apples =1
for l in pos_orange:
if l == i:
count_oranges = 1
print(count_apples)
print(count_oranges)
The code works. However, when I try submitting it, it passes the first 3 tests and fails the rest with the exception 'Terminated due to timeout'. I checked out the input from one of the tests and it's a huge amount of data to process, you can take a look at the data here: https://hr-testcases-us-east-1.s3.amazonaws.com/25220/input03.txt?AWSAccessKeyId=AKIAR6O7GJNX5DNFO3PV&Expires=1642016820&Signature=J4ypdP0YzRxcOWp+y5XaD5ITeMw=&response-content-type=text/plain
It fails cause it takes like 2 minutes to process the code with the same input via my IDE, but the HackerRank tests are limited to 10 seconds.
I need your help to optimize the code and get it to run faster.
Nested loops seem to be the biggest problem here, but I have no idea what should I replace the with.
CodePudding user response:
Checking each apple/orange against each coordinate in range turns your code's runtime complexity into O(n * a n * o) where n is the length of the house, a is the number of the apples and o is the number of oranges. Ideally, your code must run in O(a o).
Here's a refactored version of your solution:
def countApplesAndOranges(s, t, a, b, apples, oranges):
count_apples = 0
count_oranges = 0
for apple in pos_apple:
if s <= apple a <= t:
count_apples =1
for orange in pos_orange:
if s <= orange b <= t:
count_oranges = 1
print(count_apples)
print(count_oranges)
CodePudding user response:
Do not build intermediate lists if you can, if this is really needed (this is not the case for tis problem) use a generator instead
And try to do not repeat yourself, factorize with a function when it is possible
def countApplesAndOranges(home_start, home_end, tree_apple, tree_orange, apples, oranges):
def count_fruits(home_start, home_end, tree, fruits):
count = 0
for fruit in fruits:
if home_start <= tree fruit <= home_end:
count =1
return count
print(count_fruits(home_start, home_end, tree_apple, apples))
print(count_fruits(home_start, home_end, tree_orange, oranges))
CodePudding user response:
I think I would sum() a couple of list comprehensions as a starting point:
apple_hits = sum(
1 for apple in apples
if house_min <= apple apple_tree_origin <= house_max
)
One thing that points out to me though is that subtracting apple_tree_origin from the sides of that test should not change it:
apple_hits = sum(
1 for apple in apples
if house_min - apple_tree_origin <= apple <= house_max - apple_tree_origin
)
Now we might observe that house_min - apple_tree_origin can be precomputed and the rest my answer falls out as:
def countApplesAndOranges1(s, t, a, b, apples, oranges):
house_min = s
house_max = t
apple_tree_origin = a
orange_tree_origin = b
house_min_apples = house_min - apple_tree_origin
house_max_apples = house_max - apple_tree_origin
house_min_oranges = house_min - orange_tree_origin
house_max_oranges = house_max - orange_tree_origin
apple_hits = sum(1 for apple in apples if house_min_apples <= apple <= house_max_apples)
orange_hits = sum(1 for orange in oranges if house_min_oranges <= orange <= house_max_oranges)
return apple_hits, orange_hits
With the test data you provided I get (18409, 19582). Hopefully that is correct.
Feel free to timeit against other solutions:
import timeit
setup = """
with open("apples_oranges.txt") as file_in:
s,t = list(map(int, file_in.readline().split()))
a,b = list(map(int, file_in.readline().split()))
m,n = list(map(int, file_in.readline().split()))
apples = list(map(int, file_in.readline().split()))
oranges = list(map(int, file_in.readline().split()))
def countApplesAndOranges_jonsg(s, t, a, b, apples, oranges):
house_min = s
house_max = t
apple_tree_origin = a
orange_tree_origin = b
house_min_apples = house_min - apple_tree_origin
house_max_apples = house_max - apple_tree_origin
house_min_oranges = house_min - orange_tree_origin
house_max_oranges = house_max - orange_tree_origin
apple_hits = sum(1 for apple in apples if house_min_apples <= apple <= house_max_apples)
orange_hits = sum(1 for orange in oranges if house_min_oranges <= orange <= house_max_oranges)
return apple_hits, orange_hits
"""
print(timeit.timeit("countApplesAndOranges_jonsg(s, t, a, b, apples, oranges)", setup=setup, number=1000))
CodePudding user response:
Thank you guys!
I figured it all out, and used
if s <= apple a <= t:
in my code as it's the most simple solution to get rid of the nested loop. It works so good it makes me wonder how haven't I thought of it.
I really like all of your solutions and appreciate the help!
