I am writing unit tests for 2 data frames to test for equality by converting them to dictionaries and using unittest's assertDictEqual(). The context is that I'm converting Excel functions to Python but due to their different rounding system, some values are off by merely /- 1
I've attempted to use the DF.round(-1) to round to the nearest 10th but due to the /- 1, some numbers may round the opposite way so for example 15 would round up but 14 would round down and the test would fail. All values in the 12x20 data frame are integers
What I'm looking for (feel free to suggest any alternate solution):
- A CLEAN way to test for approximate equality of data frames or nested dictionaries
- or a way to make the ones-digit of each element '0' to avoid the rounding issue
Thank you, and please let me know if any additional context is required. Due to confidentiality issues and my NDA (non-disclosure agreement), I cannot share the code but I can formulate an example if necessary
CodePudding user response:
You could take the element-wise absolute difference between the two DataFrames and check that all values are below a certain tolerance (in your case 1). For example, we can create two DataFrames with values in the interval [0.0, 1.0).
import numpy as np
import pandas as pd
np.random.seed(42)
## df2 are 10x10 arrays with values in the interval [0.0, 1.0)
df1 = pd.DataFrame(np.random.random_sample((10,10)))
df2 = pd.DataFrame(np.random.random_sample((10,10)))
Then the following should return True:
(abs(df2-df1) < 1).all(axis=None)
And you can write an assert statement like:
assert((abs(df2-df1) < 1).all(axis=None) == True)
CodePudding user response:
I'm not 100 pourcent sure I got what you are trying to do but why not just divide by 10 to lose the last digit that is bothering you? division with "//" will keep only the significant numbers. You can then multiply by ten if you want to keep the overall number size.
