Here is toy-example, I've string like this:
import numpy as np
z = str([np.nan, "ab", "abc"])
Printed it looks like "[nan, 'ab', 'abc']" but I've to process z = str([np.nan, "ab", "abc"])
I want to get from z list of strings excluding nan:
zz = ["ab", "abc"]
To be clear: z is input (string, that look list-like), zz is wanted output (list)
There is no problem if z doesn't contain nan, in such ast.literal_eval(z) do the job, but with nan I get error about malformed node or string.
Note: np.nan doesn't have to be first.
CodePudding user response:
ast.literal_eval is suggested over eval exactly because it allows a very limited set of statements. As stated in the docs: "Safely evaluate an expression node or a string containing a Python literal or container display. The string or node provided may only consist of the following Python literal structures: strings, bytes, numbers, tuples, lists, dicts, sets, booleans, None and Ellipsis." np.nan is none of those so it cannot be evaluated.
There are few choices to handle this.
- Remove
nanby operating on the string before doing evaluation on it. Might be problematic if you want to avoid also removing nan from inside the actual strings. - NOT ADVISED - SECURITY RISKS - standard
evalcan handle this if you define nan variable in the namespace - And finally, I think the best choice but also hardest to implement: like explained here, you take the source code for
ast, subclass it and reimplementliteral_evalin such a way that it knows how to handlenanstring on it's own.
CodePudding user response:
As I understand it, your goal is to parse csv or similar.
If you want a trade-off solution that should work in most cases, you can use a regex to get rid of the "nan". It will fail on the strings that contain the substring nan, (with comma), but this seems to be a reasonably unlikely edge case. Worth to explode with you real data.
z = str([np.nan, "ab", np.nan, "nan,", "abc", "x nan , y", "x nan y"])
import re
literal_eval(re.sub(r'\bnan\s*,\s*', '', z))
output: ['ab', '', 'abc', 'x y', 'x nan y']
CodePudding user response:
What about:
eval(z,{'nan':'nan'}) # if you can tolerate then:
[i for i in eval(z,{'nan':'nan'}) if i != 'nan']
It may have security considerations.
CodePudding user response:
Many Solutions one of these is
z = [nan, 'string', 'another_one']
string_list = []
for item in z :
# find the object come from str Class and Append it to the list
if item.__class__ == str:
string_list.append(item)
CodePudding user response:
Something like this :
import numpy as np
z = [item for item in [np.nan, "ab", "abc" ] if type(item) == str]
print(z)
CodePudding user response:
Use filter() function:
list(filter(lambda f: type(f)==str, z))
