I have a dataframe with schema, and want to convert this into tfRecords
root
|-- col1: string (nullable = true)
|-- col2: array (nullable = true)
| |-- element: string (containsNull = true)
|-- col3: array (nullable = true)
| |-- element: string (containsNull = true)
|-- col4: array (nullable = true)
| |-- element: float (containsNull = true)
|-- col5: array (nullable = true)
| |-- element: float (containsNull = true)
|-- col6: array (nullable = true)
| |-- element: integer (containsNull = true)
|-- col7: array (nullable = true)
| |-- element: string (containsNull = true)
|-- col8: array (nullable = true)
| |-- element: string (containsNull = true)
|-- col9: array (nullable = true)
| |-- element: string (containsNull = true)
I'm using spark tensorflow connector
df.write.mode("overwrite").format("tfrecords").option("recordType", "Example").save("targetpath.tf")
Error which I'm getting while saving the data into tfrecords
java.lang.NoSuchMethodError: scala.Predef$.refArrayOps([Ljava/lang/Object;)Lscala/collection/mutable/ArrayOps
I have tried similar approach in databricks community edition as well , also got the similar erro
Can anyone help here ?
CodePudding user response:
The most probable cause (judging from Maven Central information) is that you're using connector compiled for Scala 2.11 on the Databricks runtime that uses Scala 2.12.
Either you need to use DBR 6.4 for that conversion, or compile connector for Scala 2.12 & use.
