Home > Software engineering >  how to run sql query on delta table
how to run sql query on delta table

Time:01-22

I have problem with delta lake docs. I know that I can query on delta table with presto,hive,spark sql and other tools but in delta's documents mentioned that "You can load a Delta table as a DataFrame by specifying a table name or a path"

from delta lake

but it isn't clear. how can I run sql query like that?

CodePudding user response:

Use the spark.sql() function

spark.sql("select * from delta.`hdfs://192.168.2.131:9000/Delta_Table/test001`").show()

CodePudding user response:

To read data from tables in DeltaLake it is possible to use Java API or Python without Apache Spark. See details at: https://databricks.com/blog/2020/12/22/natively-query-your-delta-lake-with-scala-java-and-python.html

See how to use with Pandas:

pip3 install deltalake
python3
from deltalake import DeltaTable
table_path = "/opt/data/delta/my-table" # whatever table name and object store
# now using Pandas
df = DeltaTable(table_path).to_pandas()
df
  •  Tags:  
  • Related