I created external table in Azure Synapse Analytics Serverless. The File Format is CSV and it points to a Data Lake Gen 2 folder with multiple CSV files which hold the actual data. The CSV files are being updated from time to time.
I would like to foresee the potential problems that may arise when a user executes a long running query against the external table in the moment when underlying CSV files are being updated. Will the query fail or maybe the result set will contain dirty data / inconsistent results?
CodePudding user response:
As such there is no issue when connecting Synapse Serverless pool with Azure data lake. Synapse is very much compatible to query, transform and analyze and data stored in data lake.
Microsoft provide the well explained troubleshoot document in case of any error. Please refer Troubleshoot the Azure Synapse Analytics.
CodePudding user response:
Synapse SQL serverless allows you to control what the behavior will be. If you want to avoid the query failures due to constantly appended files, you can use the ALLOW_INCONSISTENT_READS option.
You can see the details here: https://docs.microsoft.com/en-us/azure/synapse-analytics/sql/query-single-csv-file#querying-appendable-files
