I have a Pandas dataframe df which I want to push to a relational database as a table. I setup a connection object (<Connection>) using SQLAlchemy (pyodbc is the connection engine), and called the command
df.to_sql(<Table_Name>, <Connection>)
which I was able to confirm was written as a table to the desired relational database by visual examination of it in SQL Server Management Studio (SSMS). But in the left-hand-side list of databases and their tables in SSMS I see that it has named it
<Sender_Server>\<Username>.<Table_Name>
where <Sender_Server> is (I think) related to the name of the server I ran the Python command from, <Username> is my username on that server, and <Table_Name> is the desired table name.
When I right-click on the table and select to query the top one thousand rows I get a query of the form
SELECT * FROM [<Data_Base_Name>].[<Sender_Server>\<Username>].[<Table_Name>]
which also has the <Sender_Server>\<Username> info in it. The inclusion of <Sender_Server>\<Username> is undesired behaviour in our use case.
How can I instead have the data pushed such that
SELECT * FROM [<Data_Base_Name>].[<Table_Name>]
is the appropriate query?
CodePudding user response:
By default, .to_sql() assumes the default schema for the current user unless schema="schema_name" is provided. Say, for example, the database contains a table named dbo.thing and the database user named joan has a default schema named engineering. If Joan does
df.to_sql("thing", engine, if_exists="append", index=False)
it will not append to dbo.thing but will instead try to create an engineering.thing table and append to that. If Joan wants to append to dbo.thing she needs to do
df.to_sql("thing", engine, schema="dbo", if_exists="append", index=False)
