So I'm trying to write a SQL query with parameters in the group by clause and have pyodbc execute it; I have the following schema as such:
http://sqlfiddle.com/#!18/4e7bb7/2
In case sqlfiddle fails,
CREATE TABLE Persons (
Personid int IDENTITY(1,1) PRIMARY KEY,
Name varchar(255) NOT NULL,
Birthday datetime
);
INSERT INTO PERSONS (NAME, BIRTHDAY)
VALUES
('a','20220101'),
('b','20220102'),
('c','20220103'),
('d','20220104'),
('e','20220105'),
('f','20220106'),
('g','20220108'),
('h','20220110'),
('i','20220111'),
('j','20220112'),
('k','20220113'),
('l','20220114'),
('m','20220115')
and I have the following query as an example, which is valid SQL:
select
COUNT(*)
,dateadd(week, datediff(week,0, birthday), 0)
from Persons
group by
dateadd(week, datediff(week,0, birthday), 0)
order by
dateadd(week, datediff(week,0, birthday), 0)
This query will group the users' birthdays by week. This is simply a contrived example. I have real data that is similar to this.
I'm trying to write a python function where it would grab the data and group them by week. I want this function to be able to decide what day of the week is the start of the week. I have the following function:
import pyodbc
def TestSQLServerDB2(dayOfWeekStart=0):
"""Tests the query.
:param: dayOfWeekStart: Int. 0 = Monday, 1 = Tuesday, ... 6 = Saturday
"""
hostname = 'DESKTOPHOST'
database_instance = "test"
db_conn = pyodbc.connect('Trusted_Connection=yes;' r"DRIVER=" "{SQL SERVER}"
";SERVER=" hostname ";DATABASE="
database_instance ";")
targetWeekday = dayOfWeekStart % 7
sql = '''
select
COUNT(*)
,dateadd(week, datediff(week,0, birthday), ?)
from Persons
group by
dateadd(week, datediff(week,0, birthday), ?)
order by
dateadd(week, datediff(week,0, birthday), ?)
'''
params = (targetWeekday ,targetWeekday ,targetWeekday )
cur = db_conn.cursor()
cur.execute(sql, params)
print(cur.fetchall())
Running this function will produce the following error:
pyodbc.ProgrammingError: ('42000', "[42000] [Microsoft][ODBC SQL Server Driver][SQL Server]Column 'Persons.Birthday' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause. (8120) (SQLExecDirectW); [42000] [Microsoft][ODBC SQL Server Driver][SQL Server]Statement(s) could not be prepared. (8180)")
It's trivial enough for me to do a string replace on the SQL query prior to executing, and I won't face any issues; what I want to know is whether SQL parameterized statements are just limited to value comparisons in the WHERE clause basically or not? Is this an intended design?
Thanks
CodePudding user response:
The problem is that you are essentially passing the parameter into the query three times, and the GROUP BY and SELECT are using different parameters. Although you know they are the same, the server does not. Effectively, your query is:
select
COUNT(*)
,dateadd(week, datediff(week,0, birthday), @p0)
from Persons
group by
dateadd(week, datediff(week,0, birthday), @p1)
order by
dateadd(week, datediff(week,0, birthday), @p2)
And it's quite obvious why it doesn't work.
The solution is to use the same value on each.
- Unfortunately, you cannot use named parameters with pyodbc, so that is out.
- You could use a variable, but that would disable parameter sniffing, which may be undesirable.
- You could put the whole thing inside a nested subquery/derived table. However nesting is messy, and over-complicates the query.
- Instead, put the value inside a
CROSS APPLY (VALUES, which means you can reuse it anywhere in the query
select
COUNT(*)
,v.birthweek
from Persons
cross apply (values (
dateadd(week, datediff(week,0, birthday), ?)
) ) v(birthweek)
group by
v.birthweek
order by
v.birthweek;
This is in any case a sensible solution to avoid repetition of calculations.
