Home > Software engineering >  Improper identifier with SQL compilation error in Snowflake
Improper identifier with SQL compilation error in Snowflake

Time:02-08

I am trying join two table in SQL using the following code:

SELECT 
  lo.PATIENT_ID AS pid,
  lo.SERVICE_CODE,
  de.PATIENT_ID as pid,
  de.STATE,
  de.GENDER,
  de.ETHNICITY
FROM
  "DB"."SCHEMA"."TABLE" AS lo
INNER JOIN
  "DB"."SCHEMA"."TABLE" AS de
ON
  lo.pid = de.pid
WHERE 
  lo.SERVICE_CODE = `2345.7';

I am getting an error stating "SQL compilation error: invalid identifier 'LO.PID'. Line 13 is referring to the ON statement "lo.pid = de.pid" The data types are the same. Is there a glaring issue in the code. I've tried changing the alias on both the tables and variables but that didn't seem to work.

CodePudding user response:

For the avoidance of doubt, the commenters are suggesting the following changes:

SELECT 
  lo.PATIENT_ID AS lo_pid,
  lo.SERVICE_CODE,
  de.PATIENT_ID as de_pid,
  de.STATE,
  de.GENDER,
  de.ETHNICITY
FROM
  "DB"."SCHEMA"."TABLE" lo
  INNER JOIN "DB"."SCHEMA"."TABLE" de ON lo.PATIENT_ID = de.PATIENT_ID 
WHERE 
  lo.SERVICE_CODE = '2345.7';

JOIN runs first, then WHERE, then SELECT. You cannot use your alias pid in a JOIN because it hasn't been created yet. Naming two different columns the same would potentially work until you came to the point where you wanted to use this query in something else, then it would cause a problem. Ensure aliases created in the SELECT block are unique

If "DB"."SCHEMA"."TABLE" is repeated because you truly do want to join the table to itself, I'd point out that this might be an error, and at the very least unnecessary if PATIENT_ID is the PK. It might have some use if the table has multiple rows with the same patient_id, and you want to cross them together, but it;s more likely that your lo and de should represent different tables; Check if one of the "DB"."SCHEMA"."TABLE" is a typo

CodePudding user response:

the problem is as everyone else says "you on clauses have to referrer to the table.column not the alias from the select region.

reproduction:

This SQL is wrong and gives your error:

with table_a (id, val_a) AS (
    SELECT * FROM VALUES
        (1,'one'),
        (2,'two'),
        (3,'three')
), table_b (id, val_b) AS (
    SELECT * FROM VALUES
        (4,'XXXX'),
        (2,'XX'),
        (3,'XXX')
)
SELECT a.id as a_id
       ,b.id as b_id
       ,a.val_a
       ,b.val_b
FROM table_a AS a
JOIN table_b AS b
    on a_id = b_id;

SQL compilation error: error line 18 at position 7 invalid identifier 'A_ID'

corrected:

it's a single change on the ON you use a.id = b.id

with table_a (id, val_a) AS (
    SELECT * FROM VALUES
        (1,'one'),
        (2,'two'),
        (3,'three')
), table_b (id, val_b) AS (
    SELECT * FROM VALUES
        (4,'XXXX'),
        (2,'XX'),
        (3,'XXX')
)
SELECT a.id as a_id
       ,b.id as b_id
       ,a.val_a
       ,b.val_b
FROM table_a AS a
JOIN table_b AS b
    on a.id = b.id
;

gives the correct results:

A_ID B_ID VAL_A VAL_B
2 2 two XX
3 3 three XXX

counter points:

Now to be fair, you can use alias's from the SELECT in the WHERE region, but in the JOIN you are correctly use the table alias lo and de but pid needs to be something on those tables. Which it doesn't feel like it is, as you have an alias of patient_id to pid

which if you have had valid named columns available like in the SQL below, it just works fine:

with table_a (id, pid, val_a) AS (
    SELECT column1, column1, column2 FROM VALUES
        (1,'one'),
        (2,'two'),
        (3,'three')
), table_b (id, pid, val_b) AS (
    SELECT column1, column1, column2 FROM VALUES
        (4,'XXXX'),
        (2,'XX'),
        (3,'XXX')
)
SELECT a.id as pid
       ,b.id as pid
       ,a.val_a
       ,b.val_b
FROM table_a AS a
JOIN table_b AS b
    on a.pid = b.pid
;
  •  Tags:  
  • Related