I have the following XML file which i want to load into SQL:
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<ns4:Dataview1 xmlns="ux:no::ehe:v5:actual:aver" xmlns:ns4="ux:no:ehe:v5:move" xmlns:ns2="ux:no:ehe:v5:centre" xmlns:ns5="ux:no:ehe:v5:tran">
<ns4:Content versionCode="16000">
<ns4:gen start="1961-07-01" end="1961-07-01">
</ns4:gen>
<Weight unit="KG">30.0000</Weight>
<ns4:gen start="2017-09-19">
</ns4:gen>
<ns4:gen start="1961-07-02" end="2016-09-30">
<ns5:translation>
<ns2:En>Example Data English 234354</ns2:En>
</ns5:translation>
<Type>Circle</Type>
<Weight unit="KG">10.0000</Weight>
</ns4:gen>
<ns4:gen start="2016-10-01" end="2017-09-18">
<ns5:translation>
<ns2:De>Beispieldaten Deutschland 23443</ns2:De>
<ns2:En>Example Data English 23443</ns2:En>
</ns5:translation>
<Weight unit="KG">20.0000</Weight>
</ns4:gen>
</ns4:Content>
</ns4:Dataview1>
Following code is used:
CREATE TABLE tbl (
ID INT IDENTITY(1, 1) PRIMARY KEY,
XmlColumn XML
);
INSERT INTO tbl(XmlColumn)
SELECT * FROM OPENROWSET(BULK N'C:\Data\demoset.xml', SINGLE_BLOB) AS x;
WITH XMLNAMESPACES(DEFAULT 'ux:no::ehe:v5:actual:aver',
'ux:no:ehe:v5:tran' as ns5,
'ux:no:ehe:v5:move' AS ns4,
'ux:no:ehe:v5:cat:fill' as ns3,
'ux:no:ehe:v5:centre' as ns2)
SELECT c.value('@versionCode', 'VARCHAR(100)') as versionCode,
x.value('@start', 'DATE') as Start_date,
x.value('@end', 'DATE') as End_date,
x.value('(ns5:translation/ns2:En)[1]', 'VARCHAR(100)') as Trans_EN,
x.value('Type[1]', 'VARCHAR(100)') as Typed,
x.value('(Weight/unit)[1]', 'VARCHAR(100)') as Unitdata,
x.value('Weight[1]', 'VARCHAR(100)') as Weightdata
INTO dbo.testtbl
FROM tbl
CROSS APPLY XmlColumn.nodes('/ns4:Dataview1/ns4:Content') AS t1(c)
OUTER APPLY t1.c.nodes('ns4:gen') AS t2(x);
I have three questions:
1.) I don't understand why it is not possible to use @ns2:En, @Type, @Weight. Therefore, why do i need to place [1] behind data ns2:En, Type & Weight?
2.) Why is '(Weight/unit)[1]' not working?
3.) If I change outer apply to cross apply results are the same. I think i need to use outer apply since it is a left outer join in my opinion, but I don't understand completely. In my previous topic cross apply is used in the answer. Slow XML import with SQL server
CodePudding user response:
-
.valuealways needs a singleton: in other words it must be statically known (upfront) that there will be exactly one node to retrieve. This is normally done using a predicate[1]to retrieve the first node.- Retrieving a whole node using
.valueimplicitly takes thetext()node (of which there may be more than one), however it is better and more performant to specify it explicitly like(Type/text())[1] - In the case of attributes (see 2.), there is always one unique attribute, so this doesn't need
[1]. But this only applies if you are retrieving a top-level attribute. If you go down a level then you may have multiple attributes, again requiring[1].
- Attributes are retrieved using
@, so you need(Weight/@unit)[1]. - The
.nodesfunction shreds the XML into separate rows.CROSS APPLYimplies that you expect rows and to act like anINNER JOIN, whereasOUTER APPLYimplies that there may not be rows but you still want nulls, like aLEFT JOIN. Up to you as to what you expect.
WITH XMLNAMESPACES(DEFAULT 'ux:no::ehe:v5:actual:aver',
'ux:no:ehe:v5:tran' as ns5,
'ux:no:ehe:v5:move' AS ns4,
'ux:no:ehe:v5:cat:fill' as ns3,
'ux:no:ehe:v5:centre' as ns2)
SELECT c.value('@versionCode', 'VARCHAR(100)') as versionCode,
x.value('@start', 'DATE') as Start_date,
x.value('@end', 'DATE') as End_date,
x.value('(ns5:translation/ns2:En/text())[1]', 'VARCHAR(100)') as Trans_EN,
x.value('(Type/text())[1]', 'VARCHAR(100)') as Typed,
x.value('(Weight/@unit)[1]', 'VARCHAR(20)') as Unitdata,
x.value('(Weight/text())[1]', 'decimal(18,9)') as Weightdata
FROM tbl
CROSS APPLY tbl.XmlColumn.nodes('/ns4:Dataview1/ns4:Content') AS t1(c)
OUTER APPLY t1.c.nodes('ns4:gen') AS t2(x);
You don't need to create a table just to use OPENROWSET, you can feed it straight in:
WITH XMLNAMESPACES(DEFAULT 'ux:no::ehe:v5:actual:aver',
'ux:no:ehe:v5:tran' as ns5,
'ux:no:ehe:v5:move' AS ns4,
'ux:no:ehe:v5:cat:fill' as ns3,
'ux:no:ehe:v5:centre' as ns2)
SELECT c.value('@versionCode', 'VARCHAR(100)') as versionCode,
x.value('@start', 'DATE') as Start_date,
x.value('@end', 'DATE') as End_date,
x.value('(ns5:translation/ns2:En/text())[1]', 'VARCHAR(100)') as Trans_EN,
x.value('(Type/text())[1]', 'VARCHAR(100)') as Typed,
x.value('(Weight/@unit)[1]', 'VARCHAR(20)') as Unitdata,
x.value('(Weight/text())[1]', 'decimal(18,9)') as Weightdata
FROM OPENROWSET(BULK N'C:\Data\demoset.xml', SINGLE_BLOB) AS blk(col)
CROSS APPLY (VALUES (
CAST(blk.col AS xml)
) ) AS t(XmlColumn)
CROSS APPLY t.XmlColumn.nodes('/ns4:Dataview1/ns4:Content') AS t1(c)
OUTER APPLY t1.c.nodes('ns4:gen') AS t2(x);
