Bring a dataset with header into a specific json format with pandas-CodePudding

What I have as Input as string text file

HeaderCol1; HeaderCol2; HeaderCol3
ValueRow1Col1; ValueRow1Col2; VRow1Col3
ValueRow2Col1; ValueRow2Col2; VRow2Col3
ValueRow3Col1; ValueRow3Col2; VRow3Col3

Important to know: This is an example set and there are much more rows and the headers title don't have the same title in every incoming dataset. They can change.

What I need as output in JSON

{
    "Row1": {
        "HeaderCol1": "ValueRow1C1",
        "HeaderCol2": "ValueRow1C2",
        "HeaderCol3": "ValueRow1C3"
    },
    "Row2": {
        "HeaderCol1": "ValueRow2C1",
        "HeaderCol2": "ValueRow2C2",
        "HeaderCol3": "ValueRow2C3"
    },
    "Row3": {
        "HeaderCol1": "ValueRow3C1",
        "HeaderCol2": "ValueRow3C2",
        "HeaderCol3": "ValueRow3C3"
    }
}

What I already did

#rawdata is the textfile as str()
import panda as pd    
list1 = (rawdata.splitlines())
list2 = [[el] for el in list1]
df = pd.DataFrame(
    [list2],
    columns= ['HeaderCol1', 'HeaderCol2', 'HeaderCol3']
    )

This brings an error: ValueError: 3 columns passed, passed data had 4 columns

CodePudding user response：

You can do this directly with pandas:

df = pd.read_csv('path_to_file.txt', sep=';')

Output:

      HeaderCol1      HeaderCol2  HeaderCol3
0  ValueRow1Col1   ValueRow1Col2   VRow1Col3
1  ValueRow2Col1   ValueRow2Col2   VRow2Col3
2  ValueRow3Col1   ValueRow3Col2   VRow3Col3

Then simply call:

df.to_json(orient='index')

Output:

'{"0":{"HeaderCol1":"ValueRow1Col1"," HeaderCol2":" ValueRow1Col2"," HeaderCol3":" VRow1Col3"},"1":{"HeaderCol1":"ValueRow2Col1"," HeaderCol2":" ValueRow2Col2"," HeaderCol3":" VRow2Col3"},"2":{"HeaderCol1":"ValueRow3Col1"," HeaderCol2":" ValueRow3Col2"," HeaderCol3":" VRow3Col3"}}'

EDIT

Reading from a buffer instead:

from io import StringIO

df = pd.read_csv(StringIO(rawdata), sep=';')