Home > database >  I want to group something to become a dictionary and list. Also group them based on the heading
I want to group something to become a dictionary and list. Also group them based on the heading

Time:01-10

I want to group this following data into a dictionaries and add the names section and grade in a list

My input data:

mystring = """STUDENT;SECTION;GRADE
    Abordo;BSIT4A;2.25
    Agustin;BSIT4A;1.75
    Asiatico;BSIT4A;3.00
    Asilo;BSIT4A;2.75
    Bernabe;BSIT4A;2.25
    Borja;BSIT4A;2.00
    Botabara;BSIT4A;3.00
    Cagoco;BSIT4A;3.00
    Cariño;BSIT4A;3.00
    Cruz;BSIT4A;3.00
    Dapatnapo;BSIT4A;3.00
    Darupan;BSIT4A;2.25
    Delos Reyes;BSIT4A;3.00
    Ono;BSIT4A;3.00
    Torres;BSIT4A;2.50
    Ugale;BSIT4A;2.25
    Elpedes;BSIT4B;3.00
    Endozo;BSIT4B;2.50
    Estrada;BSIT4B;3.00
    Evangelista;BSIT4B;2.75
    Fernandez;BSIT4B;3.00
    Flores;BSIT4B;3.00
    Gayeta;BSIT4B;2.25
    Gernale;BSIT4B;2.25
    Guarino;BSIT4B;2.50
    Lecaros;BSIT4B;3.00
    Legarda;BSIT4B;2.50
    Longcop;BSIT4B;2.75
    Mabansag;BSIT4B;2.75
    Malaluan;BSIT4B;2.50
    Manaba;BSIT4B;2.25
    Manarin;BSIT4B;3.00
    Mengol;BSIT4B;3.00
    Opriasa;BSIT4B;2.50
    Pangan;BSIT4B;1.75
    Cortez;BSIT4C;3.00
    Pantilag;BSIT4C;2.25
    Penuliar;BSIT4C;3.00
    Relojo;BSIT4C;3.00
    Reyes;BSIT4C;2.75
    Salazar;BSIT4C;3.00
    Santiago;BSIT4C;2.25
    Seberre;BSIT4C;3.00
    Suayan;BSIT4C;3.00
    Sulit;BSIT4C;3.00
    Tejada;BSIT4C;2.50
    Tura;BSIT4C;2.25
    Tuvieron;BSIT4C;1.75
    Vicente;BSIT4C;2.25
    Yacub;BSIT4C;2.75"""

My code so far:

a = mystring.split("\n")
for i, j in enumerate(a):
     a[i] = j.replace(";",":")

heading = mystring[0]

I want to separate them based on groups like:

{
    'STUDENT': ['Abordo', 'Agustin', ...],
    'SECTION': ['BSIT4A', 'BSIT4A', ...],
    'GRADE': [2.25, 1.75, ...]
}

I am new in python if you can tell me I really appreciate it. I already did this so that I can change the string into a semi column separator

CodePudding user response:

Given a shortened version of your data:

mystring = """STUDENT;SECTION;GRADE
    Abordo;BSIT4A;2.25
    Agustin;BSIT4A;1.75
    Asiatico;BSIT4A;3.00
    Asilo;BSIT4A;2.75
    Bernabe;BSIT4A;2.25
    Borja;BSIT4A;2.00
    Botabara;BSIT4A;3.00"""

The first thing I would probably do is reshape it into a "list of lists" or a list of dictionaries. Let's do a list of lists.

rows = [
    [cell.strip() for cell in row.split(";")]
    for row in mystring.split("\n")
]

at this point we can view what we have with a print():

[
    ['STUDENT', 'SECTION', 'GRADE'],
    ['Abordo', 'BSIT4A', '2.25'],
    ['Agustin', 'BSIT4A', '1.75'],
    ['Asiatico', 'BSIT4A', '3.00'],
    ['Asilo', 'BSIT4A', '2.75'],
    ['Bernabe', 'BSIT4A', '2.25'],
    ['Borja', 'BSIT4A', '2.00'],
    ['Botabara', 'BSIT4A', '3.00']
]

Now we can reshape that using the data in the first row as the keys of a dictionary with corresponding values appended to lists.

headers = rows[0]
results = collections.defaultdict(list)
for row_values in rows[1:]:
    for column_index, column_name in enumerate(headers):
        results[column_name].append(row_values[column_index])

Again we can print what we have to see:

{
    'STUDENT': ['Abordo', 'Agustin', 'Asiatico', 'Asilo', 'Bernabe', 'Borja', 'Botabara'],
    'SECTION': ['BSIT4A', 'BSIT4A', 'BSIT4A', 'BSIT4A', 'BSIT4A', 'BSIT4A', 'BSIT4A'],
    'GRADE': ['2.25', '1.75', '3.00', '2.75', '2.25', '2.00', '3.00']
}

Note though that the pandas module would do most of the work for you if you wanted to explore it.

CodePudding user response:

Here's another way to do it.

First you split each row by newline character '\n'. This will create a list of strings. Then you strip the whitespace on left side of each string and split on ';'. This will create a list of lists, lsts.

The you use the unpacking operator * to unpack the sublists to tuples where in each tuple, the first item is the key and the rest are the values. You use dict comprehension to create your desired outcome.

lsts = [x.lstrip().split(';') for x in mystring.split('\n')]
out = {tpl[0]: list(tpl[1:]) for tpl in zip(*lsts)}

The same code as a one-liner:

out = {tpl[0]: list(tpl[1:]) for tpl in zip(*[x.lstrip().split(';') for x in mystring.split('\n')])}

Finally, if you need to create cast 'GRADE' to floats instead of strings, you can do:

out['GRADE'][:] = map(float, out['GRADE'])

Output:

{'STUDENT': ['Abordo', 'Agustin', 'Asiatico', 'Asilo', 'Bernabe', 'Borja', 'Botabara', 'Cagoco', 'Cariño', 
             'Cruz', 'Dapatnapo', 'Darupan', 'Delos Reyes', 'Ono', 'Torres', 'Ugale', 'Elpedes', 'Endozo', 
             'Estrada', 'Evangelista', 'Fernandez', 'Flores', 'Gayeta', 'Gernale', 'Guarino', 'Lecaros', 
             'Legarda', 'Longcop', 'Mabansag', 'Malaluan', 'Manaba', 'Manarin', 'Mengol', 'Opriasa', 'Pangan', 
             'Cortez', 'Pantilag', 'Penuliar', 'Relojo', 'Reyes', 'Salazar', 'Santiago', 'Seberre', 'Suayan', 
             'Sulit', 'Tejada', 'Tura', 'Tuvieron', 'Vicente', 'Yacub'], 
 'SECTION': ['BSIT4A', 'BSIT4A', 'BSIT4A', 'BSIT4A', 'BSIT4A', 'BSIT4A', 'BSIT4A', 'BSIT4A', 'BSIT4A', 'BSIT4A', 
             'BSIT4A', 'BSIT4A', 'BSIT4A', 'BSIT4A', 'BSIT4A', 'BSIT4A', 'BSIT4B', 'BSIT4B', 'BSIT4B', 'BSIT4B', 
             'BSIT4B', 'BSIT4B', 'BSIT4B', 'BSIT4B', 'BSIT4B', 'BSIT4B', 'BSIT4B', 'BSIT4B', 'BSIT4B', 'BSIT4B', 
             'BSIT4B', 'BSIT4B', 'BSIT4B', 'BSIT4B', 'BSIT4B', 'BSIT4C', 'BSIT4C', 'BSIT4C', 'BSIT4C', 'BSIT4C', 
             'BSIT4C', 'BSIT4C', 'BSIT4C', 'BSIT4C', 'BSIT4C', 'BSIT4C', 'BSIT4C', 'BSIT4C', 'BSIT4C', 'BSIT4C'], 
 'GRADE': [2.25, 1.75, 3.0, 2.75, 2.25, 2.0, 3.0, 3.0, 3.0, 3.0, 3.0, 2.25, 3.0, 3.0, 2.5, 2.25, 3.0, 2.5, 3.0, 
           2.75, 3.0, 3.0, 2.25, 2.25, 2.5, 3.0, 2.5, 2.75, 2.75, 2.5, 2.25, 3.0, 3.0, 2.5, 1.75, 3.0, 2.25, 3.0, 
           3.0, 2.75, 3.0, 2.25, 3.0, 3.0, 3.0, 2.5, 2.25, 1.75, 2.25, 2.75]}
  •  Tags:  
  • Related