I would like to create a directed graph with use of the networkx library in python.
I have a pandas dataframe that looks like this:
Head Mounted Display Marker Smartphone
2D data extrusion 3 0 1
AgiSoft PhotoScan 3D design 1 2 2
AuGeo Esri AR template 1 1 2
BIM 1 1 0
Blender 3D design 0 2 4
Bluetooth localization 1 1 0
CityEngine 3 1 2
GIS data processing 3 1 2
GNSS localization 1 2 4
Google ARCore 0 1 5
Google SketchUp 3D design 1 2 0
Image Stitching 1 1 4
Java Development Kit 0 1 0
SLAM 1 2 2
Unity 3D 8 12 10
Unreal Engine 1 1 0
Vuforia 2 7 3
As input for the "networkx.DiGraph.add_weighted_edges_from" function I need to format this in a list of 3-tuples like this:
('Head Mounted Display', '2D data extrusion', 3),
('Head Mounted Display', 'Agisoft PhotoScan 3D design', 1),
('Head Mounted Display','AuGeo Esri AR template', 1),
etc...
Furthermore, tuples that have a weight of 0 such as:
('Marker', '2D data extrusion', 0)
need to be removed from the list.
Anyone any idea how to do this?
Thanks in advance!
CodePudding user response:
Use df.columns[0] to get 'HeadMountedDisplay' and df.index[i] to get the row names. Note that df refers to your df name.
Then use tupling with a conditional:
tuple((df.columns[0], df.index[i], df[df.columns[0]][i]) for i in range(len(df)) if df[df.columns[0]][i] is not 0)
CodePudding user response:
You can follow the code below
lstOfTuples = []
for i in range(df.shape[0]):
for j in range(df.shape[1]):
index = df.index[i]
col = df.columns[j]
value = float(df.loc[index, col])
if value > 0:
lstOfTuples.append((col, index, value))
lstOfTuples
Create a directed graph like this
G = nx.Graph()
G.add_weighted_edges_from(ebunch_to_add=lstOfTuples)
CodePudding user response:
You can create the list of required tuples as follows:
def createTuples(df, onColumn=0):
sze = df.shape[0]
colName = list(df.columns)[onColumn]
rslt = []
for r in range(sze):
if df.iloc[r][onColumn] > 0:
rslt.append((colName, df.iloc[r].name, df.iloc[r][onColumn]))
return rslt
This approach allows you to designate the column heading you want to use in the first tuple position.
CodePudding user response:
Using .melt will help to get you the shape you are interested in. Here's a reproducible example:
import networkx as nx
import pandas as pd
# create a dummy dataframe with a similar structure
df = pd.DataFrame(zip(range(6), range(5, -1, -1)))
df.columns = list("ab")
df.index = list("qwerty")
# flatten the dataframe for easier processing
flat = df.melt(ignore_index=False).reset_index()
# ignore 0
mask = flat["value"] > 0
flat = flat.loc[mask]
# create a directed graphp
G = nx.DiGraph()
# fill-in with edges
for start, end, weight in flat.values:
G.add_edge(start, end, weight=weight)
print(G.nodes()) # ['w', 'a', 'e', 'r', 't', 'y', 'q', 'b']
print(
G.edges()
) # [('w', 'a'), ('w', 'b'), ('e', 'a'), ('e', 'b'), ('r', 'a'), ('r', 'b'), ('t', 'a'), ('t', 'b'), ('y', 'a'), ('q', 'b')]

