Home > Software design >  Scatter Plot Binary Data Color Coded Points from Data Labels
Scatter Plot Binary Data Color Coded Points from Data Labels

Time:01-27

I'd like to use matplotlib.pyplot.scatter to create a scatter plot similar to the picture below from data in a dataframe with a header that is formatted similar to the table here where all the points for a given sample are colored based on the label in the first column of the data and a point is only plotted for each gene with a value of 1 - no point for the genes with a 0 value:

label gene a gene b gene c gene d
1 0 1 0 0
0 1 1 0 1
0 0 0 1 0
1 0 0 0 0
1 0 1 0 0

enter image description here

Note: my sample data does not match my sample scatter plot output.

CodePudding user response:

After melting your dataframe to a long format you can draw a matrix with seaborn's matrix plot

With the melted dataframe you can access plt.scatter directly from pandas but I think you have to add your own custom legend for the labels.

df.plot(x='variable', y='sample', s=(df.value 0.1) * 300, kind='scatter',
        ylim=[df['sample'].max() .5, df['sample'].min()-.5], # uncomment to flip y-axis
        figsize=(7,6), c='label', cmap='coolwarm', colorbar=False
);

pandas matrix plot

  •  Tags:  
  • Related