I divided my dataset into 2 variables for the independent and the dependent columns and split them into training and testing datasets but my independent variable is of 2 dimension and dependent var of 1 dim.I am having a value error , when i am trying to plot my data using pyplot,according to my understanding it is due to the difference in the dimensions of my 2 variables(x & y).Can anyone explain me what is wrong in this and how can i correct this ?
since i cant put my ss here heres the link of my codes screenshot
CodePudding user response:
First of all, it is quite common, usually even necessary, for machine learning that you have more than one feature. So your variable x is a matrix with multiple columns/features.
If you now want to scatter your x and y data you have to be more specific what you want to achieve. You might want to look at seaborn scatterplot as it gives you quick access to a wide range of possibilities.
You might for example try:
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
sns.scatterplot(
data=x_train,
x=column1,
y=column2,
plot_kws=dict(
hue=y_train,
palette="blend:gold,dodgerblue",
)
)
plt.show()
Look at this awesome example from seaborn for heatmap-like scatterplots.
