Im still learning A.I and trying to create a Linear Regression algorithm model, but I keep getting K-CodePudding

Im learning from a python course from tech by tim, I used his code for a basic Linear Regression algorithm and tried to make my own small data sheet to try it out, whenever I import my data sheet it shows the following error:

KeyError: 'xvalue'

The above exception was the direct cause of the following exception:

KeyError                                  Traceback (most recent call last)
4 frames
/usr/local/lib/python3.7/dist-packages/pandas/core/indexes/base.py in get_loc(self, key, method, tolerance)
   2898                 return self._engine.get_loc(casted_key)
   2899             except KeyError as err:
-> 2900                 raise KeyError(key) from err
   2901 
   2902         if tolerance is not None:

KeyError: 'xvalue'

my code is as follows:

from __future__ import absolute_import, division, print_function, unicode_literals

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from IPython.display import clear_output
from six.moves import urllib

import tensorflow.compat.v2.feature_column as fc

import tensorflow as tf
# Load dataset.
train = "https://docs.google.com/spreadsheets/d/1qQfNL2ePWVsOiqSNgJu_ZwmI9KFwFTrtdb1boqIKFZQ/edit?usp=sharing"
eval = "https://docs.google.com/spreadsheets/d/1Ip1Key9NYx3boTUFkPUOzEWdtUBMeGJJEY3MbSpVkUo/edit?usp=sharing"
dftrain = pd.read_csv(train, sep='\t,\s*') # training data
dfeval = pd.read_csv(eval, sep='\t,\s*') # testing data
y_train = dftrain.pop('xvalue')
y_eval = dfeval.pop('xvalue')

I'm using the google python notebook to code this, any help would be greatly appreciated!

CodePudding user response：

If you want to export the contents of a Google Sheets spreadsheet, you can use the suggestion in this answer.

In your case, you're using Pandas, so you don't need to use the requests package.

So you can define this function:

def get_docs_url(doc_id):
    url = f"https://docs.google.com/spreadsheet/ccc?key={doc_id}&output=csv"
    return url

And use it like this:

train = get_docs_url('1qQfNL2ePWVsOiqSNgJu_ZwmI9KFwFTrtdb1boqIKFZQ')
dftrain = pd.read_csv(train) # training data

PS: I removed the sep='\t,\s*' part because it's not needed to parse a CSV file.

PPS: The test dataset has different column names than the train dataset. I needed to add dfeval = dfeval.rename(columns={'X': 'xvalue', 'Y': 'yvalue'}) to get it to work.