Home > Blockchain >  How to analyze external data using command line arguments
How to analyze external data using command line arguments

Time:01-16

I started to write a program analyzing external data. This is done with the help of command line arguments. However, i cant execute the program.. Is it wrong or am I on the right track...?

import argparse
parser = argparse.ArgumentParser()
parser.add_argument("statistic", choices=["avg", "max"], help="Which statistic should be run?")
parser.add_argument("variable", choices=["distance", "delay"], help="What variable should be used for the calculation?")
parser.add_argument("tsvfile", help="Name of data file to be analyzed")
import pandas as pd
df = pd.read_csv("flights.tsv", sep="\t")

args = parser.parse_args()
s = args.statistic
v = args.variable
t = args.tsvfile

if s == "avg" and v == "distance" and t == "flights.tsv":
    print(df["DISTANCE"].mean())
elif s == "avg" and v == "delay" and t == "flights.tsv":
    print(df["DEPATURE_DELAY"].mean())
elif s == "max" and v == "delay":
    print(df["DEPATURE_DELAY"].max())
elif s == "max" and v == "distance" and t == "flights.tsv":
    print(df["DISTANCE"].max())

This is the exception I got

I would really love some help

CodePudding user response:

If you were running the script from the command line, you would need to add arguments (e.g., python3 tom_script.py avg delay flights.tsv):

$ python3 tom_script.py  # Incorrect

usage: tom_script.py [-h] {avg,max} {distance,delay} tsvfile
tom_script.py: error: the following arguments are required: statistic, variable, tsvfile

$ python3 tom_script.py avg delay flights.tsv  # Correct

However, to run the code in Jupyter Notebook, you have to provide values for the arguments within a cell (e.g., args = parser.parse_args(args=['avg', 'delay', 'flights.tsv', ]):

NOTE - Tested using Ubuntu 20.04, Python 3.8, IPython 7.13, Firefox 95.0, and Jupyter 6.0

in [1]: import argparse

in [2]: parser = argparse.ArgumentParser()

in [3]: parser.add_argument("statistic", choices=["avg", "max", ], help="Which statistic should be run?")
        parser.add_argument("variable", choices=["distance", "delay", ],
                            help="What variable should be used for the calculation?")
        parser.add_argument("tsvfile", help="Name of data file to be analyzed")

out[3]: _StoreAction(option_strings=[], dest='tsvfile', nargs=None, const=None, default=None, type=None, choices=None, help='Name of data file to be analyzed', metavar=None)

in [4]: # This does not work
        args = parser.parse_args()

        usage: ipykernel_launcher.py [-h] {avg,max} {distance,delay} tsvfile
        ipykernel_launcher.py: error: argument statistic: invalid choice: '/home/stack/.local/share/jupyter/runtime/kernel-66ce9d80-ab79-4f2f-9836-c98cdbcd20c5.json' (choose from 'avg', 'max')
        ERROR:root:Internal Python error in the inspect module.
        Below is the traceback from this internal error.
        ...

in [5]: # This does not work either (this also caused your exception)
        args = parser.parse_args(args=[])

        usage: ipykernel_launcher.py [-h] {avg,max} {distance,delay} tsvfile
        ipykernel_launcher.py: error: the following arguments are required: statistic, variable, tsvfile
        An exception has occurred, use %tb to see the full traceback.

        SystemExit: 2

in [6]: # This works
        args = parser.parse_args(args=['avg', 'delay', 'flights.tsv', ])

in [7]: s = args.statistic
        v = args.variable
        t = args.tsvfile

in [8]: print(s, v, t)

        avg delay flights.tsv
  •  Tags:  
  • Related