Home > database >  Macro that loads multiple datasets that updates and change names?
Macro that loads multiple datasets that updates and change names?

Time:01-06

Im working with a database in SAS that updates every so often. I want the macro to automatically load the most recent dataset of a given year. The datasets cover the years 2015-2018 and each year has a different updated version which is stated in the name of the dataset, i.e. 2015_version9. With my current code you need to update the macro manually everytime a dataset change its version and name.

CodePudding user response:

You can scan through each library and find the max version number, then save those to a single macro variable string that you can supply to a set statement. Here are the assumptions of this solution:

  1. Your libraries are named lib_2015, lib_2016, etc. and follow 8-char libname requirements
  2. Your libraries are static for years 2015-2018
  3. Your datasets are named _version1, _version2, etc.

Here's how we'll do it.

%let libraries = "LIB_2015", "LIB_2016", "LIB_2017", "LIB_2018";
    
proc sql noprint;
    select cats(libname, '.', memname)
         , input(compress(memname,,'KD'), 8.) as version
    into :data separated by ' '
    from dictionary.members
    where     upcase(libname) IN(&libraries.)
          AND upcase(memname) LIKE "^_VERSION%" escape '^'
    group by libname
    having version = max(version)
    ;
quit;

data want;
    set &data. indsname=name;
    dsn = name;
run;

This code does the following:

  • Gets all dataset names from each library that starts with _VERSION. The ^ in the like clause is an escape character that we defined so that we can match _ literally.
  • Removes all non-digits from the dataset name and converts it to a version number, version. The KD option in the compress() function says to keep only digits from the string.
  • Keeps only names in each library where version is the highest value
  • Saves all the dataset names to a single macro variable, &data

&data will store a string of all the relevant datasets you want with the highest version number for each library. For example:

%put &data.;

LIB_2015._VERSION9 LIB_2016._VERSION19 LIB_2017._VERSION12 LIB_2018._VERSION8

The indsname option in the data step will store the full dataset name of each observation. We're saving that to a variable named dsn. This shows where each observation comes from so you can split them out to individual datasets as needed.

  •  Tags:  
  • Related