I have data with current names of companies, old names, and the date of name changes. It looks like this:
| name | former_name1 | name_change_date1 |
|---|---|---|
| ACMAT CORP | nan | NaT |
| ACME ELECTRIC CORP | nan | NaT |
| ACME UNITED CORP | nan | NaT |
| COLUMBIA ACORN TRUST | LIBERTY ACORN TRUST | 2003-10-20 |
| MULTIGRAPHICS INC | AM INTERNATIONAL INC | 1997-03-17 |
| MILLER LLOYD I III | nan | NaT |
| AFFILIATED COMPUTER SERVICES INC | nan | NaT |
| ADAMS RESOURCES & ENERGY, INC. | ADAMS RESOURCES & ENERGY INC | 2005-04-01 |
| BK Technologies Corp | BK Technologies, Inc. | 2019-03-28 |
I want to figure out what the name of each company was at a particular date. Let's say I want to figure out the name of a company as of January 1st 2002. Then I could create a new column called say, edited_name, which would contain the current name of the company unless the company has changed names since 1/1/2002, in which case it would contain the historical name (i.e. former_name1) of the company. So the output should look something like this:
| name | former_name1 | name_change_date1 | edited_name |
|---|---|---|---|
| ACMAT CORP | nan | NaT | ACMAT CORP |
| ACME ELECTRIC CORP | nan | NaT | ACME ELECTRIC CORP |
| ACME UNITED CORP | nan | NaT | ACME UNITED CORP |
| COLUMBIA ACORN TRUST | LIBERTY ACORN TRUST | 2003-10-20 | LIBERTY ACORN TRUST |
| MULTIGRAPHICS INC | AM INTERNATIONAL INC | 1997-03-17 | MULTIGRAPHICS INC |
| MILLER LLOYD I III | nan | NaT | MILLER LLOYD I III |
| AFFILIATED COMPUTER SERVICES INC | nan | NaT | AFFILIATED COMPUTER SERVICES INC |
| ADAMS RESOURCES & ENERGY, INC. | ADAMS RESOURCES & ENERGY INC | 2005-04-01 | ADAMS RESOURCES & ENERGY INC |
| BK Technologies Corp | BK Technologies, Inc. | 2019-03-28 | BK Technologies, Inc. |
In Stata (with which I am much more familiar) this could be easily accomplished with:
gen edited_name = name
replace edited_name = former_name1 if name_change_date_1 > date("2002-01-01", "YMD") & name_change_date_1 != .
Unfortunately I am at a loss of how to accomplish this in Python/Pandas.
Data:
{'name': ['ACMAT CORP', 'ACME ELECTRIC CORP', 'ACME UNITED CORP', 'COLUMBIA ACORN TRUST',
'MULTIGRAPHICS INC', 'MILLER LLOYD I III', 'AFFILIATED COMPUTER SERVICES INC',
'ADAMS RESOURCES & ENERGY, INC.', 'BK Technologies Corp'],
'former_name1': [nan, nan, nan, 'LIBERTY ACORN TRUST', 'AM INTERNATIONAL INC', nan, nan,
'ADAMS RESOURCES & ENERGY INC', 'BK Technologies, Inc.'],
'name_change_date1': [NaT, NaT, NaT, '2003-10-20', '1997-03-17', NaT, NaT,
'2005-04-01', '2019-03-28']}
CodePudding user response:

