This notebook uses SAS to visualize the performance of the Florida Retirement System pension fund, one of the largest in the United States. The data is from the Center for Retirement Research at Boston College, which tracks 180 pension funds across the United States. The performance of the fund is compared to those of other large funds.
Reading the data:
/*Reading in the dataset*/
proc import datafile="pension_plans/pension_plans.csv"
out=pension_plans
dbms=csv
replace;
getnames=yes;
/*guessingrows=max makes it go a lot slower but it's necessary because
if the first rows of a numeric column are blanks, SAS labels the column
as a text (character) column*/
guessingrows=max;
run;
/*I want to print the dataset dimensions.
The CONTENTS statement prints a *screenful* of information.
I just need the number of rows and columns. Using the option
"short" creates a dataset pension_plans_columns where each
observation is a variable from the original data set.*/
proc contents
data=pension_plans out=pension_plans_columns short noprint;
run;
/*Then we create two dummy tables, one from the original dataset
and one from the columns dataset, and merge them*/
pro sql;
title j=l "Reading the data" ;
/*Creating dummy tables NumRowsTable and NumColumnsTable*/
create table NumRowsTable as
select count(*) as Number_of_rows label = "Number of rows" from pension_plans;
create table NumColumnsTable as
select count(*) as Number_of_columns label = "Number of columns" from pension_plans_columns;
/*Adding a Dataset column to both dummy tables so we have something to merge on*/
alter table NumRowsTable
add Dataset char(30);
update NumRowsTable
set Dataset = "pension_plans dataset";
alter table NumColumnsTable
add Dataset char(30);
update NumColumnsTable
set Dataset = "pension_plans dataset";
run;
/*Merging the two tables*/
data NumRowsAndColumns;
merge work.NumRowsTable work.NumColumnsTable;
by Dataset;
run;
/*Finally*/
proc sql;
select Dataset, Number_of_Rows, Number_of_columns from NumRowsAndColumns;
run;
SAS comes with its own version of SQL, which we can use to get the asset allocations of FRS. The CRR provides detailed documentation of the dataset, so we know which columns to query.
proc sql;
title j=l "Investment allocations of FRS";
create table pension_plans_frs_alloc as
select PlanName, fy as Year,
EQTotal_Actl format=percent8.2 as Equities,
FITotal_Actl format=percent8.2 as Fixed_Income,
RETotal_Actl format=percent8.2 as Real_Estate,
PETotal_Actl format=percent8.2 as Private_Equity,
HFTotal_Actl format=percent8.2 as Hedge_Funds,
CashTotal_Actl format=percent8.2 as Cash
from work.pension_plans
where PlanName = "Florida RS";
select * from work.pension_plans_frs_alloc; /*show new table*/
quit;
We can use sgplot
to plot the asset allocations over time.
proc sgplot data=work.pension_plans_frs_alloc;
series x=Year y=Equities / lineattrs=(color=CX999999 thickness=2);
series x=Year y=Fixed_Income / lineattrs=(color=CXE69F00 thickness=2);
series x=Year y=Real_Estate / lineattrs=(color=CX56B4E9 thickness=2);
series x=Year y=Private_Equity / lineattrs=(color=CX009E73 thickness=2);
series x=Year y=Hedge_Funds / lineattrs=(color=CXF0E442 thickness=2);
series x=Year y=Cash / lineattrs=(color=CX0072B2 thickness=2);
yaxis grid label=" " ;
xaxis values=(2000 to 2019 by 1);
title "Investment percentage allocations of Florida RS pension fund" ;
ods graphics / noborder;
keylegend / position=top;
ods graphics on / width=8in;
run;
FRS used to allocate about 90% of its funds to equities and fixed income investments, but has reduced its allocations to those two asset classes to about 75%. FRS has roughly doubled its allocations to real estate and private equity, and has diversified into hedge funds as well.
We can run nearly identical code to tabulate and plot the returns for each asset class, as well as the total yearly returns.
proc sql;
create table pension_plans_frs_alloc_returns as
select PlanName, fy as Year,
EQTotal_Rtrn format=percent8.2 as Equities,
FITotal_Rtrn format=percent8.2 as Fixed_Income,
RETotal_Rtrn format=percent8.2 as Real_Estate,
PETotal_Rtrn format=percent8.2 as Private_Equity,
HFTotal_Rtrn format=percent8.2 as Hedge_Funds,
CashTotal_Rtrn format=percent8.2 as Cash,
InvestmentReturn_1yr format=percent8.2 as Total
from work.pension_plans
where PlanName = "Florida RS";
select * from work.pension_plans_frs_alloc_returns; /*show new table*/
title j=l "Investment returns by asset class of Florida RS pension fund" ;
quit;
proc sgplot data=work.pension_plans_frs_alloc_returns;
series x=Year y=Equities / lineattrs=(color=CX999999 thickness=2);
series x=Year y=Fixed_Income / lineattrs=(color=CXE69F00 thickness=2);
series x=Year y=Real_Estate / lineattrs=(color=CX56B4E9 thickness=2);
series x=Year y=Private_Equity / lineattrs=(color=CX009E73 thickness=2);
series x=Year y=Hedge_Funds / lineattrs=(color=CXF0E442 thickness=2);
series x=Year y=Cash / lineattrs=(color=CX0072B2 thickness=2);
series x=Year y=Total / lineattrs=(color=CX000000 thickness=3);
yaxis grid label=" " ;
xaxis values=(2000 to 2019 by 1);
title "Investment returns by asset class of Florida RS pension fund" ;
ods graphics / noborder;
keylegend / position=top;
ods graphics on / width=8in;
run;
I wanted to see how the performance of FRS compared to those of other large funds. Besides return on investment, a fund is graded on the funded ratio: its assets divided by its liabilities. Ideally, it should be at least 1. The code below tabulates the return on investment and the funded ratio of the 25 largest funds by assets in the United States in 2016.
/*Displaying the top 25 funds by assets
You have to do it in two steps. First you CREATE the table you want,
ordered by assets...*/
proc sql;
create table top_funds_by_assets as
select PlanName, StateName, fy as Year, ActAssets_GASB format=dollar14.0 as PlanAssets,
InvestmentReturn_1yr format=percent8.2 as Return_1yr,
ActFundedRatio_GASB format=percent8.2 as FundedRatio
from work.pension_plans
where fy = 2016
order by PlanAssets desc;
title j=l "2016 returns and funded ratios of the 25 largest pension funds by assets" ;
run;
/*...and then you tell SAS to put the first 25 rows
from that table and put them in a new table*/
proc sql inobs=25;
create table top25_funds_by_assets as
select * from top_funds_by_assets;
run;
/*Display your table*/
proc sql;
select * from top25_funds_by_assets;
run;
We see that only the Wisconsin pension fund met the target.
We can also have SAS create a column where each fund's position in a ranking is shown. For example, if we wanted to show the ranking of funds by funded ratio in the table above:
proc rank data=work.top25_funds_by_assets ties=low out=work.top25_funds_by_assets_ranked_1 descending;
var FundedRatio; /*variable by which you want to rank*/
ranks FundedRatioRank; /*name of column with the rankings*/
run;
/*Only showing the first 10 rows to save space*/
proc print data=work.top25_funds_by_assets_ranked_1 (obs=10);
title " ";
run;
I have only shown the first 10 rows. We can see FRS ranked number 8 among the largest funds in the U.S. in 2016.
Doing it for a single year, by itself, is not that useful. We could just have ordered the 2016 Top-25 funds table by funded ratio and seen where FRS ranked that year. However, I wanted to see where FRS ranked each year among the largest 25 funds, on both funded ratio and return on investment. We certainly wouldn't want to scan 15 or so tables. SAS lets us loop over values using DO
, and create functions using macro
, so we can avail ourselves of both of those features.
%macro sqlloop(start,end);
/*Creating the top-25-funds-by-assets tables, from start=2001 to end=2016
The procedure is exactly the same as before, except now it's embedded
in a loop.*/
%DO year=&start. %TO &end.;
PROC SQL;
CREATE TABLE temp_table as
select PlanName, StateName, fy as Year, ActAssets_GASB format=dollar14.0 as PlanAssets,
InvestmentReturn_1yr format=percent8.2 as Return_1yr,
ActFundedRatio_GASB format=percent8.2 as FundedRatio
from work.pension_plans
where fy = &year.
order by PlanAssets desc;
RUN;
PROC SQL inobs=25;
CREATE TABLE FRS_Perf_&year. as
select * from temp_table
RUN;
%END;
/*Creating a table for each year where the top-25 funds are ranked by returns.
Also almost identical to what we had before, but now embedded in a loop*/
%DO year=&start. %TO &end.;
proc rank data=work.FRS_Perf_&year. ties=low out=FRS_Perf_ranked_returns_&year. descending;
var Return_1yr;
ranks ReturnsRank;
run;
%END;
/*Creating a table for each year where the top-25 funds are ranked by funded ratio.*/
%DO year=&start. %TO &end.;
proc rank data=work.FRS_Perf_&year. ties=low out=work.FRS_Perf_ranked_fundratio_&year. descending;
var FundedRatio;
ranks FundedRatioRank;
run;
%END;
/*Now we will create a table that shows FRS' yearly ranking by returns
First we create the table and put in the first row, by querying the
2001 table with the ranked returns*/
proc sql;
create table Florida_RS_Perf_Rank_Returns as
select * from work.FRS_Perf_ranked_returns_2001
where PlanName='Florida RS' and Year=2001;
quit;
/*Then we loop over the 2002-2016 tables with the ranked returns,
querying them and inserting the relevant rows into the yearly
rankings table*/
proc sql;
%DO year=2002 %TO &end.;
insert into Florida_RS_Perf_Rank_Returns
/*values ('Florida RS', 'Florida', 2001, '$95,517,948', '( 7.60%)', 12, 19);*/
select * from work.FRS_Perf_ranked_returns_&year.
where PlanName='Florida RS' and Year=&year.;
%END;
quit;
/*Now we repeat what we just did to create a table that shows FRS'
yearly ranking by funded ratio*/
proc sql;
create table Florida_RS_Perf_Rank_FundRatio as
select * from work.FRS_Perf_ranked_fundratio_2001
where PlanName='Florida RS' and Year=2001;
quit;
proc sql;
%DO year=2002 %TO &end.;
insert into Florida_RS_Perf_Rank_FundRatio
/*values ('Florida RS', 'Florida', 2001, '$95,517,948', '( 7.60%)', 12, 19);*/
select * from work.FRS_Perf_ranked_fundratio_&year.
where PlanName='Florida RS' and Year=&year.;
%END;
quit;
%mend;
/*Call the macro*/
%sqlloop(start=2001, end=2016)
/*Merge the two tables*/
data Florida_RS_Perf_Rank_merged;
merge work.Florida_RS_Perf_Rank_Returns work.Florida_RS_Perf_Rank_FundRatio;
by Year;
run;
/*Display the table
The rank proc *labels* the columns, which makes it very confusing
when you want to plot the table later, so I created a second merged
table and changed the labels. :| */
proc sql;
create table Florida_RS_Perf_Rank_merged2 as
select PlanName, Year, PlanAssets, Return_1yr, FundedRatio,
ReturnsRank label="ReturnsRank", FundedRatioRank label="FundedRatioRank"
from work.Florida_RS_Perf_Rank_merged;
select * from Florida_RS_Perf_Rank_merged2;
title j=l "FRS ranking in returns and funded ratio among 25 largest funds in the U.S., 2001-2016";
run;
We can see that FRS consistently ranks in the top third of funds in funded ratio and slightly below the middle in returns.
Plotting the rankings:
proc sgplot data=work.Florida_RS_Perf_Rank_merged2;
series x=Year y=ReturnsRank / lineattrs=(color=CXE69F00 thickness=2);
series x=Year y=FundedRatioRank / lineattrs=(color=CX56B4E9 thickness=2);
xaxis values=(2001 to 2016 by 1);
yaxis reverse grid label=" " values =(1 to 25 by 4) labe="Ranking";
title "FRS ranking in returns and funded ratio among 25 largest funds in the U.S.";
keylegend / position=top;
ods graphics on / width=8in;
run;