* spaj7792.sas ; * 4/20/93 ; * ; * Purpose: ; * ------- ; * Use a robust method to adjust the Standard & Poor's price ; * change series for deterministic seasonal effects in location ; * and scale. The ideas behind the adjustment procedure are ; * similar to those outlined in Gallant, Rossi, and Tauchen ; * (Review of Financial Studies, 1992), except that a robust ; * procedure is used to estimate the location and scale equations. ; * ; * Method: ; * ------ ; * The procedure works as follows. Define the location ; * equation ; * ; * (location) p(t) = x(t)*beta + u(t) ; * ; * where p(t) is the variable to be adjusted and x(t) denotes the ; * calendar variables. Initially the location equation is fitted ; * to the data using the Huber robust estimation procedure, which ; * is described further below. Let uhat(t) = y(t) - x(t)*betahat ; * denote the residual from this robust estimation. In the next ; * step, the scale equation ; * ; * (scale) |uhat(t)| = x(t)*gamma + w(t) ; * ; * is fitted to the absolute residual, also using the Huber robust ; * procedure. Observe that the variable ; * ; * z(t) = uhat(t)/sighat(t) ; * ; * where ; * ; * sighat(t) = sqrt(x(t)*gammahat) ; * ; * is p(t) purged of the effects of x(t) in mean and location. ; * The series z(t) is the adjusted series, but its units of ; * measurement are awkward to interpret. The final adjusted ; * series is thus ; * ; * padj(t) = psi0 + psi1*z(t) ; * ; * with psi0 and psi1 chosen so that the padj(t) and p(t) have the ; * mean and standard deviation in the sample. This last affine ; * linear transformation has no substantive effect in subsequent ; * analysis. It's only purpose is to put the adjusted series in ; * the same units of measurement as the original series, which is ; * convenient for interpretation. ; * ; * Another way to view the adjustment process is that it is a ; * time-dependent affine linear transformation of the p(t) ; * process, ; * ; * padj(t) = g0(t) + g1(t)*z(t) ; * ; * where ; * ; * g0(t) = psi0 - [psi1/sighat(t)]*[x(t)*betahat] ; * ; * g1(t) = [ps1/sighat(t)] ; * ; * and as before psi0 and psi1 are chosen to so that the adjusted ; * and unadjusted series have the same first two sample moments. ; * The intercept, go(t), and slope coefficient, g1(t), completely ; * characterize the adjustment process mapping p(t) to padj(t). ; * ; * ; * The Huber procedure: ; * ------------------- ; * ; * To fit the location and scale equations, we employ Huber's ; * proposal 2 as described in in Hampel, Ronchetti, Rousseeuw, and ; * Stahel, Robust Statistics: The Approach Based on Influence ; * Functions (HRRS), pp. 105-106, p. 237. Both the location and ; * scale equations are linear models of the form ; * ; * y(t) = x(t)*beta + u(t) . ; * ; * The estimator of beta is ; * ; * betahat = argmin { sum rho[(y(t) - x(t)*beta)/rscale] } ; * beta t ; * ; * where rho is the Huber criterion function ; * ; * z*z for |z| <= c ; * rho(z) = ; * 2*sign(z)*z - c*c for |z| > c . ; * ; * Following HRRS (p. 105 and 237), rscale = 1.483*[median ; * absolute deviation of OLS residuals] and c = 2.0, which ; * provides a breakdown point of eps = 0.19. ; * ; * ; * Variables: ; * ---------- ; * ; * p: 100*[log(SP(t)) - log(SP(t-1))], SP(t) = Standard and ; * Poor's Composite Index, daily 1977-1992. ; * ; * ; * Explanatory Variables (x): ; * ; * 1. Day of week dummies (Tues, Weds, Thurs, Fri) ; * ; * day2 = (day eq 2); ; * day3 = (day eq 3); ; * day4 = (day eq 4); ; * day5 = (day eq 5); ; * ; * 2. Sqrt(gap) where gap = number of calendar days since the ; * preceding trading day ; * ; * 3. Dummies for Months March, April, ..., November ; * mon02 = (mm eq 2); ; * mon03 = (mm eq 3); ; * mon04 = (mm eq 4); ; * mon05 = (mm eq 5); ; * mon06 = (mm eq 6); ; * mon07 = (mm eq 7); ; * mon08 = (mm eq 8); ; * mon09 = (mm eq 9); ; * mon10 = (mm eq 10); ; * mon11 = (mm eq 11); ; * ; * 4. Dummies for weeks within January and December ; * ; * mon01_1 = (mm eq 1 and (1 le dd and dd le 7) ); ; * mon01_2 = (mm eq 1 and (8 le dd and dd le 14) ); ; * mon01_3 = (mm eq 1 and (15 le dd and dd le 21) ); ; * mon01_4 = (mm eq 1 and (22 le dd and dd le 31) ); ; * mon12_1 = (mm eq 12 and (1 le dd and dd le 7) ); ; * mon12_2 = (mm eq 12 and (8 le dd and dd le 14) ); ; * mon12_3 = (mm eq 12 and (15 le dd and dd le 21) ); ; * mon12_4 = (mm eq 12 and (22 le dd and dd le 31) ); ; * ; *-------------------------------------------------------------------------; * Begin SAS Code for Adjustments; *-------------------------------------------------------------------------; *-------------------------------------------------------------------------; * Macro for robust regression *-------------------------------------------------------------------------; %MACRO ROBCODE; array x{*} day2 day3 day4 day5 rgap mon01_1 mon01_2 mon01_3 mon01_4 mon03 mon04 mon05 mon06 mon07 mon08 mon09 mon10 mon11 mon12_1 mon12_2 mon12_3 mon12_4; parms b0=0.50 b1=0 b2=0 b3=0 b4=0 b5=0 b6=0 b7=0 b8=0 b9=0 b10=0 b11=0 b12=0 b13=0 b14=0 b15=0 b16=0 b17=0 b18=0 b19=0 b20=0 b21=0 b22=0; pred = b0 + b1*x{1} + b2*x{2} + b3*x{3} + b4*x{4} + b5*x{5} + b6*x{6} + b7*x{7} + b8*x{8} + b9*x{9} + b10*x{10} + b11*x{11} + b12*x{12} + b13*x{13} + b14*x{14} + b15*x{15} + b16*x{16} + b17*x{17} + b18*x{18} + b19*x{19} + b20*x{20} + b21*x{21} + b22*x{22}; e = y - pred; z = e/rscale; if ( z > c ) then do; rho = 2*c*z- c*c; delrho = 2*c; end; if ( z < -c ) then do; rho = 2*c*(-z) - c*c; delrho = -2*c; end; if ( z >= -c and z <= c ) then do; rho = z*z; delrho = 2*z; end; rrho = sqrt(rho); model naught = rrho; der.b0 = -0.50*(1/rrho)*delrho; der.b1 = -0.50*(1/rrho)*delrho*x{1}; der.b2 = -0.50*(1/rrho)*delrho*x{2}; der.b3 = -0.50*(1/rrho)*delrho*x{3}; der.b4 = -0.50*(1/rrho)*delrho*x{4}; der.b5 = -0.50*(1/rrho)*delrho*x{5}; der.b6 = -0.50*(1/rrho)*delrho*x{6}; der.b7 = -0.50*(1/rrho)*delrho*x{7}; der.b8 = -0.50*(1/rrho)*delrho*x{8}; der.b9 = -0.50*(1/rrho)*delrho*x{9}; der.b10 = -0.50*(1/rrho)*delrho*x{10}; der.b11 = -0.50*(1/rrho)*delrho*x{11}; der.b12 = -0.50*(1/rrho)*delrho*x{12}; der.b13 = -0.50*(1/rrho)*delrho*x{13}; der.b14 = -0.50*(1/rrho)*delrho*x{14}; der.b15 = -0.50*(1/rrho)*delrho*x{15}; der.b16 = -0.50*(1/rrho)*delrho*x{16}; der.b17 = -0.50*(1/rrho)*delrho*x{17}; der.b18 = -0.50*(1/rrho)*delrho*x{18}; der.b19 = -0.50*(1/rrho)*delrho*x{19}; der.b20 = -0.50*(1/rrho)*delrho*x{20}; der.b21 = -0.50*(1/rrho)*delrho*x{21}; der.b22 = -0.50*(1/rrho)*delrho*x{22}; %MEND; *--------------------------------------------------------------------------; options pagesize=55 linesize=80; data zero; infile '/home/get/sv/adjust/sp7792.dat'; input year mm dd price; yy = year-1900; yymmdd = yy*10000 + mm*100 + dd; daynum = mdy(mm,dd,yy); p = 100*( log(price) - log(lag(price)) ); day = weekday(daynum)-1; day1 = (day eq 1); day2 = (day eq 2); day3 = (day eq 3); day4 = (day eq 4); day5 = (day eq 5); hol = ( yymmdd le 520526 )*( mod(lag(day)+1,6) ne mod(day,6) ) + ( yymmdd ge 520527 )*( mod(lag(day)+1,5) ne mod(day,5) ); wkend = ( lag(day) > day); fromjan = daynum - mdy(01,01,yy) + 1; gap = daynum - lag(daynum); rgap = sqrt(gap); gap1 = (gap eq 1); gap2 = (gap eq 2); gap3 = (gap eq 3); gap4 = (gap eq 4); gap5 = (gap eq 5); mon01_1 = (mm eq 1 and (1 le dd and dd le 7) ); mon01_2 = (mm eq 1 and (8 le dd and dd le 14) ); mon01_3 = (mm eq 1 and (15 le dd and dd le 21) ); mon01_4 = (mm eq 1 and (22 le dd and dd le 31) ); mon02 = (mm eq 2); mon03 = (mm eq 3); mon04 = (mm eq 4); mon05 = (mm eq 5); mon06 = (mm eq 6); mon07 = (mm eq 7); mon08 = (mm eq 8); mon09 = (mm eq 9); mon10 = (mm eq 10); mon11 = (mm eq 11); mon12_1 = (mm eq 12 and (1 le dd and dd le 7) ); mon12_2 = (mm eq 12 and (8 le dd and dd le 14) ); mon12_3 = (mm eq 12 and (15 le dd and dd le 21) ); mon12_4 = (mm eq 12 and (22 le dd and dd le 31) ); naught = 0; if (yymmdd lt 770000) then delete; if ( p eq . ) then delete; Proc means data=zero; *-------------------------------------------------------; * First do an OLS *-------------------------------------------------------; proc reg data=zero; model p = day2-day5 rgap mon01_1-mon01_4 mon03-mon11 mon12_1-mon12_4; output out=res01ols r=prols p=phols; *-------------------------------------------------------; * Put the OLS second moments in statsols; *-------------------------------------------------------; proc means data=res01ols; var p phols prols; output mean = pbar pholsbar prolsbar std = pse pholsse prolsse out = statsols; proc print data=statsols; *-------------------------------------------------------; * Determine the median absolute deviation of the OLS residuals; *-------------------------------------------------------; data work00; set res01ols; abpr = abs(prols); keep abpr; proc univariate data=work00; var abpr; output out=work01 median = madpr; proc print data=work01; data one; set zero; if (_n_=1) then set work01; proc datasets ddname=work nolist; delete work00 work01; proc means data=one; *-------------------------------------------------------; * Estimate the linear mean function robustly; *-------------------------------------------------------; proc nlin method=gauss data=one; y = p; * dependent variable; rscale = 1.483*madpr; * robust scale estimate; c = 2.00; * switch point of Huber criterion; %ROBCODE; * bring in the macro; ph = pred; * robust predicted value; pr = y - pred; * robust residuals; rhol = rho; * value of rho; zl = z; id ph pr e zl rhol; output out=res01rob; proc means data=res01rob; *-------------------------------------------------------; * Put a copy of the residual second moments in stats; *-------------------------------------------------------; data res01; set res01rob; keep p ph pr e rhol; proc means data=res01; var p ph pr e; output mean = pbar phbar prbar ebar std = pse phse prse ese out = stats; proc print data=stats; data auxl; set res01; keep pr rhol; *-------------------------------------------------------; * Form absolute residuals; *-------------------------------------------------------; data two; merge one res01(keep = ph pr); ap = abs(pr); proc datasets ddname=work nolist; delete one; *-------------------------------------------------------; * Do an OLS on the absolute residuals; * This is an OLS fit of the scale equation; *-------------------------------------------------------; proc reg data=two ; model ap = day2-day5 rgap mon01_1-mon01_4 mon03-mon11 mon12_1-mon12_4; output out=res02ols p=aphols r=aprols; proc univariate data=res02ols; var ap aprols aphols; *-------------------------------------------------------; * Determine the median absolute deviation of the OLS residuals from the scale equation; *-------------------------------------------------------; data work00; set res02ols; abapr = abs(aprols); keep abapr; proc univariate data=work00; var abapr; output out=work01 median = madabapr; proc print data=work01; data three; set two; if (_n_=1) then set work01; proc datasets ddname=work nolist; delete work00 work01; proc means data=three; *-------------------------------------------------------; * Do the robust NLS on absolute residuals, i.e., fit the scale equation; *-------------------------------------------------------; proc nlin method=gauss data=three; y = ap; * dependent variable; rscale = 1.483*madabapr; * robust scale estimate; c = 2.00; * switch point of Huber criterion; %ROBCODE; * bring in the macro; aph = pred; * robust predicted value; apr = y - pred; * robust residuals; rhos = rho; * value of rho; zs = z; id aph apr e zs rhos; output out=res02rob; proc means data=res02rob; var p ap aph apr e zs rhos; data res02; set res02rob; keep aph; data auxs; set res02rob; keep apr rhos; proc datasets ddname=work nolist; delete res01rob res02rob; *-------------------------------------------------------; * Determine the mean and standard deviation of the rescaled and recentered residual; *-------------------------------------------------------; data four; merge three res02(keep = aph); scale = 1/aph; zp = scale*pr; proc datasets ddname=work nolist; delete three; proc means data=four; var zp; output mean = zpbar std = zpse out = statszp; proc print data=statszp; data statsall; merge stats statszp; proc print data=statsall; proc datasets ddname=work nolist; delete three; *-------------------------------------------------------; * Determine the intercept (g0) and slope (g1) of the affine linear transformation for the adjustment and use it to adjust the data; *-------------------------------------------------------; data five; set four; merge auxl auxs; if _n_=1 then set statsall; g0 = pbar - (pse/zpse)*zpbar - (pse/zpse)*scale*ph; g1 = (pse/zpse)*scale; padj = g0 + g1*p; proc datasets ddname=work nolist; delete four; proc univariate data=five; var p padj g0 g1; proc means data=five; var p padj g0 g1; *-------------------------------------------------------; * write out adjusted data; *-------------------------------------------------------; data _null_; set five; file '/home/get/sv/adjust/spaj7792.dat' notitles; put yymmdd 1-6 day 7-8 hol 9-10 wkend 11-12 gap 13-15 fromjan 16-19 +1 padj e17. +1 p e17. +1 price 6.2; *-------------------------------------------------------; * dump more stuff to a .dmp file; *-------------------------------------------------------; data _null_; set five; file '/home/get/sv/adjust/spaj7792.dmp' notitles; put yymmdd 1-6 day 7-8 hol 9-10 wkend 11-12 gap 13-15 fromjan 16-19 +1 padj e17. +1 p e17. +1 price 6.2 +1 g0 10.5 +1 g1 10.5 +1 pr 10.5 +1 rhol 10.5 +1 apr 10.5 +1 rhos 10.5;