TIME SERIES ANALYSIS IN SQL AND PL/SQL

articles: 

TIME SERIES ANALYSIS IN SQL AND PL/SQL
Written JP Vijaykumar
Date Mar 8th 2020

****************************************************************************************************************************
This article/script is provided for educational purpose ONLY.
The readers are advised to make necessary changes to the script as may be required for their use in their respective environments.
Pls test these scripts thoroughly in a lower environment, ONLY if you are satisfied with the tests' results,
then ONLY use in any higher environments with CAUTION. Reader's discretion advised.
It is advisable to monitor the load on the db/server, while you are testing these scripts in your environment.

This script was generated/tested in Oracle 12C environment.
This script may not work in other versions of Oracle, pls modify the script as may be needed using your own discretion/ingenuity.

HUMAN BRAIN IS A COMPLEX ALGORITHM/SYSTEM. HARVEST/HARNESS IT FOR CONSTRUCTIVE/POSITIVE/PRODUCTIVE PURPOSES.
****************************************************************************************************************************

Programming on Time Series Analysis(Datamining Algorith) using pl/sql & sql is quite complicated and challenging.
I love data mining algorithms for their complexity and number crunching toughest logic.

There are slight variations in the way, Time Series Analysis is performed, from presentation to presentation.
I followed the below mentioned vedio presentation, and few other vedios referred in the references section,
on Time Series Analysis for programming in sql and pl/sql, in this article.
https://www.youtube.com/watch?v=HIWXdHlDSFs --TIME SERIES ANALYSIS

Questions to be answered:
01) Using the ratio to moving average method calculate seasonally adjusted indicies for each quarter.
02) Obtain a regression trend line representing the above data.
03) Obtain a seasonally adjusted trend estimate for the 4th quarter of 2011.

Create a table and loaded the data for time series analysis scripting task.

connect Veeksha/Saketh --connected to my Oracle 12 database.

drop table temp_jp;
create table temp_jp(year number,q1 number,q2 number,q3 number,q4 number);
insert into temp_jp values(2008,20,30,39,60);
insert into temp_jp values(2009,40,51,62,81);
insert into temp_jp values(2010,50,64,74,95);
commit;

select * from temp_jp;

YEAR Q1 Q2 Q3 Q4
---------- ---------- ---------- ---------- ----------
2008 20 30 39 60
2009 40 51 62 81
2010 50 64 74 95

How to rotate the data in my table temp_jp for processing the given data using Time Series Analysis:

set serverout on size 1000000 timing on
declare
begin
for c1 in (select rownum rnum,value from (( select q1 v1, q2 v2,q3 v3,q4 v4 from temp_jp) unpivot(value for value_type in (v1,v2,v3,v4)))) loop
dbms_output.put_line(c1.rnum||' '||c1.value);
end loop;
end;
/
1 20
2 30
3 39
4 60
5 40
6 51
7 62
8 81
9 50
10 64
11 74
12 95

PL/SQL procedure successfully completed.

From the above table, I used this sql to generate the required result set data using Time Series Analysis method:

--UNPIVOT ROWS FROM TABLE AND PRESENT IN COLUMNS FOR PERFORMING TIME SERIES ANALYSIS AND DISPLAY RESULTS:
with t as (select rownum rnum,value from (( select q1 v1, q2 v2,q3 v3,q4 v4 from temp_jp) unpivot(value for value_type in (v1,v2,v3,v4)))),
--CALCULATING FOUR QUARTERS MOVING AVERAGE MA(4)
frqma01 as (select round(avg(value),2) frqma_val from t where rnum>=1 and rnum<=4 ), --QUARTERS 01-04
frqma02 as (select round(avg(value),2) frqma_val from t where rnum>=2 and rnum<=5 ), --QUARTERS 02-05
frqma03 as (select round(avg(value),2) frqma_val from t where rnum>=3 and rnum<=6 ), --QUARTERS 03-06
frqma04 as (select round(avg(value),2) frqma_val from t where rnum>=4 and rnum<=7 ), --QUARTERS 04-07
frqma05 as (select round(avg(value),2) frqma_val from t where rnum>=5 and rnum<=8 ), --QUARTERS 05-08
frqma06 as (select round(avg(value),2) frqma_val from t where rnum>=6 and rnum<=9 ), --QUARTERS 06-09
frqma07 as (select round(avg(value),2) frqma_val from t where rnum>=7 and rnum<=10 ), --QUARTERS 07-10
frqma08 as (select round(avg(value),2) frqma_val from t where rnum>=8 and rnum<=11 ), --QUARTERS 08-11
frqma09 as (select round(avg(value),2) frqma_val from t where rnum>=9 and rnum<=12 ), --QUARTERS 09-12
--DISPLAYING FOUR QUARTER MOVING AVERAGE MA(4)
frqma_rpt as (select cast('Four Quarter Moving Average:' as char(60))||to_char(frqma_val) description from(
select frqma_val from frqma01
union all
select frqma_val from frqma02
union all
select frqma_val from frqma03
union all
select frqma_val from frqma04
union all
select frqma_val from frqma05
union all
select frqma_val from frqma06
union all
select frqma_val from frqma07
union all
select frqma_val from frqma08
union all
select frqma_val from frqma09
)),
--CALCULATING CENTERED MOVING AVERAGE CMA(4)
ctdma1 as (select round(avg(frqma_val),2) ctdma_val from (select frqma_val from frqma01 union all select frqma_val from frqma02)),
ctdma2 as (select round(avg(frqma_val),2) ctdma_val from (select frqma_val from frqma02 union all select frqma_val from frqma03)),
ctdma3 as (select round(avg(frqma_val),2) ctdma_val from (select frqma_val from frqma03 union all select frqma_val from frqma04)),
ctdma4 as (select round(avg(frqma_val),2) ctdma_val from (select frqma_val from frqma04 union all select frqma_val from frqma05)),
ctdma5 as (select round(avg(frqma_val),2) ctdma_val from (select frqma_val from frqma05 union all select frqma_val from frqma06)),
ctdma6 as (select round(avg(frqma_val),2) ctdma_val from (select frqma_val from frqma06 union all select frqma_val from frqma07)),
ctdma7 as (select round(avg(frqma_val),2) ctdma_val from (select frqma_val from frqma07 union all select frqma_val from frqma08)),
ctdma8 as (select round(avg(frqma_val),2) ctdma_val from (select frqma_val from frqma08 union all select frqma_val from frqma09)),
--DISPLAYING CENTERED MVOING AVERAGE CMA(4)
ctdma_rpt as (select cast('Centered Average:' as char(60))||to_char(ctdma_val) from (
select ctdma_val from ctdma1
union all
select ctdma_val from ctdma2
union all
select ctdma_val from ctdma3
union all
select ctdma_val from ctdma4
union all
select ctdma_val from ctdma5
union all
select ctdma_val from ctdma6
union all
select ctdma_val from ctdma7
union all
select ctdma_val from ctdma8
)),
--CALCULATING PCT_AVG
pctavg_rpt as (select cast('PCT of Average:' as char(60))||to_char(pct_avg) from (
select round(value*100/ctdma_val,2) pct_avg from t,ctdma1 where rnum=3
union all
select round(value*100/ctdma_val,2) pct_avg from t,ctdma2 where rnum=4
union all
select round(value*100/ctdma_val,2) pct_avg from t,ctdma3 where rnum=5
union all
select round(value*100/ctdma_val,2) pct_avg from t,ctdma4 where rnum=6
union all
select round(value*100/ctdma_val,2) pct_avg from t,ctdma5 where rnum=7
union all
select round(value*100/ctdma_val,2) pct_avg from t,ctdma6 where rnum=8
union all
select round(value*100/ctdma_val,2) pct_avg from t,ctdma7 where rnum=9
union all
select round(value*100/ctdma_val,2) pct_avg from t,ctdma8 where rnum=10
)),
q3_1 as (select round(value*100/ctdma_val,2) pct_avg from t,ctdma1 where rnum=3 ),
q4_1 as (select round(value*100/ctdma_val,2) pct_avg from t,ctdma2 where rnum=4 ),
q1_2 as (select round(value*100/ctdma_val,2) pct_avg from t,ctdma3 where rnum=5 ),
q2_2 as (select round(value*100/ctdma_val,2) pct_avg from t,ctdma4 where rnum=6 ),
q3_2 as (select round(value*100/ctdma_val,2) pct_avg from t,ctdma5 where rnum=7 ),
q4_2 as (select round(value*100/ctdma_val,2) pct_avg from t,ctdma6 where rnum=8 ),
q1_3 as (select round(value*100/ctdma_val,2) pct_avg from t,ctdma7 where rnum=9 ),
q2_3 as (select round(value*100/ctdma_val,2) pct_avg from t,ctdma8 where rnum=10),
--CALCULATING MEAN
mean_rpt as (select cast('Mean:' as char(60))||to_char(mean) from (
select round(avg(pct_avg),2) mean from (select * from q1_2 union all select * from q1_3)
union all
select round(avg(pct_avg),2) mean from (select * from q2_2 union all select * from q2_3)
union all
select round(avg(pct_avg),2) mean from (select * from q3_1 union all select * from q3_2)
union all
select round(avg(pct_avg),2) mean from (select * from q4_1 union all select * from q4_2)
)),
m1 as (select round(avg(pct_avg),2) mean from (select * from q1_2 union all select * from q1_3)),
m2 as (select round(avg(pct_avg),2) mean from (select * from q2_2 union all select * from q2_3)),
m3 as (select round(avg(pct_avg),2) mean from (select * from q3_1 union all select * from q3_2)),
m4 as (select round(avg(pct_avg),2) mean from (select * from q4_1 union all select * from q4_2)),
adj_factor as (select round(400/sum(mean),4) adj_factor from (select mean from m1 union all select mean from m2 union all select mean from m3 union all select mean from m4 )),
s1 as (select round(adj_factor*mean,2) seasonal_idx from adj_factor,m1),
s2 as (select round(adj_factor*mean,2) seasonal_idx from adj_factor,m2),
s3 as (select round(adj_factor*mean,2) seasonal_idx from adj_factor,m3),
s4 as (select round(adj_factor*mean,2) seasonal_idx from adj_factor,m4),
--VERIFY THE SUM SEASONAL INDEX IS ADDING TO 400
sum_ssidx as (select cast('Sum Seasonal Index:' as char(60))||to_char(sum(seasonal_idx)) from (select seasonal_idx from s1 union all select seasonal_idx from s2 union all select seasonal_idx from s3 union all select seasonal_idx from s4)),
tc as (select count(*) num_recs,sum(rnum) sum_x,sum(rnum)/count(*) mean_x,sum(value) sum_y,sum(value)/count(*) mean_y,sum(rnum*value) sum_xy,sum(power(rnum,2)) sum_x_sqr from t),
b as (select round((num_recs*sum_xy - sum_x*sum_y)/(num_recs*sum_x_sqr - power(sum_x,2)),2) b_val from tc),
a as (select round( mean_y - b_val*mean_x,2) a_val from tc,b)
select cast('X code and Y code values:' as char(60))||rnum||' '||value description from t
union all
select * from frqma_rpt
union all
select * from ctdma_rpt
union all
select * from pctavg_rpt
union all
select * from mean_rpt
union all
select cast('Seasonal Index:' as char(60))||seasonal_idx from s1
union all
select cast('Seasonal Index:' as char(60))||seasonal_idx from s2
union all
select cast('Seasonal Index:' as char(60))||seasonal_idx from s3
union all
select cast('Seasonal Index:' as char(60))||seasonal_idx from s4
union all
select * from sum_ssidx
union all
select cast('X Adjustment Factor:' as char(60))||adj_factor from adj_factor
union all
select cast('b value :' as char(60))||b_val from b
union all
select cast('a value :' as char(60))||a_val from a
union all
--HERE 16 IS THE "x" VALUE(rnum) FOR 2011 FOURTH QUARTER( EXISTING 12 QUARTERS + 4 QUARTER OF 2011)
select cast('Seasonally Adjusted Trend Estimate for 1st Quarter of 2011:' as char(60))||to_char(round((a_val + b_val*13)*seasonal_idx/100,2)) seasonal_index from a,b,s1
union all
select cast('Seasonally Adjusted Trend Estimate for 2nd Quarter of 2011:' as char(60))||to_char(round((a_val + b_val*14)*seasonal_idx/100,2)) seasonal_index from a,b,s2
union all
select cast('Seasonally Adjusted Trend Estimate for 3rd Quarter of 2011:' as char(60))||to_char(round((a_val + b_val*15)*seasonal_idx/100,2)) seasonal_index from a,b,s3
union all
select cast('Seasonally Adjusted Trend Estimate for 4th Quarter of 2011:' as char(60))||to_char(round((a_val + b_val*16)*seasonal_idx/100,2)) seasonal_index from a,b,s4
/

DESCRIPTION
--------------------------------------------------------------------------------
X code and Y code values: 1 20
X code and Y code values: 2 30
X code and Y code values: 3 39
X code and Y code values: 4 60
X code and Y code values: 5 40
X code and Y code values: 6 51
X code and Y code values: 7 62
X code and Y code values: 8 81
X code and Y code values: 9 50
X code and Y code values: 10 64
X code and Y code values: 11 74
X code and Y code values: 12 95
Four Quarter Moving Average: 37.25
Four Quarter Moving Average: 42.25
Four Quarter Moving Average: 47.5
Four Quarter Moving Average: 53.25
Four Quarter Moving Average: 58.5
Four Quarter Moving Average: 61
Four Quarter Moving Average: 64.25
Four Quarter Moving Average: 67.25
Four Quarter Moving Average: 70.75
Centered Average: 39.75
Centered Average: 44.88
Centered Average: 50.38
Centered Average: 55.88
Centered Average: 59.75
Centered Average: 62.63
Centered Average: 65.75
Centered Average: 69
PCT of Average: 98.11
PCT of Average: 133.69
PCT of Average: 79.4
PCT of Average: 91.27
PCT of Average: 103.77
PCT of Average: 129.33
PCT of Average: 76.05
PCT of Average: 92.75
Mean: 77.73
Mean: 92.01
Mean: 100.94
Mean: 131.51
Seasonal Index: 77.31
Seasonal Index: 91.51
Seasonal Index: 100.39
Seasonal Index: 130.8
Sum Seasonal Index: 400.01
X Adjustment Factor: .9946
b value : 5.17
a value : 21.9
Seasonally Adjusted Trend Estimate for 1st Quarter of 2011: 68.89
Seasonally Adjusted Trend Estimate for 2nd Quarter of 2011: 86.28
Seasonally Adjusted Trend Estimate for 3rd Quarter of 2011: 99.84
Seasonally Adjusted Trend Estimate for 4th Quarter of 2011: 136.84

53 rows selected.

I want to process Time Series Analysis in pl/sql scripting:

set serverout on size 1000000 timing on
declare
v_cnt number:=0;
v_prd number:=3; --FOUR PERIODS MOVING AVERAGE
v_sum number:=0;
v_mav number;
v_cav number;
v_max number;
i number:=0;
j number:=0;
type counter is table of number index by pls_integer;
frprdma counter;
ctdma counter;
pctav counter;
q1 number:=0;
q11 number:=0;
q2 number:=0;
q22 number:=0;
q3 number:=0;
q33 number:=0;
q4 number:=0;
q44 number:=0;
si1 number;
si2 number;
si3 number;
si4 number;
sif number;
begin
--QUERY THE ROW COUNT FROM THE DATASET
--execute immediate 'select count(*) from temp_jp' into v_max;
execute immediate 'select count(*) from (select rownum rnum,value from (( select q1 v1, q2 v2,q3 v3,q4 v4 from temp_jp) unpivot(value for value_type in (v1,v2,v3,v4)))) ' into v_max;
--CALCULATE THE 4 QUARTERS MOVING AVERAGE AND CENTERED AVERAGE
--for c1 in (select rownum rnum,value from (select year,sales value from temp_jp)) loop
for c1 in (select rownum rnum,value from (( select q1 v1, q2 v2,q3 v3,q4 v4 from temp_jp) unpivot(value for value_type in (v1,v2,v3,v4)))) loop
v_cnt:=c1.rnum;
v_mav:=0;
if ((v_cnt + v_prd) <= v_max) then
i:=i +1;
--execute immediate 'select round(avg(value),2) from (select rownum rnum,value from (select year,sales value from temp_jp)) where rnum between '||v_cnt||' and ('||v_cnt||' +'||v_prd||')' into frprdma(i);
execute immediate 'select round(avg(value),2) from (select rownum rnum,value from (( select q1 v1, q2 v2,q3 v3,q4 v4 from temp_jp) unpivot(value for value_type in (v1,v2,v3,v4)))) where rnum between '||v_cnt||' and ('||v_cnt||' +'||v_prd||')' into frprdma(i);
dbms_output.put_line('Four Quarter Moving Avg: '||frprdma(i));
v_mav:=frprdma(i);
end if;
if (v_sum <> 0 and v_mav <> 0) then
j:=j+1;
v_cav:=round((v_sum + v_mav)/2,2);
ctdma(j):=v_cav;
--dbms_output.put_line('Centered Avg: '||ctdma(j));
--execute immediate 'select round((value*100)/'||ctdma(j)||',2) from (select rownum rnum,value from (select year,sales value from temp_jp)) where rnum='||j||'+2' into pctav(j);

execute immediate 'select round((value*100)/'||ctdma(j)||',2) from (select rownum rnum,value from (( select q1 v1, q2 v2,q3 v3,q4 v4 from temp_jp) unpivot(value for value_type in (v1,v2,v3,v4)))) where rnum='||j||'+2' into pctav(j);
dbms_output.put_line('Centered Avg: '||ctdma(j)||' '||j||' %Avg '||pctav(j));
if ( j = 1 or mod((j+2),4)=3) then
q3:=q3+pctav(j);
q33:=q33+1;
elsif ( j = 2 or mod((j+2),4)=0) then
q4:=q4+pctav(j);
q44:=q44+1;
elsif (mod((j+2),4)=1)then
q1:=q1+pctav(j);
q11:=q11+1;
elsif (mod((j+2),4)=2) then
q2:=q2+pctav(j);
q22:=q22+1;
end if;
end if;
v_sum:=frprdma(i);
end loop;
--CALCULATING SEASONAL INDEX
si1:=round(q1/q11,2);
dbms_output.put_line('Seasonal Index: '||si1);
si2:=round(q2/q22,2);
dbms_output.put_line('Seasonal Index: '||si2);
si3:=round(q3/q33,2);
dbms_output.put_line('Seasonal Index: '||si3);
si4:=round(q4/q44,2);
dbms_output.put_line('Seasonal Index: '||si4);
--CALCULATING ADJUSTMENT FACTOR
sif:=round(400/(q1/q11 + q2/q22 + q3/q33 + q4/q44),4);
dbms_output.put_line('Adjustment Factor: '||sif);
si1:=round(si1*sif,2);
si2:=round(si2*sif,2);
si3:=round(si3*sif,2);
si4:=round(si4*sif,2);
dbms_output.put_line('Seasonal Index: '||si1);
dbms_output.put_line('Seasonal Index: '||si2);
dbms_output.put_line('Seasonal Index: '||si3);
dbms_output.put_line('Seasonal Index: '||si4);
--FORECASTING SALES THE THREE QUARTERS IN THE 4TH YEAR
for c2 in ( with t as (select rownum rnum,value from (( select q1 v1, q2 v2,q3 v3,q4 v4 from temp_jp) unpivot(value for value_type in (v1,v2,v3,v4)))),
tc as (select count(*) num_recs,sum(rnum) sum_x,sum(rnum)/count(*) mean_x,sum(value) sum_y,sum(value)/count(*) mean_y,sum(rnum*value) sum_xy,sum(power(rnum,2)) sum_x_sqr from t),
b as (select round((num_recs*sum_xy - sum_x*sum_y)/(num_recs*sum_x_sqr - power(sum_x,2)),2) b_val from tc),
a as (select round( mean_y - b_val*mean_x,2) a_val from tc,b),
cnt as (select count(*) val from t),
qtr as (select level val from dual connect by level < 5)
select qtr.val + cnt.val qtr_prdct, round((a_val + b_val*(qtr.val + cnt.val))*
(case
when mod((qtr.val + cnt.val),4)=1 then si1
when mod((qtr.val + cnt.val),4)=2 then si2
when mod((qtr.val + cnt.val),4)=3 then si3
when mod((qtr.val + cnt.val),4)=0 then si4
end)/100,2) val_prdct from a,b,cnt,qtr) loop

dbms_output.put_line('Projected Sales for Quarter '||c2.qtr_prdct||' : '||c2.val_prdct);
end loop;
end;
/

Four Quarter Moving Avg: 37.25
Four Quarter Moving Avg: 42.25
Centered Avg: 39.75 1 %Avg 98.11
Four Quarter Moving Avg: 47.5
Centered Avg: 44.88 2 %Avg 133.69
Four Quarter Moving Avg: 53.25
Centered Avg: 50.38 3 %Avg 79.4
Four Quarter Moving Avg: 58.5
Centered Avg: 55.88 4 %Avg 91.27
Four Quarter Moving Avg: 61
Centered Avg: 59.75 5 %Avg 103.77
Four Quarter Moving Avg: 64.25
Centered Avg: 62.63 6 %Avg 129.33
Four Quarter Moving Avg: 67.25
Centered Avg: 65.75 7 %Avg 76.05
Four Quarter Moving Avg: 70.75
Centered Avg: 69 8 %Avg 92.75
Seasonal Index: 77.73
Seasonal Index: 92.01
Seasonal Index: 100.94
Seasonal Index: 131.51
Adjustment Factor: .9946
Seasonal Index: 77.31
Seasonal Index: 91.51
Seasonal Index: 100.39
Seasonal Index: 130.8
Projected Sales for Quarter 13 : 68.89
Projected Sales for Quarter 14 : 86.28
Projected Sales for Quarter 15 : 99.84
Projected Sales for Quarter 16 : 136.84

PL/SQL procedure successfully completed.

Here and there, the messages' content and presentation of results, need to changing/formatting.
I crosschecked the results and they are matching.

I found the following script on Time Series Analysis using Oracle/sql
https://stackoverflow.com/questions/55134697/forecasting-time-series-data-in-oracle-sql

When I used the above sql on my data, I got somewhat different forecasted values.
But the script is easy and ready to use.

with t as (
select case when rownum in (1,5,9) then 1
when rownum in (2,6,10) then 2
when rownum in (3,7,11) then 3
when rownum in (4,8,12) then 4 end cust_id,
case when rownum <= 4 then 2008
when rownum >=5 and rownum <=8 then 2009
when rownum >=9 then 2010 end year,
value from (( select q1 v1, q2 v2,q3 v3,q4 v4 from temp_jp) unpivot(value for value_type in (v1,v2,v3,v4))))
select
cust_id,
max(year) +1 forecast_year,
-- y = mx+b
round(regr_slope(value, year)
* (max(year) + 1)
+ regr_intercept(value, year),2) forecasted_revenue
from t
group by cust_id order by cust_id;

CUST_ID FORECAST_YEAR FORECASTED_REVENUE
---------- ------------- ------------------
1 2011 66.67
2 2011 82.33
3 2011 93.33
4 2011 113.67

Happy scripting.

References:
https://www.youtube.com/watch?v=HIWXdHlDSFs --TIME SERIES ANALYSIS
https://www.youtube.com/watch?v=gHdYEZA50KE
https://www.youtube.com/watch?v=-XWtSUCOv5Q
https://www.youtube.com/watch?v=UjTVWoT6zu8
https://www.youtube.com/watch?v=U516XeI-gKU&t=284s
https://www.youtube.com/watch?v=ZaxpCw6lCe4
https://www.youtube.com/watch?v=I26f4q40upM
https://www.youtube.com/watch?v=ptDezc0ZojM
https://www.youtube.com/watch?v=w9LXjZfcgOo&t=1317s
https://www.youtube.com/watch?v=GormX3_eN6A
https://www.machinelearningplus.com/time-series/time-series-analysis-python/
https://towardsdatascience.com/an-end-to-end-project-on-time-series-analysis-and-forecasting-with-python-4835e6bf050b
https://towardsdatascience.com/time-series-analysis-in-python-an-introduction-70d5a5b1d52a
https://www.analyticsvidhya.com/blog/2016/02/time-series-forecasting-codes-python/
https://www.dataquest.io/blog/tutorial-time-series-analysis-with-pandas/
https://ourcodingclub.github.io/2019/01/07/pandas-time-series.html
https://stackoverflow.com/questions/55134697/forecasting-time-series-data-in-oracle-sql