Part 1 Section F.4.4. Analytic Tools 分析ツール

Linear Regression Analysis 線形回帰分析

Time Series Analysis 時系列分析

  • Time series analysis is used for descriptive modeling.
  • Time series forecasting , on the other hand, is predictive.
  • A time series may have one or more of four patterns that influence its behavior over time.
    1. Trend
    2. Cyclical
    3. Seasonal
    4. Irregular

A trend projection is performed with simple liner regression analysis.

Simple linear Regression Formula

\( \displaystyle \bf \hat{y} = a + bx \)

\( \displaystyle \hat{y} \) = the predictedvalue of the dependent variable, \( \displaystyle \hat{y} \), on the regression line corresponding to each value of x.
a = the constant coefficient, or the y-intercept, the value of \( \displaystyle \hat{y} \) on the regression line when x is zero.
b = the variable coefficient and the slope of the regression line, which is the amount by which \( \displaystyle \hat{y} \) value of the regression line changes (either increases or decreases) when the value of x increases by one unit.
x = the independent variable, or the value of x on the x-axis that corresponds to the predicted value of \( \displaystyle \hat{y} \) on the regression line.

Correlation Analysis 相関分析

Correlation analysis is used to assess how well a model can predict an outcome.

  • The correlation coefficient, R
    • -1 < R < 1
  • The standard error of the estimate, also called the standard error of the regression
    • y = a + bx + e
  • The coefficient of determination, R^2
    • 0 < R^2 < 1
  • The T-statistic
    • T > 2

e = the error term, also called the residual, which for each value of x is the difference between the estimated \( \displaystyle \hat{y} \) value on the regression line for that value of x and the actual value of y for that value of x. The error term will be different for each value of x used in the regression function.

The standard error of the estimate (S) represents the average distance that the observed values fall from the regression line.
It describes how wrong the regression model is on average.

\( \displaystyle \bf e = y – \hat{y} \)

The coefficient of determination is the percentage of the total variation in the dependent variable (y) that can be explained by variations in the independent variable (x), as depicted by the regression line.

The t-static, or t-value, measures the degree to which the independent variable has a valid, long-term relationship with the dependent variable. The t-value for the independent variable used in a simple regression analysis should generally be greater than 2.

Multiple Regression Analysis 重回帰分析


Goodness of Fit in Regression Analysis 回帰分析における適合度


describes how close the actual values used in a statistical model are to to the expected values, that is, the predicted values, in the model.

Confidence Interval in Regression Analysis 回帰分析における信頼区間

The confidential interval is used in regression analysis to describe the amount of uncertainty caused by the sampling method used when drawing conclusions about a population based on a sample.

If several samples are drawn from a population using the same sampling method and a confidence interval at a confidence level of 95% is used, 95% of the interval estimates in the samples can be expected to include the true parameter of the population.




19-3. 95%信頼区間のもつ意味 | 統計学の時間 | 統計WEB

Sensitivity Analysis 感度分析

Sensitivity analysis can be used to determine how much the prediction of a model will change if one input to the model is changed.

It can be used to determine which input parameter is most important for achieving accurate predictions.

Sensitivity analysis is known as “what-if” analysis.

Monte Carlo Simulation Analysis モンテカルロシミュレーション

Whereas sensitivity analysis involves changing one input variable at a time, a Monte Carlo simulation analysis can be used to find solutions to mathematical problems that involve changes to multiple variables at the same time.