Part 1 Section F.4.4. Analytic Tools 分析ツール

Linear Regression Analysis 線形回帰分析

Time Series Analysis 時系列分析

  • Time series analysis is used for descriptive modeling.
  • Time series forecasting , on the other hand, is predictive.
  • A time series may have one or more of four patterns that influence its behavior over time.
    1. Trend
    2. Cyclical
    3. Seasonal
    4. Irregular

A trend projection is performed with simple liner regression analysis.

Simple linear Regression Formula

\( \displaystyle \bf \hat{y} = a + bx \)

\( \displaystyle \hat{y} \) = the predictedvalue of the dependent variable, \( \displaystyle \hat{y} \), on the regression line corresponding to each value of x.
a = the constant coefficient, or the y-intercept, the value of \( \displaystyle \hat{y} \) on the regression line when x is zero.
b = the variable coefficient and the slope of the regression line, which is the amount by which \( \displaystyle \hat{y} \) value of the regression line changes (either increases or decreases) when the value of x increases by one unit.
x = the independent variable, or the value of x on the x-axis that corresponds to the predicted value of \( \displaystyle \hat{y} \) on the regression line.

Correlation Analysis 相関分析

Correlation analysis is used to assess how well a model can predict an outcome.

  • The correlation coefficient, R
    • -1 < R < 1
  • The standard error of the estimate, also called the standard error of the regression
    • y = a + bx + e
  • The coefficient of determination, R^2
    • 0 < R^2 < 1
  • The T-statistic
    • T > 2

e = the error term, also called the residual, which for each value of x is the difference between the estimated \( \displaystyle \hat{y} \) value on the regression line for that value of x and the actual value of y for that value of x. The error term will be different for each value of x used in the regression function.

The standard error of the estimate (S) represents the average distance that the observed values fall from the regression line.
It describes how wrong the regression model is on average.

\( \displaystyle \bf e = y – \hat{y} \)

The coefficient of determination is the percentage of the total variation in the dependent variable (y) that can be explained by variations in the independent variable (x), as depicted by the regression line.

The t-static, or t-value, measures the degree to which the independent variable has a valid, long-term relationship with the dependent variable. The t-value for the independent variable used in a simple regression analysis should generally be greater than 2.

Multiple Regression Analysis 重回帰分析

計算問題は出題されない

Goodness of Fit in Regression Analysis 回帰分析における適合度

言葉だけ覚えていく

describes how close the actual values used in a statistical model are to to the expected values, that is, the predicted values, in the model.

Confidence Interval in Regression Analysis 回帰分析における信頼区間

The confidential interval is used in regression analysis to describe the amount of uncertainty caused by the sampling method used when drawing conclusions about a population based on a sample.

If several samples are drawn from a population using the same sampling method and a confidence interval at a confidence level of 95% is used, 95% of the interval estimates in the samples can be expected to include the true parameter of the population.

「95%信頼区間」とは、母平均が95%の確率でその範囲にあるということを表しています。これは、「正規分布に従う母集団から標本を取ってきてその平均から95%信頼区間を求めた時に、その区間の中に95%の確率で母平均が含まれる」という意味だと思う人がいるかもしれませんが、これは間違いです

母平均は決まった値(定数)であり、確率的に変化することはありません。つまり、算出された信頼区間に母平均が「含まれる」か「含まれない」かのどちらかしかありえません。したがって、「母平均が、95%の確率で推定した信頼区間に含まれる」と言うことはできません。

正しくは、「母集団から標本を取ってきて、その平均から95%信頼区間を求める、という作業を100回やったときに、95回はその区間の中に母平均が含まれる」という意味です。

19-3. 95%信頼区間のもつ意味 | 統計学の時間 | 統計WEB

Sensitivity Analysis 感度分析

Sensitivity analysis can be used to determine how much the prediction of a model will change if one input to the model is changed.

It can be used to determine which input parameter is most important for achieving accurate predictions.

Sensitivity analysis is known as “what-if” analysis.

Monte Carlo Simulation Analysis モンテカルロシミュレーション

Whereas sensitivity analysis involves changing one input variable at a time, a Monte Carlo simulation analysis can be used to find solutions to mathematical problems that involve changes to multiple variables at the same time.

コメント